The effects of Pleistocene climatic cycles on avian historic demography across Amazonia

Laís Araújo Coelho

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2020

© 2020

Laís Araújo Coelho

All Rights Reserved

Abstract

The effects of Pleistocene climatic cycles on avian historic demography across Amazonia

Laís Araújo Coelho

Understanding the history of Amazonian diversity and how it relates to past environmental changes in the region is fundamental for elucidating processes behind the origin of global diversity distribution patterns and understanding future threats to its preservation. The diversification of Amazonian biota has been a topic of debate for centuries. Recent studies have found that biodiversity in Amazonia is highly underestimated and that many taxa are younger than previously thought. The distribution and dynamics of rivers, vegetation, soil types and moisture gradients created a complex scenario of diversification in the region. These emerging patterns have brought forth new and previously unanswered puzzles regarding the effects of landscape history in Amazonian biodiversity.

Climate cycles of the Pleistocene were important in shaping biodiversity patterns worldwide, and they have been hypothesized to be drivers of diversification in Amazonia.

However, little is known about the effects of historic climate cycles on Amazonian organisms. I addressed this knowledge gap by leveraging recently gathered abiotic evidence of past changes in precipitation across Amazonia as well as population genetics methods to make an explicit investigation of the effects of climate cycles in the historic demography of upland forest .

Specifically, I compared the demographic histories of populations occurring in regions with contrasting climatic histories: Northwestern Amazonia (NW, relatively stable paleoclimate and humid during the Last Glacial Maximum) and Southeastern Amazonia (SE, marked

paleoclimatic cycles and dry in the LGM). Demographic history was assessed in two time- scales: late Pleistocene (mtDNA of 33 taxa) and mid-late Pleistocene (whole genomes of four taxa). I hypothesized that: 1) populations co-occurring in the SE would show signals of recent, synchronous co-expansion due to rainfall increase since the LGM; 2) NW populations would have idiosyncratic demographic histories in response to drivers other than climatic cycles, given the milder regional precipitation oscillation; 3) populations occurring in SE would show cycles of population size change spanning multiple glacial cycles, that would be more pronounced and off-phase with NW counterparts. I found synchronous changes in population size using both mtDNA and genomic reconstructions. Most populations in both regions (23 in each) underwent expansions, based on ABC model testing with mtDNA. Contrary to my expectations, both the SE and NW populations had co-expansion (87-97% of populations co-expanding in SE and 94-99% in NW), and both these expansion events preceded the LGM (106-121 Kya for SE and 120-138

Kya for NW). These results were corroborated by whole genome based demographic reconstructions: the focus populations showed signs of increase (Rhegmatorhina gymnops,

Psophia dextralis and Psophia napensis) and decrease (R. melanosticta) during the transition from the penultimate glacial maximum and the last interglacial period (~120Kya). These synchronous demographic changes across Amazonia suggest joint response to changes in the environment spanning the whole region, which points to climatic cycles of the Pleistocene.

Amazonian forest birds respond to habitat change in concurrence, and they were sensitive to even relatively subtle changes in precipitation during the mid to late Quaternary.

Table of Contents

List of Figures ...... iv

List of Tables ...... v

Appendices ...... vi

Introduction ...... 1

Overview ...... 5

CHAPTER 1-Convergent demographic history of lowland birds across Amazonia ...... 8

1.Introduction ...... 8

2.Methods...... 13

2.1 Data collection ...... 13

2.2 Population delimitation ...... 13

2.3 Demographic history inference ...... 18

2.4 Effects of climate on genetic diversity and magnitude of population size change ...... 21

2.5 Coexpansion modelling ...... 22

2.6 Simulations for method validation ...... 24

3.Results ...... 27

3.2 Pairwise comparisons of summary statistics between regions ...... 32

3.3 Co-expansion modelling ...... 33

3.4 Synchronicity estimation assessment ...... 37

4.Discussion ...... 40

4.1 Genetic diversity and magnitude of population size change across Amazonia ...... 40

4.2 Synchronicity and timing of expansion across Amazonia ...... 43

i

4.3 Considerations about mechanisms driving demography ...... 46

5. Conclusions ...... 47

CHAPTER 2-A Multireference-Based Whole Genome Assembly for the Obligate Ant-Following

Antbird, Rhegmatorhina melanosticta (Thamnophilidae) ...... 48

1. Introduction ...... 48

2. Methods...... 51

2.1 Genome Sequencing and De Novo Assembly ...... 51

2.2 Single reference assisted chromosome level assembly………………………………………53

2.3 Multiple reference assisted chromosome level assembly……………………………………55

2.4 Evaluation of genome completeness…………………………………………………………58

2.5 Genotyping-by-sequencing (GBS) reference mapping………………………………………60

3. Results…………………………………………………………………………………………61

3.1 De-novo assembly…………………………………………………………………………....61

3.2 Reference-Based Assembly ...... 61

3.3 Genome Completeness...... 63

3.4 Synteny ...... 65

3.5 GBS Reference Mapping ...... 68

4. Discussion ...... 68

4.1 General Genome Structure, Contiguity and Content ...... 68

5. Conclusions ...... 72

CHAPTER 3- Demographic history of amazonian bird populations over multiple glacial cycles

...... 74

1.Introduction ...... 74 ii

2.Methods...... 78

2.1 Reference genomes ...... 80

2.2 Genome resequencing ...... 81

2.3 Single Polymorphism Calling and Filtering ...... 82

2.4 Demographic reconstruction ...... 83

3. Results ...... 86

3.1 Genome Resequencing and genetic diversity ...... 86

3.2 Demographic history within regions ...... 87

3.3 Demographic history within taxa ...... 88

4. Discussion ...... 93

4.1 Comparative population history ...... 94

References ...... 101

Appendices ...... 133

iii

List of Figures

Figure 1.Example of sampling strategy in study region. The dots correspond to sampling localities of sequences...... 18

Figure 2. Bayes factor comparison between three demographic models for populations in SE

Amazonia and NW ...... 28

Figure 3. Pairwise comparison of nucleotide diversity (A), haplotype diversity (B) and population size change (C) in Southeastern (orange) and Northwestern (blue) Amazonia...... 33

Figure 4. Posterior distribution of four co-expansion hyperparameters estimated for 22 bird populations in Northwestern Amazonia...... 36

Figure 5. Posterior distribution of four co-expansion hyperparameters estimated for 24 bird populations in Southeastern Amazonia ...... 37

Figure 6. Mode and credibility interval of hyperparameters zeta community congruence (zeta) and coexpansion time (Ts) ...... 38

Figure 7. Chromosome ideogram...... 62

Figure 8. Benchmarking Universal Single-Copy Orthologs version 3 (BUSCO) results for R. melanosticta plus eight related (Eufalconimorphae) ...... 65

Figure 9. Synteny plots between single-reference-based genome assemblies ...... 67

Figure 10. Sampling localities of specimens used in Chapter3...... 80

Figure 11.Distribution of nucleotide diversity (pi, x axis) across 500 Kb window ...... 87

Figure 12. Demographic history of two Rhegmatorhina populations across Amazonia ...... 90

Figure 13. Demographic history of two Psophia populations across Amazonia ...... 91

Figure 14. Comparison between demographic histories of two populations in SE ...... 92

Figure 15. Comparison between demographic histories of two populations in NW...... 93

iv

List of Tables

Table 1. List of study taxa, number of populations per region, source of sequences, loci used, number of samples per population and number of base pairs per sequence...... 17

Table 2. Settings of NetRecodon for the simulations of each single population of the simulated population assemblages...... 26

Table 3. Values of Spearman’s correlation coefficient, Tajima’s D and R2 for the populations from Southeast Amazonia...... 31

Table 4. Values of Spearman’s correlation coefficient, Tajima’s D and R2 for the populations from Northwest Amazonia...... 30

Table 5. Results of co-expansion simulations for northwestern and southeastern Amazonia. CH corresponds to the model from Chan et al. (2014). Threshold (TH), partitioned time (PT) and narrow co-expansion (NCT) correspond to models with different strategies for sampling from time prior (Gehara et al. 2017)...... 35

Table 6. Results of co-expansion analysis for ten simulated populations (pseudo-observed data,

POD) under the model from Chan et al. (2014)...... 39

Table 7. List of genomes used as reference to Rhegmatorhina melanosticta genome assembly. . 57

Table 8. Comparison of the Rhegmatorhina melanosticta linked-read de novo assembly with

PacBio long-read bird assemblies...... 64

Table 9. Museum voucher information for samples. The average coverage and its standard deviation for the genomes are presented in the last two columns...... 79

v

Appendices

SFigure 1.PCA containing 1000 closest simulations for three hABC models in A) NW Amazonia and B) SE Amazonia...... 133

SFigure 2.The phylogeny used to inform the whole-genome alignment. The topology was extracted from Oliveiros et al (2018)...... 133

SFigure 3.number of structural variants (insertions/deletions, duplications, rearrangements) mapped to Rhegmathorina melanosticta chromosomes ...... 134

SFigure 4.phylogenetic relationships for some birds in the family, Thamnophilidae...... 134

SFigure 5. distribution of lengths of scaffolds that form each chromosome (thin lines) in the

Rhegmathorina melanosticta genome...... 135

SFigure 6. Score density (y axes) of quality metrics used for data filtering for raw SNPs of all

Rhegmatorhina genomes combined...... 136

SFigure 7.Effects of data thinning and different substitution rates on demographic reconstruction.

...... 137

SFigure 8.Effects of using different numbers of cross-validation folds on demographic reconstruction...... 137

SFigure 9. Comparison of regularization methods for population size plotting for Psophia napensis...... 138

SFigure 10. Evaluation of dataset size and contiguity on demographic history reconstruction of

Psophia populations...... 138

vi

Introduction

Amazonia has been the focus of studies investigating drivers of diversification for centuries (Antonelli et al., 2018; Haffer, 2008). The region spans an area of continental proportions (6,000,000 km2) and has the highest biodiversity of plants and vertebrates on the planet (Jenkins et al., 2013; Schulman et al., 2007; Steege et al., 2013). Initial models of reconstructions of the origins of Amazonian lowland biota were based solely on the distribution of species and the identification of areas of endemism (Cracraft and Prum, 1988; Haffer 1969).

Haffer (1969) identified congruence between the location of areas of endemism for birds and regions where rainfall is higher in Amazonia, suggesting the refugia diversification hypothesis

(Haffer, 1969). This hypothesis suggested that Amazonian populations were repeatedly isolated and eventually differentiated in pockets of forest (refugia) during glacial cycles of the mid to late

Pleistocene (past million years). Beyond refugia, rivers limit the distribution of many Amazonian species, a pattern initially described in Wallace’s 19th century expeditions (Colwell, 2000;

Wallace, 1889). This pattern triggered the hypothesis that the formation of rivers created barriers to gene flow, likely being main drivers of diversification (Fernandes et al., 2014; Ribas et al.,

2012, Wallace, 1889). Other orogenic events, such as the rise of the Andean mountains, have also been suggested to have driven diversification of Amazonian taxa (Antonelli et al., 2011;

Hoorn et al., 2010). These hypotheses have provided an important background for subsequent studies attempting to reconstruct the history of Amazonian biota, particularly in the past two decades (Garzón‐Orduña et al., 2014; Leite and Rogers, 2013; Prates et al., 2016; Pulido-

Santacruz et al., 2018; Silva et al., 2019; Smith et al., 2014; Weir et al., 2015).

1

Studies investigating the origins of Amazonian biodiversity have brought to evidence the importance of interactions between riverine barriers and other environmental factors, such as climate oscillations (Sousa et al., 2019). Putative barriers, such as major rivers and the mountains of the Andes, have an effect on the structuring of genetic diversity for multiple Amazonian taxa

(e.g. Lynch-Alfaro et al. 2014, Naka et al. 2012, Ribas et al 2012). An important framework for identifying diversification drivers is the temporal congruence between historic events, such as diversification events of lineages (Barker et al, 2014). Many studies have tested the role of rivers in the diversification of Amazonian taxa (Fernandes et al., 2014; Lima et al., 2017; Ribas Camila

C. et al., 2012; Weir et al., 2015). Comparative studies report both varying degrees of discordance but also some congruence in spatial and temporal patterns of diversification across rivers (Naka and Brumfield, 2018; Smith et al., 2014; Sousa et al. 2019). The incongruences found across these studies highlight the complex history of Amazonian biodiversity (Leite and

Rogers, 2013; Rull, 2011). River permeability as a barrier to gene flow in Amazonia depends on the river’s age, geomorphological origin, location of headwaters and also the ecology of the taxon in question (Capurucho et al., 2013; Naka and Brumfield, 2018; Smith et al., 2014; Weir et al., 2015). The potential synergistic effect between climate cycles and permeability riverine barriers, such changes to river discharge, floodplain extensions during shifts in climate and changes in vegetation distribution around headwaters has brought forth a renewed interest in understating the effects of past climate cycles on Amazonian biodiversity (Haffer, 2008; Irion and Kalliola, 2010; Pupim et al., 2019; Weir et al., 2015).

The Quaternary (past 2.6 million years) was marked by major cyclical shifts in climate, from low global temperatures during glaciations to peaks in temperature during interglacial periods (Lisiecki et al., 2008; Lisiecki and Raymo, 2005). Climate cycles of the Pleistocene

2 drove speciation, shifts in distribution and extinctions across the world (Carstens et al, 2018;

Dam et al., 2006; Merceron Gildas et al., 2010; Weigelt et al., 2016). While the effects of

Pleistocene climate cycles on biodiversity have been broadly identified for temperate regions

(Carstens and Knowles, 2007; Hewitt, 2004; Shafer et al., 2010), their effects on the species rich tropics remains poorly understood. It has been suggested that Amazonian populations did not experience changes in size during the Last Glacial Maximum (LGM, 20,000 years ago) due to the climatic stability of the tropics (Lessa et al., 2003). This notion has long been disputed, as mounting evidence that evolutionary history, environment and climate were dynamic during the

Quaternary in Amazonia suggest otherwise (Cheng et al., 2013; Cohen et al., 2014; Lima et al.,

2017; Prates et al., 2016; Zular et al., 2019, reviewed in Bennet et al. 2012 and Baker and Fritz

2015). Despite this body of work and that cycles of the Pleistocene were hypothesized to be the main factor driving diversification in Amazonia (Haffer 1969), associating biotic and evolutionary patterns to past shifts in climate remains elusive and challenging in Amazonia.

While abiotic (e.g. glacier marks in rocks) and biotic (e.g. fossil pollen) evidence of landscape changes during climate cycles are well conserved in temperate regions, such records are scarce in the tropics (Baker, 2014). Assessing the effects of past climate on Amazonian biota has been historically challenging due to the paucity of records of the environment in the past

(e.g. palynological records in Amazonia, Flantua et al. 2015). An important focus of the study of drivers of Amazonian diversification has been to identify periods in geological history that concentrated speciation events (Rull 2008): it was previously thought that orogenic events in

Amazonia predated the Pleistocene, with changes in forest and dry-vegetation cover due to climatic cycles being the main source of? landscape variation throughout this period (Rull 2011).

Based on this premise, diversification events in the Pleistocene would have occurred mainly due

3 to dispersal across established barriers and habitat availability shifts (refugia), while diversification in the Miocene would have occurred due to vicariance (eg. Garzon-Orduña et al.

2014, Pavan et al. 2014). Indeed, the Miocene was a very important period of diversification for higher groups of Amazonian taxa (Hoorn et al. 2010, Rull 2011). Investigation of diversification patterns focusing on shallower taxa (putative species and geographic variations/sub-species) showed that many Amazonian lineages are younger than previously thought, and that diversification was continuous throughout the Pleistocene (Derryberry et al. 2011, Garzon-

Orduña et al. 2014, Lynch-Alfaro et al. 2012, Smith et al. 2014). Furthermore, the assumed stability in Amazonian landscapes during the Pleistocene underestimates actual changes that took place in this time period (e.g. Latubresse and Rancy 2000, Cheng et al. 2013). Associations of taxon age to diversification process many times lacked the combined environmental evidence that could clarify the process being assumed (Rull 2015).

A reconstruction of historic precipitation patterns across Amazonia over the past 200ky based on a collection of oxygen isotope data from cave speleothems, coupled with palynological records, suggests that, while western Amazonia had less intense variation in precipitation in the last 200ky (relative stability in relation to other regions), eastern Amazonia underwent intense oscillations between humid and dry climatic cycles (Cheng et al. 2013). The periodicity of these cycles corresponded to global precession cycles (~25 Ky), and the two regions (east and west) were in off-phase in relation to each other: while eastern Amazonia was at its driest phase during glacial maxima, western Amazonia had its highest precipitation rate (Cheng et al. 2013). Marine sediment cores have corroborated the increase in precipitation in western Amazonia during glacial periods over the past 250 ky (Gavin et al 2014). In contrast, palynological records close to the forest-savannah ecotone in eastern Amazonia suggest that forest cover was not stable during

4 the Last Glacial Maximum (Hermanowski et al 2012a, Hermanowski et al 2012b). In addition to the geographically limited distribution of palynological records for Amazonia, most of the available records span at most the past 20 Kya (Flantua et al. 2015). Accumulating evidence for historic landscape change in Amazonia suggests that climatic cycles could have played an important role in habitat distribution and availability during the Quaternary (Bennett et al.,

2012).

The effects that environmental changes during climatic cycles of the Pleistocene had in

Amazonian populations remains as an important link missing from our understanding of the drivers shaping the regions biodiversity patterns. Microevolutionary events in the level of population genetics are the basic unit that build broad macroevolutionary phenomena such as diversification of megadiverse regions (Harvey et al., 2019). Understanding the effects of climatic cycles of the Pleistocene in the demographic history of Amazonian populations could shed light into long pressing questions regarding the relationship between climate cycles and diversification (Haffer, 1969; Janzen, 1967). Furthermore, uncovering the response of populations to past environmental changes is key to understanding ecological relationships between organisms, their environment, and current change (Knowles, 2009).

Overview

In this thesis, I leverage advances in the reconstruction of changes in Pleistocene environment and climate across Amazonia with the power added by a comparative approach to phylogeographic studies to investigate the relationship between past climate cycles and population history of Amazonian understory birds.

For the second chapter of this thesis, I implemented a hierarchical Approximate Bayesian

Computation approach to assess synchronic demographic expansion among populations in 5 northwestern (NW) and southeastern (SE) Amazonia. I used mitochondrial DNA sequences collected over a decade of phylogeographic studies in the region, to assess if demographic patterns within the putative historically stable western Amazonia were idiosyncratic, and if populations from the unstable eastern Amazonia showed evidence of synchronous responses to environmental variation in the region. I expected that a region that has been stable over an evolutionary time scale will have posed less shared physiological and ecological constrains in its populations, and so demographic histories in the region will be results of idiosyncratic events. In contrast, cyclically unstable regions should cause joint strain in its populations, that will respond jointly to periods of increased adversity. I also investigated whether patterns of genetic diversity were different across the two regions. I expect that the unstable SE will have had more population contractions in the Pleistocene and will therefore have lost genetic diversity over time.

The third chapter of this dissertation was dedicated to creating a high-quality reference genome to be used in the following chapter, where I employed methods of historic population genomics. For this, I created a reference genome for Rhegmatorhina melanosticta, one of my focal taxa in Chapter 4, using linked-read sequencing and simultaneously employing multiple references for genome assembly. This approach optimized contiguity of my reference genome, an important aspect of genomic data that will be investigated under a coalescent and recombination framework (Terborgh et al, 2017). This chapter resulted in the first chromosome level assembly of a sub-oscine , and the first multireference chromosome level assembly for a bird.

Finally, chapter 4 investigates the effects of past climatic cycles in the Southeastern (SE) and

Northwestern (NW) Amazonia on four focal taxa. I employed a whole genome demographic

6 reconstruction framework to assess differences in patterns of population size change in a deep coalescent timescale. This powerful approach, that joins site frequency spectrum metrics to a sequential coalescent framework is a powerful tool for past population history inference

(Terborgh et al., 2017), and it’s use for non-focal organisms has been very limited so far. The species I investigated in this chapter were Psophia napensis and Rhegmatorhina melanosticta in western amazonia, and P. dextralis and R. gymnops in eastern Amazonia. The focal species were chosen based on previous knowledge of their phylogeographic distribution, restriction to understory forest habitats and their sensitivity to current landscape disturbance (Ribas Camila C. et al., 2012; Ribas et al., 2018; Stouffer and Bierregaard, 1995) . The study populations were restricted to the areas of endemism (Napo and Tapajós) where the speleothem records of changes in precipitation in the late Quaternary were found (Cheng et al., 2013, Wang et al., 2017). I expected to find similar demographic histories for the SE populations, with a few cycles of population contraction and expansion spanning the last half of the Pleistocene (past million years). I expected NW populations to not have marked changes in population size throughout the mid to late Pleistocene. I expected to find synchronous yet opposite changes in population size between populations occurring in different regions, and higher genetic diversity in NW populations.

7

CHAPTER 1-Convergent demographic history of lowland birds

across Amazonia

1.Introduction

Change in climate has long been identified as a major driver of species diversification

(Carstens and Knowles, 2007; Van Dam et al., 2006; Haffer 1969), range shifts (Carstens et al.,

2018; Timmermann and Friedrich, 2016) and extinction (Crowley and North, 1988; Merceron

Gildas et al., 2010). The late Pleistocene (past million years) was marked by major glacial climate cycles of roughly 100 Kyr (thousand years) duration, ranging from low global temperatures during glaciations to peaks in temperature during interglacial periods (Lisiecki et al., 2008; Lisiecki and Raymo, 2005). Although the link between glacial cycles and change in biodiversity distribution and composition has been studied in multiple regions (e.g. Gómez Cano et al., 2013; Shafer et al., 2010; Weigelt et al., 2016), this relationship remains elusive for much of the Neotropical lowlands (Baker and Fritz, 2015; Rull, 2013).

Amazonia is one of the most biodiverse regions of the planet (Jenkins et al., 2013). The region was also the predominant evolutionary origin of taxa across many Neotropical biomes

(Antonelli et al., 2018a). Uncovering the origins of Amazonia’s biodiversity has been the focus of much discussion and investigation, and it is paramount to a broader understanding of the mechanisms driving diversification and maintenance of biodiversity (Antonelli et al., 2011; Leite and Rogers, 2013; Ribas et al., 2012; Smith et al., 2014; Weir, 2006). Climate cycles were at the center of one of the most influential hypotheses for the mechanisms of Amazonian diversification: Haffer’s (1969) Pleistocene refuge hypothesis postulated that because

8 populations of forest taxa tracked changes in vegetation distribution during glacial cycles, they became isolated and differentiated as forests contracted. In the decades following Haffer’s hypothesis, there were no Pleistocene records of climate oscillations across Amazonia against which to test the refuge hypothesis. This shifted the focus to major rivers as potential drivers of allopatry (Rocha and Kaefer, 2019). Although it has been suggested that lowland vegetation cover and their populations were not affected by historic climate change in Amazonia

(Bennett et al., 2012; Lessa et al., 2003), recent paleoenvironmental evidence suggests that vegetation cover, precipitation and temperature were dynamic in Amazonia during climatic cycles of the Pleistocene (Cheng et al., 2013; Cohen et al., 2014; Hermanowski et al., 2012b;

Lima et al., 2018; Rossetti et al., 2017, 2018). Therefore, the responses of Amazonian biodiversity to Pleistocene climate cycles remains an unresolved and essential part of the effort to uncover the evolutionary history of this megadiverse region.

Pleistocene climate in Amazonia had a spatially complex pattern of cyclical variation in temperature and precipitation (Baker et al., 2019; Cheng et al., 2013; Zular et al., 2019). The most complete record of past climate in Amazonia identified a dipolar pattern of change in precipitation during the past 150Ky associated with global Pleistocene cycles: while periods of increased rainfall in western Amazonia coincided with glacial maxima, eastern Amazonia had marked drops in precipitation during these periods (Cheng et al., 2013). The eastern climate speleothem record suggests that the contrasting precipitation pattern across eastern and western

Amazonia was present during glacial maxima but was not predominant in interglacial phases

(Wang et al., 2017). The complexities of past change in climate across Amazonia are difficult to unravel due to the paucity of comparable evidence of past climate elsewhere.

9

Amazonian forest composition and structure responds to current and past changes in climate (Olivares et al., 2015; Saatchi et al., 2013; Zular et al., 2019). Palynological records on the margins of the region show an encroachment of open vegetation into currently forested sites spanning the past 40 Ky along the southern ecotone between Amazonia and the savanna of the

Cerrado, which reflects the fluctuations in precipitation identified in the climate record (Fontes et al., 2017; Hermanowski et al., 2012b; Mayle et al., 2007; Mayle and Power, 2008; Reis et al.,

2017). In contrast, the single palynological record from the 6 Hill Lake in central Amazonia supports maintenance of forest cover over the past 130 Ky, with some changes in forest structure due to periodic occurrence of palms and cold-adapted trees beyond their current range

(D’Apolito et al., 2017, 2013). A model of past vegetation cover based on climate and calibrated by pollen records found some stable forest refugia in western Amazonia during the Last Glacial

Maximum and predominant change in forest type in the southeast (Arruda et al., 2018), mirroring the stable-instable dipole from the speleothem records (Cheng et al., 2013). The evidence that climate was dynamic, changed in complex patterns across Amazonia and drove shifts in vegetation distribution in the region calls for an assessment of the effects of Pleistocene climate cycles on the region’s biota (Baker et al., 2019).

The distribution of bird diversity in Amazonia and lineage age follows the moisture gradient across the region, which increases from southeastern Amazonia to the northwest region

(Silva et al., 2019). This pattern suggests that climate could have driven the diversification and distribution of species in the region (Silva et al., 2019). Microevolutionary processes such as the maintenance or loss of genetic diversity, a population’s range and its size are elemental components of mechanisms driving speciation and accumulation of biodiversity (Harvey et al.,

2019). The effects of regional historic disturbance on the persistence of populations, fluctuations

10 in their size over time and consequential maintenance of genetic diversity is an important and often overlooked phase of the diversification process (Allmon 1992, Dynesius and Jansson,

2000). Many studies have shown that many bird populations across Amazonia had dynamic population sizes in the late Pleistocene, and a minority show signs of stability (e.g. Aleixo 2004;

Fernandes et al., 2012; Fernandes et al., 2013; Ribas et al., 2012; Silva et al., 2019). However, no emerging regional patterns have been identified (Silva et al., 2019), hindering our understanding of the relationship between past changes in climate and the demographic histories of Amazonian taxa. Climatic cycles are usually evoked as the driver of historic demographic changes. The lack of a comprehensive understanding of climate shifts and its effects on habitat distribution across most of Amazonia makes it difficult to associate historic demography and landscape history. As a result, the effects of Pleistocene climate cycles on Amazonian lowland taxa remains a missing piece in our understanding of the relationship between landscape history and maintenance of Amazonian biodiversity (Rull, 2013).

The climate models of eastern and western Amazonia create an opportunity to investigate the response of Amazonian biota to past climate cycles. Comparative investigation of past evolutionary events across different co-occurring populations adds power to the reconstruction of past environmental changes and their effects on organisms (Arbogast and Kenagy, 2001; Avise et al., 2016; Hickerson et al., 2010). This study investigates the effects of Pleistocene climate cycles on Amazonian lowland forest bird populations through a comparative demographic analysis of 33 bird species. As cave speleothem records and most palynological records are located in the peripheries of Amazonia, leaving the climatic and environmental history of most of Amazonia currently intractable (Baker et al. 2019, Flantua et al. 2015), I focused on populations co-occurring in areas with different climatic histories: the southeastern region

11 between the Tapajós and Amazon rivers (SE) and the northwest region (NW) bounded by the

Solimões river to the South and the Negro river to the east.

As Northwestern Amazonia had relatively stable precipitation patterns in the past (Cheng et al., 2013), it could be expected that populations occurring in this region would show more stability than populations in other regions. Furthermore, demographic oscillations observed in this region should be spaced out in time, as they would likely be idiosyncratic events occurring in response to drivers of change in population size in a stable environment particular to each taxon. In contrast, marked changes in precipitation in SE would override idiosyncratic taxon sensibilities and drive a joint response to climate cycles from populations occurring in the region.

I expect that SE populations will have co-expanded due to a communitywide recovery after the marked oscillations in precipitation during climate cycles of the late Pleistocene (past 150 Ky,

Cheng et al., 2013; Wang et al., 2017), whereas populations in NW would be predicted to have a low occurrence of synchronous expansion. If there is synchronous expansion in NW, I expect that its timing will be different from the timing of expansion in SE, reflecting the dipole precipitation pattern across Amazonia (Cheng et al., 2013). Genetic diversity is usually higher in stable landscapes (Arenas et al., 2012; Carnaval et al., 2009), therefore I predict higher genetic diversity in NW than SE populations. Finally, I expect that the higher decrease in habitat availability in SE during periods of low precipitation caused a greater change in population size in each of the SE populations, and that the amplitude of population size change for NW will be narrower. With this study, I hope to elucidate if and how climate cycles affected Amazonian bird populations and elucidate how climate history might affect diversity and diversification in the region.

12

2.Methods

2.1 Data collection

I compiled mitochondrial DNA gene sequences (ND2, ND3, CytB and/or COI) from GenBank and Dryad, as well as some previously unpublished data, for 33 Amazonian bird taxa (Table 1).

In ost of the cases, populations belonged to species endemic to only one of the sub-regions, with closely related species in the other regions (represented by “NA”s on Table 1). Some of the study taxa were populations restricted to NW and/or SE, that were part of species with broad distribution in Amazonia (e.g. Glyphorynchus spirurus). All sequences with less than 850 base- pairs were excluded. Iconcatenated sequences of different genes in the cases where multiple mtDNA genes were available.

2.2 Population delimitation

A central assumption of models based on the distribution of neutral mutations and coalescent processes in population genetics is that populations are not structured within their ranges

(panmictic populations, Heller et al. 2013). For proper reconstruction of population dynamics, it is of paramount importance to identify independently evolving lineages and their geographic distribution. Each population in this study is a subset of sequences that comprise the smallest genetic clusters that were also geographically cohesive. These populations were identified in two steps. First, phylogenetic trees were constructed with samples covering each species’ entire range under a Maximum Likelihood approach in RAxML (Stamatakis 2014), and 100 bootstrap resamples were performed to assess topological support. Iidentified clades that contained sequences from a shared geographic range with at least 80% bootstrap support. I limited this study to clades occurring in southeastern Amazonia (SE: Tapajós, Xingu and Belém interfluvia, with a few populations extending to eastern Rondonia) and northwest Amazonia (NW: including

13 populations limited to the Napo, Imeri and Negro areas of endemism, based on the classification proposed by Cracraft in 1985). I chose these regions due to their proximity to key evidence of past climate in Amazonia (Fig.1, Cheng et al. 2013, Wang et al. 2017) and one of the only palynological records of changes in vegetation within Amazonia during the Pleistocene (Fig.1,

D’Apolito et al. 2013).

Clades were assessed for finer geographic structure with the clustering algorithm implemented in Bayesian Analysis of Population Structure (BAPS; Corander et al., 2008). BAPS uses linked SNP information from single locus data to identify optimal genetic clustering within an alignment. If the genetic clusters identified by BAPS were geographically isolated from other clusters, they were considered a population. If not, all samples included in the original clade were considered to be a single population. Although larger matrices are ideal, alignments with as few as five sequences of mitochondrial DNA are sufficient to discriminate between high and low population genetic diversity (Goodall-Copestake et al., 2012). Populations with at least eight samples and for which there were at least two genetic markers available were kept in this study.

14

Sample BP Taxon NW SE Source Loci size

Automolus paraensis NA 1 Schultz et al. 2017 ND2, cytb 17 1976

Automolus infuscatus badius 2 NA Schultz et al. 2017 ND2, cytb 13-24 1976 Automolus infuscatus infuscatus 1 NA Schultz et al. 2017 ND2, cytb 9 1976 Automolus subulatus 1 NA Schultz et al. 2017 ND2, cytb 8 1976

Automolus ochroptera turdinus 1 NA Schultz et al. 2017 ND2, cytb 11 1976

15

Burney et al. 2009, Smith et al. 2014, Weir et al. Dendrocincla fuliginosa 1 1 2011 cytb 24-50 869 Galbula albirostris 1 NA Ferreira et al, 2018 ND2 8 1001 Galbula cyanicollis NA 1 Ferreira et al, 2018 ND2 12 1001

957- Glyphorynchus spirurus 4 3 Smith et al. 2014, Fernandes et al. 2013 cytb 10-100 1004 Hylophylax naevius 0 1 Fernandes et al. 2014b ND2 11 891 Lepidothrix iris eucephala NA 1 Unpublished ND2, COI 9 1871 Lepidothrix coronata 1 NA Smith et al. 2014 cytb 10 989

Lepidothrix vilasboasi NA 1 Unpublished ND2, COI 12 1871 Malacoptila fusca 1 NA Ferreira et al. 2016 cytb, ND3 8 1228 Malacoptila rufa NA 3 Ferreira et al. 2016 cytb, ND3 10-11 1213 Microcerculus marginatus 1 1 Smith et al. 2014 cytb 10-14 976

Myrmotherula axillaris 1 2 Burney et al. 2009, Smith et al. 2014 cytb 11-46 967 Myrmeciza hemimelaena 0 1 Fernandes et al. 2012 cytb 15 950 Phoenicircus nigricollis 1 NA Unpublished ND2 9 923 Psophia dextralis NA 1 Ribas et al. 2012 ND2, cytb 10 2034

16 Psophia interjecta NA 1 Ribas et al. 2012 ND2, cytb 9 2034

Psophia napensis 1 NA Ribas et al. 2012 ND2, cytb 8 2034 Psophia ochroptera 1 NA Ribas et al. 2012 ND2, cytb 11 2034

Pyriglena leuconota NA 2 Maldonado-Coelho et al. 2013 ND2 32-52 950 ND2, ND3, Rhegmatorhina cristata 1 NA Ribas et al. 2018 cytb 8 2400 ND2, ND3, Rhegmatorhina gymnops NA 1 Ribas et al. 2018 cytb 27 2159 ND2, ND3, Rhegmatorhina melanosticta 1 NA Ribas et al. 2018 cytb 8 2396

Schiffornis turdina 1 0 Nyári 2007, Smith et al. 2014 cytb 18 840 ND2, ND3, Sclerurus rufigularis 1 0 d'Horta et al. 2013 cytb 10 2357 Synallaxis rutilans 1 1 Unpublished COI, ND2 8-13 990 Willisornis poecilinotus 0 1 Weir et al. 2015 cytb 31 `084 Xenops minutus 1 2 Smith et al. 2014 cytb 9-18 981

Xiphorhynchus guttatus NA 1 Rocha et al. 2015 cytb, ND2 22 1610

17 Table 1. List of study taxa, number of populations per region, source of sequences, loci used, number of samples per population and

number of base pairs per sequence.

Lake Pata

Paraíso cave El Condor

Figure 1.Example of sampling strategy in study region. The dots correspond to sampling localities of sequences. The three shades of blue on the Southeast (delimited by the Tapajós river, in orange) correspond to three populations of Malacoptila rufa in the region (from Ferreira et al, 2017). The red and orange dots correspond to populations of Malacoptila fusca in the Northwestern region, delimited by the

Negro and Solimões rivers (thicker blue lines). The stars are the sampling locations for data reflecting past environmental conditions in Amazonia: El Condor and Paraíso caves (Cheng et al. 2013) and the palynological records from Lake Pata (D’Apolito et al. 2013).

2.3 Demographic history inference

Excluding populations that were either constant or decreasing in size reduces noise in assessing if populations who expanded in population size did so synchronously (Gehara et al, 2017). I used three complementary strategies to identify populations with signals of past expansion:

Approximate Bayesian Computation (ABC), Bayesian Skyline Plot (Drummond, 2005) and

Rozas’ R2 (Ramos-Onsins and Rozas, 2002).

18

I assessed changes in population size (Ne) through time with Bayesian Skyline Plots

(BSP), built with the software Beast version 1.8.4 (Drummond et al. 2005, Drummond and

Rambaut 2007). I implemented the HKY substitution model and the strict clock model with a rate of 0.0105/substitution/site/million years (Weir and Schluter 2008). I ran the MCMC chain from 10-50 million steps, sampling every 5000 steps. I assessed if the effective sample size

(ESS) was higher than 200, and plotted and collected data of population size change through time using Tracer v. 1.6, after discarding the first 10% runs as burn-in. I estimated Spearman’s rank correlation coefficient () between time and the median of population size through time from the BSPs to assess if population sizes increased towards the present (negative coefficient, expanding populations) or decreased (positive relationship, bottleneck populations) with the cor.test function in R, with “method” set to “spearman” (R Core Team, 2019).

I also implemented an Approximate Bayesian Computation (ABC) approach to identify if the populations underwent increase (expansion model), decrease (bottleneck model), or remained constant (constant size model) in relation to past population size. The ABC method is used to compare how well different models of demographic history explain observed data, as well as to estimate demographic parameters of interest. For this, sequence data are simulated under different demographic histories. Summary statistics are estimated for both simulated and observed data, and the simulated datasets that are closest to the observed dataset are retained based on a rejection criterion (Gehara et al., 2017; Reid et al., 2019). The posterior distribution of parameters and the fit of the data to the different models are assessed based on the selected simulations. Current effective population size (Ne) was estimated for the three models. Past effective population size (NeA) was estimated for the expansion and bottleneck models, along with the time for instantaneous increase/decrease of population size (te). The prior change in

19

population size is the ratio between the ancestral to current effective population size (NeA/ Ne), which had a uniform distribution ranging from 0.001 to 0.1 for the expansion model and 2 to 20 uniform distribution for the bottleneck model. The priors for population size and expansion time had uniform distributions ranging from 103 to 106 individuals and 20 thousand years ago (Kya) to one million years ago (mya), respectively. The broad prior for expansion time spans the last million years of the Pleistocene, a period with more frequent and intense climatic oscillations, and also reflects the timing of past demographic events for many organisms globally (Hewitt

2000, Burbrink et al.; 2016; Gehara et al., 2017). The prior for substitution rate had a normal distribution with a mean of 1.1e-8 substitutions/site/yr (s/s/yr) and SD of 1.5e-9 s/s/yr, which encompasses a range of rates estimated for many bird groups (Weir and Schluter, 2008). All generation times were of one year, except for Psophia, a large terrestrial taxon for which sexual maturity is reached at two years of age (Sherman 1995). The summary statistics used to represent demographic history were Tajima’s D, nucleotide diversity, number of segregating sites, haplotype diversity and the frequency of the three most common alleles (Gehara et al., 2017).

Two hundred thousand sets of summary statistics were simulated under each demographic model, resulting in 600,000 simulations per population. The simulations were run with the ms software (Hudson 2002) as implemented in the R package PipeMaster, with the function

“test.demog” (Gehara et al. in prep; github.com/gehara/PipeMaster).

The comparison of the expansion, constant size and bottleneck models and their fit were performed with the abc package (Csillery et al, 2012) in R under the rejection and neural network (neuralnet) algorithms, with a tolerance of 0.05. Bayes Factors representing pairwise comparisons of the performance of the models as well as the posterior probability of each model in relation to the other 2 were performed with the “postpr” command (Csilléry et al., 2010;

20

2012). Populations were considered to be expanding if the model with the highest posterior probability was the expansion model, and if the observed data were contained within the credibility interval (goodness of fit within 0.1–0.9 interval) of the posterior distribution of the expansion model. Goodness of fit was assessed with the “gift” command, run with 100 replicates

(Csillery et al., 2012). Populations that had higher posterior probability for the constant population size model, bottleneck model or had observed data outside the credibility interval under the expansion model were excluded from the co-expansion analysis. I also verified if expanding populations had R2 lower than 0.10 using the pegas package with default settings

(Paradis 2010). This cutoff, based on a study the behavior of this statistic in resuming multiple simulated expanding populations (Sano and Tachida, 2005), encompasses a range of R2 values found for populations that had exponential, logistic and sudden expansions. The R2 statistic, which summarizes the number of singleton mutations as well as the average number of nucleotide differences between sequences and the number of segregating sites, has been shown to be more powerful than other methods for detecting divergence from neutrality (Ramos-Onsins and Rozas, 2002; Sano and Tachida, 2005). Finally, only populations that showed signals of demographic expansion under the ABC approach and which also fit at least one of the two other demographic tests (negative Spearman’s ρ for the BSP curve and/or R2 under 0.10) were included in the co-expansion analysis.

2.4 Effects of climate on genetic diversity and magnitude of population size change

I compared the mean nucleotide diversity (pi) and haplotype diversity (H) between populations from the climatically unstable southeast and populations of their close relatives from the putatively stable northwest with pairwise t-tests. For these comparisons, population-pairs that are from the same or pairs of populations of the same species were compared as ecological

21

replicates under different climatic histories. For taxa that had more than one possible paring (e.g. one Malacoptila fusca population from NW to be compared with 3 M. rufa from SE), a population from the largest cohort was randomly selected. I used two-tailed t-tests to compare differences between paired means of pi and H, and a one-tailed t-test to verify if mean genetic diversity is higher in the NW than in the SE. I also compared Ne ratios for the same population pairs described previously with pairwise Wilcoxon tests, to verify if the magnitude of population size changes were more marked in the SE than in NW Amazonia. I used the ratio between the highest and lowest values of the Ne (maxNe /minNe) extracted from the median curve of the population size through time Bayesian Skyline plots. The Ne values were extracted from the BSP curves with the “max” and “min” functions in R’s core package (R Core Team, 2019). Possible biases in the increased uncertainty in the most recent and oldest portions of the demographic reconstructions (edge effect) were assessed for a subset of five randomly selected populations

(data not shown). As no bias was found, the full demographic history recovered by BSP was used for the magnitude of demographic change assessment.

2.5 Coexpansion modelling

I implemented a hierarchical approximate Bayesian computation (hABC) to detect synchronous demographic expansion, as described in Chan et al. (2014) and expanded by Gehara et al. (2017).

The hABC approach developed by Gehara et al. (2017) combines summary statistics (nucleotide and haplotype diversity, number of segregating sites and Tajima’s D) of single populations from a population assemblage in the form of hyper-summary statistics (hSS). The method accommodates variation in mutation rates, population and sample size across taxa. In order to investigate the occurrence of congruent demographic changes, genetic data were simulated under the coalescent model based on hierarchical prior sampling: the proportion of populations that

22

expand synchronously (community congruence, ζ) was sampled from a uniform hyperprior distribution ranging from ζ=0 (idiosyncratic individual expansions) to ζ=1 (synchronous expansion for all populations). Intermediate values of ζ are proportional to the number of populations that have synchronous expansion. Similarly, the hyperparameter for time of coexpansion (Ts) was sampled for an assemblage-wide hyperprior that had uniform distribution from 20ky to 500kya, reflecting the findings from the BSPs. The population specific-priors of expansion time for sequence simulation are sampled, stringent on the hyperparameters ζ and Ts.

For example, if the co-expansion proportion sampled from the ζ prior is x and the time of co- expansion sampled from the Ts prior is y, then x percent of the populations in the assemblage is simulated to have expanded at y, and the expansion times for the other 1-x percent of the population will be re-sampled from the Ts hyperprior in different manners depending on the co- expansion model in question (refer to following paragraph). The magnitude of expansion for each population, mutation rate and the individual expansion times for the non-co-expanding populations is then sampled from the population specific priors. These priors, such as mutation rates and the ratio between past (NeA) and present Ne (magnitude of population size change) had the same distribution as the ones described in the individual demography step.

The method developed by Gehara et al (2017) implements four models with different strategies for sampling individual time of expansion (tn) from the population-specific priors: the original no-time-buffer model (CH) from Chan et al (2014), where T and tn are sampled from the same uniform prior; the threshold model (TH), where a buffer (n) is added to the time prior, forcing T and tn to be n years apart for a better representation of scenarios with intermediate ζ values; the partitioned time model (PT) where tn is sampled from a number of partitions of the time prior corresponding to number of populations in the assemblage for a more accurate

23

representation of asynchronous expansions; and the narrow co-expansion time model (NCT), where a narrow threshold is added to the time buffer (n) for testing specific hypothesis of co- expansion in response to a known event. The method also estimates two ancillary hyperparameters: the mean time of all demographic changes (Et) and the dispersion of all times around the mean (DI) (Chan et al. 2014, Gehara et al. 2017), the priors of which depend on the ζ and Ts values. For population assemblages with synchronous expansion, it is expected that the posterior estimate of Et will approach the value of T, and DI will have a low median.

The four models above were implemented and their fit to the data was assessed with the abc package (Csillery et al, 2012) in R. The population priors (generation time, substitution rate and population size) were the same as those used for individual demographic history inference.

The prior representing NeA is the ratio of ancestral to current effective population size (NeA/ Ne), which had a uniform distribution ranging from 0.001 to 0.1. The time buffer used for the TH model was 40ky (Gehara et al. 2017), and the narrow co-expansion prior for the NCT model ranged from 20-200kya. I ran 2.5 million simulations for each model. Posteriors were inferred with 1000 simulations retained with a tolerance of 0.0004 under the rejection and the neural network algorithms. Finally, I compared the four co-expansion models relative performance with the postpr function in the abc package, using the same algorithms and tolerance as described for previous steps to estimate each model’s posterior probability and Bayes factors for pairwise model comparisons.

2.6 Simulations for method validation

I simulated 10 population assemblages to test the accuracy of the models, each containing 22 populations (Table 2). The population alignments for the assemblages were simulated with the software NetRecodon v.6.0 (Arenas and Posada, 2010), using the number of samples per

24

population and generation time per taxon matching the SE empirical dataset (Table 1, Table 2).

Southeast Amazonia was arbitrarily chosen as the reference dataset for creating simulated assemblages, and this choice has no implication for downstream analysis. Half of the populations in each assemblage were simulated to have increased in size 40 Kya, while the other half expanded at idiosyncratic times (ζ =0.5) randomly chosen from 50-1000 Kya. This time threshold was chosen to investigate whether this method is adequate for detecting recent expansion and reflects the onset of increase in precipitation in NW, reaching its peak around 25

Kya (Cheng et al., 2013). The populations were randomly selected to be part of either the synchronous or asynchronous expansion. I chose this mid-value of simultaneous expansion to verify if the hABC approach would satisfactorily recover low ζ values when simultaneous expansion is not occurring for the majority of the populations. Therefore, this approach assessed overestimation of community congruence. The population-size change ranged from 10 to 20-fold increase, based on the increase range observed on the Bayesian Skyline Plot. This magnitude of increase, ranging from 0.04 to 0.08 NeA/Ne ratio, is also a close range around the average (0.05) of the prior for magnitude of population-size change from the ABC steps (normal distribution ranging from 0.001 to 0.1). Increase in population size happened over a time period of 5ky, representing the rate of transition between climate phases (e.g. Lisiecki et al., 2008). The substitution rate used was 1.1e-8 substitutions/site/yr, and generation time was one year for all populations.

I assessed co-expansion for the simulated population assemblages implementing the methods described previously. I estimated ζ and Ts for each assemblage with hABC under the original co-expansion model with no time buffer from Chan et al. (2014) implemented in Gehara et al. (2017). I ran one million simulations for each assemblage, resulting in 10 million

25

simulations, for the hABC approach. The posteriors were built under a tolerance of 0.001, collecting the 1000 closest simulations.

SimPopulation Samples Base pairs Ne NeA tn

1 11 891 400000 20000 40000 2 27 2159 500000 40000 670000

3 10 1216 600000 40000 40000 4 11 1216 600000 40000 40000 5 10 1720 600000 40000 40000 6 15 984 600000 40000 650000

7 9 1560 600000 40000 450000 8 24 934 800000 40000 40000

9 9 944 800000 40000 40000

10 9 1871 800000 40000 40000 11 11 969 800000 40000 40000 12 13 1494 800000 40000 40000 13 17 961 800000 40000 250000 14 52 948 800000 40000 350000 15 32 993 800000 40000 530000

16 10 1216 800000 50000 40000 17 11 1498 800000 50000 90000

18 12 1871 800000 50000 120000

19 17 1976 1000000 40000 300000 20 18 960 1000000 50000 40000 21 45 962 1000000 50000 100000

22 93 987 1000000 50000 600000

Table 2. Settings of NetRecodon for the simulations of each single population of the simulated population assemblages. Each line represents one simulated population. Ne is the effective population size since

26

increase to present; NeA is the ancestral effective population size before increase; tn is time of population expansion. Populations in bold were simulated to be co-expanding 40 Kya.

3.Results

3.1 Demographic history inference

Out of datasets with samples of each taxon from the full range across Amazonia, a total of 26 and 24 population clusters were found to be restricted to southeastern and northwestern sub- regions, respectively (Tables 3–4). The vast majority of populations in both sub-regions showed signals of past population expansion. Of all populations identified, 24 populations in SE and 23 populations in NW had the best fit for the expansion model in relation to the constant or bottleneck models in the ABC approach (Tables 3–4, Fig. 2). Phoenicircus nigricollis did not pass the additional criteria (R2 <0.1 and <0) to be included in the expanding population group other than the ABC model test, so it was excluded from the co-expansion analysis of the NW assemblage. Most of the expanding populations under the ABC model also showed signals of expansion under both other analyses: negative Spearman’s correlation coefficient (<0) for the

BSP curves and R2 under 0.10. Populations that did not have higher posterior probability under the expansion model were Willisornis poecilinota, Xenops minutus from the Xingu region, and the population of Glyphorhyncus spirurus from the Jaú region (Tables 3–4). Many populations identified to have expanded in their demographic histories based on ABC and summary statistics

(Tables 3-4) had Bayes factors higher than 3 (threshold for positive support for a model in relation to alternate model (Raftery, 1995) in comparison with the constant size and bottleneck models (Fig. 2).

27

Figure 2. Bayes factor comparison between three demographic models for populations in SE Amazonia

(left panel) and NW (right). The first two are ratios between the expansion (ex) model and the other two

(bottleneck and constant size), second pair are ratios comparing constant size model, and the last two compare bottleneck model. Taxa marked with green symbols were considered to have expanded (full circle passed 3 criteria, empty circle passed ABC and BSP cutoff). Red triangles mean populations were not expanding: full triangle means populations met none of the criteria, hashed lines mean population met all criteria but ABC, and empty triangle means only passed ABC. Details in methods.

28

Population  TD R2 Posterior Probability

BT CT EX

Automolus paraensis -0.99 -1.86 0.06 0.12 0.37 0.51

Dendrocincla fuliginosa -0.99 -0.69 0.09 0.13 0.32 0.54

Galbula cyanicollis -0.99 -1.11 0.08 0.23 0.31 0.46

Glyphorynchus spirurus 1 -0.99 -2.49 0.02 0.23 0.31 0.46

Glyphorynchus spirurus 2 -0.99 -0.67 0.11 0.13 0.35 0.52

Glyphorynchus spirurus 3 -0.96 -1.85 0.08 0.18 0.29 0.53

Hylophilax naevius -0.94 -1.52 0.16 0.21 0.30 0.49

Lepidothrix iris eucephala -0.96 -1.42 0.12 0.15 0.21 0.64

Lepidothrix vilasboasi -0.99 -1.57 0.06 0.13 0.24 0.63

Malacoptila rufa 1 -0.99 -1.46 0.09 0.22 0.35 0.43

Malacoptila rufa 2 -0.20 -1.65 0.10 0.24 0.29 0.46

Malacoptila rufa 3 -0.98 -2.04 0.12 0.18 0.27 0.55

Microcerculus marginatus -0.96 0.34 0.16 0.21 0.38 0.41

Myrmotherula axillaris 1 -0.99 -1.88 0.05 0.18 0.30 0.52

Myrmotherula axillaris 2 -0.98 -1.54 0.13 0.22 0.30 0.48

Myrmeciza hemimelaena -0.85 -1.51 0.10 0.14 0.29 0.57

Psophia dextralis -0.99 -2.12 0.06 0.19 0.31 0.51

Psophia interjecta -0.96 -1.88 0.15 0.15 0.23 0.62

Pyriglena leuconota 1 -0.99 -1.70 0.06 0.21 0.31 0.49

29

Pyriglena leuconota 2 -0.99 -2.34 0.04 0.19 0.24 0.57

Rhegmatorhina gymnops -0.99 -1.56 0.06 0.22 0.25 0.53

Synallaxis rutilans -0.99 -2.05 0.10 0.22 0.36 0.43

Willisornis poecilinotus* -0.71 -1.39 0.07 0.37 0.45 0.18

Xenops minutus1 -0.85 -0.42 0.11 0.25 0.33 0.42

Xenops minutus2* 0.46 -0.23 0.20 0.21 0.46 0.33

Xiphorhynchus guttatus -0.99 -1.57 0.12 0.14 0.28 0.58

Table 3. Values of Spearman’s correlation coefficient, Tajima’s D and R2 for the populations from

Northwest Amazonia. The last three columns show the posterior probability of the demographic models using the rejection algorithm in an Approximate Bayesian Computation analysis. Marked populations (*) were not included in the co-expansion analysis.

Taxon  TD R2 Posterior Probability

BT CT EX

Automolus badius1 -0.98 -1.93 0.05 0.18 0.20 0.63

A. badius2 -0.94 -1.54 0.10 0.12 0.16 0.75

A. infuscatus -0.99 -0.92 0.10 0.22 0.26 0.52

A. subulatus -0.99 -0.62 0.12 0.22 0.28 0.51

A. ochroptera turdinus -0.40 0.35 0.17 0.27 0.35 0.38

Dendrocincla fuliginosa -0.99 -1.47 0.06 0.19 0.33 0.49

Galbula albirostris -0.99 -1.20 0.08 0.25 0.32 0.43

30

Glyphorynchus spiruru-Japurá -0.98 -1.11 0.10 0.25 0.30 0.45

G. spirurus- Sopé -0.99 -0.87 0.10 0.22 0.29 0.49

G. spirurus3*- Jaú 0.96 -1.99 0.20 0.33 0.48 0.19

G. spirurus4- Marañon -0.99 -0.43 0.12 0.25 0.32 0.44

Lepidothrix coronata -0.99 -1.03 0.09 0.23 0.29 0.48

Malacoptila fusca -0.99 -0.97 0.10 0.22 0.28 0.50

Microcerculus marginatus -0.99 -1.21 0.10 0.22 0.33 0.45

Myrmotherula axillaris -0.99 0.09 0.12 0.23 0.37 0.40

Phoenicircus nigricollis * 0.85 -1.09 0.31 0.22 0.35 0.43

Psophia napensis -0.95 -1.22 0.11 0.12 0.17 0.71

P. ochroptera -0.93 0.63 0.19 0.30 0.33 0.38

Rhegmatorhina cristata -0.99 -0.31 0.16 0.24 0.30 0.46

R. melanosticta -0.98 -1.38 0.15 0.25 0.32 0.43

Schiffornis turdina -0.98 -1.58 0.07 0.21 0.33 0.46

Sclerurus rufigularis -0.89 -2.22 0.10 0.22 0.25 0.53

Synallaxis rutilans -0.23 -1.70 0.16 0.16 0.38 0.46

Xenops minutus -0.99 -0.53 0.13 0.25 0.31 0.45

Table 4. Values of Spearman’s correlation coefficient, Tajima’s D and R2 for the populations from

Southeast Amazonia. The last three columns show the posterior probability of the demographic models using the rejection algorithm in an Approximate Bayesian Computation analysis. Marked populations (*) were not included in the co-expansion analysis.

31

3.2 Pairwise comparisons of summary statistics between regions

There was no difference in genetic diversity between NW and SE Amazonia. There were 13 populations with a comparable, closely related taxon in both sub-regions for performing the pairwise comparisons of summary statistics between regions (Fig. 3): Automolus infuscatus and

A. parensis; Dendrocincla fuliginoas; Galbula cyanocollis and G. albirostris, two pairs of

Glyphorynchus spirurus; Lepidothrix coronata and L. iris; Malacoptila rufa and M. fusca;

Myrmotherula axillaris; Psophia dextralis and P. napensis; Psophia interjecta and P. ochroptera; Rhegmatorhina gymnops and R. melanosticta; Synallaxis rutilans; and Xenops minutus. There was no significant difference between mean nucleotide (Fig 2a, pi two-tailed t- test: t (12) =-0.031, p=0.98) or haplotype diversity (Fig 2b, H two-tailed t-test: t (12) =-0.24, p=0.81) of populations occurring in the two regions. Hence, the hypothesis that mean genetic diversity is greater in the putatively stable NW than in the SE was not supported (pi single tail t- test p=0.51; H single tail t-test p=0.59).

Populations from the SE underwent a more pronounced increase in population size in relation to populations from the NW. The ratio between the largest Ne size by the smallest was higher in SE than NW (Fig.11c, p=0.04, V=71 single tail Wilcoxon signed rank test), but the difference between the two was not different than 0 (p=0.08, V=71 two tail Wilcoxon test).

32

Figure 3. Pairwise comparison of nucleotide diversity (A), haplotype diversity (B) and population size change (C) in Southeastern (orange) and Northwestern (blue) Amazonia.

3.3 Co-expansion modelling

The assemblages of southeastern and northwestern Amazonia had a high proportion of its populations co-expaning in the late Pleistocene. The behavior of the two population assemblages was similar under the hABC approach: co-expansion indices (ζ) were high, from 82% to 98% of the populations simultaneously expanding, and co-expansion time (Ts) spanned from 90ky to

130ky under all models (Table 5). Similarly, the Narrow Co-expansion Time (NCT) model was the one with the best fit and highest posterior probability in both regions (Table 5). Although the

NCT model also had the highest Bayes factor ratio in pairwise comparisons with other models, the ratio was below 3 for all comparisons. This suggests that the models resulted in similar joint demographic histories, also visible in the Principal Component Analysis plotted with the 1000 retained posterior simulations for the NCT, TH and PT models (SFig.1). The rejection algorithm resulted in the most similar posterior parameter estimates across models (Table 5). The differences in mode estimates of parameters under the rejection and neuralnet algorithms were small and did not change the overall interpretation of results. As the neuralnet algorithm more

33

accurately recovered the demographic histories of the simulated population assemblages

(following section), I will focus on results from this approach.

Region Northwestern Amazonia Southeastern Amazonia

Model Co-expansion index (ζ)

Rejection Neuralnet GoF Rejection Neuralnet GoF

CH 0.98(0.44–1) 0.98(0.80–1) 0.07 0.98(0.35–1) 0.87(0.72–0.91) 0.04

TH 0.98(0.48–1) 0.89(0.65–0.95) 0.09 0.98(0.43–1) 0.82(0.71–0.90) 0.13

PT 0.99(0.52–1) 0.94(0.65–1) 0.19 0.94(0.13–1) 0.94(0.71–1) 0.12

NCT 0.98(0.52–1) 0.92(0.73–0.96) 0.22 0.93(0.09–1) 0.88(0.66–0.95) 0.19

Co-expansion time (Kya)

CH 116.6(74.4–215.2) 111.4(88.3–141.2) 109.6(65.6–243.4) 94.2(75.1–120)

TH 120(73–221.1) 101.8(72.4–148.2) 116.3(64.3–230.1) 79.4(51.1–125.1)

PT 105.9(74.3–182.0) 102.3(74.1–148.7) 121.3(56.4–277) 101.8(80.4–139.8)

NCT 105.9(74.3–181.9) 94.8(74–131.2) 117.4(52.6–300.7) 100.6(80.5–139.4)

Expansion time (Kya)

CH 140.3(93.8–321) 121.6(96.5–184.3) 140.8(84.9–346.5) 154.9(136.2–205.7)

TH 151.7(88–341.2) 161.4(127.8–262.3) 136.2(79.7–358) 154.5(129.7–224.5)

PT 162.2(89.0–303.6) 143.8(107.0–248.9) 146.5(91–260.2) 111.7(90.1–153.4)

NCT 162.2(89–303.6) 151.0(117.8–212.8) 148.3(91–262.2 119(103.1–161.1)

Dispersion Index (Kya)

CH 5(0–310.9) 10.2(48–243.3) 5.2(0–330) 71.5(11.1–302.6)

34

TH 4.8(0–305.4) 121.4(65.2–354.2) 3.8(0–337.3) 132.2(81.5–351.7)

PT 5.0(0–328.3) 71.5(13.2–333.3) 3.3(0–121.9) 23.7(-17.4–74)

NCT 5(0–328.3) 72.2(34.4–300) 4.2(0–122) 34.2(-20.7–99.6)

Table 3. Results of co-expansion simulations for northwestern and southeastern Amazonia. CH corresponds to the model from Chan et al. (2014). Threshold (TH), partitioned time (PT) and narrow co- expansion (NCT) correspond to models with different strategies for sampling from time prior (Gehara et al. 2017). The mode and credible interval of the posterior are presented under the rejection and non-linear regression (neuralnet) algorithms. GoF:Goodness of fit.

The population expansions that took place in SE Amazonia are better represented by the simultaneous expansion model than the joint demographic history of the NW, although the occurrence of synchronous expansion was large in the NW as well. A narrow DI posterior distribution is a better representation of a co-expansion scenario (Gehara et al. 2017). The dispersion index (DI) of the SE assemblages was overall lower, and its posterior had a narrower range than for the NW. The goodness of fit for the no-buffer model (CH) was the lowest for both

SE and NW (Table 5), so the results for these models will not be discussed further. The values for zeta varied little across models for the NW assemblage, ranging from 0.91 to 0.94, in relation to the 0.82 -0.94 range of SE (Table 5, Fig. 4–5, A).

35

A B Model 2.5e−05 NCT 6 PT 2.0e−05

TH y

y 1.5e−05

t

t

i

i s

4 s

n

n

e

e D D 1.0e−05

2 5.0e−06

0 0.0e+00 0.0 0.3 0.6 0.9 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 Community congruence (z) Coexpansion time (Ts)

C D

6e−06

1.5e−05

4e−06

y y

t t i

i 1.0e−05

s s

n n

e e

D D 2e−06 5.0e−06

0.0e+00 0e+00 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 t (years) Dispersion Index (years) Figure 4. Posterior distribution of four co-expansion hyperparameters estimated for 22 bird populations in

Northwestern Amazonia: the proportion of simultaneous change in the assemblage (zeta, A), the time of synchronous expansion (Ts, B), the average time of demographic change (tn, C) and the variation of expansion time in relation to the mean (dispersion index, D). The posteriors shown were built under the

Threshold model (TH, green), the Partitioned Time model (PT, red), and the Narrow Co-expansion Time model (NCT, blue). The dashed lines represent the mode of each posterior distribution. The gray curve is the prior distribution for each parameter.

36

A B Model 3e−05 8 NCT

6 PT

2e−05 y

TH y

t

t

i

i

s

s n

4 n

e

e D D 1e−05 2

0 0e+00

0.0 0.3 0.6 0.9 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 Community congruence (z) Coexpansion time (Ts)

C D

1.5e−05

2e−05

y

y

t

t

i

i s

s 1.0e−05

n

n

e

e D 1e−05 D 5.0e−06

0e+00 0.0e+00

0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 t (years) Dispersion Index (years)

Figure 5. Posterior distribution of four co-expansion hyperparameters estimated for 24 bird populations in

Southeastern Amazonia: the proportion of simultaneous change in the assemblage (zeta, A), the time of synchronous expansion (Ts, B), the average time of demographic change (tn, C) and the variation of expansion time in relation to the mean (dispersion index, D). The posteriors shown were inferred under the Threshold model (TH, green), the Partitioned Time model (PT, red), and the Narrow Co-expansion

Time model (NCT, blue). The dashed lines represent the mode of each posterior distribution. The gray curve is the prior distribution for each parameter.

3.4 Synchronicity estimation assessment

The hABC approach had a satisfactory performance in estimating synchronicity and time of co- expansion for the simulated assemblages. The credibility interval of the ζ posterior in all 10

37

simulations included the pseudo-observed value of 0.5, and in most cases the interval did not include ζ over 0.7 (Fig. 6, Table 6). The pseudo-observed ζ and Ts were recovered best by the neuralnet algorithm: modes of the two parameters fluctuated close to ζ at 0.5 and Ts at 40kya

(Fig. 6). The simple rejection algorithm had a tendency of overestimating the mode for coexpansion time. Modes estimated with the simple rejection algorithm greatly underestimated synchronicity, with most values close to ζ of 0.1. This pattern is expected, as the neuralnet approach is more suited for handling models with a large amount of summary statistics (Blum and François, 2010) such as the hABC implemented here (16 joint summary statistics, Gehara et al. 2017).

A B

1.00 neuralnet 200

)

z

( )

rejection

y

e

k

c

(

n 0.75 150

e

e

u

m

r

i

t

g

n

n o o 100

0.50 i

c

s

y

n

t

i

a

n

p u

x 50 e

m 0.25

o

m

C

o C 0 0.00 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Simulation Simulation

Figure 6. Mode and credibility interval of hyperparameters zeta community congruence (zeta) and coexpansion time (Ts)for 10 simulated datasets with the neural network (green) and simple rejection

(maroon) algorithms. The dashed red line represents known hyperparameter value.

Rejection

POD ζ Ts tn DI

1 0.09 (0.05–0.64) 92 (25–194) 465 (263–601) 197 (117–362)

38

2 0.10 (0.05–0.64) 157 (25–195) 453 (276–597) 183 (111–358)

3 0.09 (0.05–0.64) 76 (24–196) 452 (201–596) 188 (76–344)

4 0.17 (0.05–0.64) 67 (28–195) 449 (264–577) 180 (123–346)

5 0.09 (0.05–0.64) 151 (26–195) 501 (287–604) 183 (121–351)

6 0.10 (0.05–0.59) 72 (29–196) 472 (299–323) 183 (116–598)

7 0.10 (0.05–0.64) 79 (26–195) 478 (279–597) 194 (120–349)

8 0.11 (0.05–0.64) 72 (26–194) 476 (299–603) 201 (119–343)

9 0.10 (0.05–0.64) 48 (25–195) 466 (281–606) 190 (119–372)

10 0.09 (0.05–0.55) 58 (24–194) 492 (316–599) 177 (114–332)

Neuralnet

POD ζ Ts tn DI

1 0.50 (0.43–0.67) 65 (61–77) 290 (195–413) 342 (257–419)

2 0.41 (0.25–0.69) 122 (39–166) 324 (229–443) 299 (242–358)

3 0.29 (0.22–0.50) 45 (23–92) 419 (368–478) 247 (193–348)

4 0.69 (0.59–0.88) 36 (13–53) 185 (119–300) 384 (335–458)

5 0.41 (0.27–0.71) 30 (-7–102) 288 (191–426) 332 (266–421)

6 0.33 (0.27–0.45) 61 (44–72) 403 (332–484) 251 (167–326)

7 0.50 (0.42–0.65) 103 (57–143) 287 (221–397) 295 (222–389)

8 0.59 (0.43–0.84) 40 (-0–99) 264 (186–343) 367 (285–439)

9 0.58 (0.46–0.97) -67 (-125–-26) 254 (152–345) 404 (345–490)

10 0.53 (0.40–0.84) 51 (28–87) 292 (163–401) 358 (294–441)

Table 4. Results of co-expansion analysis for ten simulated populations (pseudo-observed data, POD) under the model from Chan et al. (2014). The mode and credible interval of the posterior are presented

39

under the rejection (top panel), and non-linear regression (neuralnet, lower panel) algorithms. ζ: co- expansion index; Ts: time of co-expansion; tn: average time of individual expansions; DI: dispersion index.

4.Discussion

I investigated demographic history of Amazonian bird populations occurring in two regions with contrasting climate histories and found a convergent temporal expansion pattern across the populations surveyed. These findings suggest that bird populations across Amazonia underwent bouts of consorted population expansion, likely in response to environmental changes occurring in a regionwide scale during the onset of the last interglacial (100 Kya, LIG). Despite the general assemblage-wide response across the region, northwestern Amazonia maintained a more stable environment that partially buffered its populations from the effects of environmental change in a broad spatial scale.

This study employed one of the largest datasets comparing demographic histories to date, along with the study of global bat demography (Carstens et al., 2018), tetrapods in the Nearctic

(Burbrink et al., 2016), birds in Australia (Chan et al., 2014) and frogs and lizards in the

Caatinga (Gehara et al., 2017). These efforts represent an important shift towards a more comprehensive reconstruction of the relationship between past landscape and community-wide demographic histories.

4.1 Genetic diversity and magnitude of population size change across Amazonia

Environmental stability over evolutionary time leads to the accumulation of genetic diversity, likely due to an increase in survival, maintenance of larger populations and consequent preservation of a higher variety of haplotypes, while environmental instability results in the loss

40

of genetic diversity (Banks et al., 2013; Beheregaray et al., 2003; Carnaval et al., 2009). I expected NW Amazonia to have higher genetic diversity than SE due to the former’s less pronounced changes in precipitation over the past ~150Ky. Surprisingly, however, there were no differences in the amount of genetic diversity in southeastern and northwestern populations.

These findings suggest that the two regions had comparable environmental suitability for terra- firme forest birds in the mid–late Pleistocene.

There are multiple lines of evidence that SE Amazonia had past changes in climate and vegetation cover (e.g. Reis et al., 2017; Wang et al., 2017). The only palynological record within the northwestern Amazonian population assemblage region has a low temporal resolution, spanning back to the mid Holocene. This record showed evidence of tropical rainforest throughout, although it captured the signature of a sharp decrease in warm-temperate forest elements at the beginning of the record period, which suggests that the period directly preceding the record was cooler (~10kya, Marchant et al., 2006). A palynological record in Central

Amazonia, on the left bank of the Negro river (Lake Pata, Fig.1), similarly registered the temporal occurrence of cold-adapted taxa of Andean origin during the past 60 kya (D’Apolito et al., 2017, but see Nascimento et al., 2019). Factors other than climate cycles cannot be ruled out as the drivers of landscape change in the past 100ky in NW Amazonia (D’Apolito et al., 2017;

Pupim et al., 2019). The changes in the palynological profile at Six Lakes were heterogeneous.

Although the presence of cold-adapted taxa around 70 kya was likely climate driven, an increase in occurrence of palms spanning the last 120-70ky was probably related to edaphic changes

(D’Apolito et al., 2019). Furthermore, gradual expansions of forest in north-central Amazonia over the past 250 Ky are related to increased river channel stability (incision), as opposed to periods of channel instability and extensive floodplain areas (aggradation, Pupim et al. 2019).

41

Although landscape changes that took place in NW Amazonia remain elusive, the record of climate fluctuation in the region (Cheng et al., 2013), mid-Holocene changes in vegetation

(Marchant et al. 2006), along with long-term vegetation changes in neighboring regions

(D’Apolito et al., 2017), suggest that the unexpected genetic diversity deficit in the NW could be a result of a dynamic landscape history, albeit the nature of which was likely different than the dynamic landscape history of SE Amazonia.

Another explanation for a lack of difference between mitochondrial DNA diversity across the two focal regions is that selection pressure could be constraining the accumulation of genetic diversity in the populations (Ribeiro et al., 2011; Shen et al., 2009). If this were the case, broad mitonuclear discordance would also be expected (e.g. Morales et al., 2015). Mitonuclear incompatibilities are not widespread for Amazonian birds (see studies in Table 1, with the exception of Ferreira et al. 2018), so this explanation is unlikely. Studies comparing neutral diversity in nuclear genes will be important for assessing variation in the accumulation or loss of genetic diversity across Amazonia, and whether these parallel the east-west biodiversity and precipitation gradient.

Populations in southeast Amazonia had a greater magnitude of population size change than populations in northwest Amazonia. Demographic reconstructions based on genome-wide variants have also provided evidence for more marked demographic change in southeastern populations than for western populations of Xenops minutus, one of our focal taxa (Harvey and

Brumfield, 2015). These results suggest that there was a greater decrease in habitat availability in

SE than in NW Amazonia during periods of lower habitat suitability. This is consistent with the climate models that propose more severe decreases in rainfall in eastern Amazonia than in the west (Cheng et al., 2013; Wang et al., 2017). The larger magnitude of population size change and

42

the environmental instability it invokes for SE Amazonia suggests that the biodiversity gradient increasing from east to western Amazonia may be related to an increase loss of lineages in the east, and maintenance of lineages in the west. This pattern recapitulates the accumulation of younger lineages in eastern Amazonia in relation to the west, and the correlation between the biodiversity and moisture gradients in the region (Silva et al., 2019). Although the environmental changes in SE were likely more prominent than in the NW, as reflected by ratios of population size change, demographic histories from both regions suggest that environmental conditions were not stable in either region during the late Pleistocene.

4.2 Synchronicity and timing of expansion across Amazonia

I compared two regions with putatively different climate histories and expected to find contrasting demographic histories in these regions: the populations in climatically stable NW

Amazonia were expected to have idiosyncratic demographics patterns due to the lack of a major precipitation change in the region of the co–occurring populations. Therefore, community congruence (ζ) would be low for this assemblage. In contrast, I expected that only the assemblage in SE Amazonia would have high simultaneous expansion in response to marked past climatic changes in the region. Contrary to these expectations, the majority of populations in both regions showed similar signals of demographic co-expansion. Consorted demographic and/or phylogenetic events in co-occurring populations are an indication that organisms are responding to a common environmental factor (Chan et al., 2014; Hickerson and Meyer, 2008).

Along with the lack of differences in genetic diversity between SE and NW Amazonia, these findings suggest that both NW and SE Amazonia underwent landscape changes and periods of decreasing suitability for forest taxa during the mid-late Pleistocene. The environment was periodically less suitable for forest birds even in regions where climate cycles were less

43

pronounced, such as northwestern Amazonia. Extended dry periods decrease population growth of tropical forest birds despite apparent habitat stability (Brawn et al., 2017). These findings, along with their sensitivity to landscape changes such as selective logging and fragmentation

(Moura et al., 2016; Stratford and Stouffer, 2015), suggest that lowland forest birds are sensitive to climatic change even without extensive change in their habitat.

Timing of co-expansion was very similar across the study regions (around 100 Kya), although it was slightly earlier for SE Amazonia. This time period for simultaneous expansion was also found for frog and lizard populations in the Caatinga (Gehara et al., 2017) and for an

Amazonian anole lizard (Prates et al., 2016). These findings suggest that a major shift in climatic conditions at a continental scale, likely related to an increase in global temperatures during the onset of the last interglacial (LIG, ~120 Kya), had a strong effect on neotropical population dynamics. The similar onset of co-expansion time for Amazonian forest bird populations from regions with different climatic histories found here go against the original prediction that expansion times in the two regions would be different, reflecting Chang et al.’s (2013) on-off phase climate-cycle model. However, the large credibility intervals around the time of co- expansion estimated for both regions (ranging from 80-140 Kya) make it difficult to pinpoint climatic events and a consequent onset of change in population size. Perhaps the credibility intervals reflect protracted demographic events around a common climate event (LIG) across the regions. Precipitation is at its highest both in SE and NW Amazonia during interglacial periods

(such as from the mid-Holocene to current times), whereas western Amazonia had less decrease in precipitation during glaciations than eastern Amazonia (Wang et al., 2017). The increase in disparity of rainfall during glacial times, reflects a decrease in water vapor transported across the region due to changes in forest cover in these periods (Wang et al., 2017). A change in

44

temperature affected populations region-wide around the onset of the LIG, but the likely drop in precipitation in SE during the penultimate glacial maxima shifted the population recovery

(expansion) to a later time than the population expansion in NW, where precipitation levels remained high during cold glacial maxima.

The synchronous demographic expansions I found for SE and NW Amazonia predate the

Last Glacial Maximum (LGM), both based on the hierarchical approximate Bayesian computation approach and Bayesian skyline plots. It has been suggested that the effects of the

LGM on demographic patterns have overridden genetic signals reflecting previous climatic shifts

(Ramírez‐Barahona and Eguiarte, 2013). Multiple phylogeographic studies in Amazonia have failed to recover demographic events that occurred during or after the LGM. Rather, much like this study, demographic changes predating the LGM were also found for other populations of

Amazonian birds (Menger et al. 2018; Ribas et al. 2018, 2012; Thom et al., 2015) and lizards

(Prates et al., 2016). It is possible that the transition from the penultimate glacial maxima to the

LIG had a stronger effect on Neotropical organisms than the LGM. On the other hand, population level studies employing substitution rates inferred based on phylogenies tend to overestimate timing of demographic events, a common issue for non-model organism studies

(Grant, 2015). Our study employed a prior with a range of substitution rates based on phylogenetic studies of birds (Weir and Schluter 2008). It may be this range is not the best representation of substitution rates at the population level (Ho et al., 2011), and that the dates I recovered are older than the actual demographic events. Regardless of limitations of the dataset

(mtDNA) in elucidating recent and detailed demographic history, the broad similarity of demographic histories across and within regions in the Amazonia brings important and novel

45

insight into the relationship of climatic shifts and Amazonian evolutionary processes at a continental scale.

4.3 Considerations about mechanisms driving demography

The taxa in this study comprise a variety of dietary guilds, body size and life history traits, and likely respond to climate shifts through different mechanisms. A recent study found limited microclimatic variation and a corresponding lack of preferential habitat use by birds in relation to microclimate, but a general avoidance of areas with higher light incidence (Pollock et al., 2015).

Furthermore, Amazonian forest birds are not structured by variation in microclimatic conditions

(e.g moisture, light and temperature) at a landscape genetics scale (Menger 2017). These studies suggest that direct effects of climate shifts in microclimatic conditions were not a main driver of the demographic patterns described here. Another possible mechanism driving demographic change could be change in resource availability. A decrease in resource availability could be aggravated by an increase in competition due to an encroachment of cold-adapted highland or open-vegetation taxa to lowland forests undergoing a decrease in precipitation and temperature.

Some Neotropical cloud forest taxa have signals of range expansion in the Pleistocene likely due to downslope population range expansion during glacial maxima(Cadena et al., 2007;

Valderrama et al., 2014), which reflects the encroachment of cold-adapted vegetation in terra- firme forest during cooler periods of the late Pleistocene (Cohen et al., 2014; D’Apolito et al.,

2017, 2013). The mechanism through which climate cycles trigger demographic fluctuations in forest birds remains to be clarified. Studies focusing on physiological descriptions and testing their relationship with environmental variables are paramount but lacking.

46

5. Conclusions

Forest bird populations from the extremes of the regional precipitation gradient had synchronous expansion during the late Pleistocene, predating the Last Glacial Maximum. Although northwestern Amazonia maintained more stable precipitation levels during the period, populations in this region had high level of co-expansions around the onset of the Last

Interglacial, similar to populations in the hydro-climatically unstable southeast Amazonia.

Populations in SE Amazonia had a greater magnitude of change in size, reflecting the harsher environmental conditions for forest taxa in the region during glacial maxima. The similar demographic histories in regions with opposite patterns of precipitation fluctuation during glacial cycles highlights the sensitivity of Amazonian forest birds to even minor changes in historical climate suitability.

47

CHAPTER 2-A Multireference-Based Whole Genome Assembly for

the Obligate Ant-Following , Rhegmatorhina melanosticta

(Thamnophilidae)

1. Introduction

Organismal biology has been revolutionized over the past decade by the ‘omics’ era, in which the rapid development of high-throughput sequencing technologies has enabled the acquisition of more genetic data than ever before, including for non-model organisms (Ellegren 2014). For example, collaborative endeavours to sequence thousands of bird (Bird10K project, Zhang 2015) and other vertebrate genomes (Genome10K project, Genome 10K Community of Scientists 2009) across many countries and research groups have been launched in the past decade, and have produced promising results (Koepfli et al. 2015; Jarvis et al. 2015; Zhang et al. 2014; Jarvis et al.

2014). Thus, high-throughput sequencing has rapidly improved our ability to make robust inferences in various fields, including avian systematics (Jarvis et al. 2014; Prum et al. 2015;

McCormack et al. 2013), population genomics and phylogeography (Toews et al. 2016; Harvey et al. 2017; Smith et al. 2013; Raposo do Amaral et al. 2018; Oswald et al. 2019; Oswald et al. 2017;

Nadachowska-Brzyska et al. 2015), biogeography (Musher et al. 2019; Oliveros et al. 2019), molecular evolution (Nam et al. 2010), and speciation (Ellegren et al. 2012). There has been an especially rapid increase in the number of studies specifically using whole-genome sequencing

(often combined with reduced representation approaches) to answer difficult ornithological questions (Runemark et al. 2018; Irwin et al. 2018; Alcaide et al. 2014; Jarvis et al. 2014; Ellegren et al. 2012). Although the number of avian genomes available on GenBank has increased by more than tenfold over the past five years (from 11 (Ellegren 2014) to 182 as of this writing) the total

48

number is still relatively low; just under two percent of recognized avian species are represented compared with over 6% of mammals (see also Bird10K project (Zhang 2015)). Lower still is the number of available avian genomes that are assembled to the chromosome-level (only 18 on

GenBank as of this writing, most of which were de novo assembled).

Current-generation (high-throughput) sequencing methods are improving the quality of genome assemblies by producing highly contiguous sequences that have traditionally been difficult due in part to the complications associated with assembling highly repetitive or heterozygous regions as well as centromeres (Hron et al. 2015; Tigano et al. 2018). Most publicly available non- model bird genomes are of relatively low contiguity, with more than half of the genomes containing tens or hundreds of thousands of relatively short contigs that are difficult to assemble into longer scaffolds. Additionally, many genomes have been of relatively poor quality, often missing thousands of GC rich genes, which are typically more difficult to sequence with short reads alone (Botero-Castro et al. 2017). Improving contiguity of publicly available genomes is critical for improving inferences based on a range of biological methods, such as whole-genome resequencing, genotyping-by-sequencing, phylogenomics, historical demography (Tigano et al.

2018), and determining architectural changes to the genome (Sotero-Caio et al. 2017; English et al. 2012).

Chromosome level assemblies have typically been achieved by the combination of different genome sequencing and cytogenomics strategies, such as mate-paired and paired-end libraries, long-read sequencing, Hi-C sequencing, fluorescent in situ hybridization (FISH), and/or bacterial artificial chromosome (BAC) clones, which make the process very costly and labor-intensive

(Lieberman-Aiden et al. 2009; Korbel et al. 2007; English et al. 2012; Myers et al. 2000). The use of longread technology, from either direct sequencing of long molecules (e.g. Pacific Biosciences

49

(English et al. 2012)) or local assemblies of linked-read genome sequencing (Chromium 10x), greatly increase the length of de novo assembled scaffolds and decreases gap content in genomes

(Korlach et al. 2017; Ozerov et al. 2018; Weisenfeld et al. 2017). These methods have different advantages and pitfalls: PacBio can create long contigs with no gaps but with high sequencing error rates and cost (Sohn & Nam 2016) . In contrast, 10x Chromium linked-read sequencing has the advantage of reduced costs associated with high-throughput Illumina sequencing, but carries the short-read in sequencing GC rich and high repeat density regions (Sedlazeck et al. 2018; Sohn

& Nam 2016). Recent developments in reference-based whole genome assembly methods assemble scaffolds to chromosomes based on available high-quality chromosome-level assemblies

(Kolmogorov et al., 2018; Tamazian et al. 2016; Kim et al. 2013). Reference-based chromosome assembly, in addition to long-read sequencing methods, create an exciting prospect of more accessible chromosome-level genome sequencing in the near future.

Among birds, few groups have been as important to understanding biotic diversification and macroevolutionary process than the New World suboscines (suborder Tyranni), which contain an enormous level of taxonomic, ecological, and functional diversity (Marcondes & Brumfield 2019;

Musher et al. 2019; Seeholzer et al. 2017; Raikow 1986; Oliveros et al. 2019; Derryberry &

Claramunt 2011; Isler et al. 2007; Willis 1968; Moyle et al. 2009). Within this group, the army ant-following clade, which includes multiple genera (Rhegmatorhina, Gymnopythis, Willisornis,

Phaenostictus, Phlegopsis, and Pythis) has enamored researchers for decades and formed the foundation of many ecological, phylogeographic, biogeographic, and population genetic studies

(Willis 1968; Willis 1969; Ribas et al. 2018; Isler et al. 2014; Pulido-Santacruz et al. 2018). Such studies have been fundamental to developing and testing important evolutionary hypotheses, including for example, the hotly-debated history of Amazonian biogeography and diversification

50

(Ribas et al. 2018; Hackett 1993; Willis 1969; Silva et al. 2019). Additional genomic level data are necessary to help unravel how Amazonian history has affected avian demography and speciation. Despite the extensive interest in this group by many researchers, genomes are not publicly available for any of the six genera at present. In fact, only six suboscine genomes are currently available on NCBI GenBank (five species of Pipridae and one of Tyrannidae).

Here I present the first publicly available high contiguity genome for an army ant-following antbird (Tribe Pyithyini), the Hairy-crested Antbird (Rhegmatorhina melanosticta). In doing so, I report on genome contents and describe its potential for future use by other researchers. Our objectives are to (1) assemble scaffolds to chromosome level and report on structural differences from other genomes, (2) assess genome completeness and content relative to other published genomes, and (3) assess the suitability of linked-read sequencing assemblies in mapping reduced- representation markers that are broadly implemented in comparative phylogenomics and population genomics studies.

2. Methods

2.1 Genome Sequencing and De Novo Assembly

The study specimen was a wild caught adult female Reghmatorhina melanosticta from San

Martin, Peru (Museum of Southwestern Biology voucher MSB:Birds:36483; http://arctos.database.museum/guid/MSB:Bird:36483). Muscle, heart and liver samples were

frozen in liquid nitrogen and stored at -80oC. Muscle tissue was transported to the sequencing facilities on dry ice for preservation of DNA quality. DNA extraction, linked-read library preparation and sequencing were carried out at the HudsonAlpha Genome Sequencing Center facilities (Huntsville, Alabama; https://hudsonalpha.org/sequencing/). High molecular weight

51

DNA was extracted with Qiagen’s MagAttract Kit (Qiagen, Valencia, California). Fragment lengths were verified to be over the minimum ideal length for linked-read sequencing libraries

(>50kb) with pulsed field gel analysis. The library for the 10x Chromium platform sequencing implements bead-in-emulsion barcoding to add location-barcodes to fragments that originated from a single long DNA molecule (Ott et al. 2018). This barcode is then used to re-assemble the short reads into pseudo long-reads post sequencing (Ott et al. 2018). The paired-end library was sequenced with the HiSeq X Illumina platform, with sequence read length of 150bp and average insert size of 350 bp.

I implemented the Chromium Genome Software Suite package for raw read processing, scaffold level genome assembly and structural variant mapping. Raw reads processing and de novo genome assembly were done with the software Supernova version 2.1 (Zheng et al. 2016; Marks et al. 2019), which includes adapter trimming within its pipeline. The raw reads were demultiplexed and assembled to scaffolds with default settings of the mkfastq and run functions, respectively. I ran Supernova version 2.1 assembler on 40 threads and 1Tb RAM on the Sackler

Institute for Comparative Genomics private server at the American Museum of Natural History for three days. The final genome sequences were generated with the mkoutput function under the

“pseudohap2” style (Weisenfeld et al. 2017). The “pseudohap2” option generates two parallel fasta files corresponding to the paternal and maternal haplotypes of the sequence. This option flattens bubbles in variant regions by randomly selecting an allele and assigning it to one of the two haplotypes, resulting in two final genome sequences composed of scaffolds with mixed occurrence of paternal and maternal haplotypes.

I used the Longranger software version 2.2.2 to map and phase structural variants in the R. melanosticta genome (Marks et al. 2019). Longranger implements the linked-read barcode

52

information to enhance the performance of external variant calling software by mapping the 10x raw reads to a reference genome. Longranger performs optimally when using references with a reduced number of scaffolds (preferably under 1000 scaffolds). I mapped single nucleotide polymorphisms (SNPs) and variants only to scaffolds over 150 Kb. This length was chosen after an exploratory analysis showed a good trade-off; optimally reducing the number of scaffolds in the reference genome without losing a significant amount of sequence. The 150 Kb cutoff reduced the number of scaffolds by 13%, while only reducing the total genome length by one percent (96 excluded scaffolds with a total of 11.7 Mb). I mapped variants by running the wgs function of

Longranger with the Genome Analysis ToolKit (GATK) (McKenna et al. 2010) as the variant caller, and used the 619 scaffolds of the de novo assembled genome from the previous step as reference. Longranger filters variants that have VCF standard phred-scaled quality score lower than 15 (QUAL <15) or 50 (QUAL<50) if they are heterozygous or homozygous, respectively.

Heterozygous sites with allele fraction under 15% are also excluded.

2.2 Single reference assisted chromosome level assembly

I first assembled the R. melanosticta scaffolds to chromosomes using the software Chromosomer

(Tamazian et al. 2016). This method generates draft chromosome level assemblies based on a

BLAST alignment (Altschul et al. 1990) between the reference genome and target genome scaffolds (Kolmogorov et al., 2018). The algorithm considers a scaffold to be anchored to a specific position on the reference genome if the ratio between the first and second highest alignment score is higher than a predefined ratio threshold (default is 1.2). If the ratio is lower than the threshold, the scaffold is not mapped: it is listed as unplaced (on a given chromosome) if the two best hits map to the same chromosome, or unlocalized, if the best hits are on different chromosomes (Tamazian et al. 2016). Although Chromosomer is ideally implemented for

53

genomes from closely related taxa, the conserved nature of avian genomes (Ellegren 2013) likely reduces the rate of insertion errors associated with an increase in phylogenetic distance between the reference and target taxa. I mapped scaffolds to all available chromosome-level genomes of the order Passeriformes found on GenBank and to relatively high-quality genomes representing four other avian orders (Table7). The total sample, in order of increasing relatedness (all reference are of the suborder Passeri and are therefore equidistant to R. melanosticta), included (1) chicken (Gallus gallus, Galliformes) (Sohn et al. 2018), (2) Rock Pigeon (Columba livia, Columbiformes) (Damas et al. 2019), (3) Anna’s hummingbird (Calypte anna,

Apodiformes) (Korlach et al. 2017), (4) Peregrine falcon (Falco peregrinus, Falconifomes)

(Damas et al. 2019), (5) Kakapo (Strigops habroptila, Psittaciformes) (Koepfli et al. 2015), (6)

Great Tit (Parus major, Passeriformes) (Laine et al. 2016), (7) House Sparrow (Passer domesticus, Passeriformes) (Elgvin et al. 2017), (8) Zebra Finch (Taenopygia guttata,

Passeriformes) (Korlach et al. 2017) and (9) Collared Flycatcher (Ficedula albicollis,

Passeriformes) (Kawakami et al. 2014).

I converted the masked repeat regions from GenBank assemblies to BLAST readable masks with the convert2blastmask function from the NCBI BLAST+ package (Camacho et al. 2009), implementing the “repeatmasker default” option. I created BLAST databases from each reference with makeblastdb and aligned R. melanosticta scaffolds to reference genome databases with blastn

(Camacho et al. 2009). I mapped the target scaffolds to the reference genomes with the fragmentmap function from the Chromosomer package (Tamazian et al. 2016). I set the gap size between non-overlapping scaffolds to 500bp, which is higher than our maximum read insert-size

(Tamazian et al. 2016). Finally, the chromosomes were assembled with default options of the assemble function.

54

2.3 Multiple reference assisted chromosome level assembly

The consistency of sequence adjacency across multiple genomes adds powerful information to referenced-based chromosome assembly (Kim et al. 2013; Kolmogorov, Armstrong, Raney,

Streeter, Dunn, Yang, Odom, Flicek, T. Keane, et al. 2018). I used the software Ragout 2

(Kolmogorov, Armstrong, Raney, Streeter, Dunn, Yang, Odom, Flicek, T. Keane, et al. 2018) to assemble the R. melanosticta scaffolds into chromosomes. Ragout uses phylogenetic information to reconstruct the most likely chromosome rearrangements for the target genome (Kolmogorov,

Armstrong, Raney, Streeter, Dunn, Yang, Odom, Flicek, T. Keane, et al. 2018). First I assembled the W chromosome separately based on Ficedula albicollis (ENA accession code PRJEB7359)

(Smeds et al. 2015), Calypte anna (Table7) and Gallus gallus (GenBank accession number

NC_006126.5) (Bellott et al. 2017): these were the only assembled W chromosomes I found to be publically available. As our sample is female, I took this approach to a) assemble the W chromosome of R. melanosticta and b) exclude confounding W chromosome scaffolds that would not be correctly mapped to any scaffolds of the reference genomes, given that none of the genomes used for multireference assembly had the W chromosome. Then, I mapped the remaining scaffolds to the five available genomes of the taxa closest to R. melanosticta: all four passerine genomes used in the previous step and to S. habroptila, representing the sister group to

Passeriformes (Table 7). I created the input genome alignments in hal format for the Ragout runs with Cactus, using default options (Paten et al. 2011). The phylogenetic topology used as reference for the alignment was based on a recently published tree for passerines (Oliveros et al.

2019) with no branch length information (all branch lengths=1, Supplementary Fig.2), and for the W chromosome the topology was based on another study (Prum et al. 2015). Given the high contiguity and sequencing depth of the scaffolds from the Supernova de novo assembly, I then

55

ran Ragout version 2.2 with the --solid-scaffolds option. The W chromosome of F. albicollis was set to "draft" because it is assembled only to the scaffold level. All other settings were left to default options. Final chromosome names in the R. melanosticta genome were based on F. albicollis chromosomes (randomly selected by Ragout).

Finally, I visually assessed synteny between the R. melanosticta genome and the nine genomes used as references by creating synteny plots between the multiple-reference based assembly (Ragout) and each single-reference based assembly (Chromosomer). Instead of performing regular analyses of synteny, which usually involve anchoring homologous sites from target to reference genomes (Liu et al. 2018), I compared the scaffold-to-chromosome assignment for the R. melanosticta genome assemblies based on single references as an assessment of synteny across Aves. This approach allowed us to assess synteny while comparing the performance of the multireference and single-reference assembly methods. I used the single-reference assemblies of the antbird to make graphical representations of the genomes of the reference taxa because these assemblies are highly constrained to the reference genomes' structure, although this approach loses any portion of the reference genomes that were not mapped to the R. melanosticta scaffolds. This strategy underestimates the amount of intra-chromosomal rearrangements in R. melanosticta in relation to other taxa, because the order of scaffold placement is guided by the sequence of the reference taxon. However, the high contiguity of the de novo assembled scaffolds (see Results) guarantees that much of the inherent arrangement of R. melanosticta genome remains well represented within the scaffolds and allows for an assessment of synteny with other genomes

(minimum ideal N50 for synteny representation is 1Mb (Liu et al. 2018)). I plotted synteny maps in R version 3.5 (R Core Team 2019) with the package 'circlize' (Gu et al. 2014), and chromosome ideograms with 'karyoploteR' (Gel & Serra 2017). Gaps inserted between scaffolds were removed

56

Order Species N50 scaf (Mb) N50 Contig Coverage BioProject Accession Size (Gb) Scaffolds

Galliformes Gallus gallus 90.11 639,813 248.3× PRJNA412424 GCA_002798355.1 1.02 1822

Columbiformes Columba livia 24.54 27,697 60× PRJNA347893 GCA_001887795.1 1.02 91

Apodiformes Calypte anna 74.1 14,522,327 54× PRJNA489139 GCA_003957555.1 1.06 159

Falconiformes Falco peregrinus 26.78 33,994 137.6 PRJNA347893 GCA_001887755.1 1.11 72

Psittaciformes Strigops habroptila 83.2 9,454,100 76.1 PRJNA489135 GCA_004027225.1 1.17 100

Passeriformes Taeniopygia guttata 70.43 11,998,827 88.2 PRJNA489098 GCA_003957565.1 1.06 134 57

Passeriformes Ficedula albicollis 6.54 410,964 60× PRJNA208061 GCA_000247815.2 1.12 21836

Passeriformes Parus major 71.37 148,693 95× PRJNA312399 GCA_001522545.3 1.02 1675

Passeriformes Passer domesticus 6.37 51,426 130× PRJNA255814 GCA_001700915.1 1.04 2571

Passeriformes Rhegmatorhina melanosticta 3.3 136,760 38× PRJNA561634 Pending 1.03 165

Table 5. List of genomes used as reference to Rhegmatorhina melanosticta genome assembly.

from both types of plots for visualization purposes. Pseudo chromosome fragments (PCF) were concatenated for genome visualization with the orientation of the output fasta sequences (i.e. their concatenation does not represent actual sequence orientation or order in the R. melanosticta genome).

2.4 Evaluation of genome completeness

I evaluated genome completeness through direct assessment of assembly metrics such as expected and observed genome length, number of scaffolds and gap length, as well as content of well-known genomic regions of interest, such as target-capture markers and conserved single-copy genes. I estimated the proportion of sequence missing from our assembly by subtracting total gap length and final genome length from the expected genome length (Peona et al. 2018). I used the haploid DNA content (in pg) based on flow cytometry of Rhegmatorina melanosticta (from the same specimen I used (Wright et al. 2014)) converted to Gb assuming 1pg = 0.978 Gb (Dolezel et al. 2003) as an independent estimate of genome size, as well as Supernova's default genome size estimate based on kmer distribution. I estimated within-scaffold gap lengths (number of bases marked as "N") with the function comp from the seqtk package (Li 2013).

To evaluate the completeness of our genome assembly in relation to sequence content, I used the software Benchmarking Universal Single-Copy Orthologs version 3 (BUSCO) (Simão et al.

2015; Waterhouse et al. 2017). BUSCO measures genome completeness by quantifying the proportion of known genes from compiled datasets that are only present in genomes as single copies and are highly conserved (i.e., they are evolving under “single-copy control” and so conserved that they should be detectable in a variety of organisms (Waterhouse et al. 2011)).

BUSCO genes are good candidates for assessing genome completeness because the expectation that they are present in a given genome is reasonable from an evolutionary perspective

58

(Waterhouse et al. 2013; Simão et al. 2015; Waterhouse et al. 2017; Waterhouse et al. 2011). I ran

BUSCO on the R. melanosticta genome, plus nine related (Eufalconimorphae sensu (Suh et al.

2011; Jarvis et al. 2014)) genomes.

I first chose five genomes that were assembled to chromosome level and used as outgroups in our chromosome mapping approach: Falco peregrinus, Strigops habroptilus, Passer domesticus,

Taeniopygia guttata, and Parus major. I then chose an additional four genomes from a recent study that sequenced dozens of bird genomes (Jarvis et al. 2014): Nestor notabilis, Acanthisitta chloris,

Corvus brachyrhynchus, and Manacus vitellinus. BUSCO outputs were summarized in three metrics: (1) percent complete BUSCOs (complete sequence matches), (2) percent fragmented

BUSCOs (partial sequence matches), and (3) percent missing BUSCOs (unmatched BUSCO sequences). I finally compared missing genes across all species to those missing from R. melanosticta in order to understand whether missing genes were consistent or variable among all assemblies.

To evaluate the efficacy of harvesting target-capture data from linked-read sequencing genomes, I also mapped the Tetrapods-UCE-5kv1 probeset, which targets 5,060 ultraconserved elements (UCEs; https://www.ultraconserved.org/). UCEs are genome-wide markers that are informative at both deep and shallow evolutionary timescales, and which have become widely used for phylogenomic and population genomic studies (Smith et al. 2013; Faircloth et al. 2012;

Raposo do Amaral et al. 2018). To do this, I used the phyluce pipeline for harvesting UCEs from genomes (Faircloth 2016). I converted the de novo assembled R. melanosticta genome from fasta to twoBit and extracted sequence length information from it with the faToTwoBit and twoBitInfo tools from the Kent Source Archive (Kent et al. 2002) I then aligned and harvested the UCE loci from the genome using scripts in the phyluce package.

59

2.5. Genotyping-by-sequencing (GBS) reference mapping

In order to demonstrate the efficacy of our genome for potential future research, I mapped genotyping-by-sequencing data for six species in the family Thamnophilidae. Specimens were provided by two institutions, the American Museum of Natural History (AMNH) and the Museu

Paraense Emilio Goeldi (MPEG) and included Thamnophilus aethiops (AMNH LJM 225),

Myrmotherula menetriesii (AMNH GT104), Myrmotherula longipennis (AMNH GDR 275),

Willisornis poecilinotus (AMNH GDR239), Hypocnemis rondoni (AMNH LJM325), and

Phlegopsis nigromaculata (MPEG T15868). Library prep and sequencing was undertaken at the

University of Wisconsin Biotechnology Center (Madison, WI) using Pstl and Mspl enzymes, with only the latter as the cutter. 150 bp paired-end sequencing was performed on an Illumina NovaSeq

6000. I then used ipyrad 0.7.30 (Eaton & Overcast 2016) to trim low-quality bases (minimum quality score = 20) from the raw Illumina reads, map the cleaned reads to the R. melanosticta reference genome at a 70% clustering threshold, and identify GBS loci with minimum statistical read depth of six. Mapped loci with more than ten ambiguous base calls, 15 heterozygous sites, or

12 SNPs were discarded to eliminate possibly erroneous and non-orthologous alignments Because sequence divergence across these species is expected to be about 0.01–0.02 substitutions per site given a highly conserved coding gene (Moyle et al. 2009), I chose these settings to allow for the somewhat higher levels of divergence expected from GBS loci. Sequences for all loci were then concatenated.

To determine the utility of the R. melanosticta assembly as a reference genome, I reconstructed the phylogenetic relationships of the six Thamnophilid species using RAxML version 8.2.4

(Stamatakis 2014), assuming a root demonstrated in previous works (Marcondes & Brumfield

2019; Moyle et al. 2009). I applied 20 maximum likelihood searches assuming a GTR + gamma

60

model of nucleotide substitution across the entire dataset. I additionally applied 500 bootstrap replicates to evaluate the robustness of each node given our data. Our expectation in employing these analyses is that a more complete genome will yield thousands of mapped GBS loci resulting in accurate, robust phylogenetic reconstruction, whereas a less complete genome would not.

3. Results

3.1 De novo assembly

I generated 560.02 million reads with a mean length of 140b. The de novo assembly size was

1.03Gb with raw and effective coverage of 62x and 38x, respectively. The fraction of sequence duplication was 7.5%, and the GC content of the assembly was 42.2%. The contig N50 was

136.8 Kb. The final genome had 715 scaffolds ranging from 13.8 to 0.1 Mb in length, with a N50 of 3.3 Mb. There were 5.3 million SNPs in the R. melanosticta genome, out of which 99.8% were phased. The longest phaseblock and the phaseblock N50 were 6.9 Mb and 1.9 Mb, respectively. Out of the 257 large structural variants that were called (over 30 Kb in size), 163 were deletions, 4 were sequence inversions, 22 were sequence duplications and 68 were distal (of at least 500 Kb) sequence translocations (Fig7). In addition to large structural variants, 3797 short deletions (from 50bp to 30Kb size range) were detected.

3.2 Reference-Based Assembly

The number of scaffolds mapped to chromosomes based on a single-reference ranged from 577 to 695 and did not vary exclusively due to phylogenetic distance from reference. While the range of scaffolds mapped was similar from passerines to G. gallus (most distantly related bird used), the fewest number of scaffolds mapped were in the C. anna and C. livia assemblies. The number

61

of chromosomes created ranged from 29 to 38, and closely reflected the number of chromosomes in reference assemblies. The length of gaps added to final assemblies ranged from 0.11 to 0.18

Mb.

Figure 7. Chromosome ideogram. The black vertical lines represent UCE (Ultra Conserved

Elements) placement in chromosome and the gray links represent large structural variants

(insertions/deletions, inversions, duplications and rearrangements) over 30 Kb in size. The green rectangle represents two scaffolds that were homologous to Chr5 and Chr1 in multireference assembly, and the yellow rectangle represents scaffolds placed on Chr4 on single-reference assemblies.

62

The genome assembled based on multiple reference genomes (henceforth referred to as the multireference genome or assembly) was composed of 46 pseudo-chromosome fragments

(PCFs), which formed 27 chromosomes (Fig.7). Fifteen chromosomes were formed by a single

PCF, nine chromosomes were formed by two separate PCFs, three were formed by three PCFs and the Z chromosome was formed by four disjointed PCFs. The PCFS ranged in length from

0.29 to 103.33 Mb, which correspond to the whole chromosome 22 PCF and one of the two

PCFs of chromosome 3 PCFs, respectively. The genome assembly placed 595 of the 715 de novo assembled scaffolds, while 120 scaffolds (56.17 Mb) remained unplaced. Although the number of scaffolds that were unplaced in the multireference assembly was an order of magnitude larger than those unplaced in single-reference assemblies, the unplaced scaffolds represented only five percent of the original de novo assembly. The total assembly length was 990.29 Mb in 163 scaffolds (unplaced and PCFs combined), including gaps introduced between placed scaffolds

(42.89 Mb, 4.3% of the assembly). The multireference assembly N50 was 53.31 Mb.

3.3 Genome Completeness

The genome size estimated by Supernova and flow cytometry were very similar: 1.36 Gb and 1.3 Gb [77], respectively. The total length of gaps within scaffolds was 12.2 Mb (Table 8).

Based on the estimated genome size, 282.2 Mb (21%) were either not sequenced or were assembled as unidentified nucleotides (“N”). The length of gaps in scaffolds were shorter in R. melanosticta than three of the five long-read assembled genomes (Table 8). In contrast, the number of scaffolds in the R. melanosticta was higher than all but one long-read assembled genome.

Our analysis of conserved genetic marker content showed that the R. melanosticta genome was relatively complete. After extracting UCE markers from the R. melanosticta genome, I

63

found that 87% of 5060 UCEs were present in the genome. BUSCO analysis revealed that our R. melanosticta genome was relatively complete, with 89.2% (n = 4384) of BUSCOs detected as complete sequences, just 3.5% (n = 172) as fragmented sequences, and only 7.3% (n = 355) missing in the assembly (Fig.8). This number was very similar to that of the other evaluated species, with a lower proportion of fragmented genes, but slightly higher levels of missing genes.

Of the 355 BUSCOs that were missing in R. melanosticta, 16.3% (n = 58) were also missing from C. brachyrhynchos, 15.2% (n = 54) from P. major, 17.5% (n = 62) from P. domesticus,

26.5% (n = 94) from T. guttata, 14.1% (n = 50) from M. vitellinus, 14.9% (n = 53) from A. chloris, 13.8% (n = 49) from N. notabilis, 25.4% (n = 90) from S. habroptilus, 12.7% (n = 45) from F. peregrinus, and 6.5% (n = 23) were missing in all six genomes.

Expected Size Assembly Size Missing Gaps % Taxon Scaffolds (Gb) (Gb) (Mb) (Mb) Missing

Calypte anna 159 1.14 1.06 80.3 16.1 8.5

Corvus cornix 145 1.19 1.05 144 9.6 12.9

Gallus gallus 1821 1.25 1.02 199 19.2 19.9

Rhegmatorhina 715 1.3 1.03 270 12.2 21 melanosticta

Strigops habroptilus 99 1.28 1.17 115.5 27.6 11.2

Taeniopygia guttata 134 1.22 1.06 192 2.3 13.6

Table 6. Comparison of the Rhegmatorhina melanosticta linked-read de novo assembly with PacBio long-read bird assemblies. Expected genome size is based on chromosome density from flow cytometry

(from [76,77,90]). Missing (Mb) is an estimate of unsequenced genome (expected assembly size subtracted from the assembly size [76]). Gaps is the total number of “N”s in the assembly. The percentage

64

of missing sequence is the sum of unsequenced genome and gap length relative to expected genome size.

Corvus cornix GenBank accession number: GCA_002023255.2.

Figure 8. Benchmarking Universal Single-Copy Orthologs version 3 (BUSCO) results for R. melanosticta plus eight related (Eufalconimorphae) non-model genomes. The phylogenetic relationships are based on previous work [8,18]. Bars represent the proportion of complete (blue), fragmented (pink), and missing

BUSCOs (gold) for each genome.

3.4 Synteny

Our mapping of synteny recovered relatively consistent results among species: the chromosome placement of sequences within the R. melanosticta multireference genome was similar to the placement of homologous sequences throughout Aves (Fig.9). However, two species assemblies (S. habroptila and F. peregrinus) were structurally different (Fig.9E,F). Out of the two taxa, F. peregrinus had the most rearranged genome in relation to R. melanosticta.

Chromosomes 5 and 6 in S. habroptila and R. melanosticta multireference genomes were highly syntenic. Four scaffolds of chromosome 1 in the R. melanosticta multireference genome consistently mapped to chromosome 4 in all assemblies but in F. peregrinus, where it was placed

65

in chromosome 2 (Fig.7, yellow rectangle). This difference reflects the homology of chromosome 2 in F. peregrinus to chromosome 4 in other birds (Fig.9). One of the PCFs created by Ragout2 (2 scaffolds with a total of 520.2 Kb) was homologous to both chromosomes 5 and 1 of F. albicollis. This was the only case in which a PCF was ambiguously placed in the multireference assembly. As the PCF had more sequences homologous to chromosome 5 in the five reference genomes, I decided to represent it as part of chromosome 5 in the genome plots

(highlighted in green in Fig.7). One of the scaffolds that were part of the translocation from chromosome 4 to 1 also had a distal structural variation detected by Longranger: a region of this scaffold is translocated to the PCF homologous to both chromosomes 1 and 5 described above

(Fig.7).

66

Figure 9. Synteny plots between single-reference-based genome assemblies (lower-half of the circle, reference taxa represented by bottom right figures: Taeniopygia guttata (A); Ficedula albicollis (B);

Passer domesticus (C); Parus major (D); Strigops habroptila (E); Falco peregrinus (F); Calypte anna

(G); Columba livia (H); Gallus gallus (I)) and multiple-reference-based genome assembly (top-half circle highlighted by blue line, represented by top left figure of Rhegmatorhina melanosticta). The red segments

67

at the end of the single-reference and beginning of the multiple-reference genomes correspond to unplaced scaffolds.

3.5 GBS Reference Mapping

Reference mapping of the GBS data resulted in 55,753 loci total, and 4870 orthologous loci that met our conservative criteria for data retention. Our concatenated matrix of retained loci contained a total of 605,952 bp. RAxML resulted in a topology consistent with previous studies, with 100% bootstrap support across all nodes. Assuming our root was correct based on previous studies, I specifically recovered Hypocnemus to be sister to Willisornis plus Phlegopsis, and these (Tribe Pyithyini) to be sister to Myrmotherula plus Thamnophilus (Tribe Thamnophilini)

[48,92]. However, I represent our results as an unrooted network to avoid this assumption

(Supplementary Fig.4).

4. Discussion

4.1 General Genome Structure, Contiguity and Content

I herein present a reference-based chromosome-level genome assembly for the obligate ant- following Rhegmatorhina melanosticta. This genome represents one of the few publicly available genomes of a suboscine (Suborder: Tyranni), and the first of any species of the infraorder,

Furnariides (Moyle et al. 2009; Dickinson & Christidis 2014). The proportion of the original assembly that was effectively placed in the final assembly (Ragout) is comparable to that of three other chromosome-level bird genomes assembled based on multiple references (Kim et al. 2013;

O’Connor et al. 2018). Our assembly of the R. melanosticta genome is both highly contiguous and relatively complete in terms of BUSCO scores (Figure 2). In terms of completeness, I showed that our genome is similar to those of other published assemblies (Fig.8) (Jarvis et al. 2014; Koepfli et

68

al. 2015; Laine et al. 2019). Although our genome was missing a slightly higher proportion of

BUSCOs than most of the other genomes I evaluated, it also contained a lower number of fragmented BUSCOs, consistent with our assessment of high contiguity (i.e. high N50). BUSCO completeness measures correlate poorly with scaffold N50 making the two metrics good independent assessments of genome quality (Simão et al. 2015). Additionally, I successfully mapped reduced representation genetic markers from both UCEs and GBS data, further demonstrating the high completeness of our assembly.

I found that the contiguity of short linked-read scaffolding has a comparable performance to that of long-read assemblies, albeit with more gap within sequences. The high scaffold number in

R. melanosticta in relation to long-read assemblies likely reflects the concentration of repeat regions towards the scaffold breaks when using short-read sequencing (Weissensteiner et al. 2017), but mapping repeat regions was beyond the scope of this paper. The large number of scaffolds left unplaced to any given chromosome in the final assembly consisted mostly of short scaffolds

(<1Mb). These unplaced scaffolds were likely regions that are difficult to assemble and are therefore highly fragmented, such as highly repetitive genome regions (Smeds et al. 2015). The high occurence of "short" scaffolds in the sex chromosomes, which have high densities of repetitive sequence in relation to autosomes, corroborate the likely high density of difficult-to- assemble repeat regions in these scaffolds (Supplementary Fig.4). Furthermore, the concentration of structural variants in these chromosomes (Fig.7, Supplementary Fig.3) in relation to autosomal chromosomes is likely due to the propensity of repeat regions to undergo insertions/deletions (Li

& Freudenberg 2014). Additionally, some unplaced scaffolds likely correspond to whole microchromosomes, and perhaps even to pathogenic DNA (Laine et al. 2019). Given that most

69

bird genomes, including suboscines, have a 2n of 80 on average, the final R. melanosticta assembly is likely missing around 13 microchromosomes.

4.2 Analysis of synteny

Our synteny plots reflect the relative stability of avian chromosome structure relative to other groups of organisms (Ellegren 2013; Damas et al. 2019). These results expand upon the high syntenic relationship found for the genomes of other passerines (Kawakami et al. 2014; Prost et al. 2019), as well as for non-passerine birds (e.g. Struthio camelus (O’Connor et al. 2018) and G. gallus (Kawakami et al. 2014)). It was expected that the level of synteny between F. peregrinus and S. habroptilus with the R. melanosticta genome would be lower than for other genomes;

Falconiformes and Psitacifformes have highly rearranged genomes within Aves (Damas et al.

2019; O’Connor et al. 2018). Regardless of their high level of rearrangement, the chromosome 1 fragment of the R. melanosticta genome that mapped to chromosome 4 in all birds was located in chromosome 2 of F. peregrinus, which is homologous to chromosome 4 in other birds. This pattern highlights the conserved nature of sequences among avian groups (Kawakami et al. 2014).

The translocation from chromosome 4 to 1 could represent an actual translocation from the ancestral chromosome 4 to both chromosomes 1 and 5 in R. melanosticta: the structural variant map created with LongRanger showed that a sequence fragment on the alternate haplotype of one of these scaffolds is translocated to a scaffold mapped to chromosome 5 or 1 of the multireference assembly (Fig. 1). Alternately, the two scaffolds homologous to both chromosomes 5 and 1 could actually be in chromosome 1 in R. melanosticta. A third possibility is that these two difficult to place scaffolds are a whole microchromosome that split off from the ancestral chromosomes 1 and

5. Whichever the case, R. melanosticta harbours variation in placement of at least a part of the translocated sequence, which could help unravel the path through which the split and/or merging

70

of these genomic regions took place. Sequencing other suboscine genomes as well as using different sequencing tools, such as PacBio long reads or Hi-C mapping will clarify the placement of these sequences in the R. melanosticta genomes and when this event happened since the split from oscine.

The translocations identified in our synteny plots likely underestimate the actual genomic translocations in the R. melanosticta genome in relation to other avian genomes, given that our chromosome level assemblies were contingent on reference sequences. However, I have demonstrated that using information of multiple genomes in conjunction to place scaffolds to chromosomes allows for the emergence of patterns intrinsic to the target genome: the multiference assembly placed a group of scaffolds that were mapped to chromosome 4 in single-reference assemblies on chromosome 1 of R. melanosticta. The occurrence of rearrangements in this genomic region was detected by Longranger independently and without the use of other avian genomes as reference in the distal translocation of sequences between scaffolds in the same region

(Fig. 1). This performance is likely possible due to the conservation of sequence adjacency information in the highly contiguous de novo assembled scaffolds, which is an important source of information for reference-based scaffold-to-chromosome placement (Kolmogorov, Armstrong,

Raney, Streeter, Dunn, Yang, Odom, Flicek, T. Keane, et al. 2018; Liu et al. 2018).

4.3 Linked-read genome applicability in comparative phylogenomics

I successfully extracted nearly all (87%) UCE loci from the genome, a number comparable to that of phylogenomic studies of birds that employed reduced representation genomic libraries with the same probeset (Andersen et al. 2018; White et al. 2017; Moyle et al. 2016; Manthey et al.

2016). This finding not only demonstrates the genome’s completeness but also its potential for future incorporation into the growing number of studies utilizing UCEs for phylogenomic research

71

(Musher & Cracraft 2018; Oliveros et al. 2019; Andersen et al. 2018; Moyle et al. 2016). The use of allele information adds power to phylogenetic and demographic reconstructions based on target- capture libraries (Andermann et al. 2019), but phasing procedures of short-read non-model genomes are prone to errors (Bukowicki et al. 2016). In addition to recovering a high number of

UCE loci in the genome, the SNPs recovered with the aid of linked-read barcodes are phased with high accuracy into long phaseblocks.

As a more in-depth assessment of the genome’s utility for comparative phylogenomics, I additionally mapped GBS data and used the identified loci to reconstruct the phylogenetic relationships of a sample of . In doing so, the resulting RAxML phylogeny agreed with the well-accepted taxonomic relationships that have been recovered in multiple previous studies

(Moyle et al. 2009; Marcondes & Brumfield 2019). Therefore, I have demonstrated that our R. melanosticta genome is valuable for a range of potential future uses. These results also corroborate the idea that GBS data –typically considered most useful at population-genetic scales– can be highly informative even at relatively deep timescales (Cariou et al. 2013; Manthey et al. 2016;

Eaton et al. 2017).

5. Conclusions

Overall, I have demonstrated that linked-read technologies are valuable resources for generating high contiguity genomes, with relatively complete sequencing of putatively conserved genomic regions. Because of this high contiguity, I were able to phase SNPs in relatively long phaseblocks, a result that indicates significant informativeness of this genome for a range of studies. Similarly, the high completeness of our assembly relative to other publicly available genomes means that new data are available for incorporation into ongoing phylogenomic and population genomic work on avian evolution. I additionally showed that multireference-based

72

assembly methods areuseful for assembling scaffolds and comparing genomic structure across the avian tree of life and found relatively stable chromosomal structure spanning at least 65 my of evolutionary history. Notably, I detected a single sequence transfer from chromosome 4 to chromosome 1 in R. melanosticta in relation to the other genomes I evaluated. Future work should determine whether this translocation is synapormorphic of Tyranni, Thamnophilidae, or another node in the R. melanosticta history. I demonstrated that genomic structure unique to the target taxon can be detected with the guidance of reference genomes in assembling scaffolds to chromosomes.

73

CHAPTER 3- Demographic history of amazonian bird populations

over multiple glacial cycles

1.Introduction

Climate cycles of the Pleistocene were major drivers of speciation, change in gene flow and change in biodiversity distribution globally (Garg et al., 2018; Saupe et al., 2019; Weir et al.,

2016). After decades of phylogeographic and population genetics studies some emerging patterns can be identified. Taxa from temperate regions tend to have idiosyncratic responses to climate shifts (Burbrink et al., 2016; Carstens et al., 2018), the geographic location of refugia of climatic stability shifted between cycles and were not all concentrated in lower latitudes (Fordham et al.,

2019; Shafer et al., 2010; Stewart and Lister, 2001), and that climatic stability is positively correlated with biodiversity and genetic diversity gradients (Carnaval et al., 2009; Hewitt, 2004, but see Cabanne et al. 2016). Furthermore, habitat specialist species underwent less fragmentation of past ranges in the late Pleistocene and show less phylogenetic structure than species with more restricted habitats (Reid et al., 2019; Silva et al., 2017). Because the tropics are currently and were historically more stable climatically than temperate regions, it has been suggested that climate cycles did not have pervasive effects on the demographic and evolutionary histories of its taxa (Lessa et al., 2003; Toews, 2015). Although all a vast amount of work in population genetics of tropical species now greatly disputes this notion by showing dynamic biodiversity histories in tropical regions during the Pleistocene (Carstens et al. 2018;

Esquivel‐Muelbert et al., 2019; Gehara et al. 2019) We still know relatively little about how historic climate shifts are related to the history of tropical biodiversity.

74

Amazonia is one of the planet’s most biodiverse regions (Antonelli et al., 2019).

Variations in regional climate have been associated with both current patterns of biodiversity distribution (Silva et al., 2019) and mechanisms driving diversification in the region (Bonaccorso et al., 2006; Haffer, 1969; Peterson and Nyári, 2008). Amazonian climate, vegetation, and landscape were dynamic in the Pleistocene (Bush et al., 2011; Cheng et al., 2013; Pupim et al.,

2019; Rossetti et al., 2017). Pleistocene climate cycles in Amazonia were complex: while southeastern Amazonia had a more unstable pattern of precipitation marked by sharp decreases during the Last Glacial Maximum (LGM), northwestern Amazonia maintained high levels of precipitation during glacial maxima, despite the decrease in temperatures (Cheng et al., 2013;

Wang et al., 2017). These findings put renewed emphasis on the relationship between climate cycles and the history of Amazonian biota.

Major advancements have recently been made with single locus sequencing in uncovering the effects of past climate cycles on the evolutionary history of Amazonian organisms (Chapter 1, Silva et al., 2019). Population genetics studies that took a multitaxa region-wide approach have found evidence that climate and the evolutionary history of biodiversity are tightly linked in Amazonia: there is a positive correlation between moisture gradient and taxon age, where taxa occurring in the dryer eastern Amazonia are younger than taxa in moist western Amazonia (Silva et al. 2019). The magnitude of past change in population size is larger for populations occurring in southeastern Amazonia than in populations from the stable northwest, reflecting the effects of habitat instability on demographic history (Chapter 1).

These findings are broadly associated with climatic patterns, but the resolution of single locus data restricts the depth and precision of demographic histories that can be recovered. Vegetation in Amazonia responds not only to long glacial cycles, but also to 40ky orbital cycles as well as

75

abrupt millennial scale climatic shifts (Zular et al., 2019). Perhaps the demographic history recovered with single or a few genetic loci, as in previous studies, is driven by major climate shifts, overriding responses to shorter climate cycles, but larger genomic datasets, such as genomic data, are needed to recover detailed demographic history and unravel nuanced relationships between specific climate shifts and past population dynamics. Furthermore, past glacial-interglacial cycles are also natural laboratories where answers to pressing questions regarding the effects of global warming and climate change on current biodiversity can be elucidated (McGee 2020).

Understanding population dynamics and persistence is an important part of uncovering the relationship between historic environmental processes and the drivers of macroevolutionary biodiversity patterns (Harvey et al., 2019). In this study, I sought to uncover the relationship between climatic cycles of the Pleistocene and demographic history across two pairs of closely related forest taxa: populations of Psophia dextralis and Rhegmatorhina gymnops from the southeastern dry-corridor Tapajós region (SE from here on) and P. napensis and R. melanosticta from northwestern Amazonia (NW). Psophia are large, ground dwelling, polyandrous birds with limited flight. They are gregarious and have an omnivorous diet. Rhegmatorhina are understory obligate ant-followers, and usually occur in pairs. These taxa were selected due to their distribution being in the proximities of cave speleothem data from which records of past precipitation patterns were recovered (Cheng et al, 2013; Wang et al., 2019). The geographical limits in the dipole pattern of precipitation change across Amazonia (stable western Amazonia versus an unstable eastern Amazonia) are not well defined (Cheng et al., 2013). Therefore, focusing on populations that occur within the range of evidence of past climate cycles allows for a better understanding of evolutionary history in a paleoclimatic context. These taxa are good

76

models for investigating response to past climate cycles because they are habitat specialists associated with undisturbed forest, evidenced by their sensitivity to current forest disturbance

(Lees and Peres, 2010; Moura et al., 2016; Stouffer and Bierregaard, 1995).

Here, I reconstructed demographic histories with a Hidden Markovian Model approach based on whole genomes (Terhorst et al., 2017) to use the powerful inferences made possible with this method to elucidate the complex issue of the effects of climate cycles of the Pleistocene on Amazonian biodiversity. Using HMM models to recover demographic history of populations that have geographic structure generates spurious results and inflates estimates of population size

(Mather et al., n.d.; Mazet et al., 2015). Previous studies show that there is no geographic structure within the distribution of the focal taxa in this study (Psophia in Ribas et al. 2012 and

Rhegmatorhina in Ribas et al., 2018), so the populations included in this study do not violate the unstructured population assumption.

Single locus data suggests that the synchronous bird population expansion dating back to the last interglacial (LIG) within SE and NW Amazonia was also synchronous among these two regions (Chapter 1). Because of the limitations in dating based on haploid organellar mitochondrial DNA and their large credibility intervals in time estimates, it is possible that the co-expansion observed within each region dispersed around the onset of the LIG (Chapter 1).

Therefore, I expect to find that genome-wide based inference of demographic events across SE and NW Amazonia will be synchronous but will be of opposite natures, reflecting precipitation dipole patterns across Amazonia (Cheng et al. 2013; Wang et al. 2017): while populations in SE undergo decreases in size, populations in NW will be expanding, and vice-versa. I expect to find multiple cycles of population expansion and contraction, corresponding to multiple glacial

77

cycles. Finally, I expect that NW populations will have higher genetic diversity than their SE counterparts, reflecting the stability in precipitation in NW Amazonia in relation to SE.

2.Methods

This study focuses on taxa occurring in two regions where both biotic and abiotic evidence for past climate was available (Cheng et al., 2013; Wang et al., 2017): the Napo region in northwestern Amazonia, here defined as the interfluvium between the Negro and Solimões rivers

(from here on NW); and the Tapajós region, comprising the interfluvium between the Tapajós and Xingu rivers (SE, Fig. 1). Two taxa were sampled per region, and three individuals (two for

Psophia napensis) were sampled per population (Fig.10, Table 9): Rhegmatorhina melanosticta melanosticta and Psophia napensis were the focal taxa of the NW region, and R. gymnops and P. dextralis were sampled in SE (Table 9). I included one outgroup taxon for each study population for estimating split time from ancestral populations: P. crepitans, R. melanosticta purusianus were the outgroups of the NW populations; P. interjecta and R. berlepschi were the outgroups in

SE. This sampling strategy resulted in sampling 15 whole genomes: 11 for the focal populations, and four outgroups.

78

Taxon Voucher Source Country Coverage Coverage SD

Psophia crepitans YPM139299 YPM Suriname 23.321044 74.527905

Psophia dextralis FMNH391995 FMNH Brasil 23.343959 71.136209

Psophia dextralis FMNH391996 FMNH Brasil 17.287674 67.307009

Psophia dextralis A14623 INPA Brasil 5.570935 53.805942

Psophia interjecta FMNH456447 FMNH Brasil 23.284485 78.554307

Psophia napensis FMNH456446 FMNH Brasil 20.388183 71.61875

Psophia napensis RAP047 MPEG Brasil 7.147403 59.972656

R. berlepschii A11494 INPA Brasil 9.927034 124.542167

R. gymnops FMNH392075 FMNH Brasil 14.873618 146.658885

R. gymnops MPEG69308 MPEG Brasil 16.247109 159.475592

R. gymnops A11932 INPA Brasil 14.151088 114.275234

R. melanosticta MSB36483 MSB Peru 49.789978 305.303487

R. melanosticta LSUB4324 LSU Peru 14.663404 193.16211

R. melanosticta LSUB7001 LSU Peru 13.110432 193.674312

R. melanosticta purusianus PUC179 PUC Brasil 17.44012 165.933474

Table 7. Museum voucher information for samples. The average coverage and its standard deviation for the genomes are presented in the last two columns. FMNH: Field Museum of Natural History; INPA:

Instituto Nacional de Pesquisas da Amazônia; MPEG: Museu Paraense Emilio Goeldi;LSU: Louisiana

Museum of Natural History; MSB: Museum of Southwestern Biology; PUC; Pontifícia Universidade

Católica; YPM: Yale Peabody Museum

79

Psophia Rhegmatorhina Outgroup

Figure 10. Sampling localities of specimens used in the study. The bold lines represent the regions investigated: northwestern Amazonia (NW) is the region highlighted in blue, and southeastern Amazonia

(SE) is in orange. Green represents Psophia samples (P. napensis in NW and P. dextralis in SE), and purple are Rhegmatorhina samples (NW: R. melanosticta, SE: R. gymnops). Triangles represent sampling site of specimens used as outgroups. The stars represent localities with evidence for past environmental changes in Amazonia (from Cheng et al., 2013; D’Apolito et al., 2013; Wang et al., 2017)

2.1 Reference genomes

The specimen used for the Rhegmatorhina reference genome was a female Rhegmatorhina melanosticta from San Martin, Peru (Museum of Southwestern Biology voucher

MSB:Birds:36483; http://arctos.database.museum/guid/MSB:Bird:36483). The genome was sequenced in a linked-read approach, which generated a high-contiguity genome (Coelho et al.,

2019). Detailed descriptions of the sequencing and assembly protocols can be found in Chapter 3

(Coelho et al., 2019).

80

I used the Psophia crepitans genome of a male specimen from Guyana (USNM-625105) available from the Bird 10K project (Zhang, 2015, sequence ID B10K-DU-001-60) as the reference genome for all Psophia genome resequencing. The genome was sequenced with paired end (150bp reads with and insert size of 500kb) and mate-paired (49bp reads with 2000bp insert size) libraries, resulting in a 1.17Gb genome composed of 36535 scaffolds and contigs (N50 of

468kb and 39kb, respectively).

2.2 Genome resequencing

I extracted DNA from tissue samples (Table 9) using the DNeasy Blood & Tissue Kit protocol from Qiagen (Qiagen, Valencia, California). Whole genomes were shotgun sequenced at the

RAPiD Genomics genome sequencing facilities. The samples were pooled in two batches, one of

7 and another of 6 samples for paired-end sequencing in two lanes of the Illumina HiSeqX platform, with a read length of 150bp and insert size of 500bp. The outgroup P. crepitans genome was sequenced in one shared lane (with a target to sequence 50 Gb) of the Illumina

NovaSeq S4 platform at HudsonAlpha, with the same read type and length as previously described. The adapters were removed from raw reads Trimmomatic v.0.38 with the TruSeq3-

PE-2.fa barcode set, trimming the first 2 bases 2:30:10 (Bolger et al 2014). I assessed quality and successful removal of barcodes with FastQC version 0.11.6 (Simon Andrews, Babrahams

Bioinformatics). When adapter contamination was detected after initial trimming (as was the case for P. crepitans), an additional adapter removal step was done with default settings of

AdapterRemoval v.2 with the --identify-adapters function (Schubert et al., 2016)

The raw reads were mapped to references (described in previous section) with bwa v.0.7.15 with the mem function and options -M and -t 28 (Li and Durbin, 2009). I used the picard toolset (from http://broadinstitute.github.io/picard/) to process the raw alignment to indexed bam 81

alignment. First, I converted the raw alignments to sorted bam format with the SortSam function

(sorting order set to coordinate), followed by the addition of read group information with

AddorReplaceReadGroup. I then removed duplicated reads (MarkDuplicates) and collected alignment metrics (CollectRawWgsMetrics). Finally, I created an alignment index

(BuildBamIndex) and validated the alignment (ValidateSamFile). This process was also done with the 10x Chromium genome used as reference, so that this sample could be included in the variant calling steps. I trimmed the first 22 bases (the read location barcode) from the of the raw forward reads of the R. melanosticta sample sequenced with the 10x Chromium platform. I then aligned the reads to the de novo assembled scaffolds as described for other samples.

2.3 Single Polymorphism Calling and Filtering

I followed the Genome Analyses Toolkit (v. 3.8) best practices for non-model organism variant calling (Van der Auwera et al., 2013). First, I called variants for each sample individually with

HaplotypeCaller in the GVCF mode. I set option hets to 0.005, nct to 16, minPrunning and minDanglingBranches to 1 to conserve SNPs with low coverage. I then did a joint genotype calling (GenotypeGVCF) of all samples for each genus combined, given that the inclusion of polymorphism information from different taxa can improve variant calling for populations (Galla et al., 2019).

I plotted the scores of SNP quality metrics (QD, DP, FS, Mq, MQRankSum, SOR and

ReadPosRankSum) in R v. 3.5 (R Core Team, 2019) for the two sets of raw SNPs

(Rhegmatorhina and Psophia), to identify cutoff scores for quality filtering (Fig S4.1). The final scores were the same for the two datasets and were the minimum values suggested in the GATk best practices pipeline: minimum depth < 9; ReadPosRankSum < -2; QD < 9; FS > 20; SOR >

2.0; MQ < 55.0 and MQRankSum < -1.0. Variants were tagged with the VariantFiltration

82

function, and SNPs with passing scores were selected with SelectVariants and –excludeFiltered flag. I used the R library SNPRelate (Zheng et al., 2012) to plot PCA graphs of the samples after haplotype calling, to ascertain sample identification and patterns of genetic variation distribution among populations.

2.4 Demographic reconstruction

I implemented the statistical package SMC++ version v1.15.3 to reconstruct past population size (Ne) through time plots (Terhorst et al., 2017). SMC++ employs a Hidden

Markov Model (HMM) based on linkage information across the genome and unphased SNP distribution. I masked missing sites (base pairs marked as N) from the genomes prior to demographic reconstruction to preclude the false occurrence of long homozygosity stretches. I made missing data masks with the function cutN and flag -gp set to 107 of the software seqtk version 1.2. I indexed the vcf files with filtered SNPs using tabix (version 1.9) of the samtools package (Li, 2011).

HMM-based analyses have better performance with more contiguous genome sequences but can lose accuracy when scaffolds shorter than one million base-pairs (Mb) are used (Gower et al., 2018). Conversely, if the genome scaffold is long enough to comprise uninterrupted linkage blocks, demographic history should be recovered using HMM methods (Patton et al.,

2019). As the contiguity of the reference genome assemblies for Psophia was relatively low

(0.47Mb N50), I tested two different sequence-length cutoffs to assess the tradeoff between increase in sequence contiguity, number of base pairs included in the analyses, and demographic reconstruction. I created two datasets: one including all Psophia scaffolds over 1Mb in length (a total of 287Mb) and another including scaffolds over 500 thousand base-pairs (Kb, a total of

557.7 Mb). To decrease the disparity between the amount of data input for the two genera, I

83

randomly selected 108 scaffolds longer than 2.5Mb from the Rhegmathorina reference genome

(514.2 Mb). After some exploratory analysis of the distribution of SNPs across Psophia scaffolds

(details in results), I found that these numbers (2.5Mb for Rhegmathorina and 500 Kb cutoff for

Psophia) represented a good tradeoff between scaffold contiguity for both genera while decreasing disparity in total sequence length and contiguity that was input into the analyses, thus improving comparability.

The default settings of SMC++ are set to improve performance when analyzing dozens of unphased human genomes. To optimize the parameters for bird genomes, I tested data thinning

(k parameter), number of cross-validation folds (2-5) and the cubic and piecewise spline methods of model building for our dataset (SFig7-9). I built the conditional site-frequency spectrum

(CFSF) for every 300,500,600 and 1200 sites (k=500 in SFig.7). After assessing effects of thinning on demographic curves, I ran all analyses with the default value (k=400). This value is around the suggested calculation of k for our sample size (k=477, given k=log(n)*1000) and also worked well for multiple groups analyzed by the authors (Terhorst et al, 2018). I did not implement thinning for the models built for single individuals (outgroups, more details below): as there is no problem with correlation of branches for a single individual, branches can be assumed to be independent. Running the analysis in two folds (default) recovered the same pattern as 3-5 folds (SFig.8), while being less computationally intensive. Therefore, final analyses were run with 2-fold cross validation. Although the smoothing of abrupt changes in population size by the spline method is likely to better represent natural population dynamics

(Terhorst et al., 2017), I noticed loss in demographic events detected with the piecewise model

(SFig.9) and opted for the latter model for all of our demographic history reconstructions. This

84

approach also facilitates future comparability of our results with other HMM methods that implement this type of demographic reconstruction.

I converted variant information from VCF files for each scaffold to SMC++ input format using the vcf2smc command in SMC++, masking the missing sites (-m flag) of the reference genome. I made input files using each sample taken one at a time as the differentiated individual, resulting in a total number of input files of no of scaffolds * no of samples in population (three, except P. napensis with two samples). This approach results in a final estimate of historic population size with composite likelihood (Terhorst et al 2018). Composite likelihood reduces noise in the data and increases robustness of demographic reconstruction in recent time. I then made input files including all samples in each population and the single outgroup sequences as input for an estimate of split time (split function). The split time is estimated with coalescent scaling, so it was converted to years by multiplying by 2*N0*generation time. I used the cv function with a 1.6e-9 mu/site/generation (Nadachowska-Brzyska et al., 2015b) and a 4.6e-9 mu/site/generation (Smeds et al., 2016) mutation rate for Psophia and Rhegmatorhina, respectively. As natural history information such as recruitment, average breeding female age and survival are lacking for the species studied, I used female reproductive maturity as a conservative lower bounds of generation time (Brommer et al., 2004): two years for Psophia and one for Rhegmatorhina , based on closely related taxa (Sherman, 1995; Willis and Oniki 1978).

These generation times were used for most of the analyses, unless otherwise stated. To explore the effects of differences in generation time (T) in demographic history reconstruction, I also used an upper bound of generation time based on single observation values of yearly recruitment

(1.6 for Psophia, Sherman 1995; 0.5 for Rhegmatorhina,Willson 2004) as proxies for survival

(sf) and female reproductive maturity (rf), where T= rf + 1/sf (Lehtonen and Lanfear, 2014): 2.6

85

and 3 years for Psophia and Rhegmatorhina, respectively. I ran 50 bootstrap analyses for the demographic models by randomly sampling 70% (Efron and Tibshirani, 1997) of the scaffolds of each population and rerunning the analyses. I excluded 5% of the data from either end, as population size inference becomes less robust towards the present and the most distant past retrieved in the analysis.

2.5 Genetic diversity estimates

I estimated nucleotide diversity (pi) for 500 Kb windows across the genomes, only including sites with no missing data, with the –window-pi function from the vcftools package (Danecek et al., 2011). For comparisons between genetic diversity of the two focal regions, I did pairwise comparisons of only the first 500 Kb window of each scaffold, to assure independent between the nucleotide diversity estimates. For this, I excluded all scaffolds under 500 Kb. I then ran paired t-tests in with R v.3.5.3 (R Core team, 2019) between the nucleotide diversity of this subset of scaffolds (388 for Rhegmatorhina and 571 for Psophia).

3. Results

3.1 Genome Resequencing and genetic diversity

The sequencing coverage of the seven resequenced Rhegmatorhina genomes had an average of

14.3x, ranging from 9.9 to 17x (Table 9). I detected 25.62 million (M) raw SNPs across all

Rhegmatorhina genomes that were reduced to 22.65M after quality filtering. The average sequencing coverage for seven resequenced Psophia genomes was 17.4x, ranging from 5.6 to

23.3x (Table 9). I detected 21.4 M SNPs across all Psophia samples.

Genetic diversity was higher in NW populations than in SE populations (Fig. 2) both in the pairwise comparison between Rhegmatorhina populations (t=-21.669, df=387, p< 2e-16) and

86

Psophia (t=-22.93, df = 570, p<2e-16). The average nucleotide diversity in R. melanosticta was

0.0021 and for R. gymnops was 0.0012. Nucleotide diversity was 0.0004 for P. napensis and

0.0002 for P. dextralis. density.default(x = nap.pi.un[, 5])

A B

0

0

0

4

0

0

0

3

y

t

i

s

n

e

0

D

0

0

2

0

0

0

1 0

0.0000 0.0005 0.0010 0.0015 0.0020 0.0025 N = 571 Bandwidth = 5.216e−05 Figure 11.Distribution of nucleotide diversity (pi, x axis) across 500 Kb window genomes in

Rhegmatorhina (A) and Psophia (B). The hashed lines are the means of the distributions.

3.2 Demographic history within regions

The population size through time plots for Psophia built using the 500Kb cutoff created underfitted curves with less independence between sequential Ne values (i.e. less details) in relation to the 1 Mb cutoff (Fig. S4.5), likely due to an increase in noise in the shorter sequences and was more computationally intensive. However, overall demographic history pattern was preserved (SFig.10). I opted to use the 500 Kb-long scaffold dataset to improve comparability with the Rhegmatorhina dataset (both with over a total of 500Mb, as described in methods).

The curve of population size change through time for populations from SE and NW had some overarching patterns: the genomes for both populations from SE had less signal of deep coalescent events than their northwestern counterparts, resulting in a shorter demographic history curve (orange lines in Fig.12-13). Change in population size through time followed the same

87

pattern for both SE populations in the past 300 Ky (e.g. T2-4 in Fig. 14), though the onset of these changes is delayed in R. gymnops in relation to P. dextralis. The oldest demographic events depicted for these populations show a small initial population that experienced two events of steep increase, separated by a period of decrease and relative stability. The population histories in the NW region have idiosyncratic patterns (blue lines in Fig.12-13), other than having medium sized initial populations when compared to SE populations (orange lines, Fig.12-13).

3.3 Demographic history within taxa

The comparative demographic histories within each focal taxon had discrepant patterns.

From 400 Ky to160 Ky, R. melanosticta had a steady increase in population size, with periods of stability interspersed with periods of increase at a rate convergent with shifts in global temperature cycles (Fig.12, orange and blue shaded areas) The gradual change in R. melanosticta population size was interrupted by a marked increase in population size around 160 Ky ago. R. gymnops had a more punctuated pattern of demographic change during this period: a marked increase in population size around 250 Ky ago from a small ancestral population (orange line at

T3 in Fig.12) was followed by a period of relative stability. The split between R. gymnops and its sister group (R. berlepschi, purple triangle on the edge of the range of R. gymnops in Fig. 1,

Ribas et al., 2018) happened during this phase, estimated at 185.5 Ky. Another sharp increase in population size, in which the population doubled in size (from 100 K individuals to 242 K), occurred following the termination of the penultimate glacial maximum (PGM, 125 Ky ago at T2 in Fig.12), corresponding to an increase in precipitation in the region in the Last Interglacial

(LIG). In contrast, there was a sharp decrease in the population size of R. melanosticta around the same time period (118 Ky ago), when the population was reduced by almost half its size in the preceding generations (from 390 K individuals to 220 K). There was a decrease in the 88

R.gymnops population around 90 Ky. The split time estimated between R. melanosticta melanosticta and R. melanosticta purusianus (purple triangle in central Amazonia, Fig. 2) occurred around 80Ky ago. No responses were detected in the period corresponding to the LGM in either of the populations (Fig.12, T1) or any increases in population size thereafter.

In contrast with Rhegmatorhina populations, the demographic histories for Psophia populations in NW and SE Amazonia were more disparate in the deeper temporal scale and converged toward more recent times (Fig. 13). The population of Psophia dextralis had a sharp increase from its initial small population around 530 Ky (Fig.13, T6), whereas the population of

P. napensis had been reduced by half in the preceding years (Fig.13, T7). From that initial discrepancy, the demographic history of the two populations had similar patterns for the following 300 Ky. While a major change in population size was off-phase between

Rhegmatorhina populations during the onset of LIG, populations of both P. napensis and P. dextralis had population expansions leading up to this period (cold period in blue preceding T2,

Fig. 13). Phases of stable population sizes (e.g. between 400-300 Ky and 280-160 Ky ago,

Fig.13), expansion (140 Ky and 169Ky ago for P. napensis and P. dextralis, respectively), and decrease (around 290 Ky ago and again 60 Ky ago) happened in a similar sequence for the two

Psophia populations. P. napensis lagged behind the onset of demographic change for P. dextralis in intervals ranging from 10-30 Ky. Another difference between the two histories was that the amplitude of change during these periods of similar demographic events were mostly greater for

P. dextralis than for P. napensis (Fig.4, 400-50 Ky ago). The split time between P. dextralis and

P. napensis and their outgroups was 334.5 Ky and 887.5 Ky, respectively.

89

The use of different generation times did not alter the demographic patterns within each genus and region, but it does alter the relationship with specific climate events, pushing onsets of population size change from 20 Ky to 200 Ky back (Fig14-15, SFig9).

R. melanosticta 0

0 R. gymnops

0 0

5 Terminations

3

0

0

0

0

5

2

e

N

0

0

0

0

5

1

0 0

0 T1 T2 T3 T4

0 5

0 100 200 300 400

Time (Kyr) Figure 12. Demographic history of two Rhegmatorhina populations across Amazonia: orange lines represent R. gymnops (SE) and blue lines are R. melanosticta melanosticta (NW). Thin lines represent bootstrap runs. The gray lines represent the transitions from glacial maxima to interglacial phases

(terminations, T1-T4 from Cheng et al., 2016). The shaded blue areas are periods of cooler global temperatures and the orange areas are periods of warmer global climate (Cheng et al. 2016).

90

0 P. napensis

0 0

0 P. dextralis 0

3 Terminations

0

0

0

0

0

e

2

N

0

0

0

0

0

1

0

0

0 0

5 T1 T2 T3 T4 T5 T6 T7

0 100 200 300 400 500 600 700 Time(Kyr) Figure 13. Demographic history of two Psophia populations across Amazonia: orange lines represent P.

dextralis (Tapajós) and blue lines are P. napensis (Napo). Thin lines represent bootstrap runs. The gray

lines represent the transitions from glacial maxima to interglacial phases (terminations, T1-T7 from

Cheng et al., 2016). The shaded blue areas are periods of cooler global temperatures and the orange areas

are periods of warmer global climate (Cheng et al. 2016)

91

Figure 14. Comparison between demographic histories of two populations in SE. The thick hashed lines represent the demographic history of the upper bound of generation time for each taxon, and the thin lines represent range of time for each demographic event given lower and upper generation times. Asian monsoon data is from Cheng et al (2016). Note that precipitation in SE is inversely related to the Chinese monsoon (Wang et al., 2017): low 18O values (high precipitation) represent low precipitation in SE.

92

Figure 15. Comparison between demographic histories of two populations in NW. The thick hashed lines represent the demographic history of the upper bound of generation time for each taxon, and the thin lines represent range of time for each demographic event given lower and upper generation times. Asian monsoon data is from Cheng et al (2016). Note that precipitation in Napo is directly corresponds to the

Chinese monsoon (Cheng et al., 2013): low 18O values (high precipitation) represent high precipitation in

Napo.

4. Discussion

I presented herein the comparative population genomics of two taxa that, despite being ecologically and phylogenetically diverse, share many similarities but also revealing discrepancies in their population dynamics in the mid to late Pleistocene. The range of these populations being adjacent to sites with detailed records of past climate in Amazonia adds a powerful context with which to interpret the patterns I have uncovered. The similarity in population fluctuation patterns and genetic diversity among taxa in the unstable SE, coupled with

93

temporally protracted onsets (Psophia) and off-phase simultaneous change in size

(Rhegmatorhina) within populations occurring in different regions reveal complexity and nuance in the relationship between evolutionary history of Amazonian populations and climate cycles.

4.1 Comparative population history

Populations occurring in northwestern Amazonia had idiosyncratic demographic histories.

Precipitation patterns in this region were more stable during global shifts in temperature (Cheng et al., 2013). Ecological differences of the taxa likely prevailed over the effects of broad precipitation shifts and drove highly variable demographic histories for P. napensis and R. melanosticta. Regardless of differences between the change in their population size through time, both these populations had higher genetic diversity than their southeastern counterparts. This difference in genetic diversity across regions suggests that NW Amazonia had a more stable environment during the Pleistocene in relation to SE (Arenas et al., 2012; Carnaval et al., 2009).

The small population sizes and shorter demographic history during deeper coalescent events of

SE populations could be indicative of a reduced ancestral population size/and or loss of genetic diversity through a population bottleneck (deep coalescent events are less frequent and are lost at higher rates during population bottlenecks, Gehara et al., 2017; Patton et al., 2019). In addition to comparable deficits of genomic diversity, SE populations had similar patterns of demographic change in the mid to late Pleistocene, with two cycles of population increase in the past 500 Ky.

This time interval is longer than the oxygen isotope records of past hydroclimate in Amazonia and suggests that the patterns described by Cheng et al. (2013) and Wang et al. (2017) likely prevailed well into the latter half of the Pleistocene.

The relationship between global cooling, regional precipitation, moisture fluctuations and vegetation dynamics is complex, and cooler temperatures do not necessarily equate to drier 94

climate (McGee, 2020; Scheff et al., 2017). Although one species might track changes in humidity, a co-occurring species could be more affected by changes in temperature. Psophia populations occurring in regions with disparate changes in precipitation during glacial maxima had similar demographic events, with short lag-times between their onsets. It is likely that these events of change in population size were within the same glacial cycles, regardless of the methodological constraints in assigning them with precision to any particular cycle (discussed below). Models of past climate in Amazonia depict historical oscillations in precipitation, as oxygen isotope ratios in cave speleothems reflect atmospheric moisture (Chang et al 2013, Wang et al., 2017). As these precipitation patterns largely overlap with long term global patterns of sea level and monsoon cyclical changes, precipitation oscillations can be related to global cycles of cooling and warming temperatures, as well as regional patterns of atmospheric moisture transfer during these global events (Wang et al., 2017). This suggests that, while precipitation varied regionally and oscillations were off-phase in eastern and western Amazonia, changes in temperature likely occurred concurrently across the region. Regionwide records of past temperature cycles are scarce in Amazonia (Wang et al., 2017), but changes in vegetation composition due to encroachment of C3 grasses and Andean taxa suggest large scale cooling of

Amazonia during past glacial cycles (Colinvaux et al., 1996; D’Apolito et al., 2017; Malhi Y. et al., 2004; Punyasena et al., 2008). Psophia populations could be more affected by temperature oscillations than Rhegmatorhina populations, which would drive the similar demographic patterns in populations occurring in geographic and climatic extremes of Amazonia. Genome- wide variants of an Amazonian lizard species also showed signals of synchronous expansion for populations occurring in different regions in Amazonia (Prates et al., 2016). Marked precipitation oscillations in SE likely intensified the effects of global temperature changes in the

95

region, driving the earlier onset of demographic change in P. dextralis, and the convergent patterns of demographic history for the taxa in this region. There is some evidence that

Neotropical birds have increased mortality during temperature increases (Wolfe et al., 2014;

Woodworth et al., 2018). There are also records of a decrease in insectivore bird activity and an increase in frugivore and omnivore bird activity during periods of cooler temperatures in

Amazonia (Willis, 1976). These suggestive ecological findings, along with the evolutionary- scale conclusion in this study highlight the complexity of effects of climate on Amazonian bird evolutionary history. The ideas present here point to the need for more studies with geographically explicit predictions in Amazonia, and the power of trait-based evolutionary biological studies (Papadopoulou and Knowles, 2016)

There was an off-phase relationship during a major event of population size change of R. melanosticta and R. gymnops at the onset of LIG. These findings correspond to expectations based on the precipitation dipole described for Amazonia (Cheng et al., 2013). The study based on mtDNA found synchronous expansion both within NW and SE Amazonia, and among these two regions (Chapter2). The discordance in the genomics data could have methodological reasons. Although studies based on single loci have brought important advances in our understanding of the evolutionary history of Neotropical taxa (Carnaval et al., 2009; Smith et al.,

2014, reviewed in Leite and Rogers, 2013), the information contained in single loci is very limited. The large credibility interval around the co-expansion times found in Chapter 1 could represent a broad convergent demographic pattern for co-occurring taxa around periods of intense climatic change, but not necessarily with exact temporal overlap (but see section on methodological issues with whole-genome based demography below).

96

4.2 Individual population history

Violations of coalescent model assumptions, such as panmixia and no population structure can compromise the estimation of the effective population size (or inverse coalescent rate) in HMM based approaches (Mather et al., 2019; Mazet et al., 2016). Peaks in population size could be associated with an increase in population structure, especially if migration rates between sub- populations are low (Mather et al., 2019). The magnitude of population size change in the deeper time of the demographic plots of P napensis (600-900 Kya) was higher than in any other period, and the population size during this period was conspicuously higher than the average observed throughout the remainder of demographic history recovered. The large ancestral population size of P. napensis, which coincides with the estimated split time from P. crepitans, likely reflects a period of increase in structure, resulting in rapid diversification of the ancestral lineage that gave origin to P. napensis, P. crepitans and P. ochroptera (Ribas et al., 2012). These ancestral populations likely maintained some gene-flow during differentiation, reflected also by the difficulty in recovering relationships between these clades in a phylogenetic framework (Ribas et al., 2012). Explicit tests of demographic models with the genomic data can help elucidate the complex history of recent diversifications and contact zone across the Negro and Branco rivers

(Beichman et al., 2018; Naka and Brumfield, 2018)

Populations in SE Amazonia had more marked changes in size in the past 500 Ky than their NW counterparts. The large population expansion in R. gymnops that broadly coincide with major terminations of past glaciations (T2 and T3 in Fig.12, Cheng et al. 2016) could reflect population recovery during the onset of interglacial periods following cold and dry glacial maxima. While populations in the west track gradual changes in temperature within glacial cycles, eastern populations have abrupt changes in populations size, probably related to marked

97

decreases in precipitation during glacial maxima (Wang et al., 2017). Although historical precipitation oscillations over millennial scales as well in longer orbital scales of the glacial cycles (Cheng et al., 2013; Zular et al., 2019), Amazonian bird populations have stronger demographic responses to the climatic conditions that result from the convergence of shifts in precipitation and temperature extremes during glacial to interglacial transitions. Changes in population size recovered in our study co-occur in cycles of durations comparable to glacial cycles. The general pattern of demographic cycles corresponding to 100 Ky glacial cycles was also found for multiple birds in a reconstruction based on genomes (Nadachowska-Brzyska et al.,

2015b). I did not detect nuanced demographic changes associated with short-cycled precession and orbital forcing (Cheng et al., 2016; McGee, 2020, climate curves in Fig.14-15). Either precipitation changes occurring in shorter time-scales within glacial cycles did not have major effects on Amazonian bird populations, or these changes were overridden by changes in response to the major climatic shifts in the transition from glacial to interglacial periods.

4.3 Methodological considerations

Different values for species specific parameters such as generation time and mutation rates have profound effects on the timing and amplitude of demographic events recovered with coalescent based methods. My test of generation times based on ecological studies of taxa closely related to this study’s focal populations resulted in a large temporal window around each demographic event. Despite this, the relative change in size and the timing of events are comparable across congeneric populations, given that potential error imputed in the analysis are shared by the two taxa (Vijay et al., 2018). Therefore, the similar demographic histories in Psophia, along with the synchronous off-phase demographic events for Rhegmatorhina are compelling evidence for the

98

roles of ecological traits of species and region-wide climate changes in shaping the current biodiversity patterns in Amazonia.

4.4 Historic demography in Amazonia

Comparative inferences of past evolutionary events across different populations adds power to the reconstruction of past environmental changes (Avise et al., 2016). Past studies comparing individual demographic histories of Amazonian birds based on single locus sequencing across

Amazonia have uncovered a complex and difficult-to- interpret mixture of populations with a history of demographic stability, expansions, and to a lesser extent, bottlenecks, without a clear geographic pattern (e.g. Fernandes et al., 2012, Ferreira et al, 2016, Silva et al., 2019, Thom et al., 2015, Silva et al., 2019). Conversely, our reconstruction of demographic history based on genome-wide variation detects demographic fluctuations both in western Amazonia and in eastern Amazonia. This pattern was also found for a joint analysis of mitochondrial DNA of multiple co-occurring populations in western Amazonia (Chapter 1). Both this study and the phylogeographic studies cited here and in Chapter 1 find signals of demographic change predating the LGM. This pattern is in stark contrast with the prevailing widespread effect of the

LGM on population dynamics of temperate species (Hewitt, 2004; Hofreiter et al., 2004;

Hofreiter and Stewart, 2009; Shafer et al., 2010). It is possible that glacial cycles previous to the

LGM had more severe effects on tropical landscapes. Reconstructing recent demography is challenging because a large amount of data (such as genome wide SNPs of multiple individuals) is required to increase the chances of recovering recent coalescent events that are naturally rarer towards the present (Beichman et al., 2018; Puckett and Munshi-South, 2019). For example, demographic reconstructions based on a few hundred individuals detected recent (post LGM) population shifts due to differential landscape use by burrowing owls (Mueller et al., 2018) and

99

barn swallows (Scordato et al., 2017). An increase in individuals sampled per population could add sufficient nuance to untangle the effects of these different forces on demographic history.

These findings reflect the general pattern of lineage age in Amazonia based on Sanger sequencing data in which western lineages are older than their eastern counterparts (Silva et al.,

2019), probably due to a higher rate of loss of lineages and genetic diversity in eastern

Amazonia. The comparative approach employed in this study shows that major demographic changes are affected by regional and global climate cycles, which reinforces the notion that

Amazonian biodiversity patterns are driven not only by regional idiosyncratic changes in landscape (such as riverine system formation), but also by global climatic cycles (Silva et al.,

2019).

100

References

Alcaide, M. et al., 2014. Genomic divergence in a ring species complex. Nature, 511(7507), pp.83–85.

Aleixo, A., 2004. Historical diversification of a terra‐firme forest bird superspecies: a phylogeographic perspective on the role of different hypotheses of Amazonian diversification. Evolution, 58(6), pp.1303-1317.

Allmon, W.D., 1992. A causal analysis of stages in allopatric speciation. Oxford surveys in evolutionary biology, 8, pp.219-219.

Altschul, S.F. et al., 1990. Basic local alignment search tool. Journal of molecular biology, 215(3), pp.403–410.

Andermann, T. et al., 2019. Allele Phasing Greatly Improves the Phylogenetic Utility of Ultraconserved Elements. Systematic biology, 68(1), pp.32–46.

Andersen, M.J. et al., 2018. A phylogeny of kingfishers reveals an Indomalayan origin and elevated rates of diversification on oceanic islands. Journal of biogeography, 45(2), pp.269–281.

Andrews, C.B. & Gregory, T.R., 2009. Genome size is inversely correlated with relative brain size in parrots and cockatoos. Genome / National Research Council Canada = Genome / Conseil national de recherches Canada, 52(3), pp.261–267.

Antonelli, A., Quijada-Mascareñas, A., Crawford, A.J., Bates, J.M., Velazco, P.M., Wüster, W., 2011. Molecular Studies and Phylogeography of Amazonian Tetrapods and their Relation to Geological and Climatic Models, in: Hoorn, C., Wesselingh, F.P. (Eds.), Amazonia: Landscape and Species Evolution. Wiley-Blackwell Publishing Ltd., Oxford, UK, pp. 386–404. https://doi.org/10.1002/9781444306408.ch24

Antonelli, A., Zizka, A., Carvalho, F.A., Scharn, R., Bacon, C.D., Silvestro, D. and Condamine,

101

F.L., 2018a. Amazonia is the primary source of Neotropical biodiversity. Proceedings of the National Academy of Sciences, 115(23), pp.6034-6039.

Antonelli, A., Ariza, M., Albert, J., Andermann, T., Azevedo, J., Bacon, C., Faurby, S., Guedes, T., Hoorn, C., Lohmann, L.G., Matos-Maraví, P., Ritter, C.D., Sanmartín, I., Silvestro, D., Tejedor, M., ter Steege, H., Tuomisto, H., Werneck, F.P., Zizka, A., Edwards, S.V., 2018b. Conceptual and empirical advances in Neotropical biodiversity research. PeerJ 6, e5644. https://doi.org/10.7717/peerj.5644

Arbogast, B.S., Kenagy, G.J., 2001. Comparative phylogeography as an integrative approach to historical biogeography. Journal of Biogeography 28, 819–825. https://doi.org/10.1046/j.1365-2699.2001.00594.x

Arenas, M. and Posada, D., 2010. Coalescent simulation of intracodon recombination. Genetics, 184(2), pp.429-437.

Arenas, M., Ray, N., Currat, M., Excoffier, L., 2012. Consequences of Range Contractions and Range Shifts on Molecular Diversity. Molecular Biology and Evolution 29, 207–218. https://doi.org/10.1093/molbev/msr187

Arruda, D.M., Schaefer, C.E.G.R., Fonseca, R.S., Solar, R.R.C., Fernandes‐Filho, E.I., 2018. Vegetation cover of Brazil in the last 21 ka: New insights into the Amazonian refugia and Pleistocenic arc hypotheses. Global Ecology and Biogeography 27, 47–56. https://doi.org/10.1111/geb.12646

Avise, J.C., Bowen, B.W., Ayala, F.J., 2016. In the light of evolution X: Comparative phylogeography. Proceedings of the National Academy of Sciences 113, 7957–7961. https://doi.org/10.1073/pnas.1604338113

Baker, P.A., Fritz, S.C., Dick, C.W., Eckert, A.J., Horton, B.K., Manzoni, S., Ribas, C.C., Garzione, C.N. and Battisti, D.S., 2014. The emerging field of geogenomics: constraining geological problems with genetic data. Earth-Science Reviews, 135, pp.38-47.

Baker, P.A. and Fritz, S.C., 2015. Nature and causes of Quaternary climate variation of tropical South America. Quaternary Science Reviews, 124, pp.31-47.

102

Baker P. A., Fritz S. C., Dick C. W., Battisti D. S., Vargas O. M., Asner G. P., Martin R. E., Wheatley A., Prates I. 2019. Beyond Refugia: New insights on Quaternary climate variation and the evolution of biotic diversity in tropical South America. In: Rull V., Carnaval A. C. (Eds.), Neotropical Diversification. Springer, Berlin.

Banks, S.C., Cary, G.J., Smith, A.L., Davies, I.D., Driscoll, D.A., Gill, A.M., Lindenmayer, D.B. and Peakall, R., 2013. How does ecological disturbance influence genetic diversity?. Trends in ecology & evolution, 28(11), pp.670-679.

Beheregaray, L.B., Ciofi, C., Geist, D., Gibbs, J.P., Caccone, A. and Powell, J.R., 2003. Genes record a prehistoric volcano eruption in the Galápagos. Science, 302(5642), pp.75-75.

Beichman, A.C., Huerta-Sanchez, E., Lohmueller, K.E., 2018. Using Genomic Data to Infer Historic Population Dynamics of Nonmodel Organisms. Annual Review of Ecology, Evolution, and Systematics 49, 433–456. https://doi.org/10.1146/annurev-ecolsys- 110617-062431

Bennett, K.D., Bhagwat, S.A. and Willis, K.J., 2012. Neotropical refugia. The Holocene, 22(11), pp.1207-1214

Blum, M.G. and François, O., 2010. Non-linear regression models for Approximate Bayesian Computation. Statistics and computing, 20(1), pp.63-73.

Bonaccorso, E., Koch, I., Peterson, A.T., 2006. Pleistocene fragmentation of Amazon species’ ranges. Diversity and Distributions 12, 157–164. https://doi.org/10.1111/j.1366- 9516.2005.00212.x

Botero-Castro, F., Figuet, E., Tilak, M.K., Nabholz, B. and Galtier, N., 2017. Avian genomes revisited: hidden genes uncovered and the rates versus traits paradox in birds. Molecular biology and evolution, 34(12), pp.3123-3131.

Brawn, J.D., Benson, T.J., Stager, M., Sly, N.D. and Tarwater, C.E., 2017. Impacts of changing rainfall regime on the demography of tropical birds. Nature Climate Change, 7(2), pp.133-136.

Brommer, J.E., Gustafsson, L., Pietiäinen, H., Merilä, J., 2004. Single‐Generation Estimates of

103

Individual Fitness as Proxies for Long‐Term Genetic Contribution. The American Naturalist 163, 505–517. https://doi.org/10.1086/382547

Burbrink, F.T., Chan, Y.L., Myers, E.A., Ruane, S., Smith, B.T., Hickerson, M.J., 2016. Asynchronous demographic responses to Pleistocene climate change in Eastern Nearctic vertebrates. Ecology Letters 19, 1457–1467. https://doi.org/10.1111/ele.12695

Burney, C.W. and Brumfield, R.T., 2009. Ecology predicts levels of genetic differentiation in Neotropical birds. The American Naturalist, 174(3), pp.358-368

Bush, M.B., Flenley, J., Gosling, W.D. (Eds.), 2011. Tropical rainforest responses to climatic change, 2nd ed. ed, Springer-Praxis books in environmental sciences. Springer, Chichester, UK.

Cadena, C.D., Klicka, J., Ricklefs, R.E., 2007. Evolutionary differentiation in the Neotropical montane region: Molecular phylogenetics and phylogeography of Buarremon brush- finches (Aves, Emberizidae). Molecular Phylogenetics and Evolution 44, 993–1016. https://doi.org/10.1016/j.ympev.2006.12.012

Capurucho, J.M.G., Cornelius, C., Borges, S.H., Cohn-Haft, M., Aleixo, A., Metzger, J.P., Ribas, C.C., 2013. Combining phylogeography and landscape genetics of Xenopipo atronitens (Aves: Pipridae), a white sand campina specialist, to understand Pleistocene landscape evolution in Amazonia. Biol J Linn Soc 110, 60–76. https://doi.org/10.1111/bij.12102

Carnaval, A.C., Hickerson, M.J., Haddad, C.F.B., Rodrigues, M.T., Moritz, C., 2009. Stability Predicts Genetic Diversity in the Brazilian Atlantic Forest Hotspot. Science 323, 785– 789. https://doi.org/10.1126/science.1166955

Carstens, B.C., Knowles, L.L., 2007. Shifting distributions and speciation: species divergence during rapid climate change. Molecular Ecology 16, 619–627. https://doi.org/10.1111/j.1365-294X.2006.03167.x

Carstens, B.C., Morales, A.E., Field, K., Pelletier, T.A., 2018. A global analysis of bats using automated comparative phylogeography uncovers a surprising impact of Pleistocene glaciation. Journal of Biogeography 45, 1795–1805. https://doi.org/10.1111/jbi.13382

104

Chan, Y.L., Schanzenbach, D., Hickerson, M.J., 2014. Detecting Concerted Demographic Response across Community Assemblages Using Hierarchical Approximate Bayesian Computation. Mol Biol Evol 31, 2501–2515. https://doi.org/10.1093/molbev/msu187

Cheng, H., Sinha, A., Cruz, F.W., Wang, X., Edwards, R.L., d’Horta, F.M., Ribas, C.C., Vuille, M., Stott, L.D., Auler, A.S., 2013. Climate change patterns in Amazonia and biodiversity. Nature Communications 4. https://doi.org/10.1038/ncomms2415

Coelho, L., Musher, L., Cracraft, J., 2019. A Multireference-Based Whole Genome Assembly for the Obligate Ant-Following Antbird, Rhegmatorhina melanosticta (Thamnophilidae). Diversity 11, 144. https://doi.org/10.3390/d11090144

Cohen, M.C.L., Rossetti, D.F., Pessenda, L.C.R., Friaes, Y.S., Oliveira, P.E., 2014. Late Pleistocene glacial forest of Humaitá—Western Amazonia. Palaeogeography, Palaeoclimatology, Palaeoecology 415, 37–47. https://doi.org/10.1016/j.palaeo.2013.12.025

Colinvaux, P.A., De Oliveira, P.E., Moreno, J.E., Miller, M.C., Bush, M.B., 1996. A Long Pollen Record from Lowland Amazonia: Forest and Cooling in Glacial Times. Science 274, 85–88. https://doi.org/10.1126/science.274.5284.85

Colwell, R.K., 2000. A barrier runs through it … or maybe just a river. Proc Natl Acad Sci U S A 97, 13470–13472.

Corander, J., Marttinen, P., Sirén, J. and Tang, J., 2008. Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC bioinformatics, 9(1), p.539.

Cracraft, J., 1985. Historical biogeography and patterns of differentiation within the South American avifauna: areas of endemism. Ornithological monographs, pp.49-84.

Cracraft, J., Prum, R.O., 1988. Patterns and Processes of Diversification: Speciation and Historical Congruence in Some Neotropical Birds. Evolution 42, 603–620. https://doi.org/10.1111/j.1558-5646.1988.tb04164.x

Crowley, T.J. and North, G.R., 1988. Abrupt climate change and extinction events in earth history. Science, 240(4855), pp.996-1002.

105

Csilléry, K., Blum, M.G.B., Gaggiotti, O.E., François, O., 2010. Approximate Bayesian Computation (ABC) in practice. Trends in Ecology & Evolution 25, 410–418. https://doi.org/10.1016/j.tree.2010.04.001

Csilléry, K., François, O. and Blum, M.G., 2012. abc: an R package for approximate Bayesian computation (ABC). Methods in ecology and evolution, 3(3), pp.475-479

D’Apolito, C., Absy, M.L., Latrubesse, E.M., 2017. The movement of pre-adapted cool taxa in north-central Amazonia during the last glacial. Quaternary Science Reviews 169, 1–12. https://doi.org/10.1016/j.quascirev.2017.05.017.

D’Apolito, C., Absy, M.L., Latrubesse, E.M., 2013. The Hill of Six Lakes revisited: new data and re-evaluation of a key Pleistocene Amazon site. Quaternary Science Reviews 76, 140–155. https://doi.org/10.1016/j.quascirev.2013.07.013. d’Horta, F.M., Cuervo, A.M., Ribas, C.C., Brumfield, R.T. and Miyaki, C.Y., 2013. Phylogeny and comparative phylogeography of Sclerurus (Aves: Furnariidae) reveal constant and cryptic diversification in an old radiation of rain forest understorey specialists. Journal of Biogeography, 40(1), pp.37-49

Damas, J. et al., 2019. Avian Chromosomal Evolution. In R. H. S. Kraus, ed. Avian Genomics in Ecology and Evolution: From the Lab into the Wild. Cham: Springer International Publishing, pp. 69–92.

Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G., Durbin, R., 2011. The variant call format and VCFtools. Bioinformatics 27, 2156–2158. https://doi.org/10.1093/bioinformatics/btr330

Del Hoyo, J. et al., 2014. Handbook of the birds of the world alive. Lynx Edicions, Barcelona.

Derryberry, E.P. & Claramunt, S., 2011. Lineage diversification and morphological evolution in a large‐scale continental radiation: the Neotropical ovenbirds and woodcreepers (Aves: Furnariidae). Evolution. Available at: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1558-5646.2011.01374.x.

106

Dickinson, E.C. & Christidis, L., 2014. The Howard and Moore complete checklist of the birds of the World: Passerines, Aves Press.

Dolezel, J. et al., 2003. Nuclear DNA content and genome size of trout and human. Cytometry. Part A: the journal of the International Society for Analytical Cytology, 51(2), pp.127–8; author reply 129.

Drummond, A.J., 2005. Bayesian Coalescent Inference of Past Population Dynamics from Molecular Sequences. Molecular Biology and Evolution 22, 1185–1192. https://doi.org/10.1093/molbev/msi103

Drummond, A.J. and Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC evolutionary biology, 7(1), p.214.

Dynesius, M., Jansson, R., 2000. Evolutionary consequences of changes in species’ geographical distributions driven by Milankovitch climate oscillations. Proceedings of the National Academy of Sciences 97, 9115–9120. https://doi.org/10.1073/pnas.97.16.9115

Eaton, D.A.R. et al., 2017. Misconceptions on Missing Data in RAD-seq Phylogenetics with a Deep-scale Example from Flowering Plants. Systematic biology, 66(3), pp.399–412.

Eaton, D.A.R. & Overcast, I., 2016. ipyrad: interactive assembly and analysis of RADseq data sets.

Efron, B., Tibshirani, R., 1997. Improvements on Cross-Validation: The .632+ Bootstrap Method. Journal of the American Statistical Association 92, 548–560. https://doi.org/10.2307/2965703

Elgvin, T.O. et al., 2017. The genomic mosaicism of hybrid speciation. Science advances, 3(6), p.e1602996.

Ellegren, H. et al., 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature, 491(7426), pp.756–760.

Ellegren, H., 2013. The Evolutionary Genomics of Birds. Annual review of ecology, evolution,

107

and systematics, 44(1), pp.239–259.

Ellegren, H., 2014. Genome sequencing and population genomics in non-model organisms. Trends in ecology & evolution, 29(1), pp.51–63.

English, A.C. et al., 2012. Mind the gap: upgrading genomes with Pacific Biosciences RS long- read sequencing technology. PloS one, 7(11), p.e47768.

Esquivel‐Muelbert, A., Baker, T.R., Dexter, K.G., Lewis, S.L., Brienen, R.J.W., Feldpausch, T.R., Lloyd, J., Monteagudo‐Mendoza, A., Arroyo, L., Álvarez‐Dávila, E., Higuchi, N., Marimon, B.S., Marimon‐Junior, B.H., Silveira, M., Vilanova, E., Gloor, E., Malhi, Y., Chave, J., Barlow, J., Bonal, D., Cardozo, N.D., Erwin, T., Fauset, S., Hérault, B., Laurance, S., Poorter, L., Qie, L., Stahl, C., Sullivan, M.J.P., Steege, H. ter, Vos, V.A., Zuidema, P.A., Almeida, E., Oliveira, E.A. de, Andrade, A., Vieira, S.A., Aragão, L., Araujo‐Murakami, A., Arets, E., C, G.A.A., Camargo, P.B., Barroso, J.G., Bongers, F., Boot, R., Camargo, J.L., Castro, W., Moscoso, V.C., Comiskey, J., Valverde, F.C., Costa, A.C.L. da, Pasquel, J. del A., Fiore, T.D., Duque, L.F., Elias, F., Engel, J., Llampazo, G.F., Galbraith, D., Fernández, R.H., Coronado, E.H., Hubau, W., Jimenez‐Rojas, E., Lima, A.J.N., Umetsu, R.K., Laurance, W., Lopez‐Gonzalez, G., Lovejoy, T., Cruz, O.A.M., Morandi, P.S., Neill, D., Vargas, P.N., Pallqui, N.C., Gutierrez, A.P., Pardo, G., Peacock, J., Peña‐Claros, M., Peñuela‐Mora, M.C., Petronelli, P., Pickavance, G.C., Pitman, N., Prieto, A., Quesada, C., Ramírez‐Angulo, H., Réjou‐Méchain, M., Correa, Z.R., Roopsind, A., Rudas, A., Salomão, R., Silva, N., Espejo, J.S., Singh, J., Stropp, J., Terborgh, J., Thomas, R., Toledo, M., Torres‐Lezama, A., Gamarra, L.V., Meer, P.J. van de, Heijden, G. van der, Hout, P. van der, Martinez, R.V., Vela, C., Vieira, I.C.G., Phillips, O.L., n.d. Beyond Refugia: New insights on Quaternary climate variation and the evolution of biotic diversity in tropical South America. Global Change Biology 0. https://doi.org/10.1111/gcb.14413

Faircloth, B.C., 2016. PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics , 32(5), pp.786–788.

Faircloth, B.C. et al., 2012. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic biology, 61(5), pp.717–726.

Flantua, S.G.A., Hooghiemstra, H., Grimm, E.C., Behling, H., Bush, M.B., González-Arango, C., Gosling, W.D., Ledru, M.-P., Lozano-García, S., Maldonado, A., Prieto, A.R., Rull, V., Van Boxel, J.H., 2015. Updated site compilation of the Latin American Pollen Database. Review of Palaeobotany and Palynology 223, 104–115.

108

https://doi.org/10.1016/j.revpalbo.2015.09.008

Fernandes, A.M., Wink, M., Aleixo, A., 2012. Phylogeography of the chestnut-tailed antbird (Myrmeciza hemimelaena) clarifies the role of rivers in Amazonian biogeography. Journal of Biogeography 39, 1524–1535. https://doi.org/10.1111/j.1365- 2699.2012.02712.x

Fernandes, A.M., Gonzalez, J., Wink, M. and Aleixo, A., 2013. Multilocus phylogeography of the Wedge-billed Woodcreeper Glyphorynchus spirurus (Aves, Furnariidae) in lowland Amazonia: Widespread cryptic diversity and paraphyly reveal a complex diversification pattern. Molecular Phylogenetics and Evolution, 66(1), pp.270-282

Fernandes, A.M., Cohn-Haft, M., Hrbek, T., Farias, I.P., 2014a. Rivers acting as barriers for bird dispersal in the Amazon. Revista Brasileira de Ornitologia 11

Fernandes, AM, Wink, M., Sardelli, CH and Aleixo, A., 2014b. Multiple speciation across the A ndes and throughout A mazonia: the case of the spot ‐ backed antbird species complex (H ylophylax naevius / H ylophylax naevioides) , Journal of biogeography , 41 (6), pp.1094- 1104

Ferreira, M., Aleixo, A., Ribas, C.C. and Santos, M.P.D., 2017. Biogeography of the Neotropical genus Malacoptila (Aves: Bucconidae): the influence of the Andean orogeny, Amazonian drainage evolution and palaeoclimate. Journal of Biogeography, 44(4), pp.748-759

Ferreira, M., Fernandes, A.M., Aleixo, A., Antonelli, A., Olsson, U., Bates, J.M., Cracraft, J. and Ribas, C.C., 2018. Evidence for mtDNA capture in the jacamar Galbula leucogastra/chalcothorax species-complex and insights on the evolution of white-sand ecosystems in the Amazon basin. Molecular phylogenetics and evolution, 129, pp.149- 157.

Fontes, D., Cordeiro, R.C., Martins, G.S., Behling, H., Turcq, B., Sifeddine, A., Seoane, J.C.S., Moreira, L.S., Rodrigues, R.A., 2017. Paleoenvironmental dynamics in South Amazonia, Brazil, during the last 35,000 years inferred from pollen and geochemical records of Lago do Saci. Quaternary Science Reviews 173, 161–180. https://doi.org/10.1016/j.quascirev.2017.08.021

109

Fordham, D.A., Brown, S.C., Wigley, T.M.L., Rahbek, C., 2019. Cradles of diversity are unlikely relics of regional climate stability. Current Biology 29, R356–R357. https://doi.org/10.1016/j.cub.2019.04.001

Freeman, B.G., Class Freeman, A.M., 2014. Rapid upslope shifts in New Guinean birds illustrate strong distributional responses of tropical montane species to global warming. Proceedings of the National Academy of Sciences 111, 4490–4494. https://doi.org/10.1073/pnas.1318190111

Galla, S.J., Forsdick, N.J., Brown, L., Hoeppner, M.P., Knapp, M., Maloney, R.F., Moraga, R., Santure, A.W., Steeves, T.E., 2019. Reference Genomes from Distantly Related Species Can Be Used for Discovery of Single Nucleotide Polymorphisms to Inform Conservation Management. Genes 10, 9. https://doi.org/10.3390/genes10010009

Garg, K.M., Chattopadhyay, B., Wilton, P.R., Malia Prawiradilaga, D., Rheindt, F.E., 2018. Pleistocene land bridges act as semipermeable agents of avian gene flow in Wallacea. Molecular Phylogenetics and Evolution 125, 196–203. https://doi.org/10.1016/j.ympev.2018.03.032

Garzón‐Orduña, I.J., Benetti‐Longhini, J.E., Brower, A.V.Z., 2014. Timing the diversification of the Amazonian biota: butterfly divergences are consistent with Pleistocene refugia. Journal of Biogeography 41, 1631–1638. https://doi.org/10.1111/jbi.12330

Gavin, D.G., Fitzpatrick, M.C., Gugger, P.F., Heath, K.D., Rodríguez‐Sánchez, F., Dobrowski, S.Z., Hampe, A., Hu, F.S., Ashcroft, M.B., Bartlein, P.J. and Blois, J.L., 2014. Climate refugia: joint inference from fossil records, species distribution models and phylogeography. New Phytologist, 204(1), pp.37-54.

Gehara, M., Garda, A.A., Werneck, F.P., Oliveira, E.F., Fonseca, E.M. da, Camurugi, F., Magalhães, F. de M., Lanna, F.M., Sites, J.W., Marques, R., Silveira‐Filho, R., Pedro, V.A.S., Colli, G.R., Costa, G.C., Burbrink, F.T., 2017. Estimating synchronous demographic changes across populations using hABC and its application for a herpetological community from northeastern Brazil. Molecular Ecology 26, 4756–4771. https://doi.org/10.1111/mec.14239

Gel, B. & Serra, E., 2017. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics , 33(19), pp.3088–3090.

110

Genome 10K Community of Scientists, 2009. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. The Journal of heredity, 100(6), pp.659–674.

Gómez Cano, A.R., Cantalapiedra, J.L., Mesa, A., Moreno Bofarull, A., Hernández Fernández, M., 2013. Global climate changes drive ecological specialization of mammal faunas: trends in rodent assemblages from the Iberian Plio-Pleistocene. BMC Evolutionary Biology 13, 94. https://doi.org/10.1186/1471-2148-13-94

Goodall-Copestake, W.P., Tarling, G.A. and Murphy, E.J., 2012. On the comparison of population-level estimates of haplotype and nucleotide diversity: a case study using the gene cox1 in . Heredity, 109(1), pp.50-56

Gower, G., Tuke, J., Rohrlach, A.B., Soubrier, J., Llamas, B., Bean, N., Cooper, A., 2018. Population size history from short genomic scaffolds: how short is too short? bioRxiv 382036. https://doi.org/10.1101/382036

Grant, W.S., 2015. Problems and Cautions With Sequence Mismatch Analysis and Bayesian Skyline Plots to Infer Historical Demography. J Hered 106, 333–346. https://doi.org/10.1093/jhered/esv020

Gu, Z. et al., 2014. circlize Implements and enhances circular visualization in R. Bioinformatics , 30(19), pp.2811–2812.

Hackett, S.J., 1993. Phylogenetic and biogeographic relationships in the Neotropical genus Gymnopithys (Formicariidae). The Wilson bulletin, pp.301–315.

Haffer, J., 2008. Hypotheses to explain the origin of species in Amazonia. Braz. J. Biol. 68, 917– 947. https://doi.org/10.1590/S1519-69842008000500003

Haffer, J., 1969. Speciation in Amazonian forest birds. Science, 165(3889), pp.131-137.

Harvey, M.G., Brumfield, R.T., 2015. Genomic variation in a widespread Neotropical bird (Xenops minutus) reveals divergence, population expansion, and gene flow. Molecular Phylogenetics and Evolution 83, 305–316. https://doi.org/10.1016/j.ympev.2014.10.023

111

Harvey, M.G. et al., 2017. Habitat Association Predicts Genetic Diversity and Population Divergence in Amazonian Birds. The American naturalist, 190(5), pp.631–648.

Harvey, M.G., Singhal, S., Rabosky, D.L., 2019. Beyond Reproductive Isolation: Demographic Controls on the Speciation Process. Annual Review of Ecology, Evolution, and Systematics 50, null. https://doi.org/10.1146/annurev-ecolsys-110218-024701

Heller, R., Chikhi, L. and Siegismund, H.R., 2013. The confounding effect of population structure on Bayesian skyline plot inferences of demographic history. PloS one, 8(5)

Hermanowski, B., da Costa, M.L. and Behling, H., 2012a. Environmental changes in southeastern Amazonia during the last 25,000 yr revealed from a paleoecological record. Quaternary Research, 77(1), pp.138-148.

Hermanowski, B., da Costa, M.L., Carvalho, A.T., Behling, H., 2012b. Palaeoenvironmental dynamics and underlying climatic changes in southeast Amazonia (Serra Sul dos Carajás, Brazil) during the late Pleistocene and Holocene. Palaeogeography, Palaeoclimatology, Palaeoecology 365–366, 227–246. https://doi.org/10.1016/j.palaeo.2012.09.030

Hewitt, G.M., 2004. Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 359, 183–195. https://doi.org/10.1098/rstb.2003.1388

Hickerson, M.J., Meyer, C.P., 2008. Testing comparative phylogeographic models of marine vicariance and dispersal using a hierarchical Bayesian approach. BMC Evolutionary Biology 8, 322. https://doi.org/10.1186/1471-2148-8-322

Hickerson, M.J., Carstens, B.C., Cavender-Bares, J., Crandall, K.A., Graham, C.H., Johnson, J.B., Rissler, L., Victoriano, P.F., Yoder, A.D., 2010. Phylogeography’s past, present, and future: 10 years after Avise, 2000. Molecular Phylogenetics and Evolution 54, 291–301. https://doi.org/10.1016/j.ympev.2009.09.016

Ho, S.Y.W., Lanfear, R., Bromham, L., Phillips, M.J., Soubrier, J., Rodrigo, A.G., Cooper, A., 2011. Time-dependent rates of molecular evolution. Molecular Ecology 20, 3087–3101. https://doi.org/10.1111/j.1365-294X.2011.05178.x

112

Hofreiter, M., Serre, D., Rohland, N., Rabeder, G., Nagel, D., Conard, N., Münzel, S., Pääbo, S., 2004. Lack of phylogeography in European mammals before the last glaciation. PNAS 101, 12963–12968. https://doi.org/10.1073/pnas.0403618101

Hofreiter, M., Stewart, J., 2009. Ecological Change, Range Fluctuations and Population Dynamics during the Pleistocene. Current Biology 19, R584–R594. https://doi.org/10.1016/j.cub.2009.06.030

Hohenlohe, P.A., Hand, B.K., Andrews, K.R., Luikart, G., 2019. Population Genomics Provides Key Insights in Ecology and Evolution, in: Rajora, O.P. (Ed.), Population Genomics: Concepts, Approaches and Applications, Population Genomics. Springer International Publishing, Cham, pp. 483–510. https://doi.org/10.1007/13836_2018_20

Hoorn, C., Wesselingh, F.P., ter Steege, H., Bermudez, M.A., Mora, A., Sevink, J., Sanmartin, I., Sanchez-Meseguer, A., Anderson, C.L., Figueiredo, J.P., Jaramillo, C., Riff, D., Negri, F.R., Hooghiemstra, H., Lundberg, J., Stadler, T., Sarkinen, T., Antonelli, A., 2010. Amazonia Through Time: Andean Uplift, Climate Change, Landscape Evolution, and Biodiversity. Science 330, 927–931. https://doi.org/10.1126/science.1194585

Hron, T. et al., 2015. Hidden genes in birds. Genome biology, 16, p.164.

Hudson, R. R. 2002. Generating samples under a Wright‐Fisher neutral model of genetic variation. Bioinformatics, 18, 337– 338.

Irion, G., Kalliola, R., 2010. Fluvial landscape evolution in lowland Amazonia during the Quaternary. Amazonia, Landscape and Species Evolution: A Look Into the Past 185–197.

Irwin, D.E. et al., 2018. A comparison of genomic islands of differentiation across three young avian species pairs. Molecular ecology, 27(23), pp.4839–4855.

Isler, M.L., Bravo, G.A. & Brumfield, R.T., 2014. Systematics of the obligate ant-following clade of antbirds (Aves: Passeriformes: Thamnophilidae). The Wilson journal of ornithology, 126(4), pp.635–648.

Isler, M.L., Isler, P.R. & Whitney, B.M., 2007. Species Limits in Antbirds (Thamnophilidae): The Warbling Antbird (Hypocnemis Cantator) Complex. The Auk, 124(1), pp.11–28.

113

Janzen, D.H., 1967. Why Mountain Passes are Higher in the Tropics. The American Naturalist 101, 233–249.

Jarvis, E.D. et al., 2015. Phylogenomic analyses data of the avian phylogenomics project. GigaScience, 4, p.4.

Jarvis, E.D. et al., 2014. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science, 346(6215), pp.1320–1331.

Jenkins, C.N., Pimm, S.L., Joppa, L.N., 2013. Global patterns of terrestrial vertebrate diversity and conservation. Proceedings of the National Academy of Sciences 110, E2602–E2610. https://doi.org/10.1073/pnas.1302251110

Kawakami, T. et al., 2014. A high-density linkage map enables a second-generation collared flycatcher genome assembly and reveals the patterns of avian recombination rate variation and chromosomal evolution. Molecular ecology, 23(16), pp.4035–4058.

Kent, W.J. et al., 2002. The human genome browser at UCSC. Genome research, 12(6), pp.996– 1006.

Kim, J. et al., 2013. Reference-assisted chromosome assembly. Proceedings of the National Academy of Sciences of the United States of America, 110(5), pp.1785–1790.

Knowles, L.L., 2009. Statistical Phylogeography. Annual Review of Ecology, Evolution, and Systematics 40, 593–612. https://doi.org/10.1146/annurev.ecolsys.38.091206.095702

Koepfli, K.-P. et al., 2015. The Genome 10K Project: a way forward. Annual review of animal biosciences, 3, pp.57–111.

Kolmogorov, M., Armstrong, J., Raney, B.J., Streeter, I., Dunn, M., Yang, F., Odom, D., Flicek, P., Keane, T.M., et al., 2018. Chromosome assembly of large and complex genomes using multiple references. Genome research, 28(11), pp.1720–1732.

Korbel, J.O. et al., 2007. Paired-end mapping reveals extensive structural variation in the human genome. Science, 318(5849), pp.420–426.

114

Korlach, J. et al., 2017. De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads. GigaScience, 6(10), pp.1–16.

Laine, V.N. et al., 2016. Evolutionary signals of selection on cognition from the great tit genome and methylome. Nature communications, 7, p.10474.

Laine, V.N. et al., 2019. Exploring the unmapped DNA and RNA reads in a songbird genome. BMC genomics, 20(1), p.19.

Latrubesse, E.M. and Rancy, A., 2000. Neotectonic influence on tropical rivers of southwestern Amazon during the late Quaternary: the Moa and Ipixuna river basins, Brazil. Quaternary International, 72(1), pp.67-72.

Lees, A.C., Peres, C.A., 2010. Habitat and Life History Determinants of Antbird Occurrence in Variable-Sized Amazonian Forest Fragments. Biotropica 42, 614–621. https://doi.org/10.1111/j.1744-7429.2010.00625.x

Lehtonen, J., Lanfear, R., 2014. Generation time, life history and the substitution rate of neutral mutations. Biology Letters 10. https://doi.org/10.1098/rsbl.2014.0801

Leite, R.N., Rogers, D.S., 2013. Revisiting Amazonian phylogeography: insights into diversification hypotheses and novel perspectives. Organisms Diversity & Evolution 13, 639–664. https://doi.org/10.1007/s13127-013-0140-8

Lessa, E.P., Cook, J.A., Patton, J.L., 2003. Genetic footprints of demographic expansion in North America, but not Amazonia, during the Late Quaternary. PNAS 100, 10331–10334. https://doi.org/10.1073/pnas.1730921100

Li, H., Durbin, R., 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760. https://doi.org/10.1093/bioinformatics/btp324

Li, H., 2011. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27, 718–719. https://doi.org/10.1093/bioinformatics/btq671

115

Li, H., 2013. Seqtk: a fast and lightweight tool for processing FASTA or FASTQ sequences. Available at: https://github.com/lh3/seqtk

Li, W. & Freudenberg, J., 2014. Mappability and read length. Frontiers in genetics, 5, p.381.

Lieberman-Aiden, E. et al., 2009. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326(5950), pp.289–293.

Lien, S. et al., 2016. The Atlantic salmon genome provides insights into rediploidization. Nature, 533(7602), pp.200–205.

Lima, M.G.M., Buckner, J.C., Silva-Júnior, J. de S. e, Aleixo, A., Martins, A.B., Boubli, J.P., Link, A., Farias, I.P., da Silva, M.N., Röhe, F., Queiroz, H., Chiou, K.L., Di Fiore, A., Alfaro, M.E., Lynch Alfaro, J.W., 2017. Capuchin monkey biogeography: understanding Sapajus Pleistocene range expansion and the current sympatry between Cebus and Sapajus. J. Biogeogr. 44, 810–820. https://doi.org/10.1111/jbi.12945

Lima, W.J.S., Cohen, M.C.L., Rossetti, D.F., França, M.C., 2018. Late Pleistocene glacial forest elements of Brazilian Amazonia. Palaeogeography, Palaeoclimatology, Palaeoecology 490, 617–628. https://doi.org/10.1016/j.palaeo.2017.11.050

Lisiecki, L.E., Raymo, M.E., 2005. A Pliocene-Pleistocene stack of 57 globally distributed benthic δ18O records. Paleoceanography 20. https://doi.org/10.1029/2004PA001071

Lisiecki, L.E., Raymo, M.E., Curry, W.B., 2008. Atlantic overturning responses to Late Pleistocene climate forcings. Nature 456, 85–88. https://doi.org/10.1038/nature07425

Liu, D., Hunt, M. & Tsai, I.J., 2018. Inferring synteny between genome assemblies: a systematic evaluation. BMC Bioinformatics, 19(1). Available at: http://dx.doi.org/10.1186/s12859- 018-2026-4.

Lynch Alfaro, J.W., Boubli, J.P., Olson, L.E., Di Fiore, A., Wilson, B., Gutiérrez‐Espeleta, G.A., Chiou, K.L., Schulte, M., Neitzel, S., Ross, V. and Schwochow, D., 2012. Explosive Pleistocene range expansion leads to widespread Amazonian sympatry between robust and gracile capuchin monkeys. Journal of biogeography, 39(2), pp.272-288.

116

Maldonado‐Coelho, M., Blake, J.G., Silveira, L.F., Batalha‐Filho, H. and Ricklefs, R.E., 2013. Rivers, refuges and population divergence of fire‐eye antbirds (Pyriglena) in the Amazon Basin. Journal of Evolutionary Biology, 26(5), pp.1090-1107.

Malhi Y., Phillips O. L., Mayle Francis E., Beerling David J., Gosling William D., Bush Mark B., 2004. Responses of Amazonian ecosystems to climatic and atmospheric carbon dioxide changes since the last glacial maximum. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 359, 499–514. https://doi.org/10.1098/rstb.2003.1434

Mather, N., Traves, S.M., Ho, S.Y.W., n.d. A practical introduction to sequentially Markovian coalescent methods for estimating demographic history from genomic data. Ecology and Evolution n/a. https://doi.org/10.1002/ece3.5888

Manthey, J.D. et al., 2016. Comparison of Target-Capture and Restriction-Site Associated DNA Sequencing for Phylogenomics: A Test in Cardinalid Tanagers (Aves, Genus: Piranga). Systematic biology, 65(4), pp.640–650.

Marchant, R., Berrío, J.C., Behling, H., Boom, A., Hooghiemstra, H., 2006. Colombian dry moist forest transitions in the Llanos Orientales—A comparison of model and pollen- based biome reconstructions. Palaeogeography, Palaeoclimatology, Palaeoecology 234, 28–44. https://doi.org/10.1016/j.palaeo.2005.10.028

Marcondes, R.S. & Brumfield, R.T., 2019. Fifty shades of brown: Macroevolution of plumage brightness in the Furnariida, a large clade of drab Neotropical passerines. Evolution; international journal of organic evolution, 73(4), pp.704–719.

Marks, P. et al., 2019. Resolving the full spectrum of human genome variation using Linked- Reads. Genome research, 29(4), pp.635–645.

Mayle, F.E., Langstroth, R.P., Fisher, R.A., Meir, P., 2007. Long-term forest–savannah dynamics in the Bolivian Amazon: implications for conservation. Philosophical Transactions of the Royal Society B: Biological Sciences 362, 291–307. https://doi.org/10.1098/rstb.2006.1987

Mayle, F.E., Power, M.J., 2008. Impact of a drier Early–Mid-Holocene climate upon Amazonian forests. Philosophical Transactions of the Royal Society B: Biological Sciences 363,

117

1829–1838. https://doi.org/10.1098/rstb.2007.0019

Mazet, O., Rodríguez, W., Chikhi, L., 2015. Demographic inference using genetic data from a single individual: Separating population size variation from population structure. Theoretical Population Biology 104, 46–58. https://doi.org/10.1016/j.tpb.2015.06.003

McCormack, J.E. et al., 2013. A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing. PloS one, 8(1), p.e54848.

McGee, D., 2020. Glacial–Interglacial Precipitation Changes. Annu. Rev. Mar. Sci. 12, annurev- marine-010419-010859. https://doi.org/10.1146/annurev-marine-010419-010859

McKenna, A. et al., 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome research, 20(9), pp.1297–1303.

Menger, J., Unrein, J., Woitow, M., Schlegel, M., Henle, K. and Magnusson, W.E., 2018. Weak evidence for fine-scale genetic spatial structure in three sedentary Amazonian understorey birds. Journal of ornithology, 159(2), pp.355-366.

Merceron Gildas, Kaiser Thomas M., Kostopoulos Dimitris S., Schulz Ellen, 2010. Ruminant diets and the Miocene extinction of European great apes. Proceedings of the Royal Society B: Biological Sciences 277, 3105–3112. https://doi.org/10.1098/rspb.2010.0523

Miller, M.J., Bermingham, E., Klicka, J., Escalante, P., do Amaral, F.S.R., Weir, J.T. and Winker, K., 2008. Out of Amazonia again and again: episodic crossing of the Andes promotes diversification in a lowland forest flycatcher. Proceedings of the Royal Society B: Biological Sciences, 275(1639), pp.1133-1142.

Morales, H.E., Pavlova, A., Joseph, L., Sunnucks, P., 2015. Positive and purifying selection in mitochondrial genomes of a bird with mitonuclear discordance. Molecular Ecology 24, 2820–2837. https://doi.org/10.1111/mec.13203

Moyle, R.G. et al., 2009. Phylogeny and phylogenetic classification of the antbirds, ovenbirds, woodcreepers, and allies (Aves: Passeriformes: infraorder Furnariides). Cladistics. Available at: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1096-0031.2009.00259.x.

118

Moyle, R.G. et al., 2016. Tectonic collision and uplift of Wallacea triggered the global songbird radiation. Nature communications, 7, p.12709.

Moura, N.G., Lees, A.C., Aleixo, A., Barlow, J., Berenguer, E., Ferreira, J., Mac Nally, R., Thomson, J.R., Gardner, T.A., 2016. Idiosyncratic responses of Amazonian birds to primary forest disturbance. Oecologia 180, 903–916. https://doi.org/10.1007/s00442-015- 3495-z

Mueller, J.C., Kuhl, H., Boerno, S., Tella, J.L., Carrete, M., Kempenaers, B., 2018. Evolution of genomic variation in the burrowing owl in response to recent colonization of urban areas. Proceedings of the Royal Society B: Biological Sciences 285, 20180206. https://doi.org/10.1098/rspb.2018.0206

Musher, L.J. et al., 2019. Why is Amazonia a “source”of biodiversity? Climate-mediated dispersal and synchronous speciation across the Andes in an avian group (Tityrinae). Proceedings of the Royal Society B, 286(1900), p.20182343.

Musher, L.J. & Cracraft, J., 2018. Phylogenomics and species delimitation of a complex radiation of Neotropical suboscine birds (Pachyramphus). Molecular phylogenetics and evolution, 118, pp.204–221.

Myers, E.W. et al., 2000. A whole-genome assembly of Drosophila. Science, 287(5461), pp.2196–2204.

Nadachowska-Brzyska, K., Li, C., Smeds, L., Zhang, G., Ellegren, H., 2015. Temporal Dynamics of Avian Populations during Pleistocene Revealed by Whole-Genome Sequences. Current Biology 25, 1375–1380. https://doi.org/10.1016/j.cub.2015.03.047Naka, L.N., Bechtoldt, C.L., Henriques, L.M.P. and Brumfield, R.T., 2012. The role of physical barriers in the location of avian suture zones in the Guiana Shield, northern Amazonia. The American Naturalist, 179(4), pp.E115-E132.

Nam, K. et al., 2010. Molecular evolution of genes in avian genomes. Genome biology, 11(6), p.R68.

Nyári, Á.S., 2007. Phylogeographic patterns, molecular and vocal differentiation, and species limits in Schiffornis turdina (Aves). Molecular phylogenetics and evolution, 44(1),

119

pp.154-164

Naka, L.N., Brumfield, R.T., 2018. The dual role of Amazonian rivers in the generation and maintenance of avian diversity. Science Advances 4, eaar8575. https://doi.org/10.1126/sciadv.aar8575

Nascimento, M.N., Martins, G.S., Cordeiro, R.C., Turcq, B., Moreira, L.S., Bush, M.B., 2019. Vegetation response to climatic changes in western Amazonia over the last 7,600 years. J Biogeogr 46, 2389–2406. https://doi.org/10.1111/jbi.13704

O’Connor, R.E. et al., 2018. Chromosome-level assembly reveals extensive rearrangement in saker falcon and budgerigar, but not ostrich, genomes. Genome biology, 19(1), p.171.

Olivares, I., Svenning, J.-C., van Bodegom, P.M., Balslev, H., 2015. Effects of Warming and Drought on the Vegetation and Plant Diversity in the Amazon Basin. The Botanical Review 81, 42–69. https://doi.org/10.1007/s12229-014-9149-8

Oliveros, C.H. et al., 2019. Earth history and the passerine superradiation. Proceedings of the National Academy of Sciences of the United States of America, 116(16), pp.7916–7925.

Oswald, J.A. et al., 2017. Isolation with asymmetric gene flow during the nonsynchronous divergence of dry forest birds. Molecular ecology, 26(5), pp.1386–1400.

Oswald, J.A. et al., 2019. Evolutionary dynamics of hybridization and introgression following the recent colonization of Glossy Ibis (Aves: Plegadis falcinellus) into the New World. Molecular ecology. Available at: http://dx.doi.org/10.1111/mec.15008.

Ott, A. et al., 2018. Linked read technology for assembling large complex and polyploid genomes. BMC genomics, 19(1), p.651.

Ozerov, M.Y. et al., 2018. Highly Continuous Genome Assembly of Eurasian Perch (Perca fluviatilis) Using Linked-Read Sequencing. G3, 8(12), pp.3737–3743.

Papadopoulou, A., Knowles, L.L., 2016. Toward a paradigm shift in comparative phylogeography driven by trait-based hypotheses. PNAS 113, 8018–8024.

120

https://doi.org/10.1073/pnas.1601069113

Paradis, E., 2010. pegas: an R package for population genetics with an integrated–modular approach. Bioinformatics, 26(3), pp.419-420

Paten, B. et al., 2011. Cactus: Algorithms for genome multiple sequence alignment. Genome research, 21(9), pp.1512–1528.

Patton, A.H., Margres, M.J., Stahlke, A.R., Hendricks, S., Lewallen, K., Hamede, R.K., Ruiz- Aravena, M., Ryder, O., McCallum, H.I., Jones, M.E., Hohenlohe, P.A., Storfer, A., 2019. Contemporary demographic reconstruction methods are robust to genome assembly quality: A case study in Tasmanian Devils. Molecular Biology and Evolution msz191. https://doi.org/10.1093/molbev/msz191

Pedreschi, D., García-Rodríguez, O., Yannic, G., Cantarello, E., Diaz, A., Golicher, D., Korstjens, A.H., Heckel, G., Searle, J.B., Gillingham, P., Hardouin, E.A., Stewart, J.R., 2018. Challenging the European southern refugium hypothesis: Species-specific structures versus general patterns of genetic diversity and differentiation among small mammals: XXXX. Global Ecology and Biogeography. https://doi.org/10.1111/geb.12828

Peona, V., Weissensteiner, M.H. & Suh, A., 2018. How complete are “complete” genome assemblies?—An avian perspective. Molecular ecology resources, 18(6), pp.1188–1195.

Peterson, A.T., Nyári, Á.S., 2008. Ecological Niche Conservatism and Pleistocene Refugia in the Thrush-Like Mourner, Schiffornis Sp., in the Neotropics. Evolution 62, 173–183. https://doi.org/10.1111/j.1558-5646.2007.00258.x

Pollock, H.S., Cheviron, Z.A., Agin, T.J., Brawn, J.D., 2015. Absence of microclimate selectivity in insectivorous birds of the Neotropical forest understory. Biological Conservation 188, 116–125. https://doi.org/10.1016/j.biocon.2014.11.013

Prates, I., Xue, A.T., Brown, J.L., Alvarado-Serrano, D.F., Rodrigues, M.T., Hickerson, M.J., Carnaval, A.C., 2016. Inferring responses to climate dynamics from historical demography in neotropical forest lizards. Proceedings of the National Academy of Sciences 113, 7978–7985. https://doi.org/10.1073/pnas.1601063113

121

Prost, S. et al., 2019. Comparative analyses identify genomic features potentially involved in the evolution of birds-of-paradise. GigaScience, 8(5). Available at: http://dx.doi.org/10.1093/gigascience/giz003.

Prum, R.O. et al., 2015. A comprehensive phylogeny of birds (Aves) using targeted next- generation DNA sequencing. Nature, 526(7574), pp.569–573.Puckett, E.E., Munshi- South, J., 2019. Brown rat demography reveals pre-commensal structure in eastern Asia prior to expansion into Southeast Asia. Genome Res. gr.235754.118. https://doi.org/10.1101/gr.235754.118

Pulido-Santacruz, P., Aleixo, A., Weir, J.T., 2018. Morphologically cryptic Amazonian bird species pairs exhibit strong postzygotic reproductive isolation. Proceedings of the Royal Society B: Biological Sciences 285, 20172081. https://doi.org/10.1098/rspb.2017.2081

Punyasena, S.W., Mayle, F.E., McElwain, J.C., 2008. Quantitative estimates of glacial and Holocene temperature and precipitation change in lowland Amazonian Bolivia. Geology 36, 667–670. https://doi.org/10.1130/G24784A.1

Pupim, F.N., Sawakuchi, A.O., Almeida, R.P., Ribas, C.C., Kern, A.K., Hartmann, G.A., Chiessi, C.M., Tamura, L.N., Mineli, T.D., Savian, J.F., Grohmann, C.H., Bertassoli, D.J., Stern, A.G., Cruz, F.W., Cracraft, J., 2019. Chronology of Terra Firme formation in Amazonian lowlands reveals a dynamic Quaternary landscape. Quaternary Science Reviews 210, 154–163. https://doi.org/10.1016/j.quascirev.2019.03.008

R Core Team, 2019. core. R: A language and environment for statistical computing. 2013.

Raftery, A.E., 1995. Bayesian model selection in social research. Sociological methodology, pp.111-163

Raikow, R.J., 1986. Why are there so many kinds of passerine birds? Systematic zoology, 35(2), pp.255–259

Ramírez‐Barahona, S., Eguiarte, L.E., 2013. The role of glacial cycles in promoting genetic diversity in the Neotropics: the case of cloud forests during the Last Glacial Maximum. Ecology and Evolution 3, 725–738. https://doi.org/10.1002/ece3.483

122

Ramos-Onsins, S.E. and Rozas, J., 2002. Statistical properties of new neutrality tests against population growth. Molecular biology and evolution, 19(12), pp.2092-2100.

Raposo do Amaral, F. et al., 2018. Recent chapters of Neotropical history overlooked in phylogeography: Shallow divergence explains phenotype and genotype uncoupling in Antilophia manakins. Molecular ecology, 27(20), pp.4108–4120.

Reid, B.N., Naro‐Maciel, E., Hahn, A.T., FitzSimmons, N.N., Gehara, M., 2019. Geography best explains global patterns of genetic diversity and postglacial co-expansion in marine turtles. Molecular Ecology 28, 3358–3370. https://doi.org/10.1111/mec.15165

Reis, L.S., Guimarães, J.T.F., Souza-Filho, P.W.M., Sahoo, P.K., de Figueiredo, M.M.J.C., de Souza, E.B., Giannini, T.C., 2017. Environmental and vegetation changes in southeastern Amazonia during the late Pleistocene and Holocene. Quaternary International 449, 83– 105. https://doi.org/10.1016/j.quaint.2017.04.031

Ribas Camila C., Aleixo Alexandre, Nogueira Afonso C. R., Miyaki Cristina Y., Cracraft Joel, 2012. A palaeobiogeographic model for biotic diversification within Amazonia over the past three million years. Proceedings of the Royal Society B: Biological Sciences 279, 681–689. https://doi.org/10.1098/rspb.2011.1120

Ribas, C.C., Aleixo, A., Gubili, C., d’Horta, F.M., Brumfield, R.T., Cracraft, J., 2018. Biogeography and diversification of Rhegmatorhina (Aves: Thamnophilidae): Implications for the evolution of Amazonian landscapes during the Quaternary. Journal of Biogeography 45, 917–928. https://doi.org/10.1111/jbi.13169

Ribeiro, Â.M., Lloyd, P., Bowie, R.C.K., 2011. A Tight Balance Between Natural Selection and Gene Flow in a Southern African Arid-Zone Endemic Bird. Evolution 65, 3499–3514. https://doi.org/10.1111/j.1558-5646.2011.01397.x

Rocha, D.G. da, Kaefer, I.L., n.d. What has become of the refugia hypothesis to explain biological diversity in Amazonia? Ecology and Evolution 0. https://doi.org/10.1002/ece3.5051

Rocha, T.C., Sequeira, F., Aleixo, A., Rêgo, P.S., Sampaio, I., Schneider, H. and Vallinoto, M., 2015. Molecular phylogeny and diversification of a widespread Neotropical rainforest bird group: The Buff-throated Woodcreeper complex, Xiphorhynchus guttatus/susurrans

123

(Aves: Dendrocolaptidae). Molecular phylogenetics and evolution, 85, pp.131-140

Rossetti, D.F., Cohen, M.C.L., Pessenda, L.C.R., 2017. Vegetation Change in Southwestern Amazonia (Brazil) and Relationship to the Late Pleistocene and Holocene Climate. Radiocarbon 59, 69–89. https://doi.org/10.1017/RDC.2016.107

Rossetti, D.F., Gribel, R., Toledo, P.M., Tatumi, S.H., Yee, M., Tudela, D.R.G., Munita, C.S., Coelho, L. de S., 2018. Unfolding long-term Late Pleistocene–Holocene disturbances of forest communities in the southwestern Amazonian lowlands. Ecosphere 9, e02457. https://doi.org/10.1002/ecs2.2457

Rull, V., 2008. Speciation timing and neotropical biodiversity: the Tertiary–Quaternary debate in the light of molecular phylogenetic evidence. Molecular ecology, 17(11), pp.2722-2729.

Rull, V., 2011. Neotropical biodiversity: timing and potential drivers. Trends in Ecology & Evolution 26, 508–513. https://doi.org/10.1016/j.tree.2011.05.011

Rull, V., 2013. Some problems in the study of the origin of neotropical biodiversity using palaeoecological and molecular phylogenetic evidence. Systematics and Biodiversity 11, 415–423. https://doi.org/10.1080/14772000.2013.865682

Runemark, A. et al., 2018. Variation and constraints in hybrid genome formation. Nature ecology & evolution, 2(3), pp.549–556.

Saatchi, S., Asefi-Najafabady, S., Malhi, Y., Aragão, L.E.O.C., Anderson, L.O., Myneni, R.B., Nemani, R., 2013. Persistent effects of a severe drought on Amazonian forest canopy. PNAS 110, 565–570. https://doi.org/10.1073/pnas.1204651110

Sano, A. and Tachida, H., 2005. Gene genealogy and properties of test statistics of neutrality under population growth. Genetics, 169(3), pp.1687-1697.

Saupe, E.E., Farnsworth, A., Lunt, D.J., Sagoo, N., Pham, K.V., Field, D.J., 2019. Climatic shifts drove major contractions in avian latitudinal distributions throughout the Cenozoic. Proc Natl Acad Sci USA 116, 12895–12900. https://doi.org/10.1073/pnas.1903866116

124

Scheff, J., Seager, R., Liu, H., Coats, S., 2017. Are Glacials Dry? Consequences for Paleoclimatology and for Greenhouse Warming. J. Climate 30, 6593–6609. https://doi.org/10.1175/JCLI-D-16-0854.1

Schubert, M., Lindgreen, S., Orlando, L., 2016. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res Notes 9. https://doi.org/10.1186/s13104-016- 1900-2

Schulman, L., Ruokolainen, K., Junikka, L., Sääksjärvi, I.E., Salo, M., Juvonen, S.-K., Salo, J., Higgins, M., 2007. Amazonian biodiversity and protected areas: do they meet? Biodivers Conserv 16, 3011–3051. https://doi.org/10.1007/s10531-007-9158-6

Schultz, E.D., Burney, C.W., Brumfield, R.T., Polo, E.M., Cracraft, J. and Ribas, C.C., 2017. Systematics and biogeography of the Automolus infuscatus complex (Aves; Furnariidae): Cryptic diversity reveals western Amazonia as the origin of a transcontinental radiation. Molecular phylogenetics and evolution, 107, pp.503-515.

Scordato, E.S.C., Wilkins, M.R., Semenov, G., Rubtsov, A.S., Kane, N.C., Safran, R.J., 2017. Genomic variation across two barn swallow hybrid zones reveals traits associated with divergence in sympatry and allopatry. Mol Ecol 26, 5676–5691. https://doi.org/10.1111/mec.14276

Sedlazeck, F.J. et al., 2018. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nature Reviews Genetics, 19(6), pp.329–346. Available at: http://dx.doi.org/10.1038/s41576-018-0003-4.

Seeholzer, G.F., Claramunt, S. & Brumfield, R.T., 2017. Niche evolution and diversification in a Neotropical radiation of birds (Aves: Furnariidae). Evolution; international journal of organic evolution, 71(3), pp.702–715.

Shafer, A.B.A., Cullingham, C.I., Côté, S.D., Coltman, D.W., 2010. Of glaciers and refugia: a decade of study sheds new light on the phylogeography of northwestern North America. Molecular Ecology 19, 4589–4621. https://doi.org/10.1111/j.1365-294X.2010.04828.x

Shen, Y.-Y., Shi, P., Sun, Y.-B., Zhang, Y.-P., 2009. Relaxation of selective constraints on avian mitochondrial DNA following the degeneration of flight ability. Genome Res. 19, 1760– 1765. https://doi.org/10.1101/gr.093138.109

125

Sherman, P.T., 1995. Breeding Biology of White-Winged Trumpeters (Psophia leucoptera) in Peru. The Auk 112, 285–295. https://doi.org/10.2307/4088717

Silva, A.C.A., Bragg, J.G., Potter, S., Fernandes, C., Coelho, M.M., Moritz, C., 2017. Tropical specialist vs. climate generalist: Diversification and demographic history of sister species of Carlia skinks from northwestern Australia. Molecular Ecology 26, 4045–4058. https://doi.org/10.1111/mec.14185

Silva, S.M., Peterson, A.T., Carneiro, L., Burlamaqui, T.C.T., Ribas, C.C., Sousa-Neves, T., Miranda, L.S., Fernandes, A.M., d’Horta, F.M., Araújo-Silva, L.E., Batista, R., Bandeira, C.H.M.M., Dantas, S.M., Ferreira, M., Martins, D.M., Oliveira, J., Rocha, T.C., Sardelli, C.H., Thom, G., Rêgo, P.S., Santos, M.P., Sequeira, F., Vallinoto, M., Aleixo, A., 2019. A dynamic continental moisture gradient drove Amazonian bird diversification. Science Advances 5, eaat5752. https://doi.org/10.1126/sciadv.aat5752

Simão, F.A. et al., 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics, 31(19), pp.3210–3212.

Smeds, L. et al., 2015. Evolutionary analysis of the female-specific avian W chromosome. Nature communications, 6, p.7330.

Smeds, L., Qvarnström, A., Ellegren, H., 2016. Direct estimate of the rate of germline mutation in a bird. Genome Res. 26, 1211–1218. https://doi.org/10.1101/gr.204669.116

Smith, B.T. et al., 2013. Target capture and massively parallel sequencing of ultraconserved elements for comparative studies at shallow evolutionary time scales. Systematic Biology. Available at: https://academic.oup.com/sysbio/article-abstract/63/1/83/1689074.

Smith, B.T., McCormack, J.E., Cuervo, A.M., Hickerson, Michael.J., Aleixo, A., Cadena, C.D., Pérez-Emán, J., Burney, C.W., Xie, X., Harvey, M.G., Faircloth, B.C., Glenn, T.C., Derryberry, E.P., Prejean, J., Fields, S., Brumfield, R.T., 2014. The drivers of tropical speciation. Nature 515, 406–409. https://doi.org/10.1038/nature13687

Sohn, J. & Nam, J.W., 2016. The present and future of de novo whole-genome assembly. Briefings in bioinformatics. Available at: https://academic.oup.com/bib/article- abstract/19/1/23/2339783.

126

Sohn, J.-I. et al., 2018. Whole genome and transcriptome maps of the entirely black native Korean chicken breed Yeonsan Ogye. GigaScience, 7(7). Available at: http://dx.doi.org/10.1093/gigascience/giy086.

Sotero-Caio, C.G. et al., 2017. Evolution and Diversity of Transposable Elements in Vertebrate Genomes. Genome biology and evolution, 9(1), pp.161–177.

Stamatakis, A., 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9), pp.1312–1313.

Steege, H. ter, Pitman, N.C.A., Sabatier, D., Baraloto, C., Salomão, R.P., Guevara, J.E., Phillips, O.L., Castilho, C.V., Magnusson, W.E., Molino, J.-F., Monteagudo, A., Vargas, P.N., Montero, J.C., Feldpausch, T.R., Coronado, E.N.H., Killeen, T.J., Mostacedo, B., Vasquez, R., Assis, R.L., Terborgh, J., Wittmann, F., Andrade, A., Laurance, W.F., Laurance, S.G.W., Marimon, B.S., Marimon, B.-H., Vieira, I.C.G., Amaral, I.L., Brienen, R., Castellanos, H., López, D.C., Duivenvoorden, J.F., Mogollón, H.F., Matos, F.D. de A., Dávila, N., García-Villacorta, R., Diaz, P.R.S., Costa, F., Emilio, T., Levis, C., Schietti, J., Souza, P., Alonso, A., Dallmeier, F., Montoya, A.J.D., Piedade, M.T.F., Araujo-Murakami, A., Arroyo, L., Gribel, R., Fine, P.V.A., Peres, C.A., Toledo, M., C, G.A.A., Baker, T.R., Cerón, C., Engel, J., Henkel, T.W., Maas, P., Petronelli, P., Stropp, J., Zartman, C.E., Daly, D., Neill, D., Silveira, M., Paredes, M.R., Chave, J., Filho, D. de A.L., Jørgensen, P.M., Fuentes, A., Schöngart, J., Valverde, F.C., Fiore, A.D., Jimenez, E.M., Mora, M.C.P., Phillips, J.F., Rivas, G., Andel, T.R. van, Hildebrand, P. von, Hoffman, B., Zent, E.L., Malhi, Y., Prieto, A., Rudas, A., Ruschell, A.R., Silva, N., Vos, V., Zent, S., Oliveira, A.A., Schutz, A.C., Gonzales, T., Nascimento, M.T., Ramirez- Angulo, H., Sierra, R., Tirado, M., Medina, M.N.U., Heijden, G. van der, Vela, C.I.A., Torre, E.V., Vriesendorp, C., Wang, O., Young, K.R., Baider, C., Balslev, H., Ferreira, C., Mesones, I., Torres-Lezama, A., Giraldo, L.E.U., Zagt, R., Alexiades, M.N., Hernandez, L., Huamantupa-Chuquimaco, I., Milliken, W., Cuenca, W.P., Pauletto, D., Sandoval, E.V., Gamarra, L.V., Dexter, K.G., Feeley, K., Lopez-Gonzalez, G., Silman, M.R., 2013. Hyperdominance in the Amazonian Tree Flora. Science 342, 1243092. https://doi.org/10.1126/science.1243092

Stewart, J.R., Lister, A.M., 2001. Cryptic northern refugia and the origins of the modern biota. Trends in Ecology & Evolution 16, 608–613. https://doi.org/10.1016/S0169- 5347(01)02338-2

Stouffer, P.C., Bierregaard, R.O., 1995. Use of Amazonian Forest Fragments by Understory Insectivorous Birds. Ecology 76, 2429–2445. https://doi.org/10.2307/2265818

127

Stratford, J.A., Stouffer, P.C., 2015. Forest fragmentation alters microhabitat availability for Neotropical terrestrial insectivorous birds. Biological Conservation 188, 109–115. https://doi.org/10.1016/j.biocon.2015.01.017

Suh, A. et al., 2011. Mesozoic retroposons reveal parrots as the closest living relatives of passerine birds. Nature communications, 2, p.443.

Tamazian, G. et al., 2016. Chromosomer: a reference-based genome arrangement tool for producing draft chromosome sequences. GigaScience, 5(1), p.38.

Terhorst, J., Kamm, J.A., Song, Y.S., 2017. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nature Genetics 49, 303–309. https://doi.org/10.1038/ng.3748

Thom, G. and Aleixo, A., 2015. Cryptic speciation in the white-shouldered antshrike (Thamnophilus aethiops, Aves–Thamnophilidae): the tale of a transcontinental radiation across rivers in lowland Amazonia and the northeastern Atlantic Forest. Molecular phylogenetics and evolution, 82, pp.95-110.

Thom, G., Amaral, F.R.D., Hickerson, M.J., Aleixo, A., Araujo-Silva, L.E., Ribas, C.C., Choueri, E., Miyaki, C.Y., 2018. Phenotypic and Genetic Structure Support Gene Flow Generating Gene Tree Discordances in an Amazonian Floodplain Endemic Species. Systematic Biology 67, 700–718. https://doi.org/10.1093/sysbio/syy004

Tigano, A., Sackton, T.B. & Friesen, V.L., 2018. Assembly and RNA-free annotation of highly heterozygous genomes: The case of the thick-billed murre (Uria lomvia). Molecular ecology resources, 18(1), pp.79–90.

Timmermann, A., Friedrich, T., 2016. Late Pleistocene climate drivers of early human migration. Nature 538, 92. https://doi.org/10.1038/nature19365

Toews, D.P.L., 2015. Evolution: A Genomic Guide to Bird Population History. Current Biology 25, R465–R467. https://doi.org/10.1016/j.cub.2015.04.008

Toews, D.P.L. et al., 2016. Plumage Genes and Little Else Distinguish the Genomes of Hybridizing Warblers. Current biology: CB, 26(17), pp.2313–2318.

128

Valderrama, E., Pérez‐Emán, J.L., Brumfield, R.T., Cuervo, A.M., Cadena, C.D., 2014. The influence of the complex topography and dynamic history of the montane Neotropics on the evolutionary differentiation of a cloud forest bird (Premnoplex brunnescens, Furnariidae). Journal of Biogeography 41, 1533–1546. https://doi.org/10.1111/jbi.12317

Van Dam, J.A., Aziz, H.A., Sierra, M.Á.Á., Hilgen, F.J., van den Hoek Ostende, L.W., Lourens, L.J., Mein, P., van Der Meulen, A.J. and Pelaez-Campomanes, P., 2006. Long-period astronomical forcing of mammal turnover. Nature, 443(7112), pp.687-691

Van der Auwera, G.A., Carneiro, M.O., Hartl, C., Poplin, R., del Angel, G., Levy-Moonshine, A., Jordan, T., Shakir, K., Roazen, D., Thibault, J., Banks, E., Garimella, K.V., Altshuler, D., Gabriel, S., DePristo, M.A., 2013. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 11, 11.10.1-11.10.33. https://doi.org/10.1002/0471250953.bi1110s43

Vijay, N., Park, C., Oh, J., Jin, S., Kern, E., Kim, H.W., Zhang, J., Park, J.-K., 2018. Population Genomic Analysis Reveals Contrasting Demographic Changes of Two Closely Related Dolphin Species in the Last Glacial. Mol Biol Evol 35, 2026–2033. https://doi.org/10.1093/molbev/msy108

Wallace, A.R., 1889. A narrative of travels on the Amazon and Rio Negro: with an account of the native tribes, and observations on the climate, geology, and natural history of the Amazon valley (No. 8). Ward, Lock.

Wang, X., Edwards, R.L., Auler, A.S., Cheng, H., Kong, X., Wang, Y., Cruz, F.W., Dorale, J.A., Chiang, H.-W., 2017. Hydroclimate changes across the Amazon lowlands over the past 45,000 years. Nature 541, 204–207. https://doi.org/10.1038/nature20787

Waterhouse, R.M. et al., 2017. BUSCO applications from quality assessments to gene prediction and phylogenomics. Molecular biology and evolution. Available at: http://dx.doi.org/10.1093/molbev/msx319.

Waterhouse, R.M. et al., 2013. OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic acids research, 41(Database issue), pp.D358–65.

Waterhouse, R.M., Zdobnov, E.M. & Kriventseva, E.V., 2011. Correlating traits of gene retention, sequence divergence, duplicability and essentiality in vertebrates, ,

129

and fungi. Genome biology and evolution, 3, pp.75–86.

Weigelt, P., Steinbauer, M.J., Cabral, J.S., Kreft, H., 2016. Late Quaternary climate change shapes island biodiversity. Nature 532, 99–102. https://doi.org/10.1038/nature17443

Weir, J.T., 2006. Divergent Timing and Patterns of Species Accumulation in Lowland and Highland Neotropical Birds. Evolution 60, 842–855. https://doi.org/10.1111/j.0014- 3820.2006.tb01161.x

Weir, J.T., Schluter, D., 2008. Calibrating the avian molecular clock. Molecular Ecology 17, 2321–2328. https://doi.org/10.1111/j.1365-294X.2008.03742.x

Weir, J.T. and Price, M., 2011. Andean uplift promotes lowland speciation through vicariance and dispersal in Dendrocincla woodcreepers. Molecular Ecology, 20(21), pp.4550-4563.

Weir, J.T., Faccio, M.S., Pulido-Santacruz, P., Barrera-Guzmán, A.O., Aleixo, A., 2015. Hybridization in headwater regions, and the role of rivers as drivers of speciation in Amazonian birds: Hybridization in Amazonian headwaters. Evolution 69, 1823–1834. https://doi.org/10.1111/evo.12696

Weir, J.T., Haddrath, O., Robertson, H.A., Colbourne, R.M., Baker, A.J., 2016. Explosive ice age diversification of kiwi. PNAS 113, E5580–E5587. https://doi.org/10.1073/pnas.1603795113

Weisenfeld, N.I. et al., 2017. Direct determination of diploid genome sequences. Genome Research, 27(5), pp.757–767.

Weissensteiner, M.H. et al., 2017. Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications. Genome research, 27(5), pp.697–708.

White, N.D., Mitter, C. & Braun, M.J., 2017. Ultraconserved elements resolve the phylogeny of potoos (Aves: Nyctibiidae). Journal of avian biology, 48(6), pp.872–880.

Willis, E.O., 1968. and behavior of Pale-faced Antbirds. The Auk, 85(2), pp.253–

130

264.

Willis, E.O., 1969. On the behavior of five species of Rhegmatorhina, ant-following antbirds of the Amazon basin. The Wilson bulletin, pp.363–395.

Willis, E.O., 1976. Effects of a cold wave on an Amazonian avifauna in the upper Paraguay drainage, Western Mato Grosso, and suggestions on Oscine-Suboscine relationships. Acta Amaz. 6, 379–394. https://doi.org/10.1590/1809-43921976063379

Willson, S.K., 2004. Obligate Army-Ant-Following Birds: A Study of Ecology, Spatial Movement Patterns, and Behavior in Amazonian Peru. Ornithological Monographs 1–67. https://doi.org/10.2307/40166802

Wolfe, J.D., Stouffer, P.C., Seeholzer, G.F., 2014. Variation in tropical bird survival across longitude and guilds: a case study from the Amazon. Oikos 123, 964–970. https://doi.org/10.1111/oik.00849

Woodworth, B.K., Norris, D.R., Graham, B.A., Kahn, Z.A., Mennill, D.J., 2018. Hot temperatures during the dry season reduce survival of a resident tropical bird. Proc. R. Soc. B 285, 20180176. https://doi.org/10.1098/rspb.2018.0176

Wright, N.A., Gregory, T.R. & Witt, C.C., 2014. Metabolic “engines” of flight drive genome size reduction in birds. Proceedings. Biological sciences / The Royal Society, 281(1779), p.20132780.

Zhang, G. et al., 2014. Comparative genomics reveals insights into avian genome evolution and adaptation. Science, 346(6215), pp.1311–1320.

Zhang, G., 2015. Genomics: Bird sequencing project takes off. Nature 522, 34. https://doi.org/10.1038/522034d

Zheng, X., Levine, D., Shen, J., Gogarten, S.M., Laurie, C., Weir, B.S., 2012. A high- performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28, 3326–3328. https://doi.org/10.1093/bioinformatics/bts606

131

Zheng, G.X.Y. et al., 2016. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nature biotechnology, 34(3), pp.303–311.

Zular, A., Sawakuchi, A.O., Chiessi, C.M., d’Horta, F.M., Cruz, F.W., Demattê, J.A.M., Ribas, C.C., Hartmann, G.A., Giannini, P.C.F., Soares, E.A.A., 2019. The role of abrupt climate change in the formation of an open vegetation enclave in northern Amazonia during the late Quaternary. Global and Planetary Change 172, 140–149. https://doi.org/10.1016/j.gloplacha.2018.09.006

132

Appendices

SFigure 1.PCA containing 1000 closest simulations for three hABC models in A) NW Amazonia and B)

SE Amazonia. The cross in the middle of the figures represents observed summary statistics.

SFigure 2.The phylogeny used to inform the whole-genome alignment. The topology was extracted from

Oliveiros et al (2018).

133

SFigure 3.number of structural variants (insertions/deletions, duplications, rearrangements) mapped to

Rhegmathorina melanosticta chromosomes in relation to chromosome length. The red points represent the sex chromosomes.

SFigure 4.phylogenetic relationships for some birds in the family, Thamnophilidae. The topology was inferred using RAxML based on 4,870 GBS loci mapped to the R. melanosticta genome. The relationships are consistent with well-accepted phylogenetic hypotheses.

134

SFigure 5. distribution of lengths of scaffolds that form each chromosome (thin lines) in the

Rhegmathorina melanosticta genome. The red and green lines correspond to the W and Z chromosomes, respectively. The bold line corresponds to the length distribution of all scaffolds combined. The blue dashed line represents the scaffold N50 (3.2 Mb).

135

file:///Users/laisaraujocoelho/Dropbox/cap2/regma.snps.qual.svg

SFigure 6. Score density (y axes) of quality metrics used for data filtering for raw SNPs of all 1 of 1 8/16/19, 4:34 PM Rhegmatorhina genomes combined. The lines represent minimum or maximum values of the metrics

suggested by the GATk best practices haplotype calling guide.

136

smcpp thinning and mu test

5 0

+ berlep.108s, 3.1e−9

e 6

108,k=500 108s,k=500,g=2.3yr

5

0

+

e

5

5

0

+

e

4

e

N

5 0

+ 108s, 3.1e−9

e

3

5

0

+

e 2

108s, 7e−9

5

0

+

e 1

0e+00 1e+05 2e+05 3e+05 4e+05

Time(yr) SFigure 7.Effects of data thinning and different substitution rates on demographic reconstruction.

P. napensis, fold test

5 0

+ 2−fold (default) e

5 3−fold 4−fold

5−fold

5

0

+

e

4

5

0

+

e

e

N

3

5

0

+

e

2

5

0

+

e 1

0 200000 400000 600000 800000 1000000 1200000 1400000

Time(yr)

SFigure 8.Effects of using different numbers of cross-validation folds on demographic reconstruction.

137

5 0

+ piecewise e

5 piecewise−cubic

cubic

5

0

+

e

4

5

0

+

e

e

N

3

5

0

+

e

2

5

0

+

e 1

0 200000 400000 600000 800000 1000000 1200000 1400000

Time(yr)

SFigure 9. Comparison of regularization methods for population size plotting for Psophia napensis.

Default knot numbers (eight changes in size) were used, resulting in loss of some patterns of population

size change.

5

0

+

e

5

5

0

+

e

4

5

0

+

e

e

N

3

5

0

+

e

2

5

0

+

e 1

0 200000 400000 600000 800000 1000000 1200000 1400000

Time(yr)

SFigure 10. Evaluation of dataset size and contiguity on demographic history reconstruction of Psophia populations. Hashed lines represent scaffolds over 500 Kb in length (~550 Mb) and solid lines represent scaffolds over 1 Mb (~280 Mb).

138