<<

Louisiana State University LSU Digital Commons

LSU Doctoral Dissertations Graduate School

March 2020

Genomics and Population History of Black-headed ( atriceps) Color Morphs

Subir B. Shakya Louisiana State University and Agricultural and Mechanical College

Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_dissertations

Part of the Biology Commons

Recommended Citation Shakya, Subir B., "Genomics and Population History of Black-headed Bulbul (Brachypodius atriceps) Color Morphs" (2020). LSU Doctoral Dissertations. 5187. https://digitalcommons.lsu.edu/gradschool_dissertations/5187

This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Doctoral Dissertations by an authorized graduate school editor of LSU Digital Commons. For more information, please [email protected]. GENOMICS AND POPULATION HISTORY OF BLACK- HEADED BULBUL (BRACHYPODIUS ATRICEPS) COLOR MORPHS

A Dissertation

Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy

in

The Department of Biological Sciences

by Subir B. Shakya B.Sc., Southern Arkansas University, 2014 May 2020

ACKNOWLEDGMENTS

A dissertation represents not only the effort of a single candidate but a document highlighting the roles and endeavors of many people and institutions. To this end, I have a lot of people and institutions to thank, without whom this dissertation would never have been completed. First and foremost, I would like to thank my advisor, Dr. Frederick H. Sheldon, who has guided me through the six years of my Ph.D. studies. He has not only believed in me throughout this process but also nudged me along academic paths that I would never have taken on my own. As a great writer, editor, and mentor, he has helped me polish this dissertation and all my projects at LSU, allowing me to produce some great research. This dissertation would also not have been possible without the aid and advice of my graduate committee mentors, Dr. Robb T. Brumfield and Dr. Brant C. Faircloth, as well as my Dean’s Representative, Dr. Javier G. Nevarez. All of them have helped me enormously with planning, implementing, and writing my dissertation. They were never afraid to ask me questions and help me think more critically about my project and what I should do next. My parents, Birendra B. Shakya and Suravi Shakya, and sister, Sneha Shakya, have been instrumental in getting me to where I am now. They have supported me from the very beginning and always have comforting words for me when I am struggling. Living halfway around the world is never an easy thing, so support from my family was integral. If they had not acknowledged my interest in natural history, pushed me to pursue a degree in the U.S., and allowed me to work on what I really loved, this dissertation would never have been started. None of this work would ever have been possible without the facilities at the LSU Museum of Natural Sciences and its faculty and staff. This was my home away from home. I would like to thank Dr. Van Remsen, Dr. Chris Austin, Dr. Prosanta Chakrabarty, and Dr. Jake Esselstyn, who have been a great help and for their excellent guidance whenever I had any questions or issues. Tammie Jackson, who basically holds the museum upright, deserves exceptional commendation for help with any and all academic, financial, travel, and permitting work that requires jumping bureaucratic hurdles. No expedition can be contemplated without the excellent work of Tammie. Valerie Deroun’s help with all expense paperwork, outreach organization, and basically assistance with even mundane tasks was greatly appreciated. Finally, nothing related to the collection and specimen/tissue work would be doable without the knowledge and expertise of Steve Cardiff and Donna Dittmann. Navigating the import/export process with specimens takes a lot of work and effort and Steve and Donna have effortlessly helped me and everyone navigate through those tough waters. They also instilled knowledge about bird preparations and bird curation, which has really helped with smooth functioning in the field and in the museum collections. I would like to thank colleagues at Louisiana State University Museum of Natural Science, University of Kansas Biodiversity Institute and Natural History Museum, Yale Peabody Museum, Field Museum of Natural History, and California Academy of Sciences for providing tissues and toe-pads for DNA extraction. For field work in Indonesia, I would also like to thank the Indonesia State Ministry of Research and Technology (RISTEK research license numbers 388/SIP/FRP/E5/Dit.KI/XI/2016, 389/SIP/FRP/E5/Dit.KI/XI/2016, 107/FRP/E5/Dit.KI/I/2018, 1079/FRP/E5/Dit.KI/II/2018, and 1126/FRP/E5/Dit.KI/I/2018) and the Ministry of Forestry for permission to undertake research on Maratua, Bawean, Kalimantan, and Sumatra islands. I also thank the Research Center for Biology, Indonesian Institute of Science, and the Museum Zoologicum Bogoriensis. Their staff, and particularly Dewi Prawiradilaga, Tri Haryoko,

ii

Mohammand Irham, and Suparno, helped with permitting, fieldwork, field logistics, as well as specimen export and, in the process, became good friends. For specimens from Malaysia, I thank the Malaysian Prime Minister’s Department, the Chief Ministers’ Departments of Sarawak and Sabah; Sarawak Forest Department, Forestry Corporation, and Biodiversity Institute; the Universiti Malaysia Sarawak; and Sabah Parks, Museum, and Wildlife Department. Funding for the project and fieldwork was provided largely through grants acquired by my advisor Dr. Sheldon from the NSF (DEB-0228688 and DEB-1241059), National Geographic Society (8753-10), and Coypu Foundation. I also received funding from the American Museum of Natural History, American Ornithologists’ Union, and LSU Museum of Natural Science for various portions of my dissertation. Current and former members of the Sheldon lab have been a great help and source of support for my dissertation. Dr. Haw Chuan Lim has been an influential force in helping me formulate the project and ideas on how to proceed with in the face of various difficulties. Dr. Clare Brown helped a lot in computational and bioinformatics work, and her suggestions on best practices for presenting dissertation research was greatly appreciated. Vivien Chua helped with a lot of lab work and the collection of key specimens. Ryan Burner helped train me as a field ornithologist and helped me greatly during expeditions in Malaysia and Indonesia. Matt Brady, along with Oscar Johnson, Jon Nations, and Heru Handika, also helped a lot with field work. I would like to thank all my friends and colleagues at LSU for their moral and academic support. I started graduate school with an exceptional cohort—Joanna Griffiths, Oscar Johnson, Rafael Marcondes, Genevieve Mount, Zachary Rodriguez, and Mark Swanson—and we all have held each other’s hands as we moved along with our dissertations. I would also like to thank a number of other current and former graduate students and post-docs at the Museum: Dr. Mike Harvey, Dr. Glenn Seeholzer, Dr. Ryan Terrill, Dr. Jessica Oswald, Dr. Cathy Newman, Dr. Fernando Alda, Dr. William (Bill) Ludt, Dr. Carl Oliveros, Andre Moncrieff, Marco Rego, Glaucia Del-Rio, Jon Nations, Jessie Salter, Anna Hiller, Alec Turner, Pam Hart, Diego Elias, Jackson Roberts, Eamon Corbett, Spenser Babb-Biernacki, Heru Handika, and Sheila Rodriguez. Alongside my friends at the Museum, other graduate students from the Department of Biological Sciences and the Department of Renewable Natural Resources have been exceptional in my development at LSU: Michael Henson, Alicia Riegel, Elisa Elizondo, Amie Settlecowski, Allison Snider, Cameron Rutt, and Vitek Jirinek. To everyone else who has influenced me through my time at LSU, I am greatly thankful. Finally, I would like to thank Dr. Martin Päckert and Sontaya Manawatthana for providing sequences from their studies. I also acknowledge the help of Dr. Rebecca Kimball and two anonymous reviewers for insightful comments that improved the manuscript based on Chapter 2, which was published in the journal Ibis. I am thankful for the B10K consortium for providing annotation data for the genome of the Black-headed Bulbul (Brachypodius atriceps), the main subject of the dissertation. I thank Dr. Daniel Portik and Dr. Kevin Winker for their help with running dadi models.

iii

TABLE OF CONTENTS

ACKNOWLEDGMENTS………………………………………………………………………...ii

ABSTRACT……………………………………………………………………………………….v

CHAPTER 1. INTRODUCTION…………………………………………………………………1

CHAPTER 2. THE PHYLOGENY OF THE WORLD'S (PYCNONOTIDAE) INFERRED USING A SUPERMATRIX APPROACH……………………………..……..3

CHAPTER 3. GENETIC INVESTIGATION OF INSULAR MELANISM IN BLACK- HEADED BULBUL (BRACHYPODIUS ATRICEPS) OF SOUTHEAST ASIA.………...14

CHAPTER 4. HISTORICAL DEMOGRAPHY OF THE BLACK-HEADED BULBUL (BRACHYPODIUS ATRICEPS) ON MARATUA AND BAWEAN ISLANDS, INDONESIA………………………………………………………………………...... 35

CHAPTER 5. CONCLUSIONS……………………………………………………………...….49

APPENDIX A. SUPPLEMENTARY MATERIAL FOR CHAPTER 2..……………………….50

APPENDIX B. COPYRIGHT INFORMATION FOR CHAPTER 2…………………………...73

REFERENCES…………………………………………………………………………………..78

VITA……………………………………………………………………………………………..91

iv

ABSTRACT

Intraspecific polymorphism in , especially color polymorphism, is an area of active research in evolutionary biology. Using whole-genome sequencing (WGS) it is possible to investigate the genetic basis of such polymorphisms and gain a basic understanding of the mechanics of bird coloration. In this dissertation, I applied WGS to uncover the potential genetic underpinnings of color polymorphism in the Black-headed Bulbul (Brachypodius atriceps) of Southeast Asia. This was selected because of the heterogeneous dispersion of two morphs across its range: a yellow form predominating on mainland Asia and the Greater Sunda Islands and a gray morph on two islands—Bawean, a continental island in the Java Sea, and Maratua, an oceanic island off the east coast of Borneo. I approached this project from three angles. First, I reconstructed the phylogeny of the bulbul family, Pycnonotidae, to examine patterns of coloration among all species and infer the commonality of color changes relevant to B. atriceps. To build the phylogenetic tree, I used a super-matrix approach, which allowed the inclusion of 121 of the 130 known species of bulbuls. Using the tree, I determined the most appropriate outgroups for comparison with B. atriceps in subsequent genomic study. Next, I generated a high-quality reference genome of a yellow individual of B. atriceps and, subsequently, sequenced low-coverage genomes of multiple gray and yellow individuals, as well as three outgroup taxa. I compared Fst values between genomes of gray and yellow individuals to locate peaks of divergence and identify potential candidate loci for the color polymorphism. I also tested the protein-coding genes between yellow and gray birds for signs of selection. Among genes potentially responsible for the color polymorphism, several involved in lipid uptake, transport, and deposition—processes fundamental to carotenoid expression. In the final chapter, I assessed characteristics among B. atriceps populations across the species range in Sundaland with an emphasis on Bawean and Maratua islands. The Bawean population was barely discernable genetically from that on mainland Borneo. The Maratua population, however, was notably divergent from the mainland Bornean and other populations. Therefore, I modelled its demographic parameters and used the information to gain a better idea of the historical processes that have led to its unique, singular coloration. The Maratua population was originally isolated from other Sundaic populations c. 1.9 Ma, but c. 1000 years ago began to experience a small amount of gene flow.

v

CHAPTER 1. INTRODUCTION

Examples of intraspecific polymorphism abound in nature. Polymorphism can manifest itself in phenotype (e.g., color), behavior, or simple genetic differences (e.g., SNPs). Genetic polymorphism and morphological variation within populations often act as precursors to speciation (Mayr 1954; West-Eberhard 1986). Thus, an understanding of the genetics of such polymorphisms can yield insight into the process of evolution itself. Intraspecific polymorphism, especially color variation reflected as color “morphs”, is common in bird species (Galeotti et al. 2003). In some groups, such as cuckoos and owls, more than 10% of the species exhibit some form of color polymorphism, whereas only 1% of do (Galeotti et al. 2003). One species that exhibits dramatic polymorphism, however, is the Black-headed Bulbul (Brachypodius atriceps) of Southeast Asia. It has two morphs, yellow and gray. The yellow morph is bright yellow with a dark black head; the gray morph is similar, except that it lacks any yellow whatsoever. In some cases, partially gray birds also occur in populations, especially in northeast India. In these cases, individuals lack yellow plumage in their underparts but retain yellow wings and tail. The gray morph is rare throughout B. atriceps’ range, except on two islands on the Sunda continental shelf. On Bawean Island, a continental island in the Java Sea between Borneo and Java, gray and yellow individuals are about equally common. On Maratua Island, an oceanic island off the east coast of Borneo, the entire population is gray. Originally, the gray populations of B. atriceps on both Bawean and Maratua islands were described as different species, baweanus and hodiernus respectively. Both species were later subsumed into a single species by Mayr and Greenway (1960). Subsequent observations of mixed pair couples on Bawean Island further strengthened the case that the gray birds and yellow morphs were color morphs of the same species (Hoogerwerf 1966). Some recent authorities, however, have suggested splitting the Bawean, Maratua, as well as the Mentawai populations (subspecies hyperemnus) into different species (Eaton et al. 2016). Historically, the color polymorphism in B. atriceps was assumed to result from a lack of yellow pigments. Hume (1878) noted that by applying carbolic acid to yellow feathers of B. atriceps the yellow coloration was removed, leading to a typical-looking gray bird. Hume (1878) postulated that skin had ceased production of yellow pigment leading to the lack of yellow coloration in gray morphs. We now know that the bright yellow coloration in birds in general is caused mainly by pigments called carotenoids (Hill and Mcgraw 2006). The genetic basis of carotenoid pigmentation has been investigated, but little is known about the role of various genes in carotenoid pathways (Walsh et al. 2012; Lopes et al. 2016; Mundy et al. 2016; Toomey et al. 2017). The loss of carotenoid pigmentation could be caused by a loss of the pigment itself or by a mutation within the genes responsible for carotenoid absorption, transport, or deposition. In this dissertation, I examine the genetic basis of the color polymorphism in B. atriceps in an attempt to understand the phylogenetic events and phylogeographic forces that have resulted in the establishment of gray morphs on Maratua and Bawean islands. In Chapter 2, I reconstruct the phylogeny of the bulbul family, Pycnonotidae. The phylogeny not only provides information about genealogical relationships among species, including B. atriceps, but also can be used to illuminate patterns that are key to understanding bulbul evolution. In this case, the patterns and corresponding inferences that are of paramount importance include the commonality of color changes and the potential roles of convergence and parallelism in determining these changes. The family Pycnonotidae includes approximately 130

1 species (Shakya and Sheldon 2017). However, bright yellow carotenoid pigmentation is not a common characteristic of bulbuls. It occurs only in a few taxa, including B. atriceps, the melanicterus-complex, R. squamatus, and Thapsinillas species. To reconstruct the phylogeny, I applied a super-matrix approach, using published sequences from repositories such as GenBank supplemented by new sequences I generated for missing taxa from tissues available in the LSU Museum of Natural Science’s Genetic Resources Collection and collections at other museums. In all, I compared sequences from 121 species of bulbul and used maximum likelihood and Bayesian methods to estimate the phylogeny. The feasibility of whole-genome sequencing (WGS) has revolutionized the field of biology. It is now possible to compare whole genomes from multiple individuals as well as multiple species to investigate fundamental questions in evolution, ecology, and other aspects of biology. WGS has been applied to illuminate genes and gene pathways involved in various biological processes including coloration. In Chapter 3, I use WGS to produce a high-quality reference genome for one individual of a yellow B. atriceps. I also resequenced at low coverage the genomes of 17 other individuals, including six gray birds from Maratua Island, one gray bird from Bawean Island, and ten yellow birds from Borneo, Sumatra, Palawan, and the Mentawai Islands, as well as three outgroup taxa. Then I scanned the genomes of yellow and gray individuals for islands of genetic divergence signaled by high Fst values, which are often associated with phenotypic characters that are fixed between populations (e.g., Lamichhaney, Berglund, et al. 2015; Lamichhaney, Fan, et al. 2015; Toomey et al. 2018; Kim et al. 2019). Because, the gray morphs appear on two geologically distinct islands (Bawean and Maratua) that are separated by large bodies of land and water, I also investigated whether the gray morphs on the two islands resulted from convergent or parallel evolution and tested for selection among protein-coding genes. Since small islands play a role in the phenotypic dispersion of B. atriceps morphs, in Chapter 4 I examine the role geography in shaping the population history of the species. Mayr (1954) noted the common occurrence of morphologically distinct populations of mainland species on peripheral islands. Several hypotheses have been postulated to explain this phenomenon (Mayr 1954; Carson 1968; Templeton 1980). Using whole genome sequences, I assessed population characteristics among B. atriceps populations across Sundaland. I also tested models that estimate various demographic parameters on the Maratua Island population. These parameters include time of divergence, gene flow, and effective population sizes. The modeled estimates paint a picture of the population dynamics and environmental conditions during which the Maratua Island population was established. This dissertation is an amalgamation of various avenues of biological inference designed to provide insight into the evolution and natural history of Brachypodius atriceps color morphs. We are slowly beginning to discover the genes and unravel the genetic pathways involved in the variation and evolution of vertebrate color. In particular, this dissertation adds to the accumulating body of research on genes responsible for carotenoid coloration. It also adds to our understanding of the role of continental versus oceanic islands in population diversification. Finally, because B. atriceps, like many other colorful birds in Southeast Asia, is under extreme pressure from pet traders, the dissertation serves to highlight the biological importance and conservation plight of these beautiful bulbuls in a changing world.

2

CHAPTER 2. THE PHYLOGENY OF THE WORLD'S BULBULS (PYCNONOTIDAE) INFERRED USING A SUPERMATRIX APPROACH

INTRODUCTION

Bulbuls are a prominent group of passerines in the Old World tropics and subtropics. They are generalists, feeding on a wide variety of fruits and , and they play a particularly important ecological role as seed dispersers (e.g., Corlett 1998). They are also common garden birds and known for their melodious songs. Several species are kept in captivity and some, such as the Straw-headed Bulbul zeylanicus have suffered severe population declines due to their popularity as cage-birds (Eaton et al. 2015). Bulbuls have been introduced in many parts of the world, with species such as the Red-whiskered Bulbul P. jocosus flourishing in such disparate places such as Florida, Hawaii, and coastal Australia. Currently, 30 genera and 130 species of bulbuls are recognized (Dickinson and Christidis 2014). A diagnostic characteristic of this family is a thin sheet of bone in the operculate nostrils (Fishpool & Tobias 2005). Most species also have drab or dark-coloured fluffy plumage, with characteristically colourful patches of feathers, especially on the throat and abdomen. The family is distributed throughout Africa and most of South and South-east Asia but is absent (barring introductions) from Europe and east of Wallacea. Several molecular studies have examined the phylogenetic relationships of species within the Pycnonotidae (Pasquet et al. 2001; Moyle and Marks 2006; Johansson et al. 2007; Oliveros and Moyle 2010; Zuccon and Ericson 2010; Fuchs et al. 2015). Pasquet et al. (2001) used two mitochondrial markers, 12S and 16S, to compare 27 species and demonstrated that the family is divisible into two fundamental clades, one from Africa and another from Asia. Moyle and Marks (2006) compared 57 species and used both mitochondrial and nuclear genes to determine relationships within the family. Like Pasquet et al. (2001), they recovered African and Asian clades and noted that a few members of the Asian clade had subsequently dispersed into Africa, including Madagascar (Warren et al. 2005; Moyle and Marks 2006). Johansson et al. (2007) compared numerous African members of the family using three nuclear genes and ended up splitting several African genera. Most recently, molecular studies have concentrated on phylogenetic relationships within specific groups, such as the Philippine bulbuls (Oliveros and Moyle 2010), Sundaic bulbuls (Dejtaradol et al. 2015), (Fuchs et al. 2015) and Bleda (Huntley and Voelker 2016). Oliveros and Moyle (2010), for example, redefined the limits of several Asian taxa to encompass monophyletic groups. Despite these advances, the classification of bulbul genera is still in flux. This is especially true in the largest bulbul , Pycnonotus (41 species), which is almost certainly polyphyletic, and yet little has been done to revise its classification (Fishpool and Tobias 2005). Recent studies at higher phylogenetic levels have caused several putative members of the Pycnonotidae to be assigned to other families. Five taxa in are now considered members of the endemic Malagasy family (Cibois et al. 2001; 2010; Gill and Donsker 2005). Similarly, three species of the genus are currently placed in their own _____ This chapter was previously published as Shakya, S.B., and F.H. Sheldon. 2017. “The Phylogeny of the World’s Bulbuls (Pycnonotidae) Inferred Using a Supermatrix Approach.” Ibis 159: 498–509. Reprinted by permission from John Wiley & Sons, Inc.

3 family Nicatoridae (Beresford et al. 2005; Johansson, Fjeldså, and Bowie 2008; Fregin et al. 2012). The phylogenetic position of Malia grata is still in doubt; one possibility is that it belongs in the (Oliveros, Reddy, and Moyle 2012). It turns out that most of these former “bulbuls” lack operculate nostrils, strengthening the usefulness of this character in defining the group. Nevertheless, at least one true pycnonotid, the Black-collared Bulbul Neolestes torquatus lacks it (Fishpool and Tobias 2005). The recent, substantial restructuring of African and Asian bulbul classification demonstrates how molecular studies have helped improve our understanding of bulbul relationships, but it also indicates how much more remains to be done. Limited taxon sampling has severely undermined efforts at phylogenetic reconstruction. So far most molecular studies of the Pycnonotidae have either sampled broadly at the intergeneric level or focused on specific genera or biogeographic groups. A comprehensive phylogenetic reconstruction of the entire family is needed. Fortunately, DNA sequences from all previous molecular studies are available to serve as a base for such a study. In addition tissue samples have become available for a few bulbul taxa that were not compared in earlier studies, and further gaps in sampling may be filled using toe-pad DNA from traditional museum specimens. Given these sources of data, it is now possible to infer phylogenetic relationships among a much broader group of bulbuls than ever before using a supermatrix approach. Supermatrices constructed from DNA sequences available in public databases, such as GenBank, have been successfully used to reconstruct phylogeny (e.g., de Queiroz and Gatesy 2007; Thomson and Shaffer 2010). The strengths and weakness of the supermatrix approach have been well considered, especially the problems posed by large amounts of missing sequence data (Lemmon et al. 2009; Roure, Baurain, and Philippe 2013; Hosner, Braun, and Kimball 2016; Streicher, Schulte, and Wiens 2016). Proponents of supermatrices recognize that this as an important issue, but suggest it is minor given the advantages of greater taxic coverage. Moreover, one particularly useful feature of supermatrices is that they can always be enhanced as future studies produce more sequences and improve sampling. Here I use DNA sequences from previous phylogenetic projects coupled with new sequences derived from fresh tissues and toe-pads of missing species to generate a more comprehensive tree of bulbul intrafamilial relationships. In all, I compare 121 of the 130 species of bulbuls (Dickinson and Christidis 2014), as well as individuals of several recently recognized species (i.e., subspecies raised to species level). These comparisons substantially improve our understanding of phylogenetic relationships within the Pycnonotidae and, thus, help with the formulation of a more accurate classification. The comparisons also provide a view of the efficacy, strengths, and weakness of the supermatrix approach to phylogenetic reconstruction.

METHODS

All available sequences of bulbuls (and a few related species) were downloaded from GenBank (http://www.ncbi.nlm.nih.gov/genbank). For each species, I manually selected a single sequence representing each locus, and each sequence-type was binned into a specific folder based on its locus. Whenever possible, I chose sequences from the same individual and used the minimum number of individuals per species. If sequences were from multiple individuals I made sure they were members of the same subspecies group. For all sequences, I also checked tissue and specimen numbers to ensure correct species assignment of any taxon that has been split recently. I followed Dickinson and Christidis (2014) for most species-level assignments. I also

4 included sequences of taxa designated as subspecies by Dickinson and Christidis (2014) that have recently been raised to species status by other taxonomists. Four species, three of which were formerly considered bulbuls, were included as outgroups: Malia Malia grata, Long-billed Bernieria Bernieria madagascariensis, Yellow-spotted Nicator Nicator chloris and Barn Swallow Hirundo rustica. Sequences in bins were aligned using MUSCLE (Edgar 2004) and manually checked for anomalous sequences or misidentified loci. The trailing ends of each alignment were trimmed to provide maximum sequence overlap (i.e. each nucleotide site was present in at least 50% of the taxa). In addition, I sequenced loci of bulbul species not in GenBank (Table S1). Total genomic DNA was extracted from frozen or alcohol preserved tissue or blood samples using DNEasy® Blood and Tissue Kit (Qiagen) and the manufacturer’s protocol. PCR amplifications were performed in 25 µl reactions using Taq DNA Polymerase (New England BioLabs Inc). The primer pairs, L5215 (Hackett 1996) and HTrpC (STRI), L10755 and H11151 (Chesser 1999), and MB-2R and MB-3F (Kimball et al. 2009) were used to amplify the loci NADH dehydrogenase subunit 2 (ND2), NADH dehydrogenase subunit 3 (ND3) and Myoglobin-intron 3 (MB-I3) respectively. Thermocycler runs were set to 34 cycles with denaturing temperatures of 95°C, annealing temperature varying based on the primer pair used, and an extension temperature of 72°C. The PCR products were visualized in 1% agarose gel stained with SYBR® Safe DNA Gel Stain (Invitrogen). The PCR products were sequenced by Beckman Coulter Genomics (Danvers, MA). DNA from toepads was extracted using well-established ancient-DNA protocols (Sheldon et al. 2012). ND2 was amplified using primers designed for Black-headed Bulbul Pycnonotus atriceps (Chua et al. 2015). Because several amplification efforts using these primer pairs failed, I used forward primers L5215 (Hackett 1996), and P. atriceps F2, F3, F4, and F5 (Chua et al. 2015)), and reverse primers Copsychus malabaricus R1, R2, R3, R4 (Chua et al. 2015), and HTrpC (STRI) to amplify the missing fragments. Including GenBank and newly generated sequences I had seven mitochondrial genes and seven nuclear loci (Table 1). These were aligned using MUSCLE (Edgar 2004). ML tree searches were conducted for each locus using RAxML v8 (Stamatakis 2014) to see if the sequences produced reasonable bulbul trees (i.e. did not include anomalous or paralogous loci). After this check, the loci were concatenated into a single supermatrix using a custom python script. Substitution models were estimated using PartitionFinder v1.1.1 (Lanfear et al. 2012) and partitioned by locus. Mitochondrial loci were further partitioned by codon position. I implemented a greedy search algorithm, specified blocks as described above, and used the BIC criterion to find the best partitioning scheme. ML searches were run on the concatenated supermatrix using RAxML v8 (Stamatakis 2014) implemented through the CIPRES Science Gateway (Miller, Pfeiffer, and Schwarz 2010). The supermatrix was partitioned based on the best scheme obtained from PartitionFinder and GTRGAMMA model was used for each partition. 1000 non-parametric bootstrap replicates were run to obtain the branch support values on the best tree generated from RAxML. Bayesian tree reconstruction was performed using MrBayes v3 (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003) implemented through the CIPRES Science Gateway (Miller, Pfeiffer, and Schwarz 2010), using the same partitioning scheme as the ML search. Two runs each with four chains were performed for 10,000,000 generations, sampling every 1000 generations. Burn-in of 250,000 was discarded and convergence of runs was checked in Tracer v1.6 (Rambaut et al. 2014).

5

Because of missing and short sequences, and sequences containing numerous unresolved sites, the complete matrix of sequences produced trees with many unresolved nodes. To improve resolution, I maximized species’ sequence coverage by reducing the number of loci. A smaller matrix was constructed using loci that occurred in at least 1/3 of the taxa. This reduced supermatrix consisted of 3 mitochondrial genes and 5 nuclear loci (Table 2.1). I also identified ‘rogue taxa’ (i.e. taxa that move around disproportionately during tree reconstructions, thereby reducing branch resolution) in ML tree searches of the complete supermatrix using RogueNaRok (Aberer, Krompass, and Stamatakis 2013). Several of these ‘rogue taxa,’ were pruned from the dataset before compiling the reduced matrix. Fortunately, some of the ‘rogue taxa’ were also taxa that already had been pruned from the large matrix because of sequence-quality problems. Using the reduced matrix, I repeated ML and Bayesian tree searches.

RESULTS

Together, the GenBank and 58 newly generated sequences amounted to 790 sequences representing 8 loci: 7 mitochondrial genes and 7 nuclear gene segments (Table 2.1 and Table A.1). In all, 125 species were compared, including 121 species of bulbuls representing 26 of 27 genera in the family (Acritillas was missing from my study). The sequences used in the analysis are listed in Table A.1, and the sequences used per taxon are shown in Figure 2.1.B. Taxon coverage for the different loci ranged from 13% (for Glyceraldehyde-3-phosphodehydrogenase intron 11) to 97% (for ND2). The total matrix contained 55.2% missing sequence data.

Table 2.1. Genetic loci included in the supermatrix analyses. The column-headed “Taxa” lists the total number of individuals used in the study for each locus. Loci marked with * were used in the reduced supermatrix.

Loci Abbreviation Taxa Consensus length (bp) Mitochondrially-encoded 12S RNA 12S 34 860 Mitochondrially-encoded 16S RNA 16S 35 523 ATP synthase subunit 6 ATP6 17 684 Cytochrome c oxidase subunit I COI 37 652 Cytochrome b* CYTB 58 1143 NADH dehydrogenase subunit 2* ND2 121 1041 NADH dehydrogenase subunit 3* ND3 106 351 Fibrinogen beta chain intron 5* FIB-I5 64 588 Fibrinogen beta chain intron 7* FIB-I7 79 1016 Glyceraldehyde-3-phosphodehydrogenase GAPDH-I11 16 347 intron 11 Myoglobin intron 2* MB-I2 80 730 Ornithine decarboxylase intron 6 and 7* ODC-I6/I7 70 801 Recombination activating protein 1 RAG1 17 1467 Transcription growth factor beta 2 intron 5* TGFβ2-I5 50 596

Gene trees constructed for each locus did not show any unexpected relationships that might have indicated paralogous, misidentified or other sequence problems. The concatenated, 8-locus

6 supermatrix consisted of 10802 nucleotide sites. Trees generated using ML and Bayesian approaches were highly consistent but most branches in the trees were not well supported (Fig. 2.1.A). Although most taxa appear to evolve at similar rates, a few taxa lay on unusually long branches. In the reduced supermatrix, I included only those sequences occurring in more than a third of the taxa: 3 mitochondrial genes (ND2, ND3, and cytb) and 5 nuclear loci (Fib5, Fib7, MB, ODC, and TGF). I also removed four rogue species identified by RogueNaRok: Prigogine’s prigoginei, C. laetissima, White-bearded Greenbul ndussumensis and Yellow-eared Bulbul Pycnonotus penicillatus. Compared to the full matrix, the reduced matrix had a coverage per gene ranging from 40% of the taxa (TGFβ-I5) to 97% of the taxa (for ND2). The reduced matrix contained 37% missing sequences. Trees generated from the reduced matrix were very similar to ones generated from the complete supermatrix, but had much stronger nodal support (Fig. 2.2). All phylogenetic inferences below are based on the reduced matrix tree. Phylogenetic analysis indicates that the Pycnonotidae is monophyletic and overall tree structure is similar to that found by Moyle & Marks (2006), especially in respect to the existence of African and Asian clades (Fig. 2.2). Relationships among the African species are similar to those found by Johansson et al. (2007). For example, Calyptocichla serinus, Stelgidillas gracilirostris, and Andropadus importunus form a clade that is sister to the rest of the African clade. Similarly, in agreement with Zuccon & Ericson (2010), I also found Neolestes torquatus to be a member of the African bulbul clade. Among the other African bulbuls, several groupings are clear. Most African genera are monophyletic. The only exceptions are Chlorocichla and ; C. simplex is sister to , not the other Chlorocichla species, and A. montana is sister to a clade containing Chlorocichla and Baeopogon. Among the largely Asian clade, two major subdivisions are well-supported, one including most of Pycnonotus and and the other including all remaining Asian genera. As in most previous studies (Moyle and Marks 2006; Oliveros and Moyle 2010), I found Pycnonotus to be polyphyletic. Most Pycnonotus taxa form a clade with Spizixos (finchbills) embedded. Several species that have never been compared using molecular methods were included in my study, and I summarize their relationships here. C. falkensteini is sister to C. flaviventris, but their congener C. simplex is sister to Baeopogon, albeit with low support. In Bleda, B. notatus is sister to B. canicapillus, which in turn is sister to B. eximius and then B. syndactylus. Criniger chloronotus is sister to C. barbatus, and C. calurus is sister to C. chloronotus. Two species of Phyllastrephus that have not been included in previous molecular studies, P. fulviventris and P. baumanni, are sister species and together form the sister group of P. hypochloris. Among the Asian species, I found Alophoixus finschii is not closely-related to other Alophoixus species, but instead is sister to . I. viridescens is sister to I. propinqua. virescens is sister to a clade containing I. malaccensis and I. mcclellandii. Thapsinillas affinis is embedded within and is closest to H. guimarasensis. Cerasophila thompsoni is also embedded within Hypsipetes and is sister to H. leucocephalus. Among other Pycnonotus species, P. striatus is sister to the clade containing Iole, , Ixos, and Hypsipetes. P. leucogrammicus is sister to Hemixos, albeit with low support. Pycnonotus priocephlaus is sister to P. atriceps. P. squamatus is sister to P. cyaniventris. Spizixos canifrons is sister to S. semitorques. Among the subspecies of Pycnonotus barbatus, P. b. tricolor is sister to P. capensis, and this clade is sister to P. b. barbatus.

7

Figure 2.1. A) Bayesian tree generated from the complete supermatrix with posterior probability and bootstrap support indicated by numbers above and below branches respectively. Several monophyletic groups are collapsed into triangles for easier visualization. Grey dotted branches respresent “rogue taxa” identified by RogueNaRok and subsequently pruned from the complete supermatrix to make the reduced supermatrix. B) Data completeness matrix showing presence/absence of loci per taxon used in my analyses. Filled boxes indicate sequences used. Gray boxes and labels indicate loci or taxa pruned from my complete supermatrix to make the reduced supermatrix. See Table A.1 for GenBank accession numbers of each locus.

8

Figure 2.2. Phylogenetic tree generated from the reduced supermatrix. Posterior probability and bootstrap support are indicated by numbers above and below branches respectively. * represents a posterior probability of 1.0 or bootstrap support of 100%. Continental regions where species occur are shown to the right of the taxon name. Illustrations of exemplar taxa are shown to the right (illustrated by Subir B. Shakya)

9

DISCUSSION

Supermatrix issues

I used newly generated and GenBank sequences to build a supermatrix to infer the phylogeny of Pycnonotidae. Initially, my approach was to compile a matrix of all loci from previous studies and to supplement these data with new sequences, either from fresh tissues that had recently become available (e.g., Pycnonotus squamatus, Alophoixus finschii, and Spizixos semitorques) or from toe-pads of especially difficult-to-obtain taxa (e.g., P. leucogrammicus, and Thapsinillas affinis). However, when all old and new data were compiled into a matrix, the resulting tree was poorly resolved (Fig. 2.1.A). Its lack of resolution could have been due to missing loci, small sequence sizes, pseudogenes, paralogous loci or a variety of other issues. Removal of loci that were represented in less than a third of the sample taxa greatly improved resolution of the tree (Fig 2.2). Two disadvantages of building supermatrices using sequences from public databases, such as GenBank are inconsistency in sampled loci and low sequence quality (e.g., sequences that consist only of subsections of commonly used loci, include pseudogenes, or have incorrect identified bases). Also, the information accompanying sequences from public databases is not necessarily correct. No standardized set of loci is used in all phylogenetic studies. Thus, centralized repositories assemble a variety of loci of variable quality. Some of the available sequences, such as ND2, are widely represented in phylogenetic studies, whereas others are under-represented. If commonly used sequences are available for all or a majority of the taxa, then the supermatrix approach usually produces a well-resolved tree (Wiens 2003; Wiens and Morrill 2011; Jiang et al. 2014). Thorough coverage of one or two genes, however, was not a problem in my study; the problem was under-representation as exemplified by Chlorocichla prigoginei and C. laetissima. The only gene available for these species was cytochrome oxidase I (COI). Although, well-known as the barcode gene (Austerlitz et al. 2009), COI is not commonly used in phylogenetic studies of birds. Moreover, the COI sequences from these two species were short (<300 bp). Thus, when these species were included in the supermatrix, they appeared in several different African clades during tree construction, causing low-support at several nodes. RougeNaRak identified these two species as the top ‘rogue taxa,’ and their removal from the supermatrix greatly improved nodal support values. The second issue, low quality sequences, is usually the result of sequences generated from degraded DNA in toe-pads or mistakenly amplified sequences of pseudogenes. Low sequence quality can affect nodal support as adversely as inconsistent sequence sampling. For example, the Pycnonotus penicillatus sequence I used was generated from a toe-pad of a 60 year old specimen. The sequence was short (300 bp) and contained several unidentified nucleotides. Its removal greatly improved support of several nodes in the Asian clade. Other clades with low nodal support also contained taxa represented by short sequences (<200 bp). These included the Indian Ocean Hypsipetes group (H. parvirostris, H. moheliensis, and H. crassirostris) and the Pycnonotus melanicterus-complex (P. m. melanicterus, P. m. flaviventris, P. m. gularis, P. m. montis, and P. m. dispar). A final consideration in using supermatrices constructed from public databases is the influence of locus sampling on branch-lengths. Missing data is known to affect branch-lengths (Darriba, Weiß, and Stamatakis 2016). Similarly, sequences of a locus compiled for a single species may, in fact, be derived from different individuals representing geographically disparate

10 populations. Branch-lengths of trees generated from such heterogeneous datasets do not accurately reflect the evolutionary history of a species, but rather represent sort of an average for the composite “species”. Although, resulting trees may reflect the appropriate branching position of a composite species, the branch length of that “species” must not be considered a reliable parameter in analyses that require accurate tree dimensions.

Biogeography

As in previous studies, I recovered two distinct groups of bulbuls: one restricted to Africa and the other consisting mainly of Asian taxa. Two Asian genera, Hypsipetes and Pycnonotus have expanded into Africa quite recently (Fig. 2.2). Hypsipetes appears to have reached the islands of east Africa, including Madagascar, by island-hopping across the Indian Ocean (Warren et al. 2005). Its route is evidenced by several Indian Ocean populations of black bulbuls (Fig. 2.3). If I interpret my tree literally, this cross-ocean dispersal happened either in two waves or in a single event followed by reinvasion of mainland Asia. The other dispersal event, by Pycnonotus bulbuls, seems to have occurred via the Arabian Peninsula. Hypsipetes appears to be a highly vagile group, with a predisposition for colonizing remote islands. With the exception of two mainland forms, H. leucocephalus (including H. l. ganeesa) and Cerasophila thompsoni (which is embedded in Hypsipetes), all other members of the genus are endemic to islands (Fig. 2.3). They have reached (and are the only bulbul species on) the islands of east Africa, including Comoros, Madagascar, Seychelles, Reunion, Aldabra and Moheli. They have also colonized islands in the Philippines and Japan. Most interesting, my comparisons demonstrated that an enigmatic island genus, Thapsinillas, endemic to the Moluccas, is in fact in Hypsipetes.

Taxonomy

By including African taxa not previously compared by Johansson et al. (2007), I discovered potential relationships that help improve bulbul classification. However, the availability of only short sequences (100-200 bp) for some key species, such as Chlorocichla prigoginei and C. laetissima, might have caused their inaccurate placement in my tree. Without better sampling, and longer and reliable sequences of taxa in the African clade, it is premature for us to suggest taxonomic changes in this group. Such improvements will have to wait for thorough investigations of individual groups, such as that by Huntley and Voelker (2016), who compared sequences of many individuals in all Bleda species and suggested raising B. notatus ugandae to species status. Most of my taxonomically useful discoveries pertain to Asian taxa. I compared 35 species of Pycnonotus plus 5 subspecies of Pycnonotus melanicterus that recently have been raised to full species (Rasmussen and Anderton 2005). Pycnonotus, and Spizixos form a large monophyletic group. However, three species of “Pycnonotus” are not members of this clade. The Striped Bulbul P. striatus of the Himalayas is sister to a large clade comprising Alophoixus finschii, Iole, Hemixos, Ixos, Thapsinillas, Hypsipetes and Cerasophila bulbuls. I recommend resurrecting the genus Alcurus Hodgson, 1844, for this species and calling it Alcurus striatus. The Cream-striped bulbul P. leucogrammicus is sister to Hemixos and could warrants its own genus on morphological grounds (Fishpool and Tobias 2005). However, to emphasize its phylogenetic position I suggest applying the generic name Hemixos to it. For the Yellow-eared

11

Bulbul Pycnonotus penicillatus I was not able to obtain good quality sequences, and it was excluded from the final matrix. However, based on its short DNA fragment, this species does not appear to belong in Pycnonotus. At the highest level, Pycnonotus may be divided into three groups (indicated by letters in Fig. 2.1). The members of Clade A were previously revised by Dickinson and Christidis (2014), who placed P. atriceps, P. priocephalus, and P. fuscoflavens (not sampled in my study) in Brachypodius; P. urostictus in Poliolophus; P. eutilotus in Euptilotus; and P. melanoleucos in Microtarsus. For Clade B, Fishpool & Tobias (2005) have suggested placing P. melanicterus in Rubigula, and splitting P. melanicterus into five species, P. melanicterus, P. montis, P. flaviventris, P. gularis, and P. dispar (also see Rasmussen and Anderton 2005; Collar and Pilgrim 2007). Fishpool and Tobias (2005) also suggest placing P. squamatus and P. cyaniventris in Ixidea. My study indicates that P. erythropthalmos should also go in Ixidea. Only the species in Clade C should remain in Pycnonotus. Within this clade, I found that P. barbatus tricolor is sister to P. capensis not P. b. barbatus P. b. tricolor, along with P. b. somaliensis and P. b. dodsoni, which were not included in my study, are sometimes raised to full species (Fishpool and Tobias 2005; Turner and Pearson 2015). Such an arrangement, at least for P. b. tricolor, seems warranted based on my study. Provided these generic name changes are adopted for Pycnonotus, the name Spizixos may be retained for the two species of finchbill. The remarkable, long-term misplacement of Spizixos is because of the oddly-shaped bills of this species, which are adapted to seed-eating. Such plastic characters are not expected to be useful in phylogenetic reconstruction (McCracken and Sheldon 1998). Finsch’s Bulbul Alophoixus finschii and several similar Asian bulbuls (e.g., A. bres, A. phaeocephalus, A. frater, A. ochraceus and A. pallidus) were originally placed in the genus Criniger. However, Pasquet et al. (2001) showed that African Criniger species formed a group distinct from Asian taxa. The Asian forms, including A. finschii, were then reclassified as Alophoixus. A. finschii was moved to this genus even though it does not resemble the other Alophoixus bulbuls. My study suggests that A. finschii is most closely related to the similar looking Iole bulbuls and should be called Iole finschii. The only bulbul found east of Wallace’s line is the Golden Bulbul Thapsinillas affinis. This bird has a messy taxonomic history, having at various times been synonymized with Criniger, Alophoixus and Ixos (Fishpool and Tobias 2005; Collar, Eaton, and Hutchinson 2013). My study found it to be a member of Hypisipetes. Hypsipetes species have colonized many islands in the Indian Ocean (Fig. 2.3) and, hence, discovering that the island-dwelling Golden Bulbul of Wallacea is a part of the Hypsipetes clade is not surprising. Another species, White- headed Bulbul Cerasophila thompsoni from Indochina, is also a member of Hypsipetes. Unfortunately, I was not able to include Yellow-browed Bulbul Acritillas indica of South India and Sri Lanka, in my study. Acritillas is a monotypic genus previously considered allied with Criniger and Iole, but its unique morphology and nest structure led Dickinson and Gregory (2002) to move it to its own genus. The phylogenetic position of this species remains unknown.

12

Figure 2.3. Ranges of Hypsipetes species (including Cerasophila and Thapsinillas) showing their tendency to inhabit islands. Ranges were estimated from accounts in Fishpool & Tobias (2005), Dickinson & Christidis (2014), and Gill & Donsker (2015).

13

CHAPTER 3. GENETIC INVESTIGATION OF INSULAR MELANISM IN BLACK-HEADED BULBUL (BRACHYPODIUS ATRICEPS) OF SOUTHEAST ASIA.

INTRODUCTION

It is not uncommon for peripheral island populations to differ phenotypically from a widely distributed, generally invariable mainland population of a species. Various hypotheses have been presented to explain the origin and persistence of such distinct forms on peripheral islands (Mayr 1954; Carson and Templeton 1984; Grant 2001). Changes in phenotype may result from random genetic events or dramatic selective differences on individual islands (Mayr 1954; Losos and Ricklefs 2009). Among random factors, the founder effect and drift may facilitate the emergence or fixation of novel or rare phenotypes in island populations (Mayr 1954; Grant 2001; Losos and Ricklefs 2009). When selection is at play, island taxa often diverge from the mainland population and converge on a few typical phenotypes, commonly called island syndromes, such as those related to body size (Foster 1964; Van Valen 1973; Lomolino 1985), flightlessness (Slikas, Olson, and Fleischer 2002; Kirchman 2012; Wright, Steadman, and Witt 2016), and coloration (Theron et al. 2001; Uy and Vargas-Castro 2015). To fully understand island syndromes requires the investigation of their genetic and physiological underpinnings. Although I know little about the underlying processes and pathways of traits such as body size and shape, a fair amount of work has been focused on coloration, including coloration in island taxa (Theron et al. 2001; Uy et al. 2016). In vertebrate systems, melanins and carotenoids are two of the three main sources of color, along with structural refraction of light (Hill and Mcgraw 2006). Melanins are the predominant pigment compounds and are derived through enzymatic conversion of the amino acid tyrosine in the melanogenesis pathway (Hill and Mcgraw 2006). Melanin genetics have been well studied, especially in mammals and birds, and several genes have been associated with melanin-based polymorphism, including MC1R (Melanocortin 1 receptor), TYR (Tyrosinase), ASIP (Agouti- signaling protein), SLC25A2 (Solute carrier family 25 member A2) (Barsh 1996; Theron et al. 2001; Nachman, Hoekstra, and D’Agostino 2003; Mundy 2005; Hoekstra 2006; Uy et al. 2009; Lamichhaney, Fan, et al. 2015; Schneider et al. 2015). Carotenoid coloration is also influenced genetically, e.g., by genes that play a role in the metabolic pathways that lead to carotenoid deposition (Hill and Mcgraw 2006). Most carotenoids are diet-based and are absorbed from food transported by blood in association with low density lipoproteins (LDLs) or high density lipoproteins (HDLs) and then deposited in integumentary tissues, including feathers (Hill and Mcgraw 2006). Mutations in genes associated with any of these processes are likely candidates for causing carotenoid-based polymorphism. Recently my knowledge of the diversity and metabolic pathways of carotenoids in avian species has advanced substantially (LaFountain et al. 2010; Ligon et al. 2016; Badyaev et al. 2019). A few recent studies, facilitated by next- generation sequencing and the characterization of entire genomes, have also progressed in the search for particular genes that influence carotenoid metabolism and deposition (Berry et al. 2009; Walsh et al. 2012; Lopes et al. 2016; Mundy et al. 2016; Toomey et al. 2017; 2018; Kim et al. 2019). Nevertheless, genes responsible for most of the diverse carotenoid pigmentation pathways are still unknown.

14

To date, most studies of carotenoid genetics in avian taxa have focused on model or captive-bred species, such as zebra finch Taeniopygia guttata (Lopes et al. 2016; Mundy et al. 2016), Gouldian finch Erythrura gouldiae (Toomey et al. 2018; Kim et al. 2019) and island canary Serinus canaria (Toomey et al. 2017), whose phenotypes can be manipulated experimentally. To investigate the genetics responsible for carotenoid phenotypes in wild organisms, however, requires the comparison of closely related populations (i.e., those with very similar genotypes) that vary in phenotype. A comparative approach has been applied mainly to taxa in which melanin pigments are known to be the causative agents, such as Hooded/Carrion Crown Corvus cornix/corone, Ruff Calidris pugnax, Collared Flycatcher Ficedula albicollis, and White-throated Sparrow Zonotrichia albicollis (Ellegren et al. 2012; Poelstra et al. 2014; Lamichhaney et al. 2015; Tuttle et al. 2016; Campagna et al. 2017). However, comparative analyses have also been used in a few studies of taxa whose coloration is influenced by carotenoids, such as T. guttata, New World warblers (Parulidae) and finches (Fringillidae) (Ligon et al. 2016; Toews et al. 2016; Irwin et al. 2018). Similar comparative approach to coloration can be applied in an insular setting in which island populations are distinct from their closely related mainland population and thus great potential to help in the process of disentangling the genetics of this key evolutionary phenomenon. Such an insular example, which involves both melanin and carotenoid coloration, is the Black-headed Bulbul Brachypodius atriceps. This species inhabits the mainland and virtually all continental islands of Southeast Asia, including those in the important biogeographic region of Sundaland, encompassing the Malay Peninsula, Java, Borneo, Palawan and smaller islands of the Sunda continental shelf. There are two distinct morphs of B. atriceps, yellow and gray. Yellow and gray birds are identical except that gray individuals lack yellow carotenoid pigmentation. The yellow morph predominates on the mainland of Southeast Asia and the Greater Sunda Islands. Gray birds occur only rarely in yellow populations across most of the B. atriceps range, but on two small islands off the coast of Borneo they are common. On Bawean Island, a continental island between Java and Borneo, both yellow and gray forms occur in about equal numbers (Oberholser 1917; Hoogerwerf 1966; Burner et al. 2018). On Maratua Island, which lies off the east coast Borneo and is one of the few oceanic islands on which the species occurs, all individuals are gray; no yellow birds occur (Bangs and Peters 1927; Riley 1930; Chua et al. 2015; Burner et al. 2018). There are several positive aspects to studying the genetics of color polymorphism in B. atriceps. The species is common throughout its range and easily located in forests due to its distinct song, (usually) bright yellow coloration and blue eyes. Hence, obtaining samples of this species is straightforward. Also, both yellow and gray birds co-occur on Bawean Island (Hoogerwerf 1966), which eliminates diet as a potential source of color variation. Since the heterogeneous dispersion of yellow and gray birds in Sundaland is controlled entirely by genetics, the differential proportion of birds on these small islands can be used to design robust comparisons of genetic variation; the presence of gray individuals on at least two islands sets up a natural replicate of the phenotype. To explore the genetics of coloration in B. atriceps, I sampled populations across Sundaland and mainland Southeast Asia and sequenced whole-genomes to identify genomic regions fixed for one color morph compared to the other, signaling possible locations of genes influencing coloration. I then investigated each of these genomic areas more thoroughly to identify individual genes that might play a role in determining color. I focused mainly on the populations of Maratua and Bawean islands because the presence of gray morphs on these two geologically

15 and geographically distinct islands allowed me to test for convergent or parallel causes of gray coloration. This sampling design also allowed me to test for selection of genes associated with phenotype.

MATERIALS AND METHODS

De novo genome assembly and genome annotation

A high quality de novo Brachypodius atriceps reference genome was assembled at Dovetail Genomics (Santa Cruz, CA) using 100 mg of liver tissue from a yellow individual: Louisiana State University Museum of Natural Science (LSUMNS) specimen 176475, tissue B- 50336 (Table 3.1). This specimen was collected at Ulu Rukuruku, Tawai Forest Reserve, Sabah, Malaysia (5.6155 N, 117.1943 E) on 16 August 2004. Its tissue was flash-frozen in liquid nitrogen in the field and stored at -170°C in a liquid nitrogen freezer at LSUMNS. A female individual was selected since females are the heterogametic sex in birds.

Table 3.1. Specimens used for de novo whole genome sequencing and resequencing.

Species Catalog No. Locality Coordinates Brachypodius LSUMNS B- MALAYSIA: Sabah, Tawai Forest 5.6155 N, atriceps 50336 (LSUMZ Reserve, Ulu Rukuruku 117.1943 E atriceps* 176475) Brachypodius LSUMNS B- MALAYSIA: Sabah, Crocker 5.3997 N, atriceps atriceps 36320 Range National Park, approximately 116.1022 E 15 km NW Keningau Brachypodius LSUMNS B- MALAYSIA: Sabah, Klias Forest 5.3261 N, atriceps atriceps 47145 (LSUMZ Reserve, 8 km W Beaufort 115.6736 E 179296) Brachypodius LSUMNS B- MALAYSIA: Sarawak, Samarakan, 2.9416 N, atriceps atriceps 57004 (LSUMZ 25 km S Bintulu 113.0333 E 176475) Brachypodius LSUMNS B- MALAYSIA: Sabah, Meliau Range, 5.8455 N, atriceps atriceps 57461 Ulu Tungud Forest Reserve 117.1825 E Brachypodius LSUMNS B- MALAYSIA: Sarawak, Mt. Pueh 1.8011 N, atriceps atriceps 58506 (LSUMZ 109.7122 E 186231) Brachypodius LSUMNS B- MALAYSIA: Sarawak, Mt. 1.115 N, atriceps atriceps 88148 (LSUMZ Penrissen 110.217 E 191390) Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.108 N, atriceps atriceps 93591 Province, Teluk Bayur, 5.8 km SW 117.402 E of Berau Airport Brachypodius LSUMNS B- INDONESIA: West Sumatra 0.1 N, 99.937 E atriceps atriceps 95121 Province, Mt. Talakmau

(table cont’d.)

16

Species Catalog No. Locality Coordinates Brachypodius KU tissue 12634 PHILIPPINES: Palawan, Puerto 9.8372 N, atriceps atriceps (KU 98891) Princesa, Irawan Watershed 118.6412 E Brachypodius LSUMNS B- INDONESIA: West Sumatra 1.604 S, 99.159 atriceps 93485 Province, Mentawai Islands, Siberut E hyperemnus Island Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.265 N, atriceps 93492 Province, Maratua Island 118.563 E hodiernus Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.265 N, atriceps 93498 Province, Maratua Island 118.563 E hodiernus Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.265 N, atriceps 93515 Province, Maratua Island 118.563 E hodiernus Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.265 N, atriceps 93520 Province, Maratua Island 118.563 E hodiernus Brachypodius LSUMNS B- INDONESIA: East Kalimantan 2.265 N, atriceps 93531 Province, Maratua Island 118.563 E hodiernus Brachypodius LSUMNS B- INDONESIA: East Java Province, 5.81 S, 112.614 atriceps 93614 Bawean Island E baweanus Poliolophus KU tissue 18105 PHILIPPINES: Mindanao, City of 6.9780 N, urostictus (KU 112385) Zamboanga 122.0661 E Poliolophus KU tissue 33044 PHILIPPINES: Eastern Samar, Sitio 11.2074 N, urostictus (KU 131913) Makasael, Kaantulan Creek 125.367 E Setornis criniger LSUMNS B- MALAYSIA: Kinabalu Park, 6.2933 N, 47079 (LSUMZ Serinsim 116.708 E 176507)

The method of de novo genome assembly was provided by Dovetail Genomics (https://dovetailgenomics.com/) and also described in detail in Salter et al. (2019). Dovetail Genomics obtained a high molecular weight (HMW) DNA extract for sequencing. The extract was sheared and quality controlled for concentration, 260/280 and 260/230 ratios, and average fragment size using pulsed-field gel electrophoresis. Shotgun libraries were constructed for assembly and quality controlled using qPCR. Two Chicago® libraries were then constructed and quality controlled by sequencing 1-2 million 150 bp paired-end reads using Ilumina HiSeq X instrument and mapping the reads to the de novo assembly. Both the shotgun and Chicago® libraries were sequenced on an Illumina HiSeq X instrument. The shotgun library was assembled into a genome assembly using a modified version of Meraculous assembler (http://jgi.doe.gov/data-and-tools/meraculous/). The assembly was scaffolded using Chicago library data through the HiRise™ software pipeline. In the end, Dovetail Genomics provided a fasta file with the genome assembly and a BAM file with alignments of Chicago read-pairs

17 mapped libraries along with reports of the genome assembly. Dovetail Genomic’s report also includes BUSCO statistics for the genome assembly compared to the eukaryote odb9 dataset. Genome annotation was provided by the Bird 10,000 Genomes (B10K) Project (https://b10k.genomics.cn/). A reference set of 20,181 avian genes—consisting of 12,337 genes orthologous between chicken (Gallus gallus) and zebra finch, 4877 genes unique to zebra finch, and 3025 unique to chicken—and the program Augustus were used for the annotation (Feng et al. 2020). A second reference gene-set of 20,169 human genes and 5267 genes derived from published avian transcriptomes supplemented the annotation process. The B10K Project provided a GFF3 file containing the annotations.

Genome resequencing

For comparison with the high quality de novo genome, I produced low coverage genomes (~8-10X) of 20 individuals, including: 6 gray B. atriceps from Maratua Island; 1 gray B. atriceps from Bawean Island; 10 yellow B. atriceps from various sites on Borneo, Sumatra, and the Mentawai islands; 2 Poliolophus urostictus; and 1 Setornis criniger (Table 3.1; Fig. 3.1). The latter two species represent outgroup clades (Shakya and Sheldon 2017); P. urostictus is the closest non-yellow outgroup member. To obtain these genomes, genomic DNA was extracted from ethanol or liquid-nitrogen preserved tissues using DNEasy® Blood and Tissue Kits (Qiagen) following the manufacturers’ protocol. The extracts were run on a 1.5% agarose gel to visualize the quality of DNA. The samples were quantified using Qubit 2.0 fluorometer (Life Technologies, Inc.) and spun down to a minimum concentration of 30 ng/µl using a vacuum centrifuge, Vacufuge (Eppendorf), and heated to 45°C, if needed. The reduced samples were re- quantified using Qubit 2.0. Samples greater than 30 ng/µl were diluted to the standard concentration of 30 ng/µl, and 1000 ng of the DNA was used to construct libraries. The samples were sonicated using an Episonic Multi-Function Bioprocessor in a water-bath set to 8°C. The sonicator was set a 180-190 W for 7 to 8 minutes, emitting 15 seconds pulses, followed by a cooling period of 30 seconds. The sheared DNA was visualized in a 1.5% agarose gel to ensure the average fragment sizes were between 300-400 bp. Samples with larger average fragment sizes were re-sonicated to reduce the fragment size to 300-400 bp. I used a KAPA HyperPrep Kit (Roche Sequencing Systems) to prepare a barcoded library for Illumina sequencing from each sample mostly following the manufacturer’s protocol. For barcoding, I used dual indexed iTru adapters (Glenn et al. 2019), with a unique combination applied to each sample. While adhering to the kit protocol for library preparation, I only used half reagent volumes for the end-repair and A-tailing, adapter ligation, and post-ligation cleanup steps. I used a 1.5X Serapure Speedbeads for cleanup (Rohland and Reich 2012). Libraries were amplified with seven cycles of PCR and quantified using a Qubit 2.0 both before and after amplification. One µl of each library was processed in an Agilent Bioanalyzer to visualize successful amplification and fragment distribution of samples. All libraries were sequenced for 8-10X coverage. Four libraries were combined in equimolar ratios with 14 pools of ultra- conserved element (UCE) libraries generated for a separate population genetics study of Bornean birds and sequenced on an Illumina Hiseq X 2x150 paired-end instrument at Oklahoma Medical Research Foundation (OMRF). Eight libraries were combined in equimolar ratios and sequenced on another lane of the Illumina Hiseq X 2x150 paired-end instrument at OMRF. Finally, eight libraries were sequenced on a NovaSeq6000 instrument by Macrogen (Rockville, MD).

18

Figure 3.1. Map of Sundaland showing sampling sites. Yellow circles represent yellow morph of Brachypodius atriceps, gray circles represent gray morph of B. atriceps, and black circles represent locations of outgroup taxa.

Low-coverage genome assembly and haplotype calling

Demultiplexed fastq files were adapter and quality trimmed using trimmomatic implemented through the illumiprocessor code in the phyluce pipeline (Faircloth 2016). Each cleaned file was aligned to the reference genome using bwa-mem (bwa version 0.0.17-r1188) with default settings. The generated SAM files were converted to BAM files, sorted, mate-pairs fixed, and then flagged for duplicate reads using Samtools (Samtools version 1.8). Read group meta-data, to tag individual data files, were added using Picard (Picard version 2.18.2). For variant calling, I followed the GATK Best Practices recommendations (https://software.broadinstitute.org/gatk/best-practices/). Variant calling to identify SNPs and indels was done using the HaplotypeCaller function in the GATK package (GATK version 4.0.3.0). Each variant call-file for each sample was combined into one variant call-file using the CombineGVCFs function in GATK. Genotypes were generated for the combined file using the GenotypeGVCFs function in GATK. Genotypes were filtered using the VariantFiltration 19 function in GATK with the following filter "QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0" as per GATK Best Practices recommendations. The filter expression flags sites with SnpCluster quality depth (QD) less than 2.0, Phred-scaled P-value using Fisher’s exact test to detect strand bias (FS) greater than 60.0, mapping quality (MQ) less than 40.0, MappingQualityRankSumTest (MQRankSum) value less than -12.5, and ReadPosRankSumTest (ReadPosRankSum) value less than -8.0. Filtered and non-variant sites were removed from the Genotype VCF file using SelectVariants function in GATK and this pruned file used in subsequent analyses.

Investigating genomes for the color-morph gene

To locate sites correlated with color-morphs in the genomes, Weir and Cockerham Fst statistics were calculated using vcftools (vcftools version 0.1.13) with a sliding window of 10000 bp shifted in 2500 bp steps across whole genomes of B. atriceps for (1) all gray versus all yellow birds, (2) Maratua gray versus all yellow birds, and (3) Bawean gray versus all yellow birds. Small scaffolds (<20000 bp, which produced too few sliding-windows) were excluded in further analyses. Nucleotide diversity (π) was also calculated for each comparison using vcftools with a sliding window of 10000 bp shifted by 2500 bp increments. Genes in regions of high Fst values were queried to see if they had been implicated by earlier studies in carotenoid metabolism or lipid absorption/transport activities. For all the genes in regions of high Fst, I compiled a list of Gene Ontology (GO) terms to see if any GO category was overrepresented.

Testing for selection

All coding sequences in the genome annotation file that did not have nonsense mutations or reading-frame shifts were tested for selection. Sequences with mid-stops and shifts were omitted because they could be caused by misaligned sequences or erroneous base identification. In the end, 8598 genes were tested. Their coding sequences were extracted by applying the variants in the VCF file to the reference genome to generate sequences of individual birds. I aligned each coding sequence of the reference sequence (Bornean) with one individual from each island/population: Maratua, Bawean, Mentawai, Palawan, Sumatra, and Borneo. I also included sequences of Poliolophus urostictus and Setornis criniger as outgroup taxa. The sequences were aligned using ClustalO (Sievers et al. 2011) and indels/bad alignments trimmed using Gblocks with no indel setting (Castresana 2000). The alignments were used to calculate pairwise dN/dS ratios using CodeML (Yang 2007). Any alignment in which mid-stop codons were introduced into the sequence were omitted from the analyses. In the end, I generated dN/dS ratios between every population for 8019 proteins. For proteins with dN/dS ratios greater than 2 (implying selection), I compiled a list of GO (Gene Ontology) terms to see if any GO category was overrepresented.

RESULTS

De novo genome assembly and genome annotation

The Dovetail HiRise assembly yielded a 906.73 Mb Brachypodius atriceps genome with an L50 of 14 scaffolds and N50 of 17.538 Mb (Table 3.2). The average physical coverage of the

20 genome was estimated to be 428.16X. Altogether, the final assembly consisted of 3820 scaffolds, with the longest being 66.93 Mb. The final assembly contained 30,006 gaps (0.39% of total genome). Compared to the BUSCO eukaryote_odb9 dataset of 303 genes (Simão et al. 2015; Waterhouse et al. 2018), my genome yielded 232 single copy genes (77%), 2 duplicated genes (1%), 16 fragmented genes (5%) and 53 missing genes (17%). Genome annotation yielded 15,647 proteins. Of the annotated proteins, 543 gene sequences contained mid-stop codons, and 1125 had frameshifts compared to the reference gene set.

Table 3.2. Genome assembly statistics

Total Length 906.73 Mb Average estimated coverage 428.16X L50/N50 14 scaffolds/17.538 Mb L90/N90 66 scaffolds/2.579 Mb Total number of scaffolds 3820 Length of longest scaffold 66.93 Mb

Investigating color-morph genes

Fst values from the three trials comparing B. atriceps genomes—(1) all gray individuals versus all yellow individuals, (2) Maratua grays versus all yellow individuals, and (3) the Bawean gray versus all yellow individuals—are shown in Fig. 3.2. Fst patterns of all grays versus all yellows and Maratua’s gray individual versus all yellows differ only slightly. Their similarity is probably because I only had one gray individual from Bawean population and any variation with respect to the Bawean Island bird was masked by the peaks for Maratua’s all-gray population (n = 6). Visual comparison of locations of high Fst peaks between the Bawean’s gray individual versus all yellows does not exhibit the same pattern as the Maratua’s grays versus all yellows; Fst peaks do not overlap between the two gray populations. In the comparison between Maratua grays and all yellows, there was a reduction of average nucleotide diversity among Maratua gray birds (n=6) compared to yellow birds from elsewhere in Borneo (n=7) (Fig. 3.3). Comparing Fst values, 16 scaffolds have windows with Fst >0.5, versus an average Fst value for the entire genome of 0.158 (Fig. 3.4). Two (scaffolds 2045 and 479) consist of only one 10,000 bp window and do not correspond to any protein coding genes. Within the remaining 14 scaffolds, 19 protein coding genes are present in regions encompassing one or several continuous high Fst windows (Table 3.3). These 19 proteins are part of 150 GO bins of recognized functions. No single GO bin, however, contains a significant number of loci with high Fst values.

21

Table 3.3. Protein coding regions with Fst greater than 0.5 between Maratua gray and all yellow individuals. Potential functions based on the role of each protein in humans (UNIPROT).

ENSEMBL Protein ID Gene Scaffold Name Potential Function ENSGALP00000049513 SLC12A7 170 Solute carrier family 12 Potassium uptake during cell inflammation in ears (potassium/chloride transporters), member 7 ENSGALP00000009635 RRAS2 499 RAS Related 2 GTPase associated with plasma membrane and may function as a signal transducer ENSGALP00000018559 MSH4 514 MutS Homolog 4 DNA mismatch repair protein ENSTGUP00000008692 COPB1 520 Coatomer Protein Encodes a protein subunit of the coatomer Complex Subunit Beta 1 complex associated with non-clathrin coated vesicles that functions as a lipid transporter in golgi complex ENSTGUP00000008725 PSMA1 520 Proteasome Subunit Proteinase that helps cleave polypeptides Alpha 1 ENSGALP00000019958 Unknown 690 ENSTGUP00000001950 GOLGA4 690 Golgin A4 Vesicle-mediated transport of lipids and proteins ENSTGUP00000016914 LRRFIP2 690 Leucine Rich Repeat (In Toll-like receptor that facilitates activation of FLII) Interacting Protein nuclear factor kappa B signaling 2 ENSTGUP00000002008 NET1 1299 Neuroepithelial Cell Interacts with RhoA within the cell nucleus and Transforming 1 may play a role in repairing DNA damage after ionizing radiation ENSTGUP00000009466 PDHB 1459 Pyruvate Dehydrogenase Catalyzes the overall conversion of pyruvate to E1 Beta Subunit acetyl-CoA and carbon dioxide ENSTGUP00000009467 KCTD6 1459 Potassium Channel Sweet taste signaling Tetramerization Domain Containing 6 ENSTGUP00000009531 ACOX2 1459 Acyl-CoA Oxidase 2 Involved in the degradation of long branched fatty acids and bile acid intermediates in peroxisomes. (table cont’d)

22

ENSEMBL Protein ID Gene Scaffold Name Potential Function ENSGALP00000011549 FAM107A 1459 Family With Sequence Stress-inducible actin-binding protein that plays a Similarity 107 Member role in synaptic and cognitive functions by A modulating actin filamentous (F-actin) dynamics. Mediates polymerization of globular actin to F- actin. ENSTGUP00000015442 FAM3D 1459 Family With Sequence Similarity 3 Member D ENSTGUP00000009549 C3orf67 1459 ENSTGUG00000000899 PTAFR 1651 platelet activating factor Localizes to lipid rafts and/or caveolae in the cell receptor membrane ENSTGUG00000000898 EYA3 1651 EYA transcriptional Transcriptional activator that may have a role coactivator and during development phosphatase 3 ENSTGUP00000003162 MRC2 3813 Mannose Receptor C Plays a role in extracellular matrix remodeling by Type 2 mediating the internalization and lysosomal degradation of collagen ligands. ENSTGUP00000016992 Unknown 3813

23

Table 3.4. Protein coding regions exhibiting high dN/dS values between A) Maratua gray and the reference genome and B) Bawean gray and the reference genome, and their potential functions based on the role of each protein in humans (UNIPROT).

ID dn/ds Gene_symbol Gene_name Gene_function A) Maratua versus Ref ENSTGUT00000018021 20.4099 PTP4A2 Protein Tyrosine Stimulates progression of G1 into S phase during Phostphatase 4A2 mitosis ENSTGUT00000013984 22.7502 SLC19A2 Solute Carrier Family 19 Thiamine transporter protein Member 2 ENSTGUT00000005559 23.5759 ENSTGUT00000018948 25.51 ENSTGUT00000011715 31.4253 KCNAB1 Potassium Voltage- Potassium based voltage-gated channel gated Channel Subfamily A ENSTGUT00000011683 31.5742 DNAL1 Dynein Axonemal Light Part of axomenal ATPase complex that generates Chain 1 force for cilia motility ENSTGUT00000000908 31.5923 EFNA2 Ephrin A2 Cell surface GPI-bound ligand for Eph receptors, critical for neuronal development ENSTGUT00000004152 33.9716 CRIPT CXXC Repeat Involved in cytoskeletal anchoring of DLG4 in Containing Interactor of excitory synapses PDZ3 Domain ENSTGUT00000016825 35.1464 CLUH Clustered Mitochondria mRNA-binding protein involved in proper Homolog cytoplasmic distribution of mitochondria ENSTGUT00000004632 35.9802 PRKRIP1 PRKR Interacting Binds double-sranded RNA Protein 1 ENSTGUT00000005197 36.0758 SYS1 SYS1 Golgi Trafficking Involved in protein trafficking Protein ENSTGUT00000009664 36.5755 UTS2B Urotensin 2B Vasoconstrictor ENSTGUT00000016080 37.8621 ENSTGUT00000004905 38.4974 CISH Cytokine Inducible SH2 Part of negative feedback system that regulates Containing Protein cytokine signal transduction (table cont’d)

24

ID dn/ds Gene_symbol Gene_name Gene_function ENSTGUT00000003129 39.187 ENSTGUT00000013565 45.3547 AAMDC Adipogenesis May play a role in preadipocytie differentiation Associated Mth938 and adipogenesis Domain Containing ENSTGUT00000006707 50.3472 NUP62 Nucleoporin 62 Essential part of nuclear pore complex ENSTGUT00000005986 56.7107 TPPP3 Tubulin Polymerization Binds tubuline and has microtubule bundling Promoting Protein activity Family Member 3 ENSTGUT00000007621 61.4832 ENSTGUT00000008008 62.5131 ENSTGUT00000005248 63.3257 MRPS17 Mitochondrial Facilitates protein synthesis in mitochondrion Ribosomal Protein S17 ENSTGUT00000001463 69.4163 ENSTGUT00000013930 71.4057 LRRC58 Leucine Rich Repeat Containing 58 ENSTGUT00000009694 71.7395 HIST1H46L2 ENSTGUT00000017854 76.0145 CRIP1 Cystein Rich Protein 1 Potentially plays a role in Zinc transport ENSTGUT00000013660 80.6659 AICDA Activation Induced Involved in somatic hypermutation, gene Cytidine Deaminase conversion, and class-switch recombination ENSTGUT00000014805 83.2493 ACKR2 Atypical Chemokine Control chemokine levels and localization Receptor 2 ENSTGUT00000002639 92.2803 TNFRSF9 TNF Receptor Receptor for TNFSF9/4-1BBL, possibly active Superfamily Member 9 during T-cell activation ENSTGUT00000002721 93.9239 ICMT Isoprenylcysteine Catalyzes post-translational methylation of Carboxyl isoprenylated C-terminal cysteine residue Methyltransferase B) Bawean versus Ref ENSTGUT00000015826 24.2905 TOP2A DNA Topoisomerase II Controls topological states of DNA Alpha (table cont’d)

25

ID dn/ds Gene_symbol Gene_name Gene_function ENSTGUT00000017971 31.2598 XPO7 Exportin 7 Protein transporter ENSTGUT00000004286 33.7357 ENSTGUT00000004632 36.759 PRKRIP1 PRKR Interacting Binds double-stranded RNA Protein 1 ENSTGUT00000003829 41.2813 CMTM8 CKLF like MARVEL Potentially involved in cytokine activity Transmembrane Domain Containing 8 ENSTGUT00000006707 46.2308 NUP62 Nucleoporin 62 Essential part of nuclear pore complex ENSTGUT00000009005 47.0312 CDKN2C Cyclin Dependent Protein kinase binding and cyclin-dependent Kinase Inhibitor 2C kinase inhibitors ENSTGUT00000011343 55.3568 IFNGR1 Interferon Gamma Associates with IFNGR2 to form receptor for Receptor 1 cytokine interferon gamma ENSTGUT00000012092 56.6367 SSBP1 Single Stranded DNA Involved in Mitochondrial Biogenesis Binding Protein 1 ENSTGUT00000017944 60.4278 YBX1 Y-Box Binding Protein Mediates pre-mRNA alternative splicing 1 regulation ENSTGUT00000000398 63.2472 LARP6L LARP6 like ENSTGUT00000016001 63.5831 YTHDC1 YTH Domain Chromatin Regulation Containing 1 ENSTGUT00000013782 68.3178 FKBP3 FKBP Prolyl Isomerase Immunosupressant 3 ENSTGUT00000013194 75.876 MSANTD4 Myb/SANT DNA Binding Domain Containing 4 With Coiled-Coils ENSTGUT00000016362 77.8422 ENSTGUT00000010200 82.7379 DRG1 Developmentally Catalyzes conversion of GTP to GDP Regulated GTP Binding Protein 1 (table cont’d)

26

ID dn/ds Gene_symbol Gene_name Gene_function ENSTGUT00000004585 91.0793 GRAMD2A GRAM Domain Participates in organization of endoplasmic Containing 2A reticulum-plasma membrane contact sites ENSTGUT00000011399 95.0902 RPL22L1 Ribosomal Protein L22 Like 1

27

Figure 3.2. Absolute mean Fst values of comparisons between A) all gray versus all yellow individuals, B) Maratua gray vs all yellow individuals, and C) Bawean gray vs all yellow individuals. All Fst values from a single scaffold are presented on a single X-axis bin.

Among the scaffolds with high Fst values, two scaffolds (520 and 499) have multiple sequential windows with Fst greater than 0.7 (Fig. 3.5.A). The Fst peaks are localized at the end of scaffold 499 and the beginning of scaffold 520 and, based on synteny with the zebra finch genome, are adjacent to one another on chromosome 5. These two scaffolds contain the protein coding genes RRAS2 (RAS related 2), COPB1 (COPI coat complex subunit beta 1), and PSMA1 (Proteasome 20S subunit alpha 1), which represent a GTPase, a Golgi-body vesicular transporter, and a proteinase, respectively (Fig. 5a). The vesicular transport protein COPB1 is particularly interesting, as one of its functions (in humans) is lipid storage and transport (UNIPROT). Carotenoids are bound and transported by lipids and, hence, COPB1 is a candidate locus for participation in the carotenoid-based color polymorphism in B. atriceps. The window with the highest Fst value in scaffold 520 is also present in the middle of the COPB1 gene. Alignment of the coding sequence of this gene indicated that yellow birds on Borneo and gray birds on Maratua differ by only one synonymous nucleotide substitution. The majority of fixed SNPs are

28 located in the introns and non-coding regions upstream and downstream from the coding sequences. The gene is conserved even with respect to zebra finch; 1072 of the 1085 amino acids are identical, three amino acids differ, and two indels are present (one comprising 9 amino acids/27 bp, the other 1 amino acid/3 bp). A second protein coding gene occurring in another high Fst area, GOLGA4 (Golgin A4), is also a vesicle-mediated lipid transporter and, as such, a color-polymorphism candidate. The coding region of this gene differs between the Maratua gray and Bornean yellow birds at only 5 sites. In comparisons between Bawean grays and all yellows, only a few Fst peaks were >0.5. The average Fst value for all scaffolds was 0.228 (Fig. 3.4). A series of adjacent windows with high Fst values occurs at the edge of scaffold 2045 (Fig. 3.5.B). These windows are located in a non-coding region upstream of gene TRPC5 (Transient receptor potential cation channel subfamily C member 5). This gene is a receptor-activated non-selective calcium permeant cation channel (UNIPROT). I could not find any functional connection between this gene and carotenoid transport or metabolism.

Figure 3.3. Distribution of Fst values of Bawean gray vs all yellow individuals (Red) and Maratua gray vs all yellow individual (Blue). Mean values of each distribution represented by dotted lines.

Testing for selection

In comparisons of genes among B. atriceps populations, the general trend appears to be that in most populations most genes do not have any changes in protein coding sequences, a few genes tend to show signs of positive selection (dN/dS > 1), and the remainder are distributed bimodally, with dN/dS values either approaching 0 or centered at 1 (Fig. 3.6). Values approaching 0 represent genes in which synonymous outweigh non-synonymous substitutions (purifying selection), whereas those centered at ~ 1 represent equal frequencies of non- synonymous and synonymous substitutions (neutrality). Between the reference genome (Bornean) and Maratua, Bawean, and a second Bornean population, most of the dN/dS values between 0 and 1 are skewed toward 1, suggesting most the variation is attributable to neutral processes. This was usually not the case in comparisons of the reference genome to Sumatran

29 populations or between B. atriceps and Poliolophus urostictus; in these cases, most of the dN/dS values between 0 and 1 are skewed toward 0, signaling more purifying selection.

Figure 3.4. Distribution of π (nucleotide diversity) values among Maratua gray (Red) and yellow individuals (Blue).

Figure 3.5. Absolute mean Fst (red), π (blue) for A) Maratua gray vs all yellow individuals in scaffolds 499 and 520, B) Bawean gray vs all yellow individuals in scaffold 2045. Genes in region of high Fst are shown with gray boxes.

30

Figure 3.6. Distribution of dN/dS ratios from various comparisons between the Brachypodius atriceps reference genome (of a Bornean individual) vs. the same species from other islands, between B. atriceps and Poliolophus urostictus, and between B. atriceps and Setornis criniger.

31

Among genes compared between Maratua grays and the yellow reference genome, 765 have one or more synonymous and zero non-synonymous substitutions, 99 have dN/dS ratios between 1 and 2, and 29 have dN/dS ratios >2. The occurrence of only synonymous substitutions might result from missing data or misalignment. Genes with dN/dS between 1 and 2 have evolved close to neutral rates. Among the remaining 29 genes (Table 4.A) seven are essential cell-membrane genes, five are essential parts of the nuclear membrane, and the remaining genes perform other functions throughout the cell. None of the high dN/dS genes overlaps with the high Fst genes identified above. Of the genes compared between Bawean gray and the reference genome, 802 have one or more synonymous and zero non-synonymous substitutions, 96 have dN/dS ratios between 1 and 2, and 18 have dN/dS ratios >2. Among the 18 genes with dN/dS >2 (Table 4.B) six are essential parts of the nuclear membrane, four are cytoplasm limited genes, and another four are essential elements of the cell membrane. Two genes, PRKRIP1 (PRKR interacting protein 1) and NUP62 (Nucleoporin 62), are positively selected in both the Bawean and Maratua populations. PRKRIP1 is a protein-kinase binding protein and NUP62 is a part of the nuclear pore complex. Neither gene has been associated with carotenoid of lipid transport or metabolism.

DISCUSSION

Ornithologists are slowly beginning to understand the genetic underpinnings of carotenoid-based coloration in birds thanks to advances in whole-genome sequencing. A few genes such as CYP2J19B (Lopes et al. 2016; Mundy et al. 2016), SCARB1 (Toomey et al. 2017), and FST (Toews et al. 2016; Toomey et al. 2018; Kim et al. 2019) have been shown to play roles in carotenoid plumage variation in birds. These genes fall into two basic categories. Those, such as SCARB1 that facilitate lipid uptake and transport, and other genes such as CYP2J19B and BCO2 that facilitate modification of dietary carotenoids. Clearly, however, other genes that serve various pathways in carotenoid uptake, transport, and metabolism remain unidentified or unknown. My genomic comparisons of Brachypodius atriceps individuals pointed to high Fst in regions that contained protein coding genes and others that did not. The regions without protein coding genes may contain sites for gene regulation. In regions with protein coding genes include a few genes that play a role in lipid transport and processing. Mutations in genes that facilitate carotenoid uptake and transport, such as scavenger gene SCARB1, influence color polymorphism not only in birds (Toomey et al. 2017) but also in silkworms (Sakudoh et al. 2010; 2012), salmon (Sundvold et al. 2011), and scallops (Liu et al. 2015). Several of the candidate loci potentially affecting carotenoid-based polymorphism in golden pheasant Chrysolophus pictus and Lady Amherst’s pheasant Chrysolophus amherstiae also belong to the category of lipid transport and processing (Gao et al. 2018). I also found variation in the genes COPB1 and GOLG4A, both of which play roles in lipid transport in the Golgi body and endoplasmic reticulum. Although the exact function and role of these proteins in carotenoid transport in the birds are unknown, based on their roles in humans and other organisms we can assume they affect carotenoid deposition and transport in birds as well. Coincidentally, COPB1 is one of the genes listed by Gao et al. (2018) that has experienced selection in the golden and Lady Amherst’s pheasants. However, Gao et al. (2018) did not consider it as a potential candidate locus for carotenoid coloration. Two regions with the very high Fst values but low nucleotide diversity (π) were present at the end of scaffold 499 and beginning of scaffold 520. Based on zebra finch synteny, these

32 scaffolds are adjacent to one another in chromosome 5. This pattern of high Fst and low π at the edge of adjacent scaffolds is similar to one discovered by Poelstra et al. (2014) as the source of melanin-based difference between hooded crow Corvus cornix and carrion crow Corvus corone. Poelstra et al. (2014) suggested that such an anomalous pattern might signal an inversion. Inversions often inhibit recombination and thus protect components of inverted regions as supergenes. In B. atriceps, several genomic regions show similar signs of inversion. Besides the potential inversion in chromosome 5, others include part of scaffold 3209 and a region encompassing scaffolds 170, 690, and 1854. Further studies and deeper sequencing and aligning with better scaffolds would be required to verify inversions in chromosomes among B. atriceps populations. Because I did not find non-synonymous substitutions within the candidate color genes, it is possible the polymorphism in bulbul coloration is due to regulatory variation in non-coding regions. Non-coding mutations are considered potential causes of color polymorphism in Gouldian finches (Toomey et al. 2018; Kim et al. 2019), New World warblers (Toews et al. 2016), and seedeaters (Campagna et al. 2017). Since most variation within the candidate loci I discovered also occurs in non-coding regions, it is possible that regulatory mutations may lead to the difference in bulbul color polymorphism. In comparisons of individual genes, dN/dS tests indicated positive selection in several loci. However, none of these genes appears to be involved in lipid transport or carotenoid metabolism. dN/dS patterns suggest most of the variation between Bornean populations and those on nearby islands is neutral (Fig. 3.5). These populations would have been separated only recently from one another, and gene flow likely maintains many polymorphisms. However, in comparisons of mainland Bornean and Sumatran populations, or B. atriceps with Poliolophus urostictus, dN/dS patterns tend towards an increase in balancing/purifying over neutral selection. If populations or species have been separated for a longer time, then an increase in balancing/purifying selection may be due to purging of slightly deleterious substitutions. Given larger time frames, the number of mutations between Sumatran and Borneo populations or bulbul species would have been greater than between Bornean populations, but there would also be more time for purifying selection to remove the slightly deleterious mutations. I did not find any evidence that the gray color of B. atriceps in Maratua and Bawean populations was caused by parallel or convergent evolution. The regions of high Fst between yellow populations and Maratua’s gray population do not correspond to those between yellows and the Bawean gray. It is still possible, however, that the gray mutation is caused by the same allele fixed independently in the two island populations through drift. Unfortunately, on Bawean Island, all (or almost all) B. atriceps have been extirpated by pet traders. Thus, although in principle the island contains both yellow and gray individuals (Oberholser 1917; Hoogerwerf 1966; Burner et al. 2018), I was only able to obtain one individual from Bawean (a gray bird). The lack of sampling from Bawean Island limited the scope of my investigation. Island populations are also prone to genetic drift and bottlenecks, which lead to random fixing of rarer or novel alleles. Maratua’s monomorphic gray color may have been influenced by the founder effect and/or genetic drift expected on a small, isolated oceanic islands. Such possibilities are investigated in Chapter 4. Mayr (1954) introduced the idea of “genomic revolutions” to explain speciation in peripheral populations, i.e., the development of distinct species from founding populations in allopatry, as on islands. According to Mayr, once a population is established on an island, genetic drift and small population size cause a drastic reduction in genetic variation and a

33 breakdown of established epistatic systems among genes, reducing phenotypic variability. The change in epistatic forces leads to the expression of novel or alternative phenotypes, which may become fixed in the island population. Other hypotheses for establishment of distinct island populations have also been suggested, most relying on epistatic interactions of genes and expansion or contraction of population sizes and variability (Carson 1968; Templeton 1980). If founder effects proceed as described under these hypotheses, it might not be possible to tease apart the genes responsible for various phenotypic traits through methods that we currently employ, as most of these rely on natural selection (e.g., dN/dS comparisons) to signal genetic changes associated with phenotype. Being gray on islands may or may not impart any selective advantage to B. atriceps, and thus tests of selection will not identify causal genes. On the other hand, darker feathers are known to impart resistance against parasites, such as mites, but we have no way to investigate such an advantage in this species.

34

CHAPTER 4. HISTORICAL DEMOGRAPHY OF THE BLACK-HEADED BULBUL (BRACHYPODIUS ATRICEPS) ON MARATUA AND BAWEAN ISLANDS, INDONESIA

INTRODUCTION

Islands are natural laboratories of evolution (MacArthur and Wilson 1967). Their small population sizes, isolation, and unique abiotic and biotic environments allow variable sets of evolutionary forces to influence morphology in a way not seen on the mainland (MacArthur and Wilson 1967). The founder effect, genetic drift, and changes in selective forces can yield novel—sometimes predictable, sometimes unpredictable—evolutionary results on islands. Differences in land area and habitat (Thorpe et al. 2015) and predation pressure (Kirchman 2012; Brandley, Kuriyama, and Hasegawa 2014) on islands, for example, may lead to common, distinct features in island taxa. Dwarfism, gigantism, flightlessness, and melanism are well-known characteristics of island species caused by convergent evolution stemming from commonality in selective regimes on islands. Islands also host highly divergent taxa, featuring characteristics unlike any in their mainland progenitors, e.g., as in the Kagu (Rhynochetos jubatus) of New Caledonia, the Malagasy avifauna, and paradise kingfishers (Tanysiptera) on islands off New Guinea. Such uniqueness in island taxa is often multiplied by adaptive radiation, e.g., of Galapagos finches and Hawaiian honeycreepers. As emphasized by Mayr (1954), distinct characteristics occur not only on distant, isolated islands but also on “peripheral islands”, i.e., those nearer the source population. Understanding the processes that lead to unique characteristics on islands is, thus, a topic of great interest in evolutionary biology. MacArthur & Wilson (1967) argued that a balance exists between and island’s distance from the mainland and its area when it comes to rates of colonization and extinction, defining how island diversity arises and is maintained. Based on this view, islands close to the mainland tend to have greater rates of colonization and gene flow with the mainland population than more distant islands. Similarly, larger islands accumulate immigrants more rapidly and comprise more habitat diversity than smaller islands, increasing chances of successful colonization and greater diversity. In theory, on closer islands, forces that traditionally affect island populations such as founder effect and genetic drift are counteracted by the effects of gene flow and population assimilation, constraining genetic diversification. In addition to these generally expected patterns, island geography is also an important factor in shaping island diversity. For example, most continental islands, regardless of their size and distance, may be connected periodically to the mainland during times of lower ocean levels, as during Pleistocene glacial events, leading to greater rates of colonization and gene flow via land bridges than occur between the mainland and permanently isolated oceanic islands. Despite such expected patterns, however, enough island taxa deviate from expectations in characteristics and diversity to require alternative explanations of their uniqueness. Several hypotheses have been proposed to explain the prevalence of distinct forms on island systems, including peripheral islands. Of these hypotheses, three are particularly noteworthy. In developing the first, Mayr (1954) noted that Tanysiptera kingfishers on the main island of New Guinea are uniform across various habitats, but on nearby islands, such as Waigeo, Numfor, and Baik, morphologically distinct forms predominate. Based on such observations, Mayr (1954) proposed the concept of “genomic revolutions” to explain the distinct

35 taxa. He suggested that on the mainland, large effective population sizes and epistatic interactions between loci maintain phenotypic uniformity, whereas on peripheral islands genetic drift encouraged by small founding population sizes causes a breakdown in epistatic interactions, leading to the expression of novel, alternative phenotypes. Carson (1968) proposed a second more dynamic explanation for peripheral island speciation, called “founder-flush”, based on the idea that island populations initially flourish from a relaxation of selective forces that were present on the mainland. This relaxation allows new populations on islands to express novel phenotypes that would have been selected against on the mainland and, consequently, to increase in heterozygosity and genetic variation. Eventually, the colonizing population collapses due to limitation in island size. This causes a shift in adaptive peaks driven by the genetic and phenotypic make-up of the individuals that remain and novel selection pressures on the island. The third hypothesis, called “genetic transilience” (Templeton 1980), follows Carson’s assumption of initially increased heterozygosity, but it assumes that stochastic changes in only a few loci, especially those involved in reproduction, are necessary to achieve reproductive isolation. The change in these few loci influences the entire genomic landscape through epistasis and leads to the fixation of linked and co-adapted genes. The three hypotheses—genomic revolutions, founder-flush, and genetic transilience— have been discussed and tested at length, with mixed results (Barton and Charlesworth 1984; Carson and Templeton 1984; Gavrilets and Hastings 1996; Slatkin 1996). Coyne (1992) noted that most of these founder-effect based speciation models tend not to be supported by simulations. Barton and Charlesworth (1984) found a low probability of stochastic transition to new states based on assumptions of such models. However, other simulations have supported founder-effect speciation hypotheses (Gavrilets and Hastings 1996; Slatkin 1996). With respect to empirical data, early allozyme studies of Drosophila on the Hawaiian Islands found an increase in heterozygosity on small islands, thus supporting the initial elements of founder-flush and genetic transilience (Craddock and Johnson 1979). Increased or undifferentiated heterozygosity has been observed in other island species, including Galapagos finches and silvereyes (Zosteropidae) (Clegg et al. 2002; Lamichhaney, Berglund, et al. 2015). However, in a comparison of 294 island bird species, Hughes (2010) found a decrease in heterozygosity in most cases, and in a comparison of island populations of both mammals and birds, Frankham (1997) noted a decrease in heterozygosity in a majority of species, except those with greater dispersal capabilities. These findings suggest greater complexity in the process of island diversification than predicted by simple models. Moreover, our understanding of the range of this complexity is poor, since most studies have relied on limited genetic data (a few microsatellites or other loci). To investigate the dynamics of populations in a peripheral island system, I chose to work on a bird species, the Black-headed Bulbul (Brachypodius atriceps), inhabiting Sundaland in Southeast Asia. This region encompasses the Malay Peninsula, Greater Sunda Islands (Borneo, Sumatra, Java, and Palawan), and hundreds of smaller islands. By virtue of its geologically recent, dynamic history and myriad islands, Sundaland offers numerous opportunities to compare populations on island of various sizes, varying distances to the mainland, and different geographic histories (Mayr 1944; Sheldon, Lim, and Moyle 2015; Lim et al. 2017). Brachypodius atriceps occurs on the mainland and virtually all continental islands of Southeast Asia. There are two distinct morphs of this bulbul, yellow and gray. Throughout most of its range the yellow morph predominates and only occasionally does the gray morph occur. However, on two islands the situation changes. On Maratua Island, an oceanic island off the east coast Borneo, all individuals in the population are gray (Bangs and Peters 1927; Chua et al.

36

2015; Burner et al. 2018). On Bawean Island, a continental island between Java and Borneo in the Java Sea, both yellow and gray forms are equally common (Oberholser 1917; Hoogerwerf 1966; Burner et al. 2018). In several respects, this distributional scenario is reminiscent of that of the Australo-Papuan Tanysiptera paradise-kingfishers, which Mayr (1954) used to develop his ideas on peripheral speciation through genetic revolutions: a widespread species largely uniform in morphology across most of its range but distinct on small islands. As such, B. atriceps offers an opportunity to examine the underpinnings of classic peripheral island divergence and convergence in phenotype. Although the common occurrence of gray morphs on Maratua and Bawean suggests similar evolutionary forces at play, the two islands have experienced very different geographic histories, which should have strongly influenced the role of drift and (possibly) selection. Maratua is a raised volcanic atoll of coralline limestone ~50 km east of Tanjung Batu, East Kalimantan (Fig. 4.1) and is separated from Borneo by a permanent oceanic barrier greater than 180 m in depth (Riley 1930). The island has never been connected to mainland Borneo since its formation, even during dramatically lower sea levels of the Pleistocene caused by global glacial events (Sathiamurthy and Voris 2006). Bawean, on the other hand (Fig. 4.1), lies 150 km north of Java on the Sunda continental shelf in the Java Sea, where the ocean depth is only 20-60 m (Koropitan and Ikeda 2008). Bawean was joined by land to both Borneo and Java numerous times during the Pleistocene when sea levels were lower, including during the Last Glacial Maximum (LGM, c. 21 ka). Thus, the population on Bawean Island would have been periodically connected with those on the Greater Sundas (Borneo, Sumatra, and Java) until very recently. These connections would have exposed Bawean to a much higher level of gene flow than the continuously isolated Maratua. The process of examining hypotheses of peripheral island speciation first requires that panmixia exist in the mainland population. MtDNA comparisons across Southeast Asia indicate little population variation in B. atriceps, except in Indochina (Dejtaradol et al. 2015) and between Maratua Island and the rest of Southeast Asia (Chua et al. 2015). Comparing Ultraconserved Element (UCE) data, Lim et al. (2019) examined the population dynamics of B. atriceps on mainland Asia, the Greater Sundas, and Bawean (but not on Maratua). They found substantial structure in Indochina (similar to the findings of Dejtaradol et al. 2015), but little variation among the Sunda Islands and the Thai-Malay Peninsula. Thus, preliminary studies strongly support a generally panmictic genetic environment in Sundaland, except for the Maratua population. In Chapter 3, I located and examined genes potentially involved in B. atriceps’ coloration and assessed the role of natural selection on those genes in the populations of Maratua and Bawean islands. In the process, I obtained a well-annotated genome of one yellow individual from Borneo and seventeen low-coverage genomes of individuals of B. atriceps from throughout its range on Borneo, Sumatra, and Palawan, as well as from the smaller islands Maratua, Bawean, and Siberut in the Mentawai Islands off western Sumatra (Fig. 4.1). In this chapter, I take advantage of the enormous wealth of data from those genomes to study in detail the demographic history of B. atriceps on Maratua and Bawean. Specifically, I extracted UCEs from the genomic data to produce sequences for phylogeographic inference. I also identified single nucleotide polymorphisms (SNPs) from introns to use in population genetic and demographic comparisons. I estimated the age of divergence of the island populations and ascertained relative heterozygosity among the populations to assess the applicability of genetic revolutions, founder-

37 flush and genetic transilience hypotheses. Finally, I used the data to fit models and tested these models to infer historical demographic patterns in Sundaland.

Figure 4.1. Map of Sundaland showing elevational patterns and bathymetric contours. The red contour indicates 120 m, corresponding to the approximate sea-level during the last glacial maximum (c. 21 ka). Insets with Maratua and Bawean islands showing bathymetric contours deeper than 120 m (red) at 500 m intervals (black).

METHODS

Extracting UCE data and phylogenetic analysis

Methods of assembling genomic sequence contigs (BAM files) of yellow and gray B. atriceps individuals are described in Chapter 3. From these BAM files, I generated fasta files for each individual using the mpileup option in Samtools in concert with bcftools and vcf2fq scripts, and then by converting fastq to fasta formats. I then extracted UCE data from the assembled genome fasta files by following the phyluce pipeline “Harvesting UCE Loci From Genomes” described by Faircloth (2016). The fasta files were first converted to 2bit format. Then, the UCE 5k probe set was aligned with the genomic data, and fasta sequences of the UCEs along with flanking sequences were extracted from each genome. The UCE sequences were aligned using MAFFT and then internally trimmed and cleaned using Gblocks. A matrix of UCE loci with 90% completeness was compiled such that a relatively large number of UCE loci were included in the matrix with minimal amount of missing data. These loci were concatenated for use in phylogenetic reconstruction. Trees were constructed using RAxML (Stamatakis 2014), which was run with the GTRGAMMA model of nucleotide substitution and with 1000 non-parametric bootstrap replicates through the CIPRES scientific gateway (Miller, Pfeiffer, and Schwarz 2010). 38

SNAPP tree

To improve resolution of relationships between individuals of Brachypodius atriceps, I followed a coalescent approach using only intronic biallelic SNPs implemented through SNAPP. I pruned the vcf file produced in Chapter 3, which contained all variants of all B. atriceps populations and outgroups using vcftools (Danecek et al. 2011), and kept only biallelic SNPs that were identified with high confidence from intronic sequences in all individuals (max-missing = 1). The vcf file was also thinned to include only sites separated by a minimum distance of 10,000 bp to avoid linked sites. In total, 92,802 SNPs were used for SNAPP analyses. A control file was created using BEAUti 2 with default settings. The six Maratua individuals were designated as a single unit. Multiple MCMC runs were initiated for 5 million generations, sampling every 1000 generations, using BEAST 2. Convergence among runs was visualized using Tracer v1.6.7 (Rambaut et al. 2014).

PCA and STRUCTURE analysis

Similar to the filtering approach used for the SNAPP tree, I generated a dataset of intronic biallelic SNPs for all B. atriceps. I removed the outgroup species from the vcf file and kept only SNPs that were identified with high confidence in all individuals (100% complete matrix) using vcftools. The vcf file was similarly thinned to include only sites separated by a minimum of 10,000 bp to avoid linked sites. In total, 29,474 SNPs were used for further analyses. Using the thinned, biallelic vcf file, population structure was examined via principal component analysis (PCA) and STRUCTURE (Pritchard, Stephens, and Donnelly 2000). PCA was performed in R using glPca function under adegenet (Jombart, 2008). For STRUCTURE analysis, the vcf file was converted to STRUCTURE format using the program PGDSpider2 (Lischer and Excoffier 2011). To determine the appropriate number of populations (k), five replicate STRUCTURE runs were initiated with different values of k for 50,000 iterations and a burnin of 5000. The Evanno method (Evanno, Regnaut, and Goudet 2005) implemented in Structure Harvester v0.6.94 (Earl and vonHoldt 2012) found that a k of 3 produced the maximum likelihood for my dataset. Hence, I set k to 3 and computed a structure plot from ten replicates, each with 500,000 iterations and a burnin of 50,000, using ParallelStructure v2.3.4 implemented through the CIPRES Scientific Gateway (Miller, Pfeiffer, and Schwarz 2010). dadi analysis

To assess the demographic history of the Maratua population in more detail, I applied the program dadi, which uses a diffusion-based approach as opposed to coalescent simulations to infer population history. Diffusion models assume a small per-generation change in allele frequencies (Kimura 1964; Gutenkunst et al. 2009). They are computationally faster than coalescent analysis methods and, thus, particularly useful for whole-genome datasets. Diffusion models like dadi can also account for gene flow (Gutenkunst et al. 2009). dadi relies on the allele-frequency spectrum, which can be simulated under various models to infer demographic histories (Gutenkunst et al. 2009).

39

I used the program dadi (Gutenkunst et al. 2009) on the thinned, biallelic vcf file. Only populations from mainland Borneo (n=7) and Maratua Island (n=6) were used in this analysis. The vcf file was projected into a site-frequency spectrum (SFS) using easySFS with all mainland Borneo individuals designated as population 1 and the Maratua individuals designated as population 2. The number of segregating sites in each population, population 1 and 2, at different population sizes projected downwards were calculated using easySFS, and the projection with the maximum number of segregating sites for each population was selected to make an SFS. I fit the SFS to ten island-based two-population models and the standard null-model implemented in dadi (Gutenkunst et al. 2009. The ten models were: founder_no_mig, founder_sym, founder_asym, founder_nomig_admix_early, founder_nomig_admix_late, vic_no_mig, vic_no_mig_admix_early, vic_no_mig_admix_late, vic_sec_contact_asym_mig, and vic_anc_asym_mig. For each model, I first optimized the parameters to obtain parameters that converge on a similar log-likelihood score. I used a grid point sizes of 30, 40, and 50 that were at least 10 greater than the largest project used, as suggested in the dadi manual. Each optimization step included four rounds of increasingly focused optimization routines repeated 10, 20, 30, and 40 times, each round with a maximum of 3, 5, 10, and 15 iterations, respectively. For every model, initial parameters were selected randomly with the population size parameter bound between 10-2 and 100, divergence time parameters bound between 0 and 5000, migration rate parameters bound between 0 and 10, and island population fraction parameter (s) bound between 0 and 1, following recommendations in the dadi manual. The best fitting parameters for each model were then used as initial parameters, and each was perturbed to estimate best-fit values for each model. Each model was run at least five times. For models in which parameters did not converge, I ran dadi more times until the parameters converged in at least three runs. The model with parameters having the highest likelihood value was used for further demographic inference. To convert parameter values to biologically relevant units, I calculated the mutation rate (µ) for B. atriceps. To this end, I generated a list of 500 introns randomly sampled from across the genome and extracted the sequences of each of these intron regions from the genome to construct a fasta file. These regions were blasted to the genomes of model species Taeniopygia guttata (taeGut3.2.4), Ficedula albicollis (FicAlb_1.4), and Manacus vitellinus (Manacus_vitellinus.ASM171598v2). From the blast output of each species, I selected only the best matched hit with an e-value smaller than 10-100 for each intron. The rate of substitution between B. atriceps and each model species was calculated for each intron. The mutation rate (µ) was estimated using the formula dAB=2µTAB, where dAB is the number of nucleotide substitutions between species A and B, and TAB is the time of divergence between species A and B. The divergence times between B. atriceps and the three model species were based on divergence time between the families Pycnonotidae (B. atriceps) and the Estrildidae (T. guttata: 27.0 my), Muscicapidae (F. albicollis: 27.0 my), and Pipridae (M. vitellinus: 43.9 my) from Oliveros et al. (2019). The average mutation rate for each comparison was calculated by averaging the mutation rates estimated for all introns. The mutation rate used to convert parameter values was the average rate generated from the mutation rate derived from each of the three species (1.661 x 10- 9 per year). For generation time, I used 1.7 years, which is the average for passerine birds (Sæther et al. 2005). To calculate effective population size (Nuref), I used the equation: θ = 4µNurefL. Since, the number of SNPs was subset during the thinning and filtering process, the value of L (length of sequence used in the dataset) was adjusted to the length of sequence that would yield the

40 number of SNPs used in the final analysis. Split effective population sizes were based on the fraction parameter, s, in which the effective size of population 1 (mainland) is nuA * (1-s) and population 2 (island) is nuA * s. Split time parameters are reported in 2 Nuref generations and subsequently converted to years using the calculated value of Nuref and the generation time. Migration rates were estimated using mab * 2 * Nuref.

Heterozygosity and substitution ratios

The number of heterozygous sites in various populations was estimated using vcftools -- het command (Danecek et al. 2011). Transition-transversion ratios within intron sequences were calculated for Maratua gray vs Bornean yellow birds using vcftools -- TsTv-summary command (Danecek et al. 2011). One individual from each population was used to generate statistics. Similar ratios were calculated for exon sequences and mitochondrial sequences.

RESULTS

UCE tree

Altogether 4539 UCE loci were extracted from the genomes. The 90% complete matrix comprises 4255 UCE loci with an average locus size of 1124 bp (610 to 1781 bp) to yield a sequence dataset totaling 4,783,674 bp. The RAxML tree (Fig. 4.2) strongly supports two groups, one comprising Maratua birds and the other comprising the populations of Sumatra, Borneo, Siberut, Bawean, and Palawan (i.e., all other Sunda islands compared). Branches between individual populations within the latter group are not well-supported.

Figure 4.2. RAxML tree of Brachypodius atriceps populations based on UCE loci. Bootstrap support is indicated next to corresponding nodes. 41

SNAPP tree

The tree generated from SNPs using SNAPP was similar to that generated from the UCE sequence data (Fig. 4.3). The shallow branches were still not resolved using intronic SNPs. However, for the individuals from mainland Sumatra, Borneo, Siberut, Bawean, and Palawan (i.e., all other Sunda islands compared), two groups were well-supported. One group consisted of individuals from NW Borneo (Mt. Pueh and Mt. Penrissen), E Kalimantan, Siberut, and Ulu Tungud in Sabah, NE Borneo. The second group consisted of mostly birds from NE Borneo (Sabah) and north-central Borneo (Bintulu), Sumatra, Palawan, and Bawean.

Figure 4.3. SNAPP tree of Brachypodius atriceps populations based on intronic sequences. Posterior probability support is indicated near respective nodes.

PCA analysis

The first principal component axis explains 88.1% of the variation in the data, while the second and third PC-axes explains 1.47 % and 1.27%, respectively. In the plot of PC1 versus PC2 (Fig. 4.4), PC1 segregates the Maratua population from all the other populations. PC2 distinguishes the Bawean population. PC3 (not shown) distinguishes western Bornean and Palawan populations from all the other populations.

STRUCTURE analysis

The Evanno method (Evanno, Regnaut, and Goudet 2005) specified 3 populations as the best k for clustering. The STRUCTURE plot places Maratua populations in one cluster (Fig.

42

4.5). The remaining individuals contain genetic characteristics of 2 populations, but without clear dominance patterns.

Figure 4.4. PCA showing PC1 versus PC2 using SNPs from intronic sequences.

Figure 4.5. STRUCTURE plot of B. atriceps populations.

43

dadi analysis

The 2D Site Frequency Spectrum (SFS) of the mainland Bornean populations vs. Maratua populations has most alleles restricted to one population each, and only a few alleles shared between populations (Fig 4.6.A). Most Bornean SNPs are singletons, whereas most Maratua SNPs are shared among all six individuals. The model “vic_sec_contact_asym_mig” had the best likelihood fit (-152.706) among the eleven models tested (Table 4.1). A comparison of the data and the model distributions of allele frequencies between Maratua and Bornean populations indicates that most of the population specific sites were well fitted by the model but alleles shared between Maratua and Bornean populations were less so (Fig. 4.6.B). In any event, residuals from the model fit are centered close to 0 (Fig. 4.6.C; Fig. 4.6.D).

Table 4.1: Models tested using dadi and their best-fitting parameter likelihoods.

Model Brief Description Num. of Likelihood AIC parameters vic_sec_contact_asym_mig Vicariance with no 5 -152.706 315.412 migration, followed by secondary contact with asymmetrical migration vic_anc_asym_mig Vicariance with 5 -160.885 331.77 asymetrical migration founder_asym Founder event with 5 -164.175 338.349 asymmetrical migration vic_no_mig Vicariance with no 2 -174.793 353.586 migration founder_nomig Founder event with no 3 -174.284 354.567 migration vic_no_mig_admix_late Vicariance with no 3 -174.699 355.398 migration, followed by a period of late admixture vic_no_mig_admix_early Vicariance with no 3 -174.793 355.586 migration, followed by a period of early admixture founder_nomig_admix_late Initial founder event with 4 -174.267 356.535 no migration, followed by a late period of admixture founder_nomig_admix_early Initial founder event with 4 -174.283 356.567 no migration, followed by an early period of admixture founder_sym Founder event with 4 -174.284 356.567 symmetrical migration snm Standard null model - -885.03 1770.060

44

Figure 4.6. A) and B) Distribution of SNPs in data and model. C) Spread of residuals in model. D) Distribution of residuals

Converted to biologically significant units, the parameter estimates of the model “vic_sec_contact_asym_mig” (Fig. 4.7) set the value of θ at 53.673. Given a generation time of 1.7 years, this value indicates that approximately 0.18% of the Bornean population was isolated on Maratua at c. 1.9 Ma. The two populations remained isolated until secondary contact c. 1000 years ago, with the occurrence of an asymmetrical migration rate of 0.01 from Borneo to Maratua and 0.00001 from Maratua to Borneo. For comparison, I converted the parameter estimates of the next highest supported models into biological units. The model with the second highest support (likelihood -160.885) was “vic_anc_asym_mig”, which assumes vicariance with ancient asymmetric migration without secondary contact as in the previous model. For this model, θ was estimated at 81.979. It suggests that the Maratua population split from the Bornean population c. 11.4 Ma and c. 4.8% of the Bornean population was isolated on Maratua. The asymmetrical migration rate from Borneo to Maratua was 0.0003 and from Maratua to Borneo was 8.61 x 10-10.

45

Heterozygosity and substitution ratios

Using an individual from Borneo as the reference genome, the average number of heterozygous sites on mainland Borneo was 4,077,642 (n=7), on Sumatra 5,138,960 (n=1), on Maratua 812,572 (n=6), on Bawean 3,799,034 (n=1), on Siberut 5,832,377 (n=1), on Palawan was 4,448,497 (n=1). Between the Maratua and mainland Bornean populations, the transition/transversion ratio for introns and exons were 2.09 and 3.43 respectively.

Figure 4.7. Thematic representation of the dadi population structure model with the highest likelihood and parameters converted to relevant biological units.

DISCUSSION

Using whole genome data, I examined the demographic history of Brachypodius atriceps populations on the Sunda Islands, with special emphasis on a small oceanic island, Maratua, and a small continental island, Bawean, where gray colored birds predominate or are common. The results suggest the Bawean population, as well as populations on other small to medium continental islands (Palawan and Siberut) are barely distinguishable genetically from those on the Greater Sundas (Borneo, Java, and Sumatra). However, the Maratua population is distinct from all other populations. The Maratua population was originally isolated from other Sundaic populations c. 1.9 Ma, but c. 1000 years ago began to experience a small amount of gene flow. Earlier studies have shown that B. atriceps exhibits little to no population structure across most of its range, except in central or northern Indochina and on Maratua Island (Chua et al. 2015; Dejtaradol et al. 2015; Lim et al. 2019). In their UCE study, Lim et al. (2019) included only a single gray specimen (from Bawean Island) and found that it barely differed other populations (they did not compare individuals from Maratua). However, the UCE data of that gray specimen was derived from a toe-pad sample >100 years old and did not yield many 46 complete loci. Nevertheless, my results using a much larger set of loci, all of which derived from well-preserved, recently collected muscle samples, corroborate the observations of both Chua et al. (2015) and Lim et al. (2019). The UCE tree (Fig. 4.2), SNAPP tree (Fig. 4.3), PCA (Fig. 4.4), and STRUCTURE analysis (Fig. 4.5) all strongly support the Maratua population as a group distinct from all other populations, including the Bawean population. In contrast, based on the second axis of PCA, which indicated 1.47% variation, the Bawean population most likely only started accumulating unique SNPs very recently (c. 10,000 years ago). Based on the dadi analysis, the Maratua and Bornean populations split c. 1.9 Ma and secondary contact occurred c. 1000 years ago. I am not aware of any geographic process that corresponds to these times of division and secondary contact. At the time of the population split, only c. 0.18% of the effective population of Borneo was isolated on Maratua. This percentage makes sense given Maratua’s small size compared to Borneo. Because Maratua was never connected to Borneo, even during the low sea level of the LGM, its B. atriceps population would have evolved largely in isolation for c. 1.9 million years. The island’s long isolation and small size (~24 km2), explains Maratua’s accumulation of variable SNPs compared to populations elsewhere in Sundaland. Possibly the founder effect and likely genetic drift promoted the rapid fixation of alleles that occur in low proportions elsewhere in B. atriceps’ distribution. Borneo, Sumatra, and Java, on the other hand, were often connected during periods of low sea-level in the Pleistocene, and the combination of their large sizes and periodic gene flow would have had an amalgamating effect on their populations. My results show a significant reduction of heterozygosity among Maratua individuals relative to Borneo. This reduction is not observed among the other peripheral island populations I examined (Bawean, Palawan, and Siberut). Mayr’s (1954) genomic revolutions predicts a loss in heterozygosity, but my results do not specify the degree of heterozygosity when the Maratua population was founded or during the early stages of population establishment. A rise in heterozygosity as predicted by founder-flush (Carson 1968) and genetic transilience (Templeton 1980), with a subsequent loss of heterozygosity through longevity in isolation, may have occurred. Founder-flush and genetic transilience also posit shifts in adaptive peaks that would be expressed as positive selection of genes, which I did not find. Other measures of genetic differentiation suggest the Maratua population has been evolving through neutral processes even with the loss in heterozygosity. The transition-transversion ratio of all introns in the Maratua population was ~2, and for exons >3, suggesting the accumulation of mutations within Maratua birds occurred in accordance with the expectations of neutral mutation rates. My results from Chapter 3, based on dN/dS ratios, indicated a lack of selection in coding genes potentially involved in color polymorphism. Most of the protein-coding genes were found to have experienced little positive selection, with the majority lacking any substitutions or having evolved at neutral rates. Hence, positive selection may not have played a part in establishing the gray morphs on the island. Based on these results, if Mayr’s hypothesis is correct, the yellow coloration is maintained in mainland populations through epistatic interactions with other gene complexes. On Maratua, the reduction in genetic variability and small population sizes would have interrupted the complexes. Genetic drift may then have led to random fixation of the distinct (gray) phenotype. However, in Chapter 3 I was unable to pinpoint any such region with confidence. To understand the evolutionary forces at play during the beginning stages of the Maratua population, one can refer to the present genetic structure of the Bawean population. Genetic drift should be active force on that island as well, given its small size. However, unlike Maratua,

47

Bawean is a continental island that was often connected to the Greater Sundas during the last 2.5 million years and would have experienced substantial gene flow and population amalgamation. On Bawean Island, both gray and yellow morphs are equally common and regularly interbreed (Hoogerwerf 1966). Carson (1975) described the existence of two classes of loci, those that are “independent” and others that are bound by strong epistatic interactions with other genes. He labelled them as “open” and “closed” systems. A strong force is required to breakdown the epistatic interactions in “closed” systems. On small islands the founder effect and drift may be adequate to breakdown such epistatic interactions. Upon disorganization of closed systems, new variants can be established in populations. This breakdown may explain why gray birds are more common on Bawean Island than on the Greater Sundas, and the existence of both color morphs may represent conditions before the population crash predicted by founder-flush and genetic transilience. If so, sometime in the future following a population crash, only gray morphs or yellow morphs will likely persist on Bawean Island (barring low sea levels and a new influx of genes). Alternatively, it is also possible that yellow birds on the island are recent migrants from Borneo and/or Java. Unfortunately, with a sample consisting of only a single gray individual and no yellows, I was not able to test this possibility. The size of the peripheral island may also play an important role in the chances of epistatic breakdown. Maratua and Bawean islands, with gray birds, are 24 and 210 km2, respectively, and are much smaller than all other islands whose populations I examined. Their small size is critical to the breakdown of epistatic interactions. Other islands in Sundaland that have been permanently separated by a water barrier from the Greater Sundas (Lim et al. 2014) include Palawan and Siberut. These islands have populations with typical yellow morphs that do not vary genetically from populations on their nearest large neighbors (Borneo and Sumatra, respectively). It might be expected that these two islands, like Maratua, would have experienced substantial isolation and divergence in their populations. However, both are much larger than Maratua, increasing the likelihood of gene flow and decreasing genetic drift and the consequent fixation of distinct alleles.

Conservation

With respect to conservation, island populations are often distinct evolutionary groups with small population sizes. As such island populations are not only inherently valuable as distinct lineages for evolutionary research, but they are often particularly vulnerable to disturbance. Maratua Island is well known in Southeast Asia as a potential biogeographic museum by virtue of its distinct populations of vertebrates. It may serve as an archive of genetic structure and morphology of early populations that invaded Sundaland from mainland Asia (Fooden 1995; Chua et al. 2015). Unfortunately, small island populations of B. atriceps are being disproportionately affected by the conversion of forests to agricultural lands and trapping for the Indonesian bird trade. Distinct populations of birds on both Maratua and Bawean, not only of B. atriceps but several other species, attract poachers who capture and sell the birds as pets (Chng, Shepherd, and Eaton 2018). The population of B. atriceps on Bawean, for example, appears to have been completely extirpated in recent years (Burner et al. 2018). Once the birds are gone, chances for understanding their role in the evolutionary history of Sundaland is likewise lost forever.

48

CHAPTER 5. CONCLUSIONS

Understanding the genetics of polymorphisms can yield key insight into the process of evolution. I examined the genetic basis and demographic history of color morphs in the Black- headed Bulbul (Brachypodius atriceps) by inferring the phylogeny of the family Pycnonotidae ( Chapter 2), searching for genes associated with color morphs within B. atriceps (Chapter 3), and comparing the phylogeographic history of various B. atriceps populations across Sundaland (Chapter 4) to gain a better understanding of population dynamics in the species. By reconstructing the phylogeny of the family Pycnonotidae, I established the relationships of B. atriceps, as well as other yellow bulbuls. I used publicly available sequences generated from previous phylogenetic studies, augmented by new sequences from several unstudied taxa, to create a supermatrix from which to infer the phylogeny. In all, I compared 121 of the 130 species of bulbuls. My tree supports the monophyly of the family and comprises an exclusively African and a predominantly Asian clade. I found several genera not to be monophyletic and, hence, I suggest taxonomic changes to provide a more accurate classification based on phylogeny. The tree also showed that bright yellow coloration among bulbul species evolved multiple times. I was able to identify the yellow-wattled bulbul (Poliolophus urostictus) as the closest non-yellow species to B. atriceps and used this information to choose outgroup taxa in the following chapters. The genetic underpinnings of carotenoid coloration are slowly being unveiled as more studies apply whole-genome sequencing (WGS) to investigate carotenoid-based polymorphism in birds. In Chapter 3, I used WGS to compare the genomes of both yellow and gray morphs of B. atriceps. Although I located a few genes in regions of high Fst within the genome that are likely involved in lipid (and hence carotenoid) transport and deposition, I was unable to associate any loci directly with the polymorphism. Tests of selection among coding-proteins in high Fst regions in gray and yellow morphs did not identify many established candidates as controlling carotenoid coloration. Establishing the phylogeographic history of a species is integral to constructing a full picture of its evolution. Thus, in Chapter 4, I used the whole-genome dataset from Chapter 3 to infer the population genetics of B. atriceps. I found the Maratua Island population of gray individuals to be distinct from all other B. atriceps populations, whereas the Bawean gray and yellow individuals, and its population as a whole, are completely indistinguishable from one another and are embedded within an almost panmictic Sundaic clade. Upon analyzing the demographic history of Maratua birds using models implemented in dadi, I found that the Maratua population was originally isolated from other Sundaic populations c. 1.9 Ma, but c. 1000 years ago began to experience a small amount of gene flow from mainland Borneo. While conducting field work to collect specimens for this study, I discovered that small Sundaic island populations of B. atriceps are being disproportionately affected by the Indonesian pet-bird trade because of the species’ elegant song and beautiful plumage. Fortunately, sampling the populations when I did, before they were lost forever, provided genetic material for this and future studies, which eventually will develop a full picture the evolutionary history of these remarkable birds.

49

APPENDIX A. SUPPLEMENTARY MATERIAL FOR CHAPTER 2

Table A.1. GenBank accession numbers for all taxa included in the supermatrix analyses. Accession numbers for new sequences generated for this study are in italics. All tissues used in this study and previous studies are listed in the column “Tissue No.”. Numeric superscripts in front of each GenBank accession number correspond to the respective numbers in front of tissue numbers from which they were sequenced. Abbreviations used to designate collections: AMNH: American Museum of Natural History; ANSP: Academy of Natural Sciences of Philadelphia; CAS: California Academy of Sciences; CU: Cornell University; FMNH: Field Museum of Natural History; KU: University of Kansas Biodiversity Institute; LSUMNS: Louisiana State University Museum of Natural Science; MNHN: Museum national d’Histoire Naturelle; MZB: Museum Zoologicum Bogoriense-LIPI; NRM: Naturhistoriska Riksmuseet, Swedish Museum of Natural History; RBINS: Royal Belgian Institute of Natural Sciences; UMMZ: University of Michigan Museum of Zoology; USNM: National Museum of Natural History, Smithsonian Institution; UWBM: University of Washington Burke Museum; VH: Vogelwarte Hiddensee, Zoological Institute and Museum; WFVZ: Western Foundation of Vertebrate Zoology; YPM: Yale Peabody Museum; ZMUC: Zoological Museum, Natural History Museum of Denmark.

Mitochondrial loci:

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Alophoixus bres 1ANSP 1056 1AF386490 1AF391227 2KT312509 - 3JN827010 4DQ402228 4DQ402289 2AMNH DOT 451 3KU 17701 4LSUMNS B- 38567 Alophoixus finschii 1LSUMNS B- - - - - 1KY404180 1KY454707 1KY454726 88066 Alophoixus flaveolus 1MNHN 4-5M 1AF386478 1AF391215 2KT312498 2JQ173973 ?KJ456190 ?KJ455318 3KT312363 2USNM:Birds: 620490 3USNM:Birds: 620489 (table cont’d.)

50

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Alophoixus frater 1KU 12607 - - - - - 1GU112679 1GU112725

Alophoixus ochraceus 1MNHN 1989- 1AF386482 1AF391219 2KT312419 - 3KY404181 3DQ402229 3DQ402290 90 2MNHN 1939- 495 3LSUMNS B- 38614 Alophoixus pallidus 1MNHN 4-4I 1AF386471 1AF391209 2KT312428 - 3DQ008507 3GQ242078 3GQ242112 2MHNG JF127 3NRM 20046822 Alophoixus phaeocephalus 1ANSP 1058 1AF386491 1AF391228 2KT312520 - 3KY404182 3DQ402230 3DQ402291 2AMNH DOT 15084 3LSUMNS B- 39568 Andropadus curvirostris 1USNM:Birds:6 - - - 1JQ174018 - - - 31688 Andropadus gracilis 1USNM:Birds:6 - - - 1JQ174019 - - - 27329 Andropadus importunus 1UWBM 70425 - - - 1FJ473230 - 2AF003469 3GQ242115 2ZMUC Aimp1773 3NRM 86388 Andropadus latirostris 1USNM:Birds:6 - - - 1JQ174020 - - - 31695 Andropadus virens 1RMCA:124702 - - - 1HQ998164 - - -

Arizelocichla fusciceps 1ZMUC O2949 - - - - 1AH005509 1AF003457 -

chlorigula

(table cont’d.)

51

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Arizelocichla fusciceps 1ZMUC atfusci - - - - 1AH005510 1AF003458 -

fusciceps

Arizelocichla masukuensis 1UWBM 70425 - - - - 1FJ487856 2KY454708 2KY454727 2FMNH 384856 kakamegae

Arizelocichla nigriceps 1ZMUC mr334 - - - - 1AH005516 1AF003463 -

kikuyensis

Arizelocichla masukuensis 1NRM 86250 - - - - - 1GQ242085 1GQ242120

masukensis

Arizelocichla milanjesnsis 1FMNH 447363 - - - - - 1KY454709 -

milanjensis

Arizelocichla montana 1ZMUC - - - - 1 AH005507 1AF003455 - Amont1a

Arizelocichla neumanni 1ZMUC mr511 - - - - 1AH005504 1AF003456 -

Arizelocichla nigriceps 1ZMUC O4422 - - - - 1AH005511 1AF003459 -

nigriceps

(table cont’d.)

52

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Arizelocichla tephrolaema 1AtCb1 - - - - 1AF282785 2GQ242086 2GQ242121 2NRM 20086239 Atimastillas flavicollis 1VH B0720 - - - - 1JX236375 2DQ402205 2DQ402266 2FMNH 391690 Baeopogon clamans 1FMNH 429455 - - - - - 1DQ402203 1DQ402264

Baeopogon indicator 1MHNH 2-49 1AF386465 1AF391205 - - - 2DQ402204 2DQ402265 2FMNH 391689 Berniera madagascariensis 1FMNH 356644 - - - - 1AF199387 2DQ402194 2DQ402255 2CMNH C37768 Bleda canicapillus 1LSUMNS B------1DQ402201 1DQ402262 39389 Bleda eximius 1USNM:Birds: - - - 1JQ174174 - 2DQ402202 2DQ402263 631669 2LSUMNS B- 39580 Bleda notatus 1MNHN 2-13 1AF386474 1AF391203 - - - 2KY454711 2KY454729 2FMNH 396345 Bleda syndactylus 1MNHN 1-02 1AF386466 1AF391204 - 2JQ174177 3AF003435 4AY136592 5AY590757 2USNM:Birds: 630777 3ZMUC mr415 4UMMZ 233829 5O3196 Calyptocichla serinus 1ANSP 11614 - - - - - 1DQ402199 1DQ402260

Cerasophila thompsoni 1LSUMNS B------1KY454712 1KY454730 69586

(table cont’d.)

53

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Chlorocichla falkensteini 1KU 29237 - - - - - 1KY454713 1KY454731

Chlorocichla flaviventris 1ZMUC O1789 - - - - 1AY228053 2GQ242087 1AY590758 2ZMUC 117578 Chlorocichla laetissima 1RBINS:68636 - - - 1HQ998101 - - -

Chlorocichla prigoginei 1RMCA:122299 - - - 1HQ998159 - - -

Chlorocichla simplex 1KU 15767 - - - - - 1KY454714 1KY454732

Criniger barbatus 1MNHN 2-21 1AF386487 1AF391224 - 2JQ174571 - 3DQ402208 3DQ402269 2USNM:Birds: 631589 3FMNH 389346 Criniger calurus 1AMNH PB222 1AF386483 1AF391220 - 2JQ174574 - 3DQ402206 3DQ402267 2USNM:Birds: 631618 3LSUMNS B- 27104 Criniger chloronotus 1Y47 - - 1JX259157 - - 2DQ402207 2DQ402268 2LSUMNS B- 27094 Criniger ndussumensis 1MNHN 3-10 1AF386472 1AF391210 - - - 2DQ402209 2DQ402270 2FMNH 357235 Criniger olivaceus 1AMNH 8275 1AF386488 1AF391225 - - - - -

Eurillas ansorgei 1ANSP 11736 - - - - - 1DQ402195 1DQ402256

Eurillas curvirostris 1ZMUC - - - - ?KC355099 1AH005522 2DQ402259 Acurv3180 2FMNH 384875 Eurillas gracilis 1FMNH 396569 - - - - - 1DQ402196 1DQ402257

(table cont’d.)

54

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Eurillas latirostris 1MNHN 2-52 1AF386467 1AF096477 - - 2DQ008508 2GQ242089 2GQ242124 2NRM 20046809 Eurillas virens 1MNHN 2-01 1AF386477 1AF391214 - - 2AH005519 3DQ402197 3DQ402258 2ZMUC mr415 3FMNH 384863 Hemixos castanonotus 1KU10273 - - - - - 1GU112647 1GU112693

Hemixos cinereus 1LSUMNS B------1DQ402224 1DQ402285 38659 Hemixos flavala 1KU 15140 - - - - ?KJ456298 1GU112648 1GU112694

Hirundo rustica 1UWBM 49276 ?AB042349 ?AB042382 1GU460221 2FJ027653 ?DQ119526 ?HF548562 3DQ402247 2MACN-Or-ct 1844 3LSUMNS B14102 Hypsipetes amaurotis 1HA0127 ?EU167068 - - - 1AB159161 2DQ402222 2DQ402283 2LSUMNS B- 51130 Hypsipetes borbonicus 1BW350 - - 1AY590702 - - - 2AY590733 2BW270 Hypsipetes crassirostris 1BW043 - - 1AY590703 - - - 1AY590741

Hypsipetes everetti 1KU 14184 - - - - - 1GU112651 1GU112697

Hypsipetes guimarasensis 1KU 15281 - - - - - 1GU112658 1GU112704

Hypsipetes leucocephalus 1MNHN 15-60 1AF386481 - - - ?KJ456308 2GU112649 2GU112695 2KU10337 Hypsipetes 1USNM:Birds:6 ?AB042346 ?AB042379 ?AY590704 1JQ175126 - 2DQ402225 2DQ402286 20522 madagascariensis 2FMNH 393308

(table cont’d.)

55

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Hypsipetes mindorensis 1KU 12566 - - - - - 1GU112655 1GU112701

Hypsipetes moheliensis 1BW171 - - 1AY590710 - - - 1AY590737

Hypsipetes olivaceus 1BWM9 - - 1AY590709 - - - 2AY590735 2BWM10

Hypsipetes parvirostris 1BW221 - - 1AY590711 - - - 2AY590739 2BW202 Hypsipetes philippinus 1FMNH 6590 1AF386486 1AF391223 ?AY590713 - - 2GU112654 2GU112700 2KU 11006 Hypsipetes rufigularis 1KU 18213 - - - - - 1GU112660 1GU112706

Hypsipetes siquijorensis 1KU 14448 - - - - - 1GU112661 1GU112707

Iole charlottae 1KU 17765 - - - - - 1GU112690 1GU112736

Iole palawanensis 1KU 12618 - - - - - 1GU112688 1GU112734

Iole propinqua 1MNHN 4-4D 1AF386475 1AF391212 - 2JQ175164 - 3GQ369691 - 2USNM:Birds: 620494 3MNHN 31-81 Iole viridescens 1CAS ORN - - - - - 1KU601856 1KU601803 89534 Ixonotus guttatus 1MNHN 3-3 1AF386470 1AF391208 - - - 2GQ242088 2GQ242123 2NRM 86201 Ixos malaccensis 1LSUMNS B- - - - - 1KY404183 1DQ402227 1DQ402288 51130 Ixos mcclellandii 1MNHN 4-4H 1AF386468 1AF391206 - - 2JX398905 3GQ242079 3GQ242113 2AMNH:299482 3NRM 20046796 Ixos philippinus 1AL28 - - - 1EU541456 - - -

(table cont’d.)

56

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Ixos virescens 1KU 31141 - - - - - 1KY454715 1KY454733

Malia grata 1AMNH:299482 - - - - 1JX398905 1JX398921 1JN826851

Microscelis amaurotis - - - ?AB765881 - - -

Neolestes torquatus 1RMCA:111532 - - - 1HQ998140 - 1GQ242083 1GQ242117 2NRM 71084 Nicator chloris 1USNM:Birds: - - - 1JQ175554 2JX236396 3DQ402188 3DQ402249 627315 2VH A1439 3FMNH 391705 Phyllastrephus albigula 1ZMUC 132973 - - - - - 1HQ716722 -

Phyllastrephus albigularis 1MNHN 1-42 1AF386476 1AF391213 - - - 2DQ402210 2DQ402271 2LSUMNS B- 39597 Phyllastrephus baumanni 1YPM ORN - - - - - 1KY454716 - 076960 Phyllastrephus 1VH A1600 - - - - 1JX236400 - -

cerviniventris

Phyllastrephus debilis 1FMNH 356727 - - - - 1AF199383 1DQ402213 1DQ402274

Phyllastrephus fischeri 1FMNH 384931 - - - - - 1DQ402216 1DQ402277

Phyllastrephus 1ZMUC O3470 - - - - 1 AH005523 2DQ402214 2DQ402275 2FMNH 384938 flavostriatus flavostriatus

Phyllastrephus fulviventris 1YPM ORN - - - - - 1KY454717 - 095391 Phyllastrephus hypochloris 1FMNH 384936 - - - - - 1DQ402215 1DQ402276

(table cont’d.)

57

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Phyllastrephus icterinus 1MNHN E-22 1AF386469 1AF391207 - 2JQ175790 3AF199382 4DQ402211 4DQ402272 2USNM:Birds: 631564 3FMNH 357201 4LSUMNS B- 27097 Phyllastrephus lorenzi 1RMCA:106134 - - - 1HQ998127 - - -

Phyllastrephus cabanisi 1LSUMNS B------1DQ402212 1DQ402273 21133 placidus

Phyllastrephus poensis 1NRM - - - - - 1GQ242084 1GQ242118 20086270 Phyllastrephus scandens 1FMNH 389335 - - - - 1AF199381 2DQ402218 2DQ402279 2FMNH 389336 Phyllastrephus terrestris 1FMNH 390098 - - - - - 1DQ402217 1DQ402278

Phyllastrephus xavieri 1FMNH 357212 - - - - - 1DQ402219 ?AY590761

Pycnonotus atriceps 1MNHN 4-3C 1AF386484 1AF391221 - - 2KY404184 2DQ402231 2DQ402292 2LSUMNS B- 36320 Pycnonotus aurigaster 1KU 23510 - - - - - 1KY454718 1KY454734

Pycnonotus barbatus 1MNHN 2-21 1AF386479 1AF391216 - 2FJ473231 2FJ487857 3DQ402232 3DQ402293 2UWBM 70428 3LSUMNS B- 39494 Pycnonotus bimaculatus 1MP-PlzV - - - - - 1KP943466 -

Pycnonotus blanfordi 1USNM:Birds:6 - - - 1JQ176058 - 2KY454719 2KY454735 20333 2KU 23346 (table cont’d.)

58

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Pycnonotus brunneus 1Sarawak- - 1AY574891 - 2FJ473091 2FJ487694 3DQ402233 3DQ402294 REBF1 2MZB DM33 3LSUMNS B- 36341 Pycnonotus cafer 1AV21 1KF289822 1KF289828 - 2JF498896 ?KJ456440 ?KJ455616 - 2USNM:643446 Pycnonotus cinereifrons 1KU12660 - - - - - 1GU112684 1GU112730

Pycnonotus cyaniventris 1LSUMNS B- - - - - 1KY404185 1DQ402234 1DQ402295 36425 Pycnonotus melanicterus 1MP_2380 - - - - - 1KP943427 -

dispar

Pycnonotus 1MZB SPS27 - - - - 1FJ487794 2DQ402235 2DQ402296 2LSUMNS B- erythropthalmos 36378

Pycnonotus eutilotus 1Sarawak- - 1AY574900 - - 2KY404186 2DQ402236 2DQ402297 BPBBM1 2LSUMNS B- 36313 Pycnonotus finlaysoni 1MNHN 4-3I 1AF386473 1AF391211 - 2FJ473154 2FJ487764 3GQ242076 3GQ242110 2NRM 20026612 3NRM 20046850 Pycnonotus flavescens 1KU 17746 - - - - 1KY404187 1GU112669 1GU112715

Pycnonotus melanicterus 1AHNU:A0014 1KJ186975 1KJ186975 1KJ186975 1KJ186975 1KJ186975 1KJ186975 1KJ186975

flaviventris

(table cont’d.)

59

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Pycnonotus goiavier 1PNM 19900 - - - 1FJ473057 2FJ487665 3DQ402237 3DQ402298 2MZB BS03 3LSUMNS B- 36351 Pycnonotus melanicterus 1USNM585513 - - - - - 1KP943372 -

gularis

Pycnonotus jocosus 1MNHN 4-4B 1AF386480 1AF391217 - 2GU170351 ?KJ456441 3GQ242077 3GQ242111 2S00101 3NRM 20046820 Pycnonotus leucogenys 1PSRCEW-BB- - - - 1HQ168045 - 2DQ402241 2DQ402302 WCB01 2FMNH 347847 Pycnonotus 1MP-PlzI - - - - - 1KP943465 -

leucogrammicus

Pycnonotus leucotis 1MP_2498 - - - - - 1KP943428 -

Pycnonotus luteolus 1USNM:Birds: - - - - - 1KP943367 - 585521 Pycnonotus melanicterus 1YPM ORN - - - - - 1KY454720 - 034088 melanicterus

Pycnonotus melanoleucos 1LSUMNS B------1DQ402242 1DQ402303 50353 Pycnonotus melanicterus 1LSUMNS B- - - - - 1KY404188 2DQ402243 2DQ402304 52674 montis 2LSUMNS B- 51038 Pycnonotus nigricans 1LSUMNS B------1DQ402238 1DQ402299 34228 (table cont’d.)

60

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Pycnonotus plumosus 1Sarawak- - 1AY574884 - ?FJ473241 2FJ487748 3DQ402239 3DQ402300 OWBW1 2LSUMNS B- 46965 3LSUMNS B- 23354 Pycnonotus priocephalus 1YPM ORN - - - - - 1KY454721 - 032052 Pycnonotus simplex 1LSUMNS B- - - - 1FJ473143 1FJ487754 2GU112671 2GU112717 47170 2KU 17777 Pycnonotus sinensis 1NSMT:A17263 ?AB042361 ?AB042394 ?GU475148 1AB843108 2FJ487714 3GU112672 3GU112718 2AMNH DOT 5237 3KU 10313 Pycnonotus squamatus 1LSUMNS B- - - - - 1KY404189 1KY454722 2GU112737 88116 2WFVZ 38646 Pycnonotus striatus 1CAS ORN - - - - 1KJ456444 1KJ455620 - 95901 Pycnonotus taivanus ?FJ378536 ?FJ378536 ?FJ378536 ?FJ378536 ?FJ378536 ?FJ378536 ?FJ378536

Pycnonotus barbatus 1FMNH 384925 - - - - - 1KY454723 1KY454736

tricolor

Pycnonotus urostictus 1PNM 19890 - - - 1EU541463 - 2KJ931003 ?AY590762 2KU 14038 Pycnonotus xanthopygos 1MP_5004 - - - - - 1KP943432 -

Pycnonotus xanthorrhous 1MNHN 14-19 1AF386485 1AF391222 - - - 2GU112677 2GU112723 2KU 11153 Pycnonotus zeylanicus 1LSUMNS B------1DQ402240 1DQ402301 23321 (table cont’d.)

61

Species Tissue No. 12S 16S ATP6 COI CYTB ND2 ND3

Setornis criniger 1LSUMNS B- - - - - 1KY404190 1DQ402221 1DQ402282 23359 Spizixos canifrons 1CAS ORN - - - - - 1KY454724 1KY454737 6462 Spizixos semitorques 1AHNU:Y06039 - - - 1FJ661100 ?JF509584 2DQ402244 2DQ402305 2LSUMNS B- 20554 Stelgidillas gracilirostris 1NRM 86447 - - - - - 1GQ242082 1GQ242116

Thapsinilllas affinis 1YPM ORN - - - - - 1KY454725 - 075448 Thescelocichla leucopleura 1ANSP 11476 - - - - - 1DQ402200 1DQ402261

Tricholestes criniger 1ANSP 1179 1AF386489 1AF391226 - - 2KY404191 2DQ402223 2DQ402284 2LSUMNS B- 38601 Nuclear loci:

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Alophoixus bres 2AMNH DOT - 4DQ402319 2KT311996 - 2KT312713 - 4GU112563 451 4LSUMNS B- 38567 Alophoixus finschii 1LSUMNS B- - - - 1KY454741 - - 1KY454738 88066 Alophoixus flaveolus 3USNM:Birds: - - 3KT311974 ?KJ454749 3KT312691 ?KJ455972 - 620489 Alophoixus frater 1KU 12607 - 1GU112633 - - - - 1GU112587

(table cont’d.)

62

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Alophoixus ochraceus 2MNHN 1939- - 3DQ402322 2KT311798 - 4KT312551 - 3GU112564 495 3LSUMNS B- 38614 4MNHN 1989- 90 Alophoixus pallidus 2MHNG JF127 3EF626743 4GU112600 2KT311828 3EF625281 2KT312565 - 4GU112531 3NRM 20046822 4KU 10111 Alophoixus phaeocephalus 2AMNH DOT - 3DQ402323 2KT312018 - 2KT312735 - 3GU112565 15084 3LSUMNS B- 39568 Andropadus importunus 3NRM 86388 4EF626713 3GQ242059 - 4EF625252 4EF625302 - - 4ZMUC 117563 Arizelocichla fusciceps 2ZMUC 136603 2EF626703 - - 2EF625243 2EF625292 - -

chlorigula

Arizelocichla fusciceps 2ZMUC 136603 2EF626700 - - 2EF625240 2EF625289 - -

fusciceps

Arizelocichla masukuensis 2FMNH 384856 - - - 2KY454742 - - -

kakamegae

Arizelocichla nigriceps 2ZMUC 123194 2EF626701 - - 2EF625241 2EF625290 - -

kikuyensis

(table cont’d.)

63

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Arizelocichla masukuensis 1NRM 86250 2EF626698 1GQ242064 - 2JX236344 2EF625287 2JX236416 - 2ZMUC 117572 masukuensis

Arizelocichla milanjensis 1FMNH 447363 - - - 1KY454743 - - -

milanjensis

Arizelocichla neumanni 2ZMUC 136084 2EF626702 - - 2EF625242 2EF625291 - -

Arizelocichla milanjensis 1FMNH MLW 1EF626704 - - 1EF625244 1EF625293 - - 985 olivaceiceps

Arizelocichla milanjensis 2FMNH 356720 - - - 2KY454744 - - -

striifacies

Arizelocichla tephrolaema 2NRM 2GQ242044 2GQ242065 - 2GQ242109 2GQ242147 - - 20086239 Atimastillas flavicollis 2FMNH 391690 3EF626721 2DQ402318 4JX236287 4JX236346 3EF625310 - - 3FMNH 429738 4VH A1575 Baeopogon clamans 1FMNH 429455 1EF626716 - - 1EF625256 1EF625305 - -

Baeopogon indicator 2FMNH 391689 2EF626717 2DQ402317 - 2EF625255 2EF625306 - -

Berniera madagascariensis 2CMNH - 2DQ402338 3HQ333100 3HQ333071 3HQ333086 3JX236419 - C37768 3FMNH 431202 Bleda canicapillus 1LSUMNS B- 2GQ242043 1DQ402315 - 2GQ242108 2GQ242146 - 1GU112528 39389 2NRM 86188 (table cont’d.)

64

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Bleda eximius 2LSUMNS B- - 2DQ402316 - - - - - 39580 Bleda notatus 2FMNH 396345 3EF626739 - - 2KY454745 3EF625328 - - 3FMNH 429498 Bleda syndactylus 6ZMUC119088 6EF626738 7GQ242063 - 6EF625276 6EF625327 8AY319976 - 7NRM 86179 8AMNH (PRS2036) Calyptocichla serinus 1ANSP 11614 2EF626715 1DQ402314 - 2EF625254 2EF625304 - 3GU112566 2ANSP 11799 3KU 11614 Cerasophila thompsoni 1LSU B-69586 - - - 1KY454746 - - -

Chlorocichla falkensteini 1KU 29237 - - - 1KY454747 - - -

Chlorocichla flaviventris 1ZMUC O1789 3EF626720 2GQ242066 - 1AY228290 3EF625309 1AY228009 - 2ZMUC 117578 3ZMUC123467 Chlorocichla simplex 1KU 15767 - - - 1KY454748 - - -

Criniger barbatus 4ZMUC 122882 4EF626740 - - 4EF625278 4EF625329 - -

Criniger calurus 1AMNH PB222 - - - 1DQ125947 - - 3GU112529 3LSUMNS B- 27104 Criniger chloronotus 2LSUMNS B- - 2DQ402321 - 2JX259184 - - - 27094 Criniger ndussumensis 2FMNH 357235 3EF626741 2DQ402324 - 3EF625279 3EF625330 - - 3FMNH 429738 Eurillas ansorgei 1ANSP 11736 1EF626708 1DQ402313 - 1EF625246 1EF625297 - -

Eurillas curvirostris 2FMNH 384875 3EF626706 2DQ402359 - 3EF625247 3EF625295 - - 3ZMUC 119035 (table cont’d.)

65

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Eurillas gracilis 1FMNH 396569 1EF626709 1DQ402357 - 1EF625245 1EF625298 - -

Eurillas latirostris 2NRM 2EF626710 2GQ242068 - - 2EF625299 - - 20046809 Eurillas virens 3FMNH 384863 4EF626711 3DQ402358 - 4EF625250 4EF625300 - - 4ZMUC 118832 Hemixos castanonotus 1KU 10273 - 1GU112601 - - - - 1GU112532

Hemixos cinereus 1LSUMNS B- 2GQ242038 - - 2GQ242104 2GQ242141 - 1GU112567 38659 2NRM 86032 Hemixos flavala 1KU 15140 - 1GU112602 - - - - 1GU112533

Hirundo rustica 3LSUMNS B- 4EF626748 3DQ402309 5HQ333105 4AY064258 4EF441240 4AY064271 - 14102 4NRM 976238 5VH A1574 Hypsipetes amaurotis 3NRM 85997 3GQ242037 4DQ402325 - 3GQ242103 3GQ242140 - 4GU112570 4LSUMNS B- 16985 Hypsipetes everetti 1KU 14184 - 1GU112605 - - - - 1GU112536

Hypsipetes guimarasensis 1KU 15281 - 1GU112612 - - - - 1GU112543

Hypsipetes leucocephalus 2KU 10337 3GQ242040 2GU112603 - 3GQ242105 3GQ242143 - 2GU112534 3NRM 20047104 Hypsipetes 2FMNH 393308 - 2DQ402328 - - - - 2GU112568

madagascariensis

Hypsipetes mindorensis 1KU 12566 - 1GU112609 - - - - 1GU112540

(table cont’d.)

66

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Hypsipetes philippinus 2KU 11006 3EF626742 2GU112608 - 3EF625280 3EF625331 - 2GU112539 3ZMUC 121498 Hypsipetes rufigularis 1KU 18213 - 1GU112614 - - - - 1GU112545

Hypsipetes siquijorensis 1KU 14448 - 1GU112615 - - - - 1GU112546

Iole charlottae 1KU 17765 2GQ242036 3DQ402354 - 2GQ242102 2GQ242139 - 1GU112598 2NRM 86056 3LSUMNS B- 51043 Iole palawanensis 1KU12618 - 1GU112642 - - - - 1GU112596

Iole propinqua 3MNHN 31-81 - - - 3GQ369646 - - -

Iole viridescens 1CAS ORN - 1KU601909 - - - - - 89534 Ixonotus guttatus 2NRM 86201 3EF626718 2GQ242067 - 3EF625257 3EF625307 - - 3ANSP11726 Ixos malaccensis 1LSUMNS B- - 1DQ402329 - - - - 1GU112569 51130 Ixos mcclellandii 3NRM 3GQ242039 4GU112607 5KJ455060 3DQ008558 3GQ242142 5KJ456057 4GU112538 20046796 4KU 10269 5CAS ORN 95882 Ixos philippinus - - - - - ?EF568257 -

Ixos virescens 1KU 31141 - - - 1KY454749 - - -

Neolestes torquatus 2NRM 71084 2GQ242041 2GQ242061 - 2GQ242106 2GQ242144 - -

(table cont’d.)

67

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Nicator chloris 4ZMUC 119452 4EU680666 - - 4EU680603 4EU680745 5AY319991 - 5AMNH (PRS2029) Phyllastrephus albigula 1ZMUC 132973 1HQ716832 ------

Phyllastrephus albigularis 2LSUMNS B- 3EF626734 2DQ402330 - 3EF625272 3EF625323 - - 39597 3ZMUC 136994 Phyllastrephus 1FMNH 444178 1EF626736 - - 1EF625274 1EF625325 - -

flavostriatus alfredi

Phyllastrephus cabanisi 1A92037 1EF626728 - - 1EF625267 1EF625317 - -

cabanisi

Phyllastrephus 1VH A1600 2EF626726 - - 1JX236363 2EF625315 1JX236447 - 2FMNH MLW cerviniventris 1108

Phyllastrephus debilis 1FMNH 356727 2EF626737 1DQ402335 ?HQ716929 2EF625275 2EF625326 - - 2ZMUC 119477 Phyllastrephus fischeri 1FMNH 384931 2EF626730 1DQ402339 - 2EF625266 2EF625319 - - 2ZMUC 119469 Phyllastrephus 2FMNH 384938 3EF626735 2DQ402336 - 3EF625273 3EF625324 - - 3ZMUC 119260 flavostriatus flavostriatus

Phyllastrephus hypochloris 1FMNH 384936 1EF626727 1DQ402337 - 1EF625265 1EF625316 - -

Phyllastrephus icterinus 4LSUMNS B- 5EF626731 4DQ402331 - 5EF625269 5EF625320 - 4GU112530 27097 5ZMUC 136974 (table cont’d.)

68

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Phyllastrephus lorenzi 2ZMUC 136984 2EF626732 - - 2EF625270 2EF625321 - -

Phyllastrephus cabanisi 1LSUMNS B- 2EF626729 1DQ402332 - 2EF625268 2EF625318 - - 21133 placidus 2ZMUC 120659

Phyllastrephus poensis 1NRM 1GQ242042 1GQ242062 - 1GQ242107 1GQ242145 - - 20086270 Phyllastrephus scandens 1FMNH 389335 1EF626723 - - 1EF625261 1EF625312 - -

Phyllastrephus strepitans 1ZMUC 122585 1EF626725 - - 1EF625263 1EF625314 - -

Phyllastrephus terrestris 2N2-22112005 2EF626724 - - 2EF625262 3EF625313 - - 3BE 18126 Phyllastrephus xavieri 1FMNH 357212 2EF626733 1DQ402340 - 2EF625271 2EF625322 - - 2ZMUC 136878 Pycnonotus atriceps 2LSUMNS B- 3GQ242029 2DQ402341 - 3GQ242096 3GQ242132 - 2GU112571 36320 3NRM 86511 Pycnonotus aurigaster 1KU 23510 - - - 1KY454750 - - -

Pycnonotus barbatus 1MNHN 2-21 4EF626746 3DQ402342 1FJ357922 4EF625284 4EF625335 5AY057027 3GU112572 2UWBM 70428 3LSUMNS B- 39494 4ZMUC 117636 5AMNH 24822 (PRS2125) Pycnonotus blanfordi 2KU 23346 - 3KP943348 - 2KY454751 - - 3KP943473 3PSUZC-AV 2014.25 Pycnonotus brunneus 3LSUMNS B- - 3DQ402343 - - - - 3GU112573 36341 (table cont’d.)

69

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Pycnonotus cafer - - ?KJ455154 ?KJ454895 ?KJ455904 ?KJ456130 -

Pycnonotus capensis 11-10102005 1EF626745 - - 1EF625283 1EF625334 - -

Pycnonotus cinereifrons 1KU12660 - 1GU112638 - - - - 1GU112592

Pycnonotus cyaniventris 1LSUMNS B- - 1DQ402344 - - - - 1GU112574 36425 Pycnonotus 2LSUMNS B- - 2DQ402345 - - - - 2GU112575 36378 erythropthalmos

Pycnonotus eutilotus 2LSUMNS B- 3GQ242030 2DQ402346 - 3GQ242097 3GQ242133 - 2GU112576 36313 3NRM 86490 Pycnonotus finlaysoni 3NRM 3GQ242033 3GQ242053 - 3GQ242100 3GQ242136 - - 20046850 Pycnonotus flavescens 1KU 17746 - 1GU112623 - - - - 1GU112554

Pycnonotus melanicterus 2NRM 2GQ242032 - - 2GQ242099 2GQ242135 - - 20046769 flaviventris

Pycnonotus goiavier 3LSUMNS B- - 3DQ402347 - - - - 3KY454739 36351 Pycnonotus jocosus 3NRM 3GQ242034 4GU112624 ?KJ455155 3DQ008557 3GQ242137 ?KJ456131 4GU112555 20046820 4KU 10347 Pycnonotus leucogenys 2FMNH 347847 - 2DQ402351 ?KJ455156 ?KJ454896 ?KJ455905 ?EF568258 2GU112578

(table cont’d.)

70

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Pycnonotus melanicterus 2LSUMZ B------2KJ456132 - 52674 melanicterus

Pycnonotus melanoleucos 1LSUMNS B- - 1DQ402352 - - - - 1GU112579 50353 Pycnonotus melanicterus 1LSUMNS B- - 2DQ402353 1KJ455157 - - - - 52674 montis 2LSUMNS B- 51038 Pycnonotus nigricans 1LSUMNS B- - 1DQ402348 - - - - 1GU112580 34228 Pycnonotus plumosus 3LSUMNS B- 4JN826138 3DQ402349 - - - - 4JN826387 23354 4KU 12667 Pycnonotus simplex 2KU 17777 - 2GU112625 - - - - 2GU112556

Pycnonotus sinensis 2KU 17777 - 2GU112626 - - - - 2GU112557

Pycnonotus squamatus 1LSUMNS B- - 2GU112645 - - - - 1KY454740 88116 2WFVZ38646 Pycnonotus striatus 1CAS ORN - - 1KJ455158 1KJ454898 1KJ455907 1KJ456133 - 95901 Pycnonotus barbatus 1FMNH 384925 - - - 1KY454752 - - -

tricolor

Pycnonotus urostictus 3ZMUC 119539 3EF626744 4GU112629 - 3EF625282 3EF625333 - 4GU112560 4KU 14038 (table cont’d.)

71

Species Tissue No. FIB-I5 FIB-I7 GAPDH-I11 MB-I2 ODC-I6/I7 RAG1 TGFB2-I5

Pycnonotus xanthorrhous 2KU 11153 - 2GU112631 - - - - 2GU112562

Pycnonotus zeylanicus 1LSUMNS B- - 1DQ402350 - - - - 1GU112582 23321 Setornis criniger 1LSUMNS B- - 1DQ402355 - - - - 1GU112583 23359 Spizixos semitorques 2LSUMNS B- 3GQ242031 2DQ402356 - 3GQ242098 3GQ242134 - 2GU112584 20554 3NRM 20086548 Stelgidillas gracilirostris 1NRM 86447 2EF626705 1GQ242060 - 2EF625249 2EF625294 - - 2ANSP 11447 Thescelocichla leucopleura 1ANSP 11476 2EF626722 1DQ402312 - 2EF625260 2EF625311 - - 2ANSP 11716

Tricholestes criniger 2LSUMNS B- 3GQ242035 2DQ402326 - 3GQ242101 3GQ242138 - 2GU112585 38601 3NRM 86086

72

APPENDIX B. COPYRIGHT INFORMATION FOR CHAPTER 2

73

74

75

76

77

REFERENCES

Aberer, Andre J., Denis Krompass, and Alexandros Stamatakis. 2013. “Pruning Rogue Taxa Improves Phylogenetic Accuracy: An Efficient Algorithm and Webservice.” Systematic Biology 62 (1): 162–66.

Austerlitz, Frederic, Olivier David, Brigitte Schaeffer, Kevin Bleakley, Madalina Olteanu, Raphael Leblois, Michel Veuille, et al. 2009. “DNA Barcode Analysis: A Comparison of Phylogenetic and Statistical Classification Methods.” BMC Bioinformatics 10 (Suppl 14): S10.

Badyaev, Alexander V., Alexander B. Posner, Erin S. Morrison, and Dawn M. Higginson. 2019. “Cycles of External Dependency Drive Evolution of Avian Carotenoid Networks.” Nature Communications 10 (1): 1–10.

Bangs, Outram, and James L. Peters. 1927. “Birds from Maratua Island, off the East Coast of Borneo.” Occasional Papers of the Boston Society of Natural History 5: 235–42.

Barsh, Gregory S. 1996. “The Genetics of Pigmentation: From Fancy Genes to Complex Traits.” Trends in Genetics 12 (8): 299–305.

Barton, N H, and B. Charlesworth. 1984. “Genetic Revolutions, Founder Effects, and Speciation.” Annual Review of Ecology and Systematics 15: 133–64.

Beresford, P., F. K. Barker, P. G. Ryan, and T. M. Crowe. 2005. “African Endemics Span the Tree of (Passeri): Molecular Systematics of Several Evolutionary ‘Enigmas’.” Proceedings of the Royal Society B: Biological Sciences 272 (1565): 849–58.

Berry, S. D., S. R. Davis, E. M. Beattie, N. L. Thomas, A. K. Burrett, H. E. Ward, A. M. Stanfield, et al. 2009. “Mutation in Bovine β-Carotene Oxygenase 2 Affects Milk Color.” Genetics 182 (3): 923–26.

Brandley, Matthew C., Takeo Kuriyama, and Masami Hasegawa. 2014. “Snake and Bird Predation Drive the Repeated Convergent Evolution of Correlated Life History Traits and Phenotype in the Izu Island Scincid Lizard (Plestiodon latiscutatus).” PLoS ONE 9 (3).

Burner, Ryan C., Subir B. Shakya, Tri Haryoko, Mohammad Irham, Dewi Malia Prawiradilaga, and Frederick H. Sheldon. 2018. “Ornithological Observations from Maratua and Bawean Islands, Indonesia.” Treubia 45 (December): 11–24.

Campagna, Leonardo, Márcio Repenning, Luís Fábio Silveira, Carla Suertegaray Fontana, Pablo L. Tubaro, and Irby J. Lovette. 2017. “Repeated Divergent Selection on Pigmentation Genes in a Rapid Finch Radiation.” Science Advances 3 (5): e1602404.

Carson, H. L. 1968. “Population Flush and Genetic Its Genetic Consequences.” In Population Biology and Evolution, 123–37. Syracuse, NY: Syracuse Univ. Press.

78

Carson, H. L., and A. R. Templeton. 1984. “Genetic Revolutions in Relation to Speciation Phenomena: The Founding of New Populations.” Annual Review of Ecology and Systematics 15: 97–131.

Carson, H. L. 1975. “The Genetics of Speciation at the Diploid Level.” The American Naturalist 109 (965): 83–92.

Castresana, J. 2000. “Selection of Conserved Blocks from Multiple Alignments for Their Use in Phylogenetic Analysis.” Molecular Biology and Evolution 17 (4): 540–52.

Chesser, RT. 1999. “Molecular Systematics of the Rhinocryptid Genus Pteroptochos.” The Condor 101: 439–46.

Chng, Serene C. L., Chris R. Shepherd, and James A. Eaton. 2018. “In the Market for Extinction: Birds for Sale at Selected Outlets in Sumatra.” TRAFFIC Bulletin 30 (1): 15–22.

Chua, Vivien L., Quentin Phillipps, Haw Chuan Lim, Sabrina S. Taylor, Dency F. Gawin, Abdul Rahman, Robert G. Moyle, and Frederick H. Sheldon. 2015. “Phylogeography of Three Endemic Birds of Maratua Island, a Potential Archive of Bornean Biogeography.” Raffles Bulletin of Zoology 63 (June): 259–69.

Cibois, Alice, Normand David, Steven M. S. Gregory, and Eric Pasquet. 2010. “Bernieridae (Aves: Passeriformes): A Family-Group Name for the Malagasy Sylvoid Radiation.” Zootaxa 2554: 65–68.

Cibois, Alice, Beth Slikas, T. S. Schulenberg, and Eric Pasquet. 2001. “An Endemic Radiation of Malagasy Songbirds is Revealed by Mitochondrial DNA Sequence Data.” Evolution 55 (6): 1198–1206.

Clegg, Sonya M., Sandie M. Degnan, Jiro Kikkawa, Craig Moritz, Arnaud Estoup, and Ian P.F. Owens. 2002. “Genetic Consequences of Sequential Founder Events by an Island- Colonizing Bird.” Proceedings of the Academy of Natural Sciences of Philadelphia 99 (12): 8127–32.

Coffman, Alec J., Ping Hsun Hsieh, Simon Gravel, and Ryan N. Gutenkunst. 2016. “Computationally Efficient Composite Likelihood Statistics for Demographic Inference.” Molecular Biology and Evolution 33 (2): 591–93.

Collar, Nigel J., James A. Eaton, and R. O. Hutchinson. 2013. “Species Limits in the Golden Bulbul Alophoixus (Thapsinillas) affinis Complex.” Forktail 29: 19–24.

Collar, Nigel J., and J. D. Pilgrim. 2007. “Species-Level Changes Proposed for Asian Birds, 2005–2006.” BirdingASIA 8: 14–30.

Corlett, Richard T. 1998. “Frugivory and Seed Dispersal by Vertebrates in the Oriental (Indomalayan) Region.” Biological Reviews 73 (4): 413–48.

79

Coyne, Jerry A. 1992. “Genetics and Speciation.” Nature 355 (6360): 511–15.

Craddock, Elysse M., and Walter E. Johnson. 1979. “Genetic Variation in Hawaiian Drosophila. V. Chromosomal and Allozymic Diversity in Drosophila silvestris and its Homosequential Species.” Evolution 33 (1): 137–55.

Danecek, Petr, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, et al. 2011. “The Variant Call Format and VCFtools.” Bioinformatics 27 (15): 2156–58.

Darriba, Diego, Michael Weiß, and Alexandros Stamatakis. 2016. “Prediction of Missing Sequences and Branch Lengths in Phylogenomic Data.” Bioinformatics 32 (9): 1331–37.

Dejtaradol, Ariya, Swen C. Renner, Sunate Karapan, Paul J. J. Bates, Robert G. Moyle, and Martin Päckert. 2015. “Indochinese-Sundaic Faunal Transition and Phylogeographical Divides North of the Isthmus of Kra in Southeast Asian Bulbuls (Aves: Pycnonotidae).” Journal of Biogeography 43 (3): 471–83.

Dickinson, E. C., and Les Christidis. 2014. The Howard and Moore Complete Checklist of the Birds of the World 4th. Edition Vol. 2. Passerines.

Dickinson, E. C., and S. M. S. Gregory. 2002. “Systematic Notes on Asian Birds. 24. On the Priority of the Name Hypsipetes Vigors, 1831, and the Division of the Broad Genus of That Name.” Zoologische Verhandelingen 340 (1960): 75–92.

Earl, Dent A., and Bridgett M. vonHoldt. 2012. “STRUCTURE HARVESTER: A Website and Program for Visualizing STRUCTURE Output and Implementing the Evanno Method.” Conservation Genetics Resources 4 (2): 359–61.

Eaton, James A., Sebastianus van Balen, Nick W. Brickle, and Frank E. Rheindt. 2016. Birds of the Indonesian Archipelago: Greater Sundas and Wallacea. Lynx.

Eaton, James A., Chris R. Shepherd, Frank E. Rheindt, J. B. C. Harris, Bas van Balen, David S. Wilcove, and Nigel J. Collar. 2015. “Trade-Driven Extinctions and near-Extinctions of Avian Taxa in Sundaic Indonesia.” Forktail 31: 1–12.

Edgar, Robert C. 2004. “MUSCLE: Multiple Sequence Alignment with High Accuracy and High Throughput.” Nucleic Acid Research 32 (5): 1792–97.

Ellegren, Hans, Linnéa Smeds, Reto Burri, Pall I. Olason, Niclas Backström, Takeshi Kawakami, Axel Künstner, et al. 2012. “The Genomic Landscape of Species Divergence in Ficedula Flycatchers.” Nature 491 (7426): 756–60.

Evanno, G., S. Regnaut, and J. Goudet. 2005. “Detecting the Number of Clusters of Individuals Using the Software Structure: A Simulation Study.” Molecular Ecology 14 (8): 2611–20.

80

Faircloth, Brant C. 2016. “PHYLUCE Is a Software Package for the Analysis of Conserved Genomic Loci.” Bioinformatics 32 (5): 786–88.

Feng, S., J. Stiller, Y. Deng, J. Armstrong, Q. Fang, A. H. Reeve, D. Xie, et al. 2020. “Densely Sampling Genomes across the Diversity of Birds Increases Power of Comparative Genomics Analyses.” Nature (submitted).

Fishpool, L. D. C., and J. A. Tobias. 2005. “Family Pycnonotidae.” In Handbook of Birds of the World Vol. 10, edited by J del Hoyo, A Elliott, and J Sargatal, 124–251. Lynx Edicions/Birdlife International.

Fooden, Jack. 1995. “Systematic Review of Southeast Asian Longtail Macaques, Macaca Fascicularis (Raffles, [1821]).” Fieldiana: Zoology New Series 81: 1–206.

Foster, J. Bristol. 1964. “Evolution of Mammals on Islands.” Nature 202 (4929): 234–35.

Frankham, R. 1997. “Do Island Populations Have Less Genetic Variation than Mainland Populations?” Heredity 78 (3): 311–27.

Fregin, Silke, Martin Haase, Urban Olsson, and Per Alström. 2012. “New Insights into Family Relationships within the Avian Superfamily (Passeriformes) Based on Seven Molecular Markers.” BMC Evolutionary Biology 12 (1): 157.

Fuchs, Jérôme, Per G.P. Ericson, Céline Bonillo, Arnaud Couloux, and Eric Pasquet. 2015. “The Complex Phylogeography of the Indo-Malayan Alophoixus Bulbuls with the Description of a Putative New Ring Species Complex.” Molecular Ecology 24: 5460–71.

Galeotti, Paolo, D. Rubolini, P. O. Dunn, and M. Fasola. 2003. “Colour Polymorphism in Birds: Causes and Functions.” Journal of Evolutionary Biology 16 (4): 635–46.

Gao, Guangqi, Meng Xu, Chunling Bai, Yulan Yang, Guangpeng Li, Junyang Xu, Zhuying Wei, et al. 2018. “Comparative Genomics and Transcriptomics of Chrysolophus Provide Insights into the Evolution of Complex Plumage Coloration.” GigaScience 7 (10): 1–14.

Gavrilets, Sergey, and Alan Hastings. 1996. “Founder Effect Speciation: A Theoretical Reassessment.” The American Naturalist 147 (3): 466–91.

Gill, Frank B., and D. Donsker. 2005. “IOC World Bird List (v5.4).”

Glenn, Travis C., Roger A. Nilsen, Troy J. Kieran, Jon G. Sanders, Natalia J. Bayona-Vásquez, John W. Finger, Todd W. Pierson, et al. 2019. “Adapterama I: Universal Stubs and Primers for 384 Unique Dual-Indexed or 147,456 Combinatorially-Indexed Illumina Libraries (ITru & INext).” PeerJ 7 (e7755).

Grant, P. R. 2001. “Reconstructing the Evolution of Birds on Islands: 100 Years of Research.”

81

Oikos 92 (3): 385–403.

Gutenkunst, Ryan N., Ryan D. Hernandez, Scott H. Williamson, and Carlos D. Bustamante. 2009. “Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data.” PLoS Genetics 5 (10).

Hackett, SJ. 1996. “ and Biogeography of Tanagers in the Genus Ramphocelus (Aves).” Molecular Phylogenetics and Evolution 5 (2): 368–82.

Hill, Geoffrey E., and Kevin J. Mcgraw. 2006. Volume 1 Bird Coloration: Mechanisms and Measurements. London, England: Harvard University Press, Cambridge.

Hoekstra, H. E. 2006. “Genetics, Development and Evolution of Adaptive Pigmentation in Vertebrates.” Heredity 97 (3): 222–34.

Hoogerwerf, A. 1966. “Notes on the Island of Bawean (Java Sea) with Special Reference to the Birds.” The Natural History Bulletin of the Siam Society 21: 313–40.

Hosner, Peter A., Edward L. Braun, and Rebecca T. Kimball. 2016. “Rapid and Recent Diversification of Curassows, Guans, and Chachalacas (Galliformes: Cracidae) out of Mesoamerica: Phylogeny Inferred from Mitochondrial, Intron, and Ultraconserved Element Sequences.” Molecular Phylogenetics and Evolution 102: 320–30.

Huelsenbeck, John P., and Fredrik Ronquist. 2001. “MRBAYES: Bayesian Inference of Phylogeny.” Bioinformatics 17: 764–755.

Hughes, Austin L. 2010. “Reduced Microsatellite Heterozygosity in Island Endemics Supports the Role of Long-Term Effective Population Size in Avian Microsatellite Diversity.” Genetica 138 (11): 1271–76.

Hume, Allan Octavian. 1878. “A Revised List of the Birds of Tenasserim.” Stray Feathers 6: 1– 572.

Huntley, Jerry W., and Gary Voelker. 2016. “Cryptic Diversity in Afro-Tropical Lowland Forests: The Systematics and Biogeography of the Avian Genus Bleda.” Molecular Phylogenetics and Evolution 99: 297–308.

Irwin, Darren E., Borja Milá, David P. L. Toews, Alan Brelsford, Haley L. Kenyon, Alison N. Porter, Christine Grossen, Kira E. Delmore, Miguel Alcaide, and Jessica H. Irwin. 2018. “A Comparison of Genomic Islands of Differentiation across Three Young Avian Species Pairs.” Molecular Ecology 27 (23): 4839–55.

Jombart, Thibaut. 2008. “Adegenet: A R Package for the Multivariate Analysis of Genetic Markers.” Bioinformatics 24 (11): 1403–5.

Jiang, Wei, Si-Yun Chen, Hong Wang, De-Zhu Li, and John J Wiens. 2014. “Should Genes with

82

Missing Data Be Excluded from Phylogenetic Analyses?” Molecular Phylogenetics and Evolution 80 (November): 308–18.

Johansson, Ulf S., Jon Fjeldså, and Rauri C K Bowie. 2008. “Phylogenetic Relationships within (Aves: Passeriformes): A Review and a New Molecular Phylogeny Based on Three Nuclear Intron Markers.” Molecular Phylogenetics and Evolution 48 (3): 858–76.

Johansson, Ulf S., Jon Fjeldså, L. G. Sampath Lokugalappatti, and Rauri C. K Bowie. 2007. “A Nuclear DNA Phylogeny and Proposed Taxonomic Revision of African (Aves, Passeriformes, Pycnonotidae).” Zoologica Scripta 36 (5): 417–27.

Kim, Kang-Wook, Benjamin C. Jackson, Hanyuan Zhang, David P. L. Toews, Scott A. Taylor, Emma I. Greig, Irby J. Lovette, et al. 2019. “Black or Red: A Sex-Linked Colour Polymorphism in a Is Maintained by Balancing Selection.” Nature Communications 10: 1852–63.

Kimball, Rebecca T., Edward L. Braun, F. Keith Barker, Rauri C. K. Bowie, Michael J. Braun, Jena L. Chojnowski, Shannon J. Hackett, et al. 2009. “A Well-Tested Set of Primers to Amplify Regions Spread across the Avian Genome.” Molecular Phylogenetics and Evolution 50 (3): 654–60.

Kimura, Motoo. 1964. “Diffusion Models in Population Genetics.” Journal of Applied Probability 1 (2): 177–232.

Kirchman, Jeremy J. 2012. “Speciation of Flightless Rails on Islands: A DNA-Based Phylogeny of the Typical Rails of the Pacific.” The Auk 129 (1): 56–69.

Koropitan, Alan F., and Motoyoshi Ikeda. 2008. “Three-Dimensional Modeling of Tidal Circulation and Mixing over the Java Sea.” Journal of Oceanography 64 (1): 61–80.

LaFountain, Amy M., Shanti Kaligotla, Shannon Cawley, Ken M. Riedl, Steven J. Schwartz, Harry A. Frank, and Richard O. Prum. 2010. “Novel Methoxy-Carotenoids from the Burgundy-Colored Plumage of the Pompadour Cotinga Xipholena punicea.” Archives of Biochemistry and Biophysics 504 (1): 142–53.

Lamichhaney, Sangeet, Jonas Berglund, Markus Sällman Almén, Khurram Maqbool, Manfred Grabherr, Alvaro Martinez-Barrio, Marta Promerová, et al. 2015. “Evolution of Darwin’s Finches and Their Revealed by Genome Sequencing.” Nature 518 (7539): 371–75.

Lamichhaney, Sangeet, Guangyi Fan, Fredrik Widemo, Ulrika Gunnarsson, Doreen Schwochow Thalmann, Marc P Hoeppner, Susanne Kerje, et al. 2015. “Structural Genomic Changes Underlie Alternative Reproductive Strategies in the Ruff (Philomachus pugnax).” Nature Genetics 48 (1): 84–88.

Lanfear, R., B. Calcott, S. Y. W. Ho, and S. Guindon. 2012. “PartitionFinder: Combined Selection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses.”

83

Molecular Biology and Evolution 29 (6): 1695–1701.

Lemmon, Alan R., Jeremy M. Brown, Kathrin Stanger-Hall, and Emily Moriarty Lemmon. 2009. “The Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference.” Systematic Biology 58 (1): 130–45.

Ligon, Russell A., Richard K. Simpson, Nicholas A. Mason, Geoffrey E. Hill, and Kevin J. McGraw. 2016. “Evolutionary Innovation and Diversification of Carotenoid-Based Pigmentation in Finches.” Evolution 70 (12): 2839–52.

Lim, Haw Chuan, Vivien L. Chua, Phred M. Benham, Carl H. Oliveros, Mustafa Abdul Rahman, Robert G. Moyle, and Frederick H. Sheldon. 2014. “Divergence History of the Rufous- Tailed Tailorbird (Orthotomus sericeus) of Sundaland: Implications for the Biogeography of Palawan and the of Island Species in General.” The Auk 131 (4): 629–42.

Lim, Haw Chuan, Dency F. Gawin, Subir B. Shakya, Michael G. Harvey, Mustafa A. Rahman, and Frederick H. Sheldon. 2017. “Sundaland’s East-West Rain Forest Population Structure: Variable Manifestations in Four Polytypic Bird Species Examined Using RAD-Seq and Plumage Analyses.” Journal of Biogeography 44 (10): 2259–71.

Lim, Haw Chuan, Subir B. Shakya, Michael G. Harvey, Robert G. Moyle, Robert C. Fleischer, Michael J. Braun, and Frederick H. Sheldon. 2019. “Opening the Door to Greater Phylogeographic Inference in Southeast Asia: Comparative Genomic Study of Five Co- Distributed Rainforest Bird Species Using Target Capture and Historical DNA.” Ecology and Evolution (in press).

Lischer, H. E.L., and L. Excoffier. 2012. “PGDSpider: An Automated Data Conversion Tool for Connecting Population Genetics and Genomics Programs.” Bioinformatics 28 (2): 298–99.

Liu, Helu, Huaiping Zheng, Hongkuan Zhang, Longhui Deng, Wenhua Liu, Shuqi Wang, Fang Meng, et al. 2015. “A de novo Transcriptome of the Noble Scallop, Chlamys Nobilis, Focusing on Mining Transcripts for Carotenoid-Based Coloration.” BMC Genomics 16 (1): 1–13.

Lomolino, Mark V. 1985. “Body Size of Mammals on Islands: The Island Rule Reexamined.” The American Naturalist 125 (2): 310–16.

Lopes, Ricardo J., James D. Johnson, Matthew B. Toomey, Mafalda S. Ferreira, Pedro M. Araujo, Jos Melo-Ferreira, Leif Andersson, Geoffrey E. Hill, Joseph C. Corbo, and Miguel Carneiro. 2016. “Genetic Basis for Red Coloration in Birds.” Current Biology 26 (11): 1427–34.

Losos, Jonathan B., and Robert E. Ricklefs. 2009. “Adaptation and Diversification on Islands.” Nature 457 (7231): 830–36.

MacArthur, Robert H., and Edward O. Wilson. 1967. The Theory of Island Biogeography.

84

Princeton: Princeton University Press.

Mayr, Ernst. 1944. “Wallace’s Line in the Light of Recent Zoogeographic Studies.” The Quarterly Review of Biology 19 (1): 1–14.

———. 1954. “Change of Genetic Environment and Evolution.” In Evolution as a Process [Essays], edited by Julian Huxley, A. C. Hardy, and E. B. Ford, 157–80. London, England: Allen & Unwin.

Mayr, Ernst, and James C. Greenway, eds. 1960. Check- of the World (Vol. IX). Cambridge : Harvard University Press.

McCracken, Kevin G., and Frederick H. Sheldon. 1998. “Molecular and Osteological Heron Phylogenies: Sources of Incongruence.” The Auk 115 (1): 127–41.

Miller, M. A., W. Pfeiffer, and T. Schwarz. 2010. “Creating the CIPRES Science Gateway for Inference of Large Phylogenetic Trees.” In Proceedings of the Gateway Computing Environments Workshop (GCE), 14 Nov. 2010, New Orleans, LA, 1–8.

Moyle, Robert G., and Ben D. Marks. 2006. “Phylogenetic Relationships of the Bulbuls (Aves: Pycnonotidae) Based on Mitochondrial and Nuclear DNA Sequence Data.” Molecular Phylogenetics and Evolution 40 (3): 687–95.

Mundy, Nicholas I. 2005. “A Window on the Genetics of Evolution : MC1R and Plumage Colouration in Birds.” Proceedings of the Royal Society B: Biological Sciences 272 (August): 1633–40.

Mundy, Nicholas I., Jessica Stapley, Clair Bennison, Rachel Tucker, Hanlu Twyman, Kang Wook Kim, Terry Burke, Tim R. Birkhead, Staffan Andersson, and Jon Slate. 2016. “Red Carotenoid Coloration in the Zebra Finch Is Controlled by a Cytochrome P450 Gene Cluster.” Current Biology 26 (11): 1435–40.

Nachman, Michael W., Hopi E. Hoekstra, and Susan L. D’Agostino. 2003. “The Genetic Basis of Adaptive Melanism in Pocket Mice.” Proceedings of the National Academy of Sciences 100 (9): 5268–73.

Oberholser, Harry C. 1917. “The Birds of Bawean Island, Java Sea.” Proceedings of the U.S. National Museum 52 (2175): 183–98.

Oliveros, Carl H., Daniel J. Field, Daniel T. Ksepka, F. Keith Barker, Alexandre Aleixo, Michael J. Andersen, Per Alström, et al. 2019. “Earth History and the Passerine Superradiation.” Proceedings of the National Academy of Sciences 116 (16): 7916–25.

Oliveros, Carl H., and Robert G. Moyle. 2010. “Origin and Diversification of Philippine Bulbuls.” Molecular Phylogenetics and Evolution 54 (3): 822–32.

85

Oliveros, Carl H., Sushma Reddy, and Robert G. Moyle. 2012. “The Phylogenetic Position of Some Philippine ‘Babblers’ Spans the Muscicapoid and Sylvioid Bird Radiations.” Molecular Phylogenetics and Evolution 65 (2): 799–804.

Pasquet, Eric, Lian-Xian Han, Obhas Khobkhet, and Alice Cibois. 2001. “Towards a Molecular Systematics of the Genus Criniger, and a Preliminary Phylogeny of the Bulbuls (Aves, Passeriformes, Pycnonotidae).” Zoosystema 23 (4): 857–63.

Poelstra, J. W., N. Vijay, C. M. Bossu, H. Lantz, B. Ryll, I. Müller, V. Baglione, et al. 2014. “The Genomic Landscape Underlying Phenotypic Integrity in the Face of Gene Flow in Crows.” Science 344 (6190): 1410–14.

Pritchard, Jonathan K., Matthew Stephens, and Peter Donnelly. 2000. “Inference of Population Structure Using Multilocus Genotype Data.” Genetics 155 (2): 945–59. de Queiroz, Alan, and John Gatesy. 2007. “The Supermatrix Approach to Systematics.” Trends in Ecology and Evolution 22 (1): 34–41.

Rambaut, Andrew, Marc A. Suchard, D. Xie, and Alexei J. Drummond. 2014. “Tracer v1.6.”

Rasmussen, Pamela C., and J C Anderton. 2005. Birds of South Asia: The Ripley Guide. Smithsonian Institution. Barcelona, Washington. Spain, USA. Washington D.C. & Barcelona: Smithsonian Institution & Lynx Edicions.

Riley, J. H. 1930. “Birds from the Small Islands off the East Coast of Dutch Borneo.” Proceedings of the U.S. National Museum 77 (12): 2597–2630.

Rohland, Nadin, and David Reich. 2012. “Cost-Effective, High-Throughput DNA Sequencing Libraries for Multiplexed Target Capture.” Genome Research 22 (5): 939–46.

Ronquist, Fredrik, and John P. Huelsenbeck. 2003. “MRBAYES 3: Bayesian Phylogenetic Inference under Mixed Models.” Bioinformatics 19: 1572–74.

Roure, Béatrice, Denis Baurain, and Hervé Philippe. 2013. “Impact of Missing Data on Phylogenies Inferred from Empirical Phylogenomic Data Sets.” Molecular Biology and Evolution 30 (1): 197–214.

Sæther, Bernt Erik, Russell Lande, Steinar Engen, Henri Weimerskirch, Magnar Lillegård, Res Altwegg, Peter H. Becker, et al. 2005. “Generation Time and Temporal Scaling of Bird Population Dynamics.” Nature 436 (7047): 99–102.

Sakudoh, Takashi, Tetsuya Iizuka, Junko Narukawa, Hideki Sezutsu, Isao Kobayashi, Seigo Kuwazaki, Yutaka Banno, et al. 2010. “A CD36-Related Transmembrane Protein Is Coordinated with an Intracellular Lipid-Binding Protein in Selective Carotenoid Transport for Cocoon Coloration.” Journal of Biological Chemistry 285 (10): 7739–51.

86

Sakudoh, Takashi, Seigo Kuwazaki, Tetsuya Iizuka, Junko Narukawa, Kimiko Yamamoto, Keiro Uchino, Hideki Sezutsu, Yutaka Banno, and Kozo Tsuchida. 2012. “ CD36 Homolog Divergence is Responsible for the Selectivity of Carotenoid Species Migration to the Silk Gland of the Silkworm Bombyx mori.” Journal of Lipid Research 54 (2): 482–95.

Salter, Jessie F., Oscar Johnson, Norman J. Stafford, Wiliam F. Herrin, Darren Schilling, Cody Cedotal, Robb T. Brumfield, and Brant C. Faircloth. 2019. “A Highly Contiguous Reference Genome for Northern Bobwhite (Colinus virginianus).” G3: Genes, Genomes, Genetics 9 (12): 3929–32.

Sathiamurthy, Edlic, and K. Harold Voris. 2006. Maps of Holocene Sea Level Transgression and Submerged Lakes on the Sunda Shelf. The Natural History Journal of Chulalongkorn University. Vol. 2.

Schneider, Alexsandra, Corneliu Henegar, Kenneth Day, Devin Absher, Marilyn Menotti- raymond, Gregory S Barsh, and Eduardo Eizirik. 2015. “Recurrent Evolution of Melanism in South American Felids.” PLoS Genetics 10 (2): 1–19.

Shakya, Subir B., and F.H. Sheldon. 2017. “The Phylogeny of the World’s Bulbuls (Pycnonotidae) Inferred Using a Supermatrix Approach.” Ibis 159: 498–509.

Sheldon, Frederick H., Haw Chuan Lim, and Robert G. Moyle. 2015. “Return to the Malay Archipelago: The Biogeography of Sundaic Rainforest Birds.” Journal of Ornithology 156 (1): 91–113.

Sheldon, Frederick H., Carl H. Oliveros, Sabrina S. Taylor, Bailey McKay, Haw Chuan Lim, Mustafa Abdul Rahman, Herman Mays, and Robert G. Moyle. 2012. “Molecular Phylogeny and Insular Biogeography of the Lowland Tailorbirds of Southeast Asia (Cisticolidae: Orthotomus).” Molecular Phylogenetics and Evolution 65 (1): 54–63.

Sievers, Fabian, Andreas Wilm, David Dineen, Toby J Gibson, Kevin Karplus, Weizhong Li, Rodrigo Lopez, et al. 2011. “Fast, Scalable Generation of High‐quality Protein Multiple Sequence Alignments Using Clustal Omega.” Molecular Systems Biology 7 (1): 539.

Simão, Felipe A., Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov. 2015. “BUSCO: Assessing Genome Assembly and Annotation Completeness with Single-Copy Orthologs.” Bioinformatics 31 (19): 3210–12.

Slatkin, Montgomery. 1996. “In Defense of Founder-Flush Theories of Speciation.” The American Naturalist 147 (4): 493–505.

Slikas, Beth, Storrs L. Olson, and Robert C. Fleischer. 2002. “Rapid, Independent Evolution of Flightlessness in Four Species of Pacific Island Rails (Rallidae): An Analysis Based on Mitochondrial Sequence Data.” Journal of Avian Biology 33 (1): 5–14.

Stamatakis, Alexandros. 2014. “RAxML Version 8: A Tool for Phylogenetic Analysis and Post-

87

Analysis for Large Phylogenetic Trees.” Bioinformatics 30 (9): 1312–13.

Streicher, Jeffrey W., James A. Schulte, and John J. Wiens. 2016. “How Should Genes and Taxa Be Sampled for Phylogenomic Analyses with Missing Data? An Empirical Study in Iguanian Lizards.” Systematic Biology 65 (1): 128–45.

Sundvold, Hilde, Hanna Helgeland, Matthew Baranski, Stig W. Omholt, and Dag I. Våge. 2011. “Characterisation of a Novel Paralog of Scavenger Receptor Class B Member I (SCARB1) in Atlantic Salmon (Salmo salar).” BMC Genetics 12.

Templeton, Alan R. 1980. “The Theory of Speciation via the Founder Principle.” Genetics 94 (4): 1011–38.

Theron, Emmalize, Kim Hawkins, Eldredge Bermingham, Robert E. Ricklefs, and Nicholas I. Mundy. 2001. “The Molecular Basis of an Avian Plumage Polymorphism in the Wild : A Melanocortin-1-Receptor Point Mutation Is Perfectly Associated with the Melanic Plumage Morph of the Bananaquit, Coereba flaveola.” Current Biology 11 (8): 550–57.

Thomson, Robert C., and H. Bradley Shaffer. 2010. “Sparse Supermatrices for Phylogenetic Inference: Taxonomy, Alignment, Rogue Taxa, and the Phylogeny of Living Turtles.” Systematic Biology 59 (1): 42–58.

Thorpe, Roger S., Axel Barlow, Anita Malhotra, and Yann Surget-Groba. 2015. “Widespread Parallel Population Adaptation to Climate Variation across a Radiation: Implications for Adaptation to Climate Change.” Molecular Ecology 24 (5): 1019–30.

Toews, David P. L., Scott A. Taylor, Rachel Vallender, Alan Brelsford, Bronwyn G. Butcher, Philipp W. Messer, and Irby J. Lovette. 2016. “Plumage Genes and Little Else Distinguish the Genomes of Hybridizing Warblers.” Current Biology 26 (17): 2313–18.

Toomey, Matthew B., Ricardo J. Lopes, Pedro M. Araújo, James D. Johnson, Małgorzata A. Gazda, Sandra Afonso, Paulo G. Mota, et al. 2017a. “High-Density Lipoprotein Receptor SCARB1 Is Required for Carotenoid Coloration in Birds.” Proceedings of the National Academy of Sciences 114 (20): 5219–24.

Toomey, Matthew B., Cristiana I. Marques, Ricardo J. Lopes, Joseph C. Corbo, Sandra Afonso, Stephen Sabatino, Pedro M. Araújo, Pedro Andrade, Miguel Carneiro, and Małgorzata A. Gazda. 2018. “A Non-Coding Region near Follistatin Controls Head Colour Polymorphism in the Gouldian Finch.” Proceedings of the Royal Society B: Biological Sciences 285: 20181788.

Turner, D. A., and D. J. Pearson. 2015. “Systematic and Taxonomic Issues Concerning Some East African Bird Species, Notably Those Where Treatment Varies between Authors.” Scopus: Journal of East African Ornithology 34: 1–23.

Tuttle, Elaina M., Alan O. Bergland, Marisa L. Korody, Michael S. Brewer, Daniel J. Newhouse,

88

Patrick Minx, Maria Stager, et al. 2016. “Divergence and Functional Degradation of a Sex Chromosome-like Supergene.” Current Biology 26 (3): 344–350.

Uy, J. Albert C., Robert G. Moyle, Christopher E. Filardi, and Zachary A. Cheviron. 2009. “Difference in Plumage Color Used in Species Recognition between Incipient Species Is Linked to a Single Amino Acid Substitution in the Melanocortin-1 Receptor.” The American Naturalist 174 (2): 244–54.

Uy, J. Albert C., and Luis E. Vargas-Castro. 2015. “Island Size Predicts the Frequency of Melanic Birds in the Color-Polymorphic Flycatcher Monarcha castaneiventris of the Solomon Islands.” The Auk 132 (4): 787–94.

Uy, J. Albert C., Elizabeth A. Cooper, Stephen Cutie, R. Concannon, Jelmer W. Poelstra, Robert G. Moyle, and Christopher E. Filardi. 2016. “Mutations in Different Pigmentation Genes Are Associated with Parallel Melanism in Island Flycatchers.” Proceedings of the Royal Society B: Biological Sciences 283 (1834): 20160731.

Valen, Leigh Van. 1973. “Pattern and the Balance of Nature.” Evolutionary Theory 1: 31–49.

Walsh, N., J. Dale, Kevin J. McGraw, M. A. Pointer, and Nicholas I. Mundy. 2012. “Candidate Genes for Carotenoid Coloration in Vertebrates and Their Expression Profiles in the Carotenoid-Containing Plumage and Bill of a Wild Bird.” Proceedings of the Royal Society B: Biological Sciences 279 (1726): 58–66.

Warren, Ben H., Eldredge Bermingham, Robert P. Prys-Jones, and Christophe Thebaud. 2005. “Tracking Island Colonization History and Phenotypic Shifts in Indian Ocean Bulbuls (Hypsipetes: Pycnonotidae).” Biological Journal of the Linnean Society 85 (3): 271–87.

Waterhouse, Robert M., Mathieu Seppey, Felipe A. Simão, Mosè Manni, Panagiotis Ioannidis, Guennadi Klioutchnikov, Evgenia V. Kriventseva, and Evgeny M. Zdobnov. 2018. “BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics.” Molecular Biology and Evolution 35 (3): 543–48.

West-Eberhard, M. J. 1986. “Alternative Adaptations, Speciation, and Phylogeny. (A Review).” Proceedings of the National Academy of Sciences 83 (5): 1388–92.

Wiens, John J. 2003. “Missing Data, Incomplete Taxa, and Phylogenetic Accuracy.” Systematic Biology 52 (4): 528–38.

Wiens, John J., and Matthew C. Morrill. 2011. “Missing Data in Phylogenetic Analysis: Reconciling Results from Simulations and Empirical Data.” Systematic Biology 60 (5): 719–31.

Wright, Natalie A., David W. Steadman, and Christopher C. Witt. 2016. “Predictable Evolution toward Flightlessness in Volant Island Birds.” Proceedings of the National Academy of Sciences of the United States of America 113 (17): 4765–70.

89

Yang, Ziheng. 2007. “PAML 4: Phylogenetic Analysis by Maximum Likelihood.” Molecular Biology and Evolution 24 (8): 1586–91.

Zuccon, D., and Per G. P. Ericson. 2010. “The Phylogenetic Position of the Black-Collared Bulbul Neolestes torquatus.” Ibis 152 (2): 386–392.

90

VITA

Subir Bahadur Shakya was born in Kathmandu, Nepal, to Birendra Bahadur Shakya and Suravi Shakya. Growing up in the shadow of the majestic Himalayan Mountains, he spent a lot of his childhood hiking and exploring. Early on, he was enthralled by science and nature and poured through encyclopedias and field guides. He also spent hours and hours on art, illustrating the biodiversity of the world. At age 16, he was introduced to the prominent Nepalese ornithologist Hari Sharan Nepali, who trained Subir in bird identification and the importance of bird collections. Inspired by Mr. Nepali, Subir went on to pursue a bachelor of science degree in the biology through the College of Science and Engineering at Southern Arkansas University (SAU) in Magnolia, Arkansas. While at SAU, he spent a summer at the Smithsonian’s National Museum of Natural History in Washington, D.C., working under Dr. Michael Braun as a part of an NSF REU program. There, he developed his skills in next-generation DNA sequencing. After graduating from SAU in 2014, he started his doctoral work on the genetic basis of avian coloration at Louisiana State University (LSU) under the guidance of Dr. Frederick H. Sheldon. At LSU, he also led five successful expeditions to Malaysia and Indonesia, which enhanced the impressive collections of birds and genetic resources at the LSU Museum of Natural Sciences.

91