Genomic Investigations of Diversity Within the Milkweed Genus Asclepias, at Multiple Scales
Total Page:16
File Type:pdf, Size:1020Kb
AN ABSTRACT OF THE DISSERTATION OF Kevin A. Weitemier for the degree of Doctor of Philosophy in Botany and Plant Pathology presented on September 7, 2016. Title: Genomic Investigations of Diversity within the Milkweed Genus Asclepias, at Multiple Scales Abstract approved: ________________________________________________________________________ Aaron I. Liston At a time when the biodiversity on Earth is being rapidly lost, new technologies and methods in genomic analysis are fortunately allowing scientists to catalog and explore the diversity that remains more efficiently and precisely. The studies in this dissertation investigate genomic diversity within the milkweed genus, Asclepias, at multiple scales, including diversity found within a single individual, diversity within and among populations in a species, and diversity across the entire genus. These investigations contribute to our understanding of the genomic content and architecture within Asclepias and the Gentianales, patterns of population diversification in the western United States, and the evolutionary history of select loci within Asclepias, with implications across flowering plants. Chapter 2 investigates patterns of polymorphisms among paralogous copies of nuclear ribosomal DNA (nrDNA) within individual genomes, and presents a bioinformatic pipeline for characterizing polymorphisms among copies of a high-copy locus. Results are presented for intragenomic nrDNA polymorphisms across Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual’s consensus and at each position reads differing from the consensus were tallied using a custom Perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided in this chapter is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming). Chapter 3 presents Hyb-Seq, a new method combining target enrichment and genome skimming to allow simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. A program is presented that takes genome and transcriptome assemblies and locates loci likely to be low copy and phylogenetically informative, to be used for probe development and enrichment in sequence libraries. A workflow is presented for processing data, from raw sequence reads to assembled exons and reconstructed trees. Genome and transcriptome assemblies for Asclepias syriaca were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nrDNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. Chapter 4 presents an assembly of the genome of the common milkweed, Asclepias syriaca. It uses principles from Chapter 3 to target SNPs and reconstruct linkage groups, enabling an analysis of chromosomal evolution within Gentianales, the order containing Asclepias. Asclepias syriaca is the first species in Apocynaceae with reconstructions of the nuclear, chloroplast, and mitochondrial genomes, and the first to have linkage group information incorporated into the nuclear assembly. The final assembly of Asclepias syriaca contains 54,266 scaffolds ≥1 kbp, with N50 = 3415 bp, representing 37% (156.6 Mbp) of the estimated 420 Mbp genome. Scaffolds ≥200 bp sum to 229.7 Mbp, with N50 = 1904 bp. A total of 14,474 protein coding genes were identified based on transcript evidence, closely related proteins, and ab initio models, and 95% of genes were annotated based on genes from Coffea canephora and Catharanthus rosea. A large proportion of gene space is represented in the assembly, with 96.7% of Asclepias transcripts, 88.4% of transcripts from the related genus Calotropis, and 90.6% of proteins from Coffea mapping to the assembly. Analyses were performed for three gene families, involved in rubber production, light sensing, and cardenolide production, with the finding that the cardenolide related progesterone 5β- reductase gene family is likely reduced in Asclepias relative to other Apocynaceae. Scaffolds covering 75 Mbp of the Asclepias assembly were grouped into eleven linkage groups. Comparisons of these groups with pseudochromosomes in Coffea found that six chromosome show consistent stability in gene content, while one may have a long history of fragmentation and rearrangement. Finally, in Chapter 5, diversity within a species across its entire range is investigated with a phylogeographic study of the jewel milkweed, Asclepias cryptoceras. This study applies the SNP targets developed from Chapter 4 to populations of A. cryptoceras, asking whether two recognized subspecies are genetically distinct, and searching for the origin of populations that are morphologically intermediate between the two. A total of 54,673 SNPs were found on 7372 contigs, across 96 individuals from ten populations. Principal component analysis and measures of allelic differentiation indicate a clear disjunction between subspecies cryptoceras and davisii (FST = 0.092 between geographic regions). For intermediate populations, estimates of hybrid index below 0.25 and measures of allelic diversity and private alleles, argue against a hybrid origin due to secondary contact, and instead support their origin as stepping stone populations during expansion along a southern corridor from east to west. ©Copyright by Kevin A. Weitemier September 7, 2016 All Rights Reserved Genomic Investigations of Diversity within the Milkweed Genus Asclepias, at Multiple Scales by Kevin A. Weitemier A DISSERTATION submitted to Oregon State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Presented September 7, 2016 Commencement June 2017 Doctor of Philosophy dissertation of Kevin A. Weitemier presented on September 7, 2016. APPROVED: ________________________________________________________________________ Major Professor, representing Botany and Plant Pathology ________________________________________________________________________ Head of the Department of Botany and Plant Pathology ________________________________________________________________________ Dean of the Graduate School I understand that my dissertation will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my dissertation to any reader upon request. ________________________________________________________________________ Kevin A. Weitemier, Author ACKNOWLEDGEMENTS The title page of this work lists a single author, but without the support of many people it could not have been completed. First, thanks to my advisor, Dr. Aaron Liston. You have helped me reject unworkable ideas, sharpen those that remained, and provided fresh thoughts when I was at a loss. Your continually positive outlook has helped me overcome frustration and tedium. Your door was literally and figuratively always open, and I never felt that I couldn’t come to you for consultation or advice. Thanks to Dr. Richard Cronn. Your assistance with laboratory techniques has been invaluable, including allowing me to inhabit your lab for hours at a time. Your ability to envision a broader application or scope of a study has helped on numerous occasions, and my work is greater for it. Thanks to Drs. Shannon Straub and Matthew Parks. As office-mates and collaborators you have modeled dedication and perseverance with cheerfulness, as well as the ability to take time away from work for the (many) more important things in life. Thank you to the many friends I have met while in graduate school. Without the moral support I have received and the joy you have provided my professional work would not only have suffered, but the many years I have spent in its pursuit would have been truly wasted. Thanks to the Coalition of Graduate Employees. Not only have I materially benefited from its work, I have received leadership opportunities and have developed a deep sense of professionalism in my employment, and an awareness of advocacy and activism. Thanks to the Native Plant Society of Oregon. It is an organization with a deep network of knowledge. It spreads the importance of our botanical heritage to the public and looks out for even