
The genome and epigenome of the European ash tree (Fraxinus excelsior) Elizabeth Sollars Thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy School of Biological and Chemical Sciences Queen Mary University of London Supervisor: Dr. Richard Buggs April 2017 Statement of Originality I, Elizabeth Sollars, confirm that the research included within this thesis is my own work or that where it has been carried out in collaboration with, or supported by others, that this is duly acknowledged below and my contribution indicated. Previously published material is also acknowledged below. I attest that I have exercised reasonable care to ensure that the work is original, and does not to the best of my knowledge break any UK law, infringe any third partys copyright or other Intellectual Property Right, or contain any confidential material. I accept that the College has the right to use plagiarism detection software to check the electronic version of the thesis. I confirm that this thesis has not been previously submitted for the award of a degree by this or any other university. The copyright of this thesis rests with the author and no quotation from it or informa- tion derived from it may be published without the prior written consent of the author. Signature: E. Sollars Date: 26th April 2017 1 Abstract European ash trees (Fraxinus excelsior) are under threat from the fungal pathogen Hy- menoscyphus fraxineus causing ash dieback disease (ADB). Previous research has shown heritable variation in ADB susceptibility in natural ash populations. Prior to this project, very little genetic data were available for ash, thus hampering efforts to identify markers associated with susceptibility. In this thesis, I have presented nuclear and organellar assem- blies of the 880 Mbp F. excelsior genome, with a combined N50 scaffold size of over 100 kbp. Using Ks distributions for six plant species, I found evidence for two whole genome duplication (WGD) events in the history of the ash lineage, one potentially shared with olive (Ks ∼0.4), and one potentially with other members of the Lamiales order (Ks ∼0.7). Using a further 38 genome sequences from trees originating throughout Europe, I found little evidence of any population structure throughout the European range of F.excelsior, but find a substantial decrease in effective population size, both in the distant (from ∼10 mya) and recent past. Linkage disequilibrium is low at small distances between loci, with an r2 of 0.15 at a few hundred bp, but decays slowly from this point. From whole genome DNA methylation data of twenty F. excelsior and F. mandshurica trees, I identified 665 Differ- entially Methylated Regions (DMRs) between those with high and low ADB susceptibility. Of genes putatively duplicated in historical WGD events, an average of 25.9% were differen- tially methylated in at least one cytosine context, possibly indicative of unequal silencing. Finally, I found some variability in methylation patterns among clonal replicates (Pearson's correlation coefficient ∼0.960), but this was less than the variability found between different genotypes (∼0.955). The results from this project and the genome sequence especially, will be valuable to researchers aiming to breed or select ash trees with low susceptibility to ADB. Acknowledgments I have been exceptionally proud to study my PhD as a member of the EU-funded ITN net- work, INTERCROSSING. In no other forum would I have had the chance to meet such a diverse and intelligent group of people and visit such beautiful European cities. Therefore, I would like to thank all other twelve members of INTERCROSSING, and their PIs, for enhancing my knowledge in a wide range of topics and enjoying many workshops together. My special thanks go to my INTERCROSSING partner, Jasmin Zohren, for being a won- derful colleague and friend, and to Richard Nichols, for organising and co-ordinating the network. I would like to thank my supervisor Richard Buggs for his encouragement and support throughout my PhD. In addition, I have received advice and help from members of the Buggs lab numerous times, so I thank all of them, especially Laura Kelly and Endymion Cooper. I would also like to thank all the employees at CLC bio in Aarhus, who went to much effort to make a newcomer feel at home, and made every day at work a fun one. To have had the chance to live and work in the city of Aarhus has been truly life-changing. I would also like to thank the members of the Aarhus 1900 Triathlon and Run for Friendship clubs, for being great friends and making training so fun. As always, thanks go to my family and friends for their continued love and support. Funding This work was supported by the EU FP7-PEOPLE project `INTERCROSSING', ID:289974. Sequencing of the reference tree was funded by NERC emergency grant NE/K01112X/1. 3 Contents 1 Introduction to ash trees and the threat from ash dieback disease 13 1.1 Biology and geography of Fraxinus excelsior ................... 13 1.2 Value of ash . 15 1.3 Hymenoscyphus fraxineus, the causative agent of ash dieback . 15 1.4 Natural genetic variation in susceptibility to ADB . 18 1.5 Current status of ADB research and future directions . 20 2 Introduction to current genome projects of forest trees 23 2.1 Challenges in sequencing and assembling plant genomes . 24 2.1.1 Heterozygosity . 24 2.1.2 Large genome size . 25 2.2 Current forest tree genome projects . 25 2.2.1 The first tree genome; Populus trichocarpa ............... 26 2.2.2 Large and complex genomes - Gymnosperms Pinus taeda and Picea spp. ..................................... 28 2.2.3 Breeding for desirable traits - Salix spp. 29 2.2.4 Disease resistance - Castanea mollissima ................. 30 2.2.5 Population biology and phylogeography - Betula spp. 30 2.2.6 Tree genome databases . 31 2.3 Conclusions . 32 3 De novo genome assembly and annotation of a British Fraxinus excelsior tree 34 3.1 Introduction to genome assembly and finishing methods . 34 3.1.1 de Bruijn graphs vs Overlap Layout Consensus methods . 34 3.1.2 Scaffolding and gap filling . 37 3.1.3 Assembly verification and comparison . 38 3.2 Methods . 39 3.2.1 DNA Extraction and sequencing . 39 3.2.2 De novo assembly . 40 3.2.3 Gene annotation . 42 3.3 De novo assembly results . 42 3.3.1 Overall comparison of released assemblies . 42 3.3.2 Testing different software . 43 3.4 RNA-seq aided annotation of genes . 45 3.5 Conclusion . 46 4 4 Assembly and annotation of organellar genomes from whole genome se- quencing reads 49 4.1 Methods . 50 4.1.1 Generating k-mer distributions . 50 4.1.2 Extracting organellar reads . 50 4.1.3 Plastid genome assembly and annotation . 51 4.1.4 Mitochondrial genome assembly and annotation . 52 4.2 Results . 53 4.2.1 K-mer distributions reveal peaks of organellar sequence coverage . 53 4.2.2 Assembly and annotation of the plastid genome . 54 4.2.3 Assembly of the mitochondrial genome using a map, extend and join method . 56 4.3 Conclusion . 58 5 Analysis of whole genome duplications in Fraxinus excelsior 60 5.1 Introduction . 60 5.1.1 A rich history of whole genome duplications (WGD) in plants . 60 5.1.2 Overview of the Ks method . 63 5.1.3 Correcting for redundant Ks values in homeolog groups . 64 5.2 Methods . 64 5.3 Evidence for two WGD events in the ash lineage . 66 5.4 Conclusion . 68 6 Population structure among European ash trees 72 6.1 Introduction . 72 6.1.1 Current population research on ash . 72 6.1.2 Approaches for analyzing population structure . 75 6.1.3 Approaches for estimating past Ne .................... 75 6.2 Methods . 77 6.2.1 Locations and origins of samples . 77 6.2.2 DNA sequencing and variant calling methods . 78 6.2.3 Population structure methods . 79 6.2.4 Effective population size methods . 80 6.3 Results and Discussion . 81 6.3.1 Analysis of population structure . 81 6.3.2 Estimating effective population size history . 85 6.4 Conclusion and future directions . 89 7 Epigenetic variation in isogenic samples 93 7.1 Introduction . 93 7.1.1 DNA methylation in plants . 93 7.1.2 Background to the ash methylome project . 94 7.1.3 Bisulphite sequencing and alignment software . 95 7.2 Methods . 97 7.2.1 Description of samples and genotypes . 97 7.2.2 DNA extraction, bisulphite conversion and sequencing . 98 5 7.2.3 Data QC and read mapping . 98 7.2.4 Data analysis methods . 99 7.3 Results and Discussion . 101 7.3.1 Landscape of DNA methylation across the ash genome . 101 7.3.2 Many homeologs retained after WGD are differentially methylated . 105 7.3.3 Methylation differences between two Fraxinus species and within geno- types..................................... 110 7.3.4 Methylation in genes relating to ADB susceptibility . 115 7.4 Conclusion . 117 8 Conclusions and further research 121 8.1 Summary of results . 121 8.2 Future research using the ash genome . 122 Bibliography 126 Appendix 146 Appendix 1: Published article on the ash tree genome . 147 Appendix 2: Book chapter on genomics projects of angiosperm trees . 170 6 List of Figures 1.1 Phylogeny of Fraxinus genus . 14 1.2 European distribution of Fraxinus excelsior ................... 15 1.3 Distribution of ash in Great Britain . 16 1.4 Life cycle of Hymenoscyphus fraxineus ...................... 17 1.5 Year of first identification of ash dieback in European countries . 18 1.6 Distribution of confirmed ADB sites across UK .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages185 Page
-
File Size-