The Corylus Mandshurica Genome Provides Insights Into the Evolution of Betulaceae Genomes and Hazelnut Breeding
Total Page:16
File Type:pdf, Size:1020Kb
Li et al. Horticulture Research (2021) 8:54 Horticulture Research https://doi.org/10.1038/s41438-021-00495-1 www.nature.com/hortres ARTICLE Open Access The Corylus mandshurica genome provides insights into the evolution of Betulaceae genomes and hazelnut breeding Ying Li1,PengchuanSun 2,ZhiqiangLu3,4,JinyuanChen1, Zhenyue Wang1,XinDu1,ZeyuZheng1,YingWu1, Hongyin Hu1,JiaoYang1, Jianxiang Ma1, Jianquan Liu1,2 and Yongzhi Yang1 Abstract Hazelnut is popular for its flavor, and it has also been suggested that hazelnut is beneficial to cardiovascular health because it is rich in oleic acid. Here, we report the first high-quality chromosome-scale genome for the hazelnut species Corylus mandshurica (2n = 22), which has a high concentration of oleic acid in its nuts. The assembled genome is 367.67 Mb in length, and the contig N50 is 14.85 Mb. All contigs were assembled into 11 chromosomes, and 28,409 protein-coding genes were annotated. We reconstructed the evolutionary trajectories of the genomes of Betulaceae species and revealed that the 11 chromosomes of the hazelnut genus were derived from the most ancestral karyotype in Betula pendula, which has 14 protochromosomes, by inferring homology among five Betulaceae genomes. We identified 96 candidate genes involved in oleic acid biosynthesis, and 10 showed rapid evolution or positive selection. These findings will help us to understand the mechanisms of lipid synthesis and storage in hazelnuts. Several gene families related to salicylic acid metabolism and stress responses experienced rapid expansion in this hazelnut species, which may have increased its stress tolerance. The reference genome presented here constitutes a valuable resource 1234567890():,; 1234567890():,; 1234567890():,; 1234567890():,; for molecular breeding and genetic improvement of the important agronomic properties of hazelnut. Introduction cardiovascular disease by inhibiting cholesterogenesis The hazelnut genus Corylus L., which belongs to the in vivo as well as in vitro, reducing blood pressure, and – family Betulaceae, contains ~20 species, all bearing edible inhibiting the atherosclerotic process6 8. Currently, the nuts1,2. Hazelnuts, which are a favorite among consumers, major hazelnut-producing countries are Turkey, Italy, are widely used in food processing and in the manufacture Spain, and the USA, and the total annual global hazelnut of confectionery products, including chocolate, biscuits, production is in excess of 550,000 tons (http://www.fao. and hazelnut oil. Hazelnuts have a high overall content of org/3/x4484e/x4484e03.htm). fatty acids (~60% of the hazelnut kernel), mostly oleic acid Hazelnut has a long history of utilization and production, (~80% of the fatty acids), which are important dietary likely predating the Roman era9, and the most widely cul- – components for humans3 5. Oleic acid is usually regarded tivated species have been European hazelnut (Corylus as a healthy fatty acid, as it can reduce the risk of avellana), which has been bred mainly for high fruit yield. European hazelnut is usually susceptible to diseases such as eastern filbert blight (EFB), which can cause serious damage Correspondence: Yongzhi Yang ([email protected]) to the commercial production of European hazelnut10. 1State Key Laboratory of Grassland Agro-Ecosystem, Institute of Innovation Ecology & School of Life Sciences, Lanzhou University, Lanzhou, China During the long history of hazelnut usage, people have 2Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education focused on increasing kernel size, oil content, and disease & State Key Laboratory of Hydraulics & Mountain River Engineering, College of resistance11,12. With the development of molecular plant Life Sciences, Sichuan University, Chengdu, China Full list of author information is available at the end of the article breeding, an increasing number of technologies are being These authors contributed equally: Ying Li, Pengchuan Sun. © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a linktotheCreativeCommons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Li et al. Horticulture Research (2021) 8:54 Page 2 of 13 utilized, especially CRISPR–Cas9 editing, which can greatly due to the low quality of the assembled genome and poor reduce the length of the breeding cycle and improve quality annotation of the relevant genes. In this study, we con- and efficiency13. Genes related to fatty acid biosynthesis and structed a high-quality chromosome-level reference genome oleic acid accumulation are important targets for future for C. mandshurica by combining Illumina short reads, breeding to improve the quality of the oil content. One Nanopore long reads and chromosomal conformational recent study generated cotton with a high oleic acid content capture (Hi-C) sequencing reads. The high-quality genome using the CRISPR/Cas9 system to knock out a microsomal for a hazelnut species presented here helps us to clarify the ω-6 fatty acid desaturase gene, the product of which can biosynthesis of oleic acid in hazelnuts, investigate disease catalyze the desaturation of oleic acid to form linoleic resistance in this species and infer the evolutionary origin of acid14. In the case of disease resistance, the focus is usually the hazelnut genus. This genomic resource also provides a on disease-resistance (R) proteins, which help plants defend valuable foundation for improving our agronomic under- against a range of pathogenic organisms, including para- standing of hazelnut and carrying out molecular breeding in sites, fungi, bacteria, oomycetes, insects, and viruses15.The the future, including cloning key genes that control hazelnut largest class of disease-resistance R genes, those encoding traits and identifying disease-resistance alleles. nucleotide-binding site (NBS) proteins, have a critical role in defending plants from a multitude of pathogens and Results pests16. However, identifying such important gene resources Genome sequence and assembly in hazelnut has been difficult because of the lack of a high- A mature C. mandshurica tree growing in Xinglong quality genome assembly. Mountain National Nature Reserve was selected for de In China, large-scale cultivation of hazelnut is rare, but novo whole-genome sequencing (Fig. 1). A total of wild varieties, mainly C. heterophylla Fisch. and C. man- 38.78 Gb (~98.90×, Supplementary Table 1) of Illumina dshurica Maxim., are distributed in mountainous forest clean short reads was retrieved and used to perform a belts and deep valleys at high altitudes in northern and genome survey, which indicated that the size of the C. northeastern China1. C. mandshurica is a deciduous mandshurica genome was 392.16 Mb and that it has shrubby Asian hazel, the bracts of which form a tubular relatively high heterozygosity (~0.92%) (Supplementary husk with pubescence and setae; it grows naturally or is Fig. 1). The estimated genome size is very close to that artificially cultivated across much of eastern and northern previously reported for European hazelnut22 (378 Mb). A Asia. The nuts of C. mandshurica are characterized by a total of 38.08 Gb (~97.12×, Supplementary Table 1) of raw thin husk, a high kernel weight to hazelnut weight ratio, and Nanopore long reads were self-corrected, assembled, and a high concentration of oleic acid3. C. mandshurica is also polished with NextDenovo and Nextpolish23. After dele- highly resistant to EFB17, and interspecific cross-breeding tion of candidate allelic haplotigs, the final contig-level trials have been carried out between C. mandshurica and assembly of 367.67 Mb was obtained with a contig N50 of European hazelnut to utilize the former’s valuable agro- 14.85 Mb (accounting for 93.7% of the estimated genome nomic properties18.Thus,aC. mandshurica genome is size; Table 1 and Supplementary Table 2). To evaluate our urgently needed for use in hazel breeding. assembly, first, Benchmarking Universal Single-Copy In addition to the need to identify key gene resources in Orthologs24 (BUSCO) was selected, and the results hazel, elucidating the evolutionary history of the genome showed that 97.1% of conserved genes could be com- across the Betulaceae is also important to help us under- pletely annotated within our genome, which is more than stand the evolution of specific traits. Two subfamilies are has been achieved for other published Betulaceae spe- – recognized in the Betulaceae: Alnus (2n = 28) and Betula cies22,25 29 (Supplementary Table 3). The base accuracy of (2n = 28) form the Betuloideae, and the other four genera the assembly was also estimated based on the mapping of comprise the Coryloideae. Within the Coryloideae, Cor- Illumina reads. We aligned genome sequencing reads and ylus (2n = 22) and Ostryopsis (2n = 16) are successively transcriptome sequencing reads to the C. mandshurica sister to Carpinus (2n = 16) and Ostrya (2n = 16)1,19,20. genome, with mapping rates of 98.86% and 99.68%, Although the Betuloideae diverged early as a monophyletic respectively (Supplementary Table 4). The RNA tran- lineage, the hazelnut genus Corylus seems to possess many scripts were further aligned, and 97.00%–99.37% of tran- traits similar to those of the Betuloideae21. All of these scripts could be mapped with over more than half their traits indicate that the evolutionary origins of the hazelnut length to the same contig (Supplementary Table 5). These genus, especially chromosomal evolution, in the Betula- results collectively suggest that the C.