The Genome Evolution and Low-Phosphorus Adaptation in White
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLE https://doi.org/10.1038/s41467-020-14891-z OPEN The genome evolution and low-phosphorus adaptation in white lupin ✉ ✉ Weifeng Xu 1 , Qian Zhang 1 , Wei Yuan1, Feiyun Xu1, Mehtab Muhammad Aslam1, Rui Miao1, Ying Li1, ✉ ✉ Qianwen Wang1, Xing Li1, Xin Zhang2, Kang Zhang2, Tianyu Xia 1 & Feng Cheng 2 White lupin (Lupinus albus) is a legume crop that develops cluster roots and has high phosphorus (P)-use efficiency (PUE) in low-P soils. Here, we assemble the genome of white 1234567890():,; lupin and find that it has evolved from a whole-genome triplication (WGT) event. We then decipher its diploid ancestral genome and reconstruct the three sub-genomes. Based on the results, we further reveal the sub-genome dominance and the genic expression of the dif- ferent sub-genomes varying in relation to their transposable element (TE) density. The PUE genes in white lupin have been expanded through WGT as well as tandem and dispersed duplications. Furthermore, we characterize four main pathways for high PUE, which include carbon fixation, cluster root formation, soil-P remobilization, and cellular-P reuse. Among these, auxin modulation may be important for cluster root formation through involvement of potential genes LaABCG36s and LaABCG37s. These findings provide insights into the genome evolution and low-P adaptation of white lupin. 1 Center for Plant Water-use and Nutrition Regulation and College of Life Sciences, Joint International Research Laboratory of Water and Nutrient in Crop, Fujian Agriculture and Forestry University, Jinshan, Fuzhou 350002, China. 2 Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural ✉ Genomics, Beijing, China. email: [email protected]; [email protected]; [email protected]; [email protected] NATURE COMMUNICATIONS | (2020) 11:1069 | https://doi.org/10.1038/s41467-020-14891-z | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-14891-z hosphorus (P) is required for plant growth. P is often Table 1 The assembly statistics of white lupin genome. present in the soil in unavailable forms, such as phytic acid, P 1 or calcium (Ca), iron (Fe), and aluminum (Al) phosphates , Type Contig Scaffold and these forms are difficult for plants to absorb. About 5.7 bil- lion hectares of land worldwide contain too little P for crop Size (Mb) Number Size (Mb) Number 2 growth, and P deficiency constrains agricultural productivity .To Maximum 9.48 1 25.25 1 cope with P-limiting environments, several processes have N50 1.76 71 18.66 14 evolved in plants. These include morphological, physiological, N90 0.052 770 1.67 33 biochemical, and molecular adaptations3. Total length 558.74 3171 558.90 1580 White lupin (Lupinus albus,2n = 50) belonging to the family Chromosomes 474.20 25 Fabaceae can mobilize soil phosphates by the formation of den- (84.87%) sely packed lateral roots. These are also called cluster roots or Genes / / 48,719 proteoid roots. White lupin can grow without the addition of P TEs 262.71 / fertilizer or help from mycorrhizal fungi, and its ability to fix (43.86%) dinitrogen (N2) is less inhibited by P deficiency compared to 2 other legumes . The morphological and physiological factors of map17 (Supplementary Fig. 2). Finally, we obtained 1580 scaf- 4 low-P adaptation in white lupin have been well studied . How- folds, and the scaffold N50 was 18.66 Mb (Table 1). The 25 largest ever, the genome evolution and low-P adaptation in white lupin is scaffolds comprised 1616 contigs, which accounted for 84.87% unclear. (474.20 Mb) of the assembled genome and corresponded to the 25 5 All plant species have experienced polyploidization events , chromosomes of white lupin (Supplementary Table 2 and Sup- which are involved in plant speciation and evolution of new plementary Fig. 3). We compared this assembled genome with a functions. Narrow-leaf lupin (Lupinus angustifolius) is a close currently released white lupin genome18, and found that they relative of white lupin. Lupinus angustifolius experienced a hex- have a good chromosomal synteny relationship, except for some 6 aploidization event that was shared with white lupin. Based on small-scale segment inversions (Supplementary Fig. 4). 7 genomic block (GB) systems , diploid ancestors of paleopoly- We combined de novo prediction, homology search, and 8 9 ploids have been found in Brassica and in the Poaceae . After mRNA-seq assisted prediction to predict genes in the genome of allopolyploidization events, sub-genome dominance phenomena white lupin (see Methods), and we obtained 48,719 genes are important features shaping the evolution of polyploid gen- (Supplementary Fig. 3 and Supplementary Tables 3 and 4). omes. Sub-genome dominance indicates that one sub-genome has BUSCO (Benchmarking Universal Single-Copy Ortholog) analy- a higher gene density than the others. In addition, there are more sis with embryophyta datasets19 estimated that 95.20% of the genes from this sub-genome expressed at higher levels than their conserved single-copy genes were predicted. TEs are a major 10,11 paralogs from the other sub-genomes . Transposable elements component of the white lupin genome. We used the tools (TEs) have been shown to play a role in dominant expression of RepeatModeler, RepeatMasker, and RepeatProteinMask to pre- paralogous genes between sub-genomes in Brassica rapa and dict TEs from the white lupin genome20 (see Methods). The 12,13 maize . However, they do not show low-P adaptation. combined size of all TE sequences was 262.71 Mb and occupied Knowledge of the evolution of the paleo-genome and sub- 43.86% of the assembled genome (Supplementary Table 5 and genomes in white lupin will contribute to our understanding of its Supplementary Fig. 3). Among the sequences, long terminal adaptations to low P levels. repeats were the major component (37.81%). In this study, we use the long-read sequencing of PacBio technology combined with high-throughput chromatin capture (Hi-C) datasets, as well as mRNA-sequencing (mRNA-seq), The diploid ancestral genome of paleohexaploid white lupin. comparative and evolutionary genomic analysis, pharmacology Lupinus albus and L. angustifolius experienced a common whole- assays, genetic transformation, physiology, and biochemistry genome triplication (WGT) event. We compared the genome of analyses to characterize the reference genome of white lupin and white lupin to 15 other legume species with sequenced genomes investigate its chromosomal evolution and the molecular basis of (Supplementary Table 6). Arabidopsis thaliana was used as the its adaptation to low-P availability. outgroup. First, we determined syntenic gene pairs between pairs of the 16 legume genomes using SynOrths21. From these syntenic gene datasets, we obtained 1664 homologous genes that were Results shared by all 16 genomes. We picked 78,926 synonymous sites Pseudo-chromosome construction of the white lupin genome. from these homologous genes to build a phylogenetic tree We assembled the genome of the white lupin cultivar Amiga with (Fig. 1a). The times of divergence between legume species were combined datasets from third-generation long-read SMRT estimated by calculating Ks values (the rate of synonymous sequencing (PacBio) and long-range, Hi-C sequencing. We con- mutations per synonymous locus) between syntenic genes firmed that the white lupin plant used for sequencing had 25 pairs (Fig. 1b). The results showed that L. albus has the closest rela- = of chromosomes using in situ hybridization (Supplementary tionship to L. angustifolius (Ks ~ 0.12, divergent ~9.40 million Fig. 1). We then generated 60 Gb Illumina Solexa 150 bp paired- years) (see Methods), followed by a group of combined species end reads data and estimated the genome size of white lupin as from Galegoid and Millettioid that contains Cicer arietinum and = 584.51 Mb by 17 K-mer counting. We produced 84.29 Gb Phaseolus vulgaris (L. albus to P. vulgaris, Ks ~ 0.51, divergent (~144×) PacBio reads data (Supplementary Table 1) and assem- ~39.97 million years), then by the group of Dalbergioid that bled the data into contigs using the software Canu14, followed by contains the two diploid ancestors of cultivated peanut Arachis 15 fi = sequence polish and ltering. We obtained 3171 contigs with a ipaensis and Arachis duranensis (L. albus to A. ipaensis, Ks ~ total size of 558.74 Mb. The contig N50 was 1.76 Mb; the largest 0.52, divergent ~40.75 million years) (Fig. 1). Two relatively contig was 9.48 Mb (Table 1). A Hi-C library was constructed and recent polyploidization events were found in these analyzed = generated ~100-fold coverage of Hi-C linkage data (100 bp legume species: one is a WGT (Ks ~ 0.28, ~21.94 million years) paired-end reads). We then linked the contigs into scaffolds based event shared by L. albus and L. angustifolius (Fig. 1 and Sup- on the Hi-C data16 with the help of a previously published linkage plementary Fig. 5), and the other is a whole-genome duplication 2 NATURE COMMUNICATIONS | (2020) 11:1069 | https://doi.org/10.1038/s41467-020-14891-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-14891-z ARTICLE a Vigra radiata c Chr25 100 Chr24 100 Vigan angularis Chr23 Chr22 100 Vigna unguiculata Chr21 Chr20 100 Phaseolus vulgaris Chr19 Chr18 Glycine max Millettioid 100 Chr17 Chr16 100 Glycine soja Chr15 100 Cajanus cajan Chr14 Galegoid Chr13 Lotus japonicus Chr12 Chr11 Glycyrrhiza uralensis 100 L. albus Chr10 Cicer arietinum Chr09 100 Chr08 100 Medicago truncatula Chr07 Chr06 100 Trifolium pratense Chr05 Genistoid Lupinus albus Chr04 Chr03 100 Lupinus angustifolius Chr02 50 Mb 150 Mb 250 Mb 350 Mb 450 Mb Dalbergioid Arachis ipaensis Chr01 100 Arachis duranensis 30 Mb 90 Mb 150 Mb 210 Mb 270 Mb 330 Mb 390 Mb 450 Mb Arabidopsis thaliana Lan01 Lan02 Lan03 Lan04 Lan05 Lan06Lan07Lan08 Lan09Lan10 Lan11 Lan12Lan13Lan14Lan15Lan16Lan17Lan18Lan19Lan20 L.