Genome and Transcriptome of Papaver Somniferum Chinese
Total Page:16
File Type:pdf, Size:1020Kb
Pei et al. Horticulture Research (2021) 8:5 Horticulture Research https://doi.org/10.1038/s41438-020-00435-5 www.nature.com/hortres ARTICLE Open Access Genome and transcriptome of Papaver somniferum Chinese landrace CHM indicates that massive genome expansion contributes to high benzylisoquinoline alkaloid biosynthesis Li Pei1,BaishiWang1,2,JianYe1,XiaodiHu3,LihongFu2,KuiLi3,ZhiyuNi2,4, Zhenlong Wang5, Yujie Wei6,LuyeShi5, Ying Zhang1,XueBai1, Mengwan Jiang5,ShuhuiWang7, Chunling Ma2, Shujin Li2,KaihuiLiu1, Wanshui Li1 and Bin Cong2 Abstract Opium poppy (Papaver somniferum) is a source of morphine, codeine, and semisynthetic derivatives, including oxycodone and naltrexone. Here, we report the de novo assembly and genomic analysis of P. somniferum traditional landrace ‘Chinese Herbal Medicine’. Variations between the 2.62 Gb CHM genome and that of the previously sequenced high noscapine 1 (HN1) variety were also explored. Among 79,668 protein-coding genes, we functionally annotated 88.9%, compared to 68.8% reported in the HN1 genome. Gene family and 4DTv comparative analyses with three other Papaveraceae species revealed that opium poppy underwent two whole-genome duplication (WGD) events. The first of these, in ancestral Ranunculales, expanded gene families related to characteristic secondary metabolite production and disease resistance. The more recent species-specific WGD mediated by transposable 1234567890():,; 1234567890():,; 1234567890():,; 1234567890():,; elements resulted in massive genome expansion. Genes carrying structural variations and large-effect variants associated with agronomically different phenotypes between CHM and HN1 that were identified through our transcriptomic comparison of multiple organs and developmental stages can enable the development of new varieties. These genomic and transcriptomic analyses will provide a valuable resource that informs future basic and agricultural studies of the opium poppy. Introduction used in traditional Chinese herbal medicine for ~1400 Opium poppy (Papaver somniferum L.), as one of the years1. The worldwide distribution of P. somniferum results longest utilized medicinal plants in human history, has from its long history of cultivation, and this species con- produced both great benefits and great challenges for tinues to serve as the major agricultural source for extrac- human civilization1. In particular, it has been cultivated and table pharmaceutical alkaloids used as narcotics, analgesics, and relaxants2.Therearefive main alkaloids that accumu- late in the capsular latex of P. somniferum:morphine, 2 Correspondence: Bin Cong ([email protected]) codeine, thebaine, papaverine, and noscapine .Theopium 1Institute of Forensic Science, Ministry of Public Security, No. 17 South Muxidi poppy is mostly grown to extract thebaine, the first penta- ’ Lane, Xicheng District, 100038 Beijing, People s Republic of China cyclic morphinan alkaloid3, which is then used as a sub- 2College of Forensic Medicine, Hebei Medical University, Hebei Key Laboratory of Forensic Medicine, Innovation Center of Forensic Medical Molecular strate to create natural (codeine and morphine) and Identification, No. 361 Zhongshan East Road, 050017 Shijiazhuang, Hebei, semisynthetic opioid alkaloids (naltrexone and hydro- ’ People s Republic of China codone). Morphine is the dominant alkaloid and the Full list of author information is available at the end of the article These authors contributed equally: Li Pei, Baishi Wang, Jian Ye, Xiaodi Hu, strongest naturally occurring analgesic. Industrial synthesis Lihong Fu, Kui Li, Zhiyu Ni © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a linktotheCreativeCommons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Pei et al. Horticulture Research (2021) 8:5 Page 2 of 13 of morphine is possible, but only at low yields. Codeine is Results used as a cough suppressant, and papaverine, another CHM genome assembly and feature annotation metabolite of this pathway, is a smooth-muscle relaxant. Greater than 956.96 Gb (~279.10 × genome coverage) of Thereareawidevarietyofmethodsusedtoobtain sequence data were generated from opium poppy plants morphine equivalents for pharmacotherapeutic treat- (Supplementary Fig. S1) using a HiSeq X Ten instrument ment of health-related suffering. Impoverished popula- with read sizes ranging from 250 bp to 20 kb (Supple- tions in some developing countries lack access to pain mentary Table S1 and Supplementary Materials and relievers or palliative care. For example, an estimated Methods). Using 17-mer analysis, the genome size was 84% of the need for morphine equivalents in China is estimated to be 3.37 Gb (Supplementary Fig. S2 and not met. Addressing this major global health inequality Supplementary Table S2). For each library, we confirmed is a key priority for the World Health Organization4. that the raw data were not biased by measuring the dis- Neither synthetic chemical nor recombinant bio- tribution of insert sizes (Supplementary Table S1). After technological approaches are currently commercially filtering, the genome was assembled into 2.62 Gb (77.74% viable for any of the molecules of the morphinan sub- of the estimated genome size) with a scaffold N50 of class of benzylisoquinoline alkaloids (BIAs), making 6.86 Mb determined using Platanus13 and other scaffold- opium poppy the only commercial source of such pro- ing software described in the “Materials and methods” ducts2,5,6. As a result, opium poppy is a major cash crop section; 90% of the genome assembly was contained in contributing to the economies of many poppy-growing 2303 scaffolds. Through three-dimensional proximity countries, such as Turkey and India7,8. However, due to information generated by chromosome conformation the addictive properties of morphine, the cultivation of capture sequencing14, we linked the scaffolds into super- opium poppy without strict regulation is outlawed scaffolds using SALSA. Then, nucmer (version 3.23) was in China and other countries. The development of novel, used to anchor superscaffolds to the HN1 genome11. The high-yielding varieties through molecular marker- chromosomal locations of blocks mapped to the HN1 assisted breeding is therefore urgently needed to meet genome were retrieved for anchoring and orienting the global demand. superscaffolds to the corresponding chromosome (Sup- To date, the genomes of three species within the plementary Tables S3 and S4). The final assembly com- Papaveraceae have been published, including those of prised 11 pseudochromosomes (87.6% of the genome) Macleaya cordata9, Eschscholzia californica10,andthe (Supplementary Table S5), with the longest scaffold high noscapine variety of Papaver somniferum (HN1)11. (chromosome) of 287.96 Mb and superscaffolds N50 of Macleaya cordata (five-seeded plume poppy) was the 227.38 Mb (L50 = 5) (Fig. 1a, Table 1 and Supplementary firsttohaveitswholegenomesequencedamongBIA- Table S4). producing (e.g., sanguinarine, protopine, allocrypto- The completeness of gene regions was evaluated using pine, and chelerythrine) members of Papaveraceae9. Benchmarking Universal Single-Copy Orthologs (BUSCO, However,thegenomesizeofP. somniferum (HN1, version 4.0.5). Of the 1614 single-copy orthologs identi- ~2.72 Gb) is much larger than that of M. cordata fied in embryophytes, 96.4% were more complete in our (~378 Mb) and E. californica (~502 Mb). High-BIA- assembly than in that of HN1 (93.1%) (Fig. 1c). CEGMA producing cultivars have lost substantial genetic diver- assessment15 of CHM showed that this assembly captured sity through successive bottlenecks owing to domes- 96.77% (240 of 248) of the core eukaryotic genes and that tication and long-term selective breeding for traits that 94.35% (234/248) were complete (Supplementary Table increase yield12. Here, we report the draft genome of the S6). To further verify the accuracy of the assembly, we traditional CHM opium poppy and compare its gene used nucmer (version 3.23) and MCScan to analyze col- family composition and transposable elements with linearity with HN1 and found a high degree of synteny those of other members of Papaveraceae to better across the whole genome (Fig. 1b and Supplementary understand the evolutionary history leading to its Fig. S3). massive genome expansion. We also conducted in- We identified 1.69 Gb (65.79%) of the assembled CHM depth transcriptomic analysis across multiple tissues genomes as transposable elements (TEs) (Table 1 and and developmental stages to characterize the spatio- Supplementary Table S7). The predominant type of TE temporal and genetic basis of BIA synthesis in CHM was long terminal repeat (LTR) elements, which repre- compared to NH1. We further identified specific genetic sented ~54.4% (1.43 Gb) of the total genome of TEs. Most variations (SNPs and InDels) that differ