The Draft Genome Sequence of Herbaceous Diploid Bamboo
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 The draft genome sequence of herbaceous diploid 2 bamboo Raddia distichophylla 3 4 Wei Li,*,1 Cong Shi,†,1 Kui Li,†,1 Qun-jie Zhang,* Yan Tong,† Yun Zhang,† Jun Wang,‡ 5 Lynn Clark, §,2 and Li-zhi Gao*,† 2 6 7 *Institution of Genomics and Bioinformatics, South China Agricultural University, 8 Guangzhou 510642, China, †Plant Germplasm and Genomics Center, Germplasm Bank 9 of Wild Species in Southwestern China, Kunming Institute of Botany, Chinese Academy 10 of Sciences, Kunming 650204, China, ‡Southwest China Forestry University, Kunming 11 650224, China, and §Ecology, Evolution and Organismal Biology, Iowa State University, 12 251 Bessey Hall, Ames, 50011-1020, IA, USA. 13 14 1These authors contributed equally to this work. 15 2Correspond author: E-mail address: [email protected] & [email protected] 16 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 17 ABSTRACT 18 Bamboos are important non-timber forest plants widely distributed in the tropical and 19 subtropical regions of Asia, Africa, America, and Pacific islands. They comprise the 20 Bambusoideae in the grass family (Poaceae), including approximately 1,700 described 21 species in 127 genera. In spite of the widespread uses of bamboo for food, construction 22 and bioenergy, the gene repertoire of bamboo still remains largely unexplored. Raddia 23 distichophylla (Schrad. ex Nees) Chase, belonging to the tribe Olyreae (Bambusoideae, 24 Poaceae), is diploid herbaceous bamboo with only slightly lignified stems. In this study, 25 we report a draft genome assembly of the approximately ~589 Mb whole-genome 26 sequence of R. distichophylla with a contig N50 length of 86.36 Kb. Repeated sequences 27 account for ~49.08% of the genome, of which LTR retrotransposons occupy ~35.99% of 28 whole genome. A total of 30,763 protein-coding genes were annotated in the R. 29 distichophylla genome with an average transcript size of 2,887 bp. Access to this 30 herbaceous bamboo genome sequence will provide novel insights into biochemistry, 31 molecular-assisted breeding programs and germplasm conservation for bamboo species 32 world-wide. 33 34 KEYWORDS 35 Bamboos, Raddia distichophylla, whole-genome sequence 36 37 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 38 INTRODUCTION 39 Bamboos are important non-timber forest plants with a wide native geographic 40 distribution in tropical, subtropical and temperate regions except for Europe and 41 Antarctica (BambooPhylogenyGroup, 2012). Bamboos are of notable economic and 42 cultural significance worldwide, and can be used as food, bioenergy and building 43 materials. Bamboos comprise the Bambusoideae in grass family (Poaceae), including 44 approximately 1,700 described species in 127 genera (Clark and Oliveira, 2018; Soreng 45 et al., 2017; Vorontsova et al., 2016). Molecular phylogenetic analysis shows that 46 Bambusoideae falls into the Bambusoideae-Oryzoideae-Pooideae (BOP) clade and is 47 phylogenetically sister to Pooideae (Saarela et al., 2018). Bambusoideae may be divided 48 into two morphologically distinct growth forms: woody bamboos and herbaceous 49 bamboos (tribe Olyreae); woody bamboos can be further divided into two lineages: 50 temperate woody (Arundinarieae) and tropical woody (Bambuseae) 51 (BambooPhylogenyGroup, 2012; Kelchner and Group, 2013; Sungkaew et al., 2009). 52 Although the phylogenetic relationship between the two woody bamboo lineages is 53 still controversial (BambooPhylogenyGroup, 2012; Bouchenak-Khelladi et al., 2008; 54 Clark et al., 2015; GrassPhylogenyWorkingGroup et al., 2001; Sungkaew et al., 2009; 55 Triplett et al., 2014; Wysocki et al., 2016), their shared phenology and sexual systems are 56 suggestive of common ancestry (Wysocki et al., 2016). Woody bamboos form highly 57 lignified, usually hollow culms, complex rhizome systems, specialized culm leaves, 58 complex vegetative branching, and outer ligules on the foliage leaves 59 (BambooPhylogenyGroup, 2012; Clark et al., 2015). Woody bamboos possess bisexual 60 flowers and flower at intervals between 7 and 120 years usually followed by a die-off 61 (BambooPhylogenyGroup, 2012; Dransfield and Widjaja, 1995; Judziewicz et al., 1999). 62 All woody bamboos with known chromosome counts are uniformly polyploidy 63 (Soderstrom, 1981). 64 The tribe Olyreae comprises 22 genera and 124 described species native to tropical 65 America except Buergersiochloa and Olyra latifolia (BambooPhylogenyGroup, 2012; 66 Clark et al., 2015). Herbaceous bamboos are characterized by usually weakly developed 67 rhizomes and less lignification in the culms. Culm leaves and foliage leaves with the 68 outer ligule are absent in herbaceous bamboos. In contrast to woody bamboos, 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 69 herbaceous bamboos have at least functionally unisexual flowers and they flower 70 annually or seasonally for extended periods (Clark et al., 2015; Gaut et al., 1997; 71 Kelchner and Group, 2013; Oliveira et al., 2014; Wysocki et al., 2015). The tribe is 72 fundamentally diploid, but chromosome counts indicating tetraploidy or hexaploidy or 73 possibly even octoploidy (in Eremitis genus) are available (Judziewicz et al., 1999; 74 Soderstrom, 1981). 75 Since the first comparative DNA sequence analysis of bamboos by Kelchner and 76 Clark (Kelchner and Clark, 1997), a number of studies have been further carried out 77 combined various biotechnologies (Das et al., 2005; Gui et al., 2010; Oliveira et al., 2014; 78 Peng et al., 2013; Sharma et al., 2008; Sungkaew et al., 2009; Wysocki et al., 2016; 79 Zhang et al., 2011). Peng et al. (Peng et al., 2013) reported the first draft genome of 80 tetraploid moso bamboo (Phyllostachys edulis). Recently, Guo et al. (Guo et al., 2019) 81 have released four draft genomes of bamboo lineages, including O. latifolia, R. 82 guianensis, Guadua angustifolia, and Bonia amplexicaulis. The lack of a high-quality 83 genome sequence for a diploid bamboo has been an impediment to our understanding of 84 bamboo biology and evolution. 85 R. distichophylla (Schrad. ex Nees) Chase, a diploid herbaceous bamboo, is almost 86 completely restricted to the forests of eastern Brazil (Oliveira et al., 2014). In this study, 87 we generate the complete genome sequence of R. distichophylla through the Illumina 88 sequencing platform. The availability of the fully sequenced and annotated genome will 89 provide functional, ecological and evolutionary insights into the bamboo species. 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 90 METHODS AND MATERIALS 91 Sample collection, total DNA and RNA extraction and sequencing 92 The source plant was an individual of R. distichophylla grown in cultivation at the R.W. 93 Pohl Conservatory, Iowa State University. Fresh and healthy leaves were harvested and 94 immediately frozen in liquid nitrogen, followed by storage at -80°C in the laboratory 95 prior to DNA extraction. A modified CTAB method (Porebski et al., 1997) was used to 96 extract high-quality genomic DNA. The quantity and quality of the extracted DNA were 97 examined using a NanoDrop D-1000 spectrophotometer (NanoDrop Technologies, 98 Wilmington, DE) and electrophoresis on a 0.8% agarose gel, respectively. A total of 9 99 paired-end sequencing libraries, spanning 180, 300, 500, 2 000, 5 000, 10 000 and 20 000 100 bp, were prepared and sequenced on Illumina HiSeq 2000 platform. 101 Total RNA was extracted from four tissues (root, stem, young leaf and female 102 inflorescence), using a Water Saturated Phenol method. RNA libraries were built using 103 the Illumina RNA-Seq kit (mRNA-Seq Sample Prep Kit P/N 1004814). The extracted 104 RNA was quantified using NanoDrop-1000 UV-VIS spectrophotometer (NanoDrop), and 105 RNA integrity was checked using Agilent 2100 Bioanalyzer (Agilent Technologies, Palo 106 Alto, CA, USA). For each tissue, only total RNAs with a total amount ≥ 15 μg with a 107 concentration ≥ 400 ng/μl, RNA integrity number (RIN) ≥ 7, and rRNA ratio ≥ 1.4 were 108 used for constructing cDNA library according to the manufacturer’s instruction (Illumina, 109 USA). The libraries were then sequenced (100 nt, pair-end) with Illumina HiSeq 2000 110 platform. 111 112 De novo assembly of the R. distichophylla genome 113 Two orthogonal methods were used to estimate the genome size of R. distichophylla, 114 including k-mer frequency distribution and flow cytometric analysis. Firstly, we 115 generated the 17-mer occurrence distribution of sequencing reads using GCE v1.0.0 (Liu 116 et al., 2013), and the genome size was then calculated with the equation NLK( 1) 117 G , where G represents the genome size; N is the total number of reads LD 118 used; L is the average length of reads; K is set to 17. Secondly, the genome size was 119 further estimated and validated using flow cytometry analysis. We employed the rice 5 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.27.064089; this version posted April 29, 2020.