See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/225578206

Analysis of ESTs from a Normalized cDNA Library of the Rhizome Tip of longistaminata

ARTICLE in JOURNAL OF BIOLOGY · FEBRUARY 2011 Impact Factor: 1.28 · DOI: 10.1007/s12374-011-9187-2

CITATIONS DOWNLOADS VIEWS 3 33 123

6 AUTHORS, INCLUDING:

Ting Zhang Fengyi Hu

10 PUBLICATIONS 51 CITATIONS Yunnan Academy of Agricultural Sciences 31 PUBLICATIONS 457 CITATIONS SEE PROFILE SEE PROFILE

Binying Fu Daichang Yang Chinese Academy of Agricultural Sciences Wuhan University

24 PUBLICATIONS 1,408 CITATIONS 45 PUBLICATIONS 815 CITATIONS

SEE PROFILE SEE PROFILE

Available from: Ting Zhang Retrieved on: 26 June 2015 J. Plant Biol. (2012) 55:33–42 DOI 10.1007/s12374-011-9187-2

ORIGINAL RESEARCH

Analysis of ESTs from a Normalized cDNA Library of the Rhizome Tip of Oryza longistaminata

Ting Zhang & Lijuan Li & Fengyi Hu & Xiuqin Zhao & Binying Fu & Daichang Yang

Received: 4 August 2011 /Revised: 21 September 2011 /Accepted: 23 September 2011 /Published online: 4 October 2011 # The Botanical Society of Korea 2011

Abstract Oryza longistaminata, a perennial wild species were physically colocalized onto rhizome-related quantitative with an AA genome, is characterized by the presence of trait locus intervals in rice and sorghum, and one gene, rhizomatous stems. The rhizomatous trait in rice was OLRR1, was further confirmed to be enriched in the rhizome previously shown to be quantitatively controlled by many tip and young leaf by real-time polymerase chain reaction and genes, but the molecular mechanism related to rhizome in situ hybridization. Unisequences reported in this study development is still unknown. In the present study, expressed provide valuable data for molecular dissection of the sequence tags (ESTs) generated from rhizome tips of O. rhizomatous growth habit in O. longistaminata. longistaminata were collected and analyzed. A total of 10,283 complimentary deoxyribunucleic acid clones were Keywords Oryza longistaminata . Rhizome . Expressed randomly sequenced, which generated 10,136 raw sequences, sequence tags . Unisequence and finally, 4,419 unisequences with diverse functional categories were generated. These unisequences were mapped onto the genome, which revealed that 4,285 Introduction (96.97%) and 4,151 (93.94%) of the unisequences were alignable to the japonica and indica genomic sequences, Rice (Oryza sativa) is a staple food for more than half of respectively, with >80% sequence identity. Additionally, 41 the world’s population. However, rice productivity is unisequences showed four typical types of alternative splicing continuously threatened by diverse environmental stresses. patterns. More than 600 simple sequence repeats were Wild rice relatives are a valuable source of genetic variation identified in these unisequences. A subset of unisequences such as resistance to diverse stresses, which provides genetic resources to improve agronomically important traits Electronic supplementary material The online version of this article of cultivated rice (Nakagahra et al. 1997; Tanksley and (doi:10.1007/s12374-011-9187-2) contains supplementary material, McCouch 1997). Oryza longistaminata, a wild rice species which is available to authorized users. indigenous to Africa, has the same AA genome as O. sativa T. Zhang : X. Zhao : B. Fu (*) (Ghesquiere 1985; Vaughan 1994) and possesses many Institute of Crop Sciences/National Key Facility for Crop Gene important adaptive traits for cultivation, such as tolerance to Resources and Genetic Improvement, cold and drought, resistance to diseases, a perennial life Chinese Academy of Agricultural Sciences, history, and a growth form characterized by strong 12 South Zhong-Guan-Cun St., Beijing 100081, China rhizomatous stems (Song et al. 1995; Sacks et al. 2003). e-mail: [email protected] Expressed sequence tags (ESTs) are a valuable resource for gene discovery, genome annotation, and comparative genomic : * T. Zhang D. Yang ( ) analysis. The Rice Genome Research Program isolated and College of Life Sciences, Wuhan University, Wuhan 430072, China partially sequenced more than 29,000 complimentary deoxy- e-mail: [email protected] ribunucleic acid (cDNA) clones from a variety of tissues and : calli of the rice japonica cultivar Nipponbare (Yamamoto and L. Li F. Hu Sasaki 1997). A total of 39,208 raw sequences were generated Food Crops Research Institute, Yunnan Academy of Agricultural Sciences, from a normalized cDNA library prepared from 15 different Kunming 650205, China tissues of the indica cultivar Minghui 63 (Zhang et al. 2005). 34 J. Plant Biol. (2012) 55:33–42

According to the National Center for Biotechnology Informa- library construction. Total ribunucleic acid (RNA) was tion (NCBI) BioProject database, currently about 300,000 rice isolated from rhizome tips using TRIzol reagent (Invitro- ESTs are lodged in the Genbank/DDBJ/EMBL databases. gen, Cat. No. 15596–018) according to the manufacturer’s Comparative analyses of 5,211 leaf ESTs of Oryza minuta and instructions and purified using the RNeasy MinElute 1,888 full-length cDNAs from Oryza rufipogon W1943 have Cleanup Kit (Qiagen, Cat. No. 74204). First-strand cDNA been undertaken (Cho et al. 2004;Luetal.2008). Since the was synthesized using the Creator SMART cDNA Con- rice genome has been sequenced (Yu et al. 2002; International struction kit (Clontech, Cat. No. 634903) and normalized Rice Genome Sequencing Project 2005), the whole- using the Trimmer-Director kit (Evrogen, Cat. No. NK002). transcriptome shotgun-sequencing procedure was developed Long-distance polymerase chain reaction (PCR) was ap- to enable transcriptome profiling based on deep-sequencing plied for double-stranded cDNA synthesis by Advantage 2 technology. Application of the technique in rice has mainly PCR kit (Clontech, Cat.No.639207). The normalized focused on the model genotypes japonica Nipponbare and double-stranded cDNA products were digested with SfiI indica 93–11 (Wang et al. 2009b;Luetal.2010;Zhangetal. for 2 h at 50°C, and the cDNA fraction of size 1–3kbafter 2010). Deep sequencing of the root transcriptome of the O. gel filtration was ligated to vector at 16°C overnight and longistaminata revealed that 15.7% of transcripts showed no transformed into competent Escherichia coli strain significant similarity to known sequences (Yang et al. 2010). DH10B cells by electroporation. The average insert size These data contribute to an improved understanding of the was more than 1 kb characterized by colony PCR using genetic characteristics of wild rice species and to more randomly selected 30 colonies. Randomly selected 192 effective use of wild rice genetic resources. colonies were sequenced to evaluate the proportion of full- Of the two cultivated and 22 wild species of rice, O. length cDNAs and empty vectors. longistaminata provides a model system for genetic and molecular dissection of the rhizomatous trait in grasses. In Nucleotide Sequencing and Sequence Data Assembly common with O. sativa, O. longistaminata possesses an AA genome. Much effort has been devoted to the Randomly selected clones were sequenced with an ABI identification of the molecular mechanisms underlying the 3730 sequencer by the dideoxy chain termination method rhizomatous trait in the bamboo Phyllostachys praecox, using the BigDye Terminator v2.0 Cycle Sequencing Ready Sorghum propinquum, and especially O. longistaminata Reaction Kit (Applied Biosystems). The sequence trace (Ghesquiere 1991; Ghesquiere and Causse 1992; Paterson et files were checked using the Phred base-calling software al. 1995; Maekawa et al. 1998;Huetal.2003;Jangetal. (phred_0.020425.c). Contamination, such as vector, ribo- 2006, 2009;Wangetal.2009a). Hu et al. (2003)reported somal RNA (rRNA) and mitochondrial DNA sequences, that the rhizome phenotype in O. longistaminata is con- were removed using the Cross Match program (University trolled by two dominant complementary genes, Rhz2 and of Washington, Seattle, WA, USA) with the following Rhz3, located on chromosomes 3 and 4. Furthermore, Hu et parameters: penalty −2, minimatch −12, and miniscore −20. al. (2011) identified rhizome-specific genes by genomewide All processed sequences were assembled using Phrap differential expression analysis in O. longistaminata and software, and the ORF Finder tool was used to identify suggested a complex gene regulatory network underlies the open reading frames of each unigene. rhizome development and growth. Few studies have reported cDNA library construction Comparative Analysis of the EST Sequences from the rhizome in a crop plant. Herein, we describe, for the first time, analysis of ESTs generated from the rhizome Similarity searches were performed with the BLAST tip of O. longistaminata, as a contribution to the molecular program (Altschul et al. 1997) against sequence data in dissection of rhizome-related genes. the NCBI GenBank, ntdb, nrdb, and dbEST databases, the Nipponbare genomic sequence (http://rgp.dna.affrc.go.jp/ IRGSP/), 93–11 whole-genome shotgun sequences (http:// Materials and Methods rice.genomics.org.cn/rice/index2.jsp), and National Center for Gene Research Rice Indica cDNA Database (http:// Plant Material and cDNA Library Construction www.ncgr.ac.cn/ricd). We downloaded all relevant se- quence data, and 4,419 unisequences were used as query Accessions of O. longistaminata originating from Niger sequences. A BLAST-like alignment tool, BLAT, was used were cultured in a greenhouse at the Food Crops Research to align the unisequences with the rice genomic sequence Institute, Yunnan Academy of Agricultural Sciences, China. (Kent 2002). The similarity threshold E-value was less than The rhizome tips (distal 1 cm of young rhizomes) of 1e-10. The unisequences were classified into functional at the active tillering stage were sampled for cDNA categories based on the Gene Ontology (GO) database. J. Plant Biol. (2012) 55:33–42 35

Alternative splicing patterns of the unisequences were Table 1 Assembly result for all 4419 unisequences from the cDNA predicted by comparison with EST or mitochondrial RNA library of rhizome apical tissue of Oryza longistaminata (mRNA) sequences lodged in public database specified Cluster size No. of unigenes Percentage of total (%) above. Alternative splicing-specific primers were selected for PCR analysis to confirm the predicted alternative 1 3,162 71.55 splicing patterns. Primer sequences are listed in Table S1. 2 685 15.50 SSR motifs were detected using the Perl script MISA (Thiel 3 252 5.70 et al. 2003; http://pgrc.ipk-gatersleben.de/misa). 4–5 170 3.85 6–10 82 1.86 Physical Mapping of ESTs Compared with Rhizome-related 11–20 34 0.77 QTLs 21–50 26 0.59 51–100 5 0.11 Physical mapping of ESTs was performed by genetic >100 3 0.07 alignment between the rhizome-related quantitative trait loci (QTLs) map derived from a RD23×O. longistaminata complexity of the library was estimated to be 1.5×106. F2 population and physical location of ESTs obtained from The Institute for Genomic Research (TIGR) japonica rice Sequencing of 10,283 randomly selected cDNAs generated assembly. Sorghum homologous genes corresponding to the 10,136 raw sequences with an average size of 461 bp unisequences were aligned against the TIGR sorghum without the vector, of which 8,691 high-quality ESTs assembly release 1 using the BLASTN algorithm. ranging in length from 100 to 756 bp formed 4,419 unisequences with an average size of 556 bp (accession Real-time PCR and In Situ Hybridization Analysis number of 8,691 high-quality ESTs: JK502261-JK510951). Among these unisequences, 3,162 (71.55%) were single Real-time PCR was performed using the ABI Prism copies, 685 (15.5%) were from two overlapping sequences, 7300 Sequence Detection System (Applied Biosystems). 252 (5.7%) were from three overlapping sequences, and Diluted cDNA was amplified using the primers 320 (7.61%) were from more than four overlapping OLRR1F (5′-CGTACCAGACCAACCAAT-3′)and sequences (Table 1). When novelty means unigenes/ OLRR1R (5′-GACCTGAGGCAGCCAAAG-3′)using assembled ESTs, we found the novelty of the cDNA library the SYBR Green Master Mix (Applied Biosystems). was about 50.85% and the redundancy was 49.15%. The most We normalized the levels of OLRR1 transcripts with endog- frequently represented sequence contig consisting of 188 enous Actin transcripts amplified with the primers actin F (5′- ESTs spanning 521 bp in length was shown by sequence TTATGGTTGGGATGGGACA-3′)andactinR(5′- similarity to represent a gene encoding a cytosolic triosephos- AGCACGGCTTGAATAGCG-3′). Each set of experiments phate isomerase of Zea mays (GenBank accession no. was repeated three times, and the relative quantification EU976612.1; E-value=0). The longest contig was 2,525 bp method (DDCT) was used to evaluate quantitative variation. in length and consisted of 101 ESTs and was identified as an In situ hybridization was carried out using the method of ubiquitin protein-coding gene on chromosome 6 of the rice Jackson (1991). The apical portion (1 cm long) of the genome (Gene ID: 4341860 Os06g0681400). rhizome was excised and fixed carefully to avoid RNase contamination. Two templates were constructed by cloning the coding DNA sequence of OLRR1 into the pBluescript plasmid (Invitrogen). The antisense and sense RNA probes were transcribed separately by T3 and T7 RNA polymer- ase, respectively, after linearization of the plasmid labelled with dig-UTP (Roche). Each experiment was performed three times using independent samples.

Results and Discussion

Sequence Analysis of the ESTs

A cDNA library was constructed using mRNA isolated Fig. 1 Chromosome distribution of the consensus sequences in rice from rhizome apical tissue of O. longistaminata. Overall genome 36 J. Plant Biol. (2012) 55:33–42

Fig. 2 Gene ontology catalytic activity classification of the unisequences from rhizome tip of Oryza a 12% nucleotide binding 18% longistaminata. a Molecular protein binding function, b Biological process, c Cellular component binding transporter activity 12% 2% kinase activity

3% transcription factor activity

4% DNA binding

9% nucleic acid binding 3% hydrolase activity

5% RNA binding signal transducer activity 8% 5% molecular_function 5% 7% others 7% b cellular process 9% response to stress protein modification 7% physiological process 32% signal transduction 7% response to endogenous stimulus transport metabolism 7% response to biotic stimulus biosynthesis

3% 7% response to abiotic stimulus

2% DNA metabolism 3% 7% catabolism 4% 4% 4% 4% others

c mitochondrion 1% 5% 2% 2% plastid 21% 3% membrane

3% nucleus cytoplasm 4% cell wall cytoplasmic membrane-bound vesicle 6% intracellular

14% thylakoid 6% plasma membrane cytoskeleton

9% nucleolus

13% cytosol 11% others J. Plant Biol. (2012) 55:33–42 37

Mapping of Unisequences onto Cultivated Rice Genomic unisequences, 35 showed similarity to ESTs of Z. mays, Assembly seven matched ESTs of Oryza glaberrima, two matched ESTs of O. rufipogon, and four were homologous to ESTs of The 4,419 unisequences from O. longistaminata were Cryptococcus neoformans, O. minuta, Salmo salar,and mapped onto the O. sativa genome assembly using the Sepioteuthis lessoniana (Table S2). The remaining 23 BLAST program (e<1e-10). A total of 4,285 (96.97%) and unisequences showing no hit to any genomic sequence 4,151 (93.94%) unisequences were aligned with the might represent genes specific to O. longistaminata. Nipponbare and 93–11 genomic sequences, respectively, with >80% sequence identity over the entire length. Of Functional Classification of the ESTs these unisequences, 396 (8.96%) and 389 (8.8%) were aligned with the Nipponbare and 93–11 genomic sequen- All 4,419 unisequences were searched with BLASTx ces, respectively, with 100% sequence identity. These against the UniProt plant database, of which 4,017 results indicated that the O. longistaminata sequences (90.9%) showed a significant alignment to existing gene showed a very high similarity with those of japonica and models at an E-value threshold of e-10. The remaining 402 indica cultivars of O. sativa, which was consistent with the (9.1%) of the O. longistaminata unisequences did not conclusion reached for root transcriptome data for O. match any known protein sequences. longistaminata (Yang et al. 2010). These sequences were In total, 2,494 (56.4%) unisequences were classified mapped with almost equal distribution on all of the 12 functionally according to the GO database. Among them, chromosomes of cultivated rice (Fig. 1). 1,604, 1,444, and 1,900 unisequences were classified by the Unisequences that were not mapped on the O. sativa GO into molecular function, biological process and cellular genomic sequence (71) were analyzed further. In compar- component functions, respectively (Fig. 2). With regard to isons with ntdb of NCBI, 48 unisequences were aligned with molecular function, nucleotide binding (11.9%), catalytic sequences from other species with sequence identity >85% activity (11.6%), and protein binding (9.2%) were the most over the entire length, most of which were monocot. Of these frequent categories. Cellular processes (8.8%) were the

Fig. 3 Total 41 O. longistaminata unisequences had alternative splice site (A5SS) type, and the alternative 3′ splice site (A3SS). b The splicing patterns different from previous ESTs or mRNAs in public PCR analysis of alternative splicing patterns. Total RNA from O. database. a Four types of alternative splicing events: the exon- longistaminata and specific primer sets were used. M DNA size skipping (ES) type, the intron-retention (IR) type, the alternative 5′ markers (sizes in bp) 38 J. Plant Biol. (2012) 55:33–42 most highly represented category among biological pro- proteins. These results indicated that the rhizome apex acts cesses. Of the cellular component category, 21.2% of the as an apical meristem with high respiration for active unisequences were predicted to match mitochondrion metabolic activity.

Table 2 Information of the genes with specific alternative splicing patterns in O. longistaminata

Events Unigenes Gene id Annotation

Exon skipping sdb_Cluster3521.seq.Contig1 LOC_Os10g29514 Expressed protein sdb_Cluster3521.seq.Contig2 LOC_Os10g29514 Expressed protein sdb_Cluster4081.seq.Contig2 LOC_Os03g29350 von Willebrand factor type A domain containing protein, expressed sdb_Cluster3979.seq.Contig2 LOC_Os03g27019 Expressed protein Intron retaintion Unigenes Gene id sdb_Cluster3491.seq.Contig1 LOC_Os01g02720 Elongation factor Tu, putative, expressed sdb_Cluster4257.seq.Contig3 LOC_Os05g04700 Uncharacterized protein family protein, expressed sdb_Cluster4153.seq.Contig3 LOC_Os05g07690 Thioredoxin H-type, putative, expressed sdb_Cluster3811.seq.Contig2 LOC_Os08g39100 Protein phosphatase 2 C family protein, putative, expressed Alternative 5′ splice site Unigenes Gene id sdb_Cluster3275.seq.Contig1 LOC_Os01g61710 Coatomer delta subunit, putative, expressed sdb_Cluster1318 LOC_Os01g61814 40S ribosomal protein S23, putative, expressed sdb_Cluster3532.seq.Contig2 LOC_Os02g45320 F-box family protein, putative, expressed sdb_Cluster1290 LOC_Os02g47440 SNARE domain containing protein, expressed sdb_Cluster3273.seq.Contig1 LOC_Os04g39864 Os4bglu11 - beta-glucosidase homolog, expressed sdb_Cluster3837.seq.Contig1 LOC_Os05g46845 Conserved hypothetical protein sdb_Cluster2546 LOC_Os07g13950 Hypothetical protein sdb_Cluster4250.seq.Contig1 LOC_Os07g33921 60S ribosomal protein L44, putative, expressed sdb_Cluster4177.seq.Contig1 LOC_Os08g45220 Expressed protein sdb_Cluster4177.seq.Contig2 LOC_Os08g45220 Expressed protein sdb_Cluster3794.seq.Contig1 LOC_Os09g07360 ABI3-interacting protein 2, putative, expressed sdb_Cluster3878.seq.Contig1 LOC_Os09g17830 Protein transport protein Sec61 alpha subunit isoform 2, putative sdb_Cluster3438.seq.Contig2 LOC_Os09g30478 Expressed protein sdb_Cluster3541.seq.Contig1 LOC_Os09g36180 Glycosyl transferase family 8 protein sdb_Cluster3779.seq.Contig2 LOC_Os10g42630 Expressed protein sdb_Cluster3795.seq.Contig1 LOC_Os11g33330 Metallopeptidase family M24 containing protein, expressed sdb_Cluster3981.seq.Contig1 LOC_Os12g37970 myb family transcription factor, putative, expressed sdb_Cluster3981.seq.Contig2 LOC_Os12g37970 myb family transcription factor, putative, expressed Alternative 3′ splice site Unigenes Gene id sdb_Cluster2149 LOC_Os01g20860 Phospholipase D. Active site motif family protein, expressed sdb_Cluster4043.seq.Contig1 LOC_Os01g60440 HEAT repeat-containing protein, putative, expressed sdb_Cluster3853.seq.Contig1 LOC_Os02g20310 HEAT repeat family protein, expressed sdb_Cluster2775 LOC_Os02g36974 14-3-3 Protein, putative, expressed sdb_Cluster4260.seq.Contig1 LOC_Os02g36974 14-3-3 Protein, putative, expressed sdb_Cluster4260.seq.Contig2 LOC_Os02g36974 14-3-3 Protein, putative, expressed sdb_Cluster3532.seq.Contig2 LOC_Os02g45320 F-box family protein, putative, expressed sdb_Cluster3847.seq.Contig2 LOC_Os03g07420 Predicted 3-dehydroquinate synthase family protein, expressed sdb_Cluster3964.seq.Contig2 LOC_Os03g49180 Alkaline phytoceramidase family protein, expressed sdb_Cluster651 LOC_Os03g61640 expressed protein sdb_Cluster4260.seq.Contig3 LOC_Os04g38870 14-3-3-Like protein GF14-6, putative, expressed sdb_Cluster4250.seq.Contig1 LOC_Os07g33921 60S ribosomal protein L44, putative, expressed sdb_Cluster3853.seq.Contig1 LOC_Os08g42189 Expressed protein sdb_Cluster3853.seq.Contig1 LOC_Os08g42268 Expressed protein sdb_Cluster3534.seq.Contig1 LOC_Os09g27700 Microtubule-associated protein family protein, putative, expressed J. Plant Biol. (2012) 55:33–42 39

O. longistaminata Genes with Specific Alternative Splicing Unisequences with alternative splicing patterns included Patterns genes encoding F-box proteins, PP2C proteins, a Myb transcription factor, and others with known or unknown Alternative splicing in plants plays an important role in functions. All of the detected alternative splicing patterns modulating gene expression and ultimately plant form might be specific to O. longistaminata or unique to the andfunction(Reddy2007). Alternative 5′ and 3′ splice rhizome. sites, intron retention, exon skipping, and mutually exclusive exon splicing are typical types of alternative SSR Analysis splicing events (Graveley 2001;Black2003; Blencowe 2006). Forty-one O. longistaminata unisequences EST-based simple sequence repeat (SSR) analysis is showed specific alternative splicing patterns, which were useful for dissection of genetic diversity among closely classified into the exon skipping, intron retention, related species and cultivars (Nicot et al. 2004). Of the alternative 5′ splice site, and alternative 3′ splice site 4,419 unisequences examined, 516 unisequences types (Fig. 3a). Each event contained 4, 4, 18, and 15 contained 666 SSRs, representing 93 different motifs unisequences, respectively (Table 2). PCR analysis of (Table 3). Trinucleotide repeats were the most abundant alternative splicing patterns with specific primers con- SSR type (241), followed by dinucleotide (190), mono- firmed the predicted alternative splicing events (Fig. 3b). nucleotide (103), tetranucleotide (27), pentanucleotide

Table 3 Type of motifs of SSRs detected from 4419 Motif type Number Motif type Number Motif type Number unisequences AG/CT 142 AATG/ATTC 2 ACCCC/GGGGT 1 A/T 99 AGGG/CCCT 2 ACCTC/AGGTG 1 CCG/CGG 69 ATCG/ATCG 2 ACGAG/CGTCT 1 AGG/CCT 55 AAAACC/GGTTTT 2 ACGAT/ATCGT 1 AGC/CTG 45 AAGCAG/CTGCTT 2 ACGGG/CCCGT 1 AAG/CTT 37 ACCCAT/ATGGGT 2 ACTCC/AGTGG 1 AC/GT 27 AGCCGG/CCGGCT 2 ACTCG/AGTCG 1 AT/AT 15 AAAC/GTTT 1 AGCCG/CGGCT 1 ACG/CGT 14 AAAT/ATTT 1 AGCGG/CCGCT 1 AGAGG/CCTCT 12 AACC/GGTT 1 AGGGC/CCCTG 1 ATC/ATG 8 AAGC/CTTG 1 ATCGC/ATGCG 1 CG/CG 6 AATT/AATT 1 ATGCC/ATGGC 1 AAGAG/CTCTT 6 ACCG/CGGT 1 AAAAAT/ATTTTT 1 AGGCGG/CCGCCT 6 AGAT/ATCT 1 AAAGAT/ATCTTT 1 ACC/GGT 5 ATGC/ATGC 1 AAAGCC/CTTTGG 1 ACT/AGT 5 AAAAC/GTTTT 1 AAGACG/CGTCTT 1 C/G 4 AAAAG/CTTTT 1 AAGATG/ATCTTC 1 AGGGG/CCCCT 4 AAAAT/ATTTT 1 ACAGCC/CTGTGG 1 AAGAGG/CCTCTT 4 AAACC/GGTTT 1 ACCGAG/CGGTCT 1 AAC/GTT 3 AAACG/CGTTT 1 ACCGGC/CCGGTG 1 AATC/ATTG 3 AAAGC/CTTTG 1 ACGGCG/CCGTCG 1 AGCG/CGCT 3 AACCC/GGGTT 1 ACTCCG/AGTCGG 1 ATCC/ATGG 3 AAGCG/CGCTT 1 ACTGCT/AGCAGT 1 ACTGC/AGTGC 3 AAGCT/AGCTT 1 AGAGGG/CCCTCT 1 AGAGC/CTCTG 3 AAGGG/CCCTT 1 AGATGG/ATCTCC 1 AGGCG/CCTCG 3 AATCC/ATTGG 1 AGCAGG/CCTGCT 1 CCGCG/CGCGG 3 AATCG/ATTCG 1 AGCATG/ATGCTC 1 AAAAAG/CTTTTT 3 AATGC/ATTGC 1 AGCCGC/CGGCTG 1 ACCTCC/AGGTGG 3 ACACT/AGTGT 1 AGCCTG/AGGCTC 1 AAAG/CTTT 2 ACAGC/CTGTG 1 AGCTCC/AGCTGG 1 AAGG/CCTT 2 ACAGT/ACTGT 1 AGGCCG/CCTCGG 1 40 J. Plant Biol. (2012) 55:33–42

(62), and hexanucleotide (43) types. The repeat AG/CT ing (Fahlgren et al. 2006). Another unisequence was the most abundant (74.74%) dinucleotide SSR, A/T (sdb_Cluster1899) on the QRn5 interval is a homolog of repeats were the most abundant (96.12%) mononucleotide trehalose 6-phosphate synthase in Arabidopsis,whichis repeat motif, and CCG/CGG repeats were the most essential for embryogenic and vegetative growth and frequent (28.63%) trinucleotide SSR. These results were responsiveness to ABA in germinating seeds and stomatal consistent with the results of a previous study on O. guard cells (Gomez et al. 2010). A unisequence rufipogon (Lu et al. 2008). These SSRs data will be (Sdb_Cluster3971.seq.Contig1) mapped to the QRl7 region valuable for EST-SSR marker development. is a homolog of AtSAT32 in Arabidopsis,whichis involved in root elongation (Park et al. 2009). These Colocalization of Unisequences and Rhizome-related QTLs unisequences colocalized with rhizome-related QTLs may in Rice and Sorghum play important roles in the formation and development of rhizomes in O. longistaminata. In a previous study, we genetically identified rhizome- Additionally, five of the unisequences corresponding to related QTLs in 12 regions on eight rice chromosomes, and known rhizome-specific expressed genes (Hu et al. 2011), two major dominant-complementary genes (Rhz2 and Rhz3) comprising genes encoding betaine-aldehyde dehydroge- controlling expression of the rhizomatous phenotype were nase (Os04g0464200), SL-TPS/P (Os05g0518600), localized on chromosomes 3 and 4 (Hu et al. 2003). In this naringenin-chalcone synthase (Os10g0472900), protein study, all of the unisequences expressed in rhizome tips kinase domain-containing protein (Os01g0655500), and were aligned with O. sativa mRNA sequences, which one hypothetical protein (Os07g0588800), were also revealed that 178 unisequences were colocalized to 10 of colocalized to the regions of Rhz3, QRn5, QRn10, QRl1, the QTL intervals (Table S3). and QRl7, respectively. The unisequences reported in this Five unisequences were mapped to the Rhz2 region, study will be a useful resource for gene discovery in O. one of them encodes a response regulator receiver domain- longistaminata, especially of genes involved in rhizome containing protein (sdb_Cluster2813), which will be formation and development. further analyzed in the following section. Unisequences Sorghum homologous genes corresponding to the uni- (25) were mapped to the Rhz3 region, which included a sequences were aligned against the TIGR sorghum assem- gene encoding a cryptochrome 1 apoprotein homolog of bly release 1 using the BLASTN algorithm and 58 cryptochrome 1 (CRY1)inArabidopsis.TheArabidopsis unisequences were comparatively mapped on sorghum CRY1 was reported to restrain lateral-root growth by rhizome-related QTL intervals (Paterson et al. 1995), inhibiting auxin transport (Zeng et al. 2010). One including all of the six unisequences mapped on the Rhz2 unisequence (Sdb_Cluster1017) on the QRn5 interval region in O. longistaminata. With the accomplishment of was a homolog of Auxin Response Factor 3 in Arabidopsis, sorghum genome sequencing (Paterson et al. 2009), further which is involved in developmental timing and pattern- comparative genomics study is necessary to elucidate the

Fig. 4 The expression pattern of OLRR1. a The qRT-PCR a b profiles of OLRR1 in various c organs of O. longistaminata, 2 including rhizome internodes (RI), rhizome tip (RT), young 1.8 leaf (YL), shoot internodes (SI), and shoot tip (ST), bars donate 1.6 standard deviation. b In situ 1.4 hybridization of OLRR1 in rhizome tip. Scale bars 200 μm, 1.2 longitudinal section, antisense 1 probe. c OLRR1 in rhizome tip, longitudinal section, sense probe 0.8

Relative expression 0.6 0.4 0.2 0 RI RT YL SI ST J. Plant Biol. (2012) 55:33–42 41 molecular role of these rhizome-related QTL-associated Dello Ioio R, Nakamura K, Moubayidin L, Perilli S, Taniguchi M, candidate genes. Morita MT, Aoyama T, Costantino P, Sabatini S (2008) A genetic framework for the control of cell division and differentiation in the root meristem. Science 322:1380–1384 Expression Pattern of OLRR1 Fahlgren N, Montgomery TA, Howell MD, Allen E, Dvorak SK, Alexander AL, Carrington JC (2006) Regulation of AUXIN One unisequence (Sdb_Cluster2813) mapped on the Rhz2 RESPONSE FACTOR3 by TAS3 ta-siRNA affects developmental timing and patterning in Arabidopsis. Curr Biol 16:939–944 region was a gene encoding a response regulator receiver Ghesquiere A (1985) Evolution of Oryza longistaminata. In: Stephen domain-containing protein (Os03g0224200, OLRR1). The JB (ed) Rice genetics I. International Rice Research Institute homologous gene AtARR1, which encodes a primary (IRRI), Philippines, pp 15–27 cytokinin-response transcription factor in Arabidopsis, has Ghesquiere A (1991) Re-examination of genetic control of the reproductive barrier between Oryza longistaminata and O. sativa a role in the promotion of cell differentiation by activating and relationship with rhizome expression. In: Khush GS (ed) the gene SHY2/IAA3 (Dello Ioio et al. 2008). Quantitative Rice genetics II. International Rice Research Institute (IRRI), RT-PCR analysis of OLRR1 in five tissues of O. long- Philippines, pp 729–730 istaminata, comprising the rhizome internode, young leaf, Ghesquiere A, Causse M (1992) Linkage study between molecular markers and genes controlling the reproductive barrier in rhizome tip, shoot tip, and shoot internode, indicated that interspecific backcross between O. sativa and O. longistaminata. OLRR1 expression was abundant in the rhizome tip, shoot RGN 9:28–31 tip, and young leaf (Fig. 4). In situ hybridization indicated Gomez LD, Gilday A, Feil R, Lunn JE, Graham IA (2010) AtTPS1- that OLRR1 was highly expressed in the apical meristem of mediated trehalose 6-phosphate synthesis is essential for embryogenic and vegetative growth and responsiveness to ABA in the rhizome tip, suggesting it might play an important role germinating seeds and stomatal guard cells. Plant J 64:1–13 in rhizome development. Graveley BR (2001) Alternative splicing: increasing diversity in the proteomic world. Trends Genet 17:100–107 Hu FY, Tao DY, Sacks E, Fu BY, Xu P, Li J, Yang Y, McNally K, Khush GS, Paterson AH, Li ZK (2003) Convergent evolution of Conclusion perenniality in rice and sorghum. Proc Natl Acad Sci USA 100:4050–4054 In this study, we sequenced 10,283 cDNA clones from a Hu F, Wang D, Zhao X, Zhang T, Sun H, Zhu L, Zhang F, Li L, Li Q, normalized cDNA library constructed from rhizome tip tissues Tao D, Fu B, Li Z (2011) Identification of rhizome-specific genes by genome-wide differential expression Analysis in Oryza of O. longistaminata. A total of 4,419 unisequences were longistaminata. BMC Plant Biol 11:18 assembled and comparatively mapped onto the O. sativa International Rice Genome Sequencing Project (2005) The map-based genomic sequence; alternative splicing events and a set of sequence of the rice genome. Nature 436:793–800 SSRs were determined in these unisequences. Further Jackson D (1991) In-situ hybridization in plants. In: Bowles DJ, Gurr SJ, McPherson M (eds) Molecular plant pathology: a practical bioinformatic analysis revealed that a number of unisequen- approach. Oxford Univ, Press, Oxford, UK, pp 163–174 ces were colocalized on rhizome-related QTL intervals in rice Jang CS, Kamps TL, Skinner DN, Schulze SR, Vencill WK, Paterson and sorghum. Collectively, these results provide valuable data AH (2006) Functional classification, genomic organization, for further identification of rhizome-related genes functionally putatively cis-acting regulatory elements, and relationship to quantitative trait loci, of sorghum genes with rhizome-enriched involved in rhizome initiation and development. expression. Plant Physiol 142:1148–1159 Jang CS, Kamps TL, Tang H, Bowers JE, Lemke C, Paterson AH (2009) Evolutionary fate of rhizome-specific genes in a non- Acknowledgment This work was supported by the National Natural rhizomatous sorghum genotype. Heredity 102:266–273 Science Foundation of China (grant no. U0836605) and the Key Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res Project from MOA (grant no. 2008ZX001-003). 12:656–664 Lu TT, Yu S, Fan D, Mu J, Shangguan Y, Wang Z, Minobe Y, Lin Z, Han B (2008) Collection and comparative analysis of 1888 full- References length cDNAs from wild rice Oryza rufipogon Griff. W1943. DNA Res 15:285–295 Lu T, Lu G, Fan D, Zhu C, Li W, Zhao Q, Feng Q, Zhao Y, Guo Y, Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Huang X, Han B (2010) Function annotation of the rice Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new transcriptome at single-nucleotide resolution by RNA-seq. generation of protein database search programs. Nucleic Acids Genome Res 20:1238–1249 Res 25:3389–3402 Maekawa M, Inukai T, Rikiishi K, Matsuura T, Govidaraj KG (1998) Black DL (2003) Mechanisms of alternative pre-messenger RNA Inheritance of the rhizomatous traits in hybrid of Oryza long- splicing. Annu Rev Biochem 72:291–336 istaminata Chev. et Roehr. and O. sativa L. SABRAO. J Breeding Blencowe BJ (2006) Alternative splicing: new insights from global Genet 30:69–72 analyses. Cell 126:37–47 Nakagahra M, Okuno K, Vaughan D (1997) Rice genetic resources Cho SK, Ok SH, Jeung JU, Shim KS, Jung KW, You MK, Kang KH, history, conservation, investigative characterization and use in Chung YS, Choi HC, Moon HP, Shin JS (2004) Comparative Japan. Plant Mol Biol 35:69–77 analysis of 5,211 leaf ESTs of wild rice (Oryza minuta). Plant Nicot N, Chiquet V, Gandon B, Amilhat L, Legeai F, Leroy P, Bernard Cell Rep 22:839–847 M, Sourdille P (2004) Study of simple sequence repeat (SSR) 42 J. Plant Biol. (2012) 55:33–42

markers from wheat expressed sequence tags (ESTs). Theor Appl Vaughan DA (1994) Wild relatives of rice: genetic resources handbook. Genet 109:800–805 International Rice Research Institute (IRRI), Philippines, pp 46–47 Park MY, Chung MS, Koh HS, Lee DJ, Ahn SJ, Kim CS (2009) Wang K, Peng H, Lin E, Jin Q, Hua X, Yao S, Bian H, Han N, Pan J, Isolation and functional characterization of the Arabidopsis salt- Wang J, Deng M, Zhu M (2009a) Identification of genes related to tolerance 32 (AtSAT32) gene associated with salt tolerance and the development of bamboo rhizome bud. J Exp Bot 61:551–561 ABA signaling. Physiol Plant 135:426–435 Wang Z, Gerstein M, Snyder M (2009b) RNA-Seq: a revolutionary Paterson AH, Schertz KF, Lin YR, Liu SC, Chang YL (1995) The tool for transcriptomics. Nat Rev Genet 10:57–63 weediness of wild plants: molecular analysis of genes influencing Yamamoto K, Sasaki T (1997) Large-scale EST sequencing in rice. dispersal and persistence of johnsongrass, Sorghum halepense Plant Mol Biol 35:135–144 (L.) Pers. Proc Natl Acad Sci U S A 92:6127–6131 Yang H, Hu L, Hurek T, Reinhold-Hurek B (2010) Global characteriza- Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, tion of the root transcriptome of a wild species of rice, Oryza Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, longistaminata, by deep sequencing. BMC Genomics 11:705 Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Li J, AA, Wang Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Liu Z, Qi Q, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Peterson DG, Mehboob ur R, Ware D, Westhoff P, Mayer KFX, Xu Z, Zhang J, He S, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Messing J, Rokhsar DS (2009) The Sorghum bicolor genome and Tao L, Ye J, Tan J, Chen X, He J, Liu D, Tian W, Tian C, Xia H, the diversification of grasses. Nature 457:551–556 Bao Q, Li G, Gao H, Cao T, Zhao W, Li P, Chen W, Zhang Y, Hu Reddy AS (2007) Alternative splicing of pre-messenger RNAs in J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu plants in the genomic era. Annu Rev Plant Biol 58:267–294 Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Tao M, Zhu L, Sacks EJ, Roxas JP, Sta CM (2003) Developing perennial upland Yuan L, Yang H (2002) A draft sequence of the rice genome rice II: filed performance of S1 families from an intermated (Oryza sativa L. ssp. indica). Science 296:79–92 Oryza sativa/O. longistaminata population. Crop Sci 43:129– Zeng J, Wang Q, Lin J, Deng K, Zhao X, Tang D, Liu X (2010) 134 Arabidopsis cryptochrome-1 restrains lateral roots growth by Song WY, Wang GL, Chen LL, Kim HS, Pi LY, Holsten T, Gardner J, inhibiting auxin transport. J Plant Physiol 167:670–673 Wang B, Zhai WX, Zhu LH, Fauquet C, Ronald P (1995) A Zhang J, Feng Q, Jin C, Qiu D, Zhang L, Xie K, Yuan D, Han B, Zhang receptor kinase-like protein encoded by the rice disease resistance Q, Wang S (2005) Features of the expressed sequences revealed by a gene, Xa21. Science 270:1804–1806 large-scale analysis of ESTs from a normalized cDNA library of the Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: elite indica rice cultivar Minghui 63. Plant J 42:772–780 unlocking genetic potential from the wild. Science 277:1063–1066 Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Thiel T, Michalek W, Varshney RK, Graner A (2003) Exploiting EST Fang X, Chen L, Tian W, Tao Y, Kristiansen K, Zhang X, Li S, databases for the development and characterization of gene-derived Yang H, Wang J (2010) Deep RNA sequencing at single base- SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet pair resolution reveals high complexity of the rice transcriptome. 106:411–422 Genome Res 20:646–654