Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae Special Online Collection Naoki Yamamoto1,4,5, Toru Kudo1,4, Shoko Fujiwara2,4,*, Yukiko Takatsuka2, Yasutaka Hirokawa2, Mikio Tsuzuki2, Tomoyuki Takano1, Masaaki Kobayashi1, Kunihiro Suda3, Erika Asamizu3, Koji Yokoyama1, Daisuke Shibata3, Satoshi Tabata3 and Kentaro Yano1,* 1Bioinformatics Laboratory, School of Agriculture, Meiji University, 1-1-1 Higashi-mita, Tama-ku, Kawasaki, Kanagawa, 214-8571 Japan 2School of Life Sciences, Tokyo University of Pharmacy and Life Sciences, 1432-1 Horinouchi, Hachioji, Tokyo, 192-0392 Japan, CREST, Japan 3Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan 4These authors contributed equally to this work. 5Present address: International Rice Research Institute, DAPO 7777, Metro Manila 1301, Philippines. *Corresponding authors: Shoko Fujiwara, E-mail, [email protected]; Fax, +81-44-934-7046; Kentaro Yano, E-mail: kyano@isc. meiji.ac.jp; Fax, +81-42-676-6721. (Received September 1, 2015; Accepted November 27, 2015) Pleurochrysis is a coccolithophorid genus, which belongs to Introduction the Coccolithales in the Haptophyta. The genus has been – Database Paper used extensively for biological research, together with Pleurochrysis are one of the coccolithophorids, which produce Emiliania in the Isochrysidales, to understand distinctive fea- calcified scales, called coccoliths, on the cell surface. tures between the two coccolithophorid-including orders. Coccolithophorids are included in two orders of the However, molecular biological research on Pleurochrysis Haptophyta, the Coccolithales and Isochrysidales (Edvardsen such as elucidation of the molecular mechanism behind et al. 2000). As representatives of the orders, Pleurochrysis coccolith formation has not made great progress at least (Coccolithales) and Emiliania (Isochrysidales) have been used in part because of lack of comprehensive gene information. for coccolithophorid research as they can be cultured in the To provide such information to the research community, laboratory (Berges et al. 2001, Marsh 2003). To date, research we built an open web database, the Pleurochrysome with the two genera has revealed that the coccolith production (http://bioinf.mind.meiji.ac.jp/phapt/), which currently mechanism and other morpho-physiological characteristics are stores 9,023 unique gene sequences (designated as very different between the two orders, e.g. the subcellular com- UNIGENEs) assembled from expressed sequence tag se- partments where coccolith production occurs (van der Wal quences of P. haptonemofera as core information. The et al. 1983, Inouye and Pienaar 1984, Westbroek et al. 1989), UNIGENEs were annotated with gene sequences sharing sig- chemical variation of coccolith acid polysaccharides (van nificant homology, conserved domains, Gene Ontology, Emburg et al. 1986, Marsh et al. 1992, Hirokawa et al. 2005) KEGG Orthology, predicted subcellular localization, open and the morphology of the base plate of the coccolith (de reading frames and orthologous relationship with genes of Vrind-de Jong and de Vrind 1997). Furthermore, although the 10 other algal species, a cyanobacterium and the yeast Haptophyta are known as one of the old-fashioned supergroup Saccharomyces cerevisiae. This sequence and annotation in- Chromalveolates which probably gained photosynthetic ability formation can be easily accessed via several search functions. through secondary endosymbiosis with rhodophyte(s) (Jordan Besides fundamental functions such as BLAST and keyword and Chamberlain 1997, Cavalier-Smith 1999), the phylogenetic searches, this database also offers search functions to ex- position of the Haptophyta is still being uncovered (Burki et al. plore orthologous genes in the 12 organisms and to 2007, Hackett et al. 2007, Burki et al. 2012). seek novel genes. The Pleurochrysome will promote To understand the molecular mechanisms behind the bio- molecular biological and phylogenetic research on cocco- logical functions and to uncover the evolutionary history, com- lithophorids and other haptophytes by helping scientists prehensive gene information is very important since such mine data from the primary transcriptome of P. information accelerates molecular biological studies and en- haptonemofera. ables comparative genomics. Recently, a reference genome of Emiliania huxleyi, which is the predominant coccolithophorid Keywords: Coccolithophorids Expressed sequence tag species in the current ecosystem, has been published (Read Haptophyta Integrated database Ortholog . Pleurochrysis et al. 2013). On this E. huxleyi genome, 30,569 gene models Abbreviations: C-cells, coccolith-bearing cells; EST, expressed were predicted (Read et al. 2013) and their sequence informa- sequence tag; GO, Gene Ontology; ID, identifier; ORF, open tion is available in the genome portal of the Joint Genome reading frame. Institute (JGI) (Nordberg et al. 2014). However, there is no Plant Cell Physiol. 57(1): e6(1–7) (2016) doi:10.1093/pcp/pcv195, Advance Access publication on 7 January 2016, available online at www.pcp.oxfordjournals.org ! The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. N. Yamamoto et al.|Pleurochrysis database Pleurochrysis database, and publicly available gene sequence database (nt), EST database (dbEST) and protein database information on Pleurochrysis is limited to only 156 nucleotide (nr) using each UNIGENE sequence as a query. Applying 1E- and 64 protein sequences deposited in the GenBank database 10 as a threshold e-value, 3,215 UNIGENEs were annotated with (Benson et al. 2013). Therefore, to facilitate research with at least one similar sequence in the databases. Conserved Pleurochrysis toward elucidating phycological issues including domain information and Gene Ontology (GO) information the coccolith formation mechanism and the evolutionary his- were retrieved by InterProScan. Conserved domains were de- tory, establishing its comprehensive gene information and tected in 5,786 UNIGENEs, and GO terms were assigned to having a freely available and easily accessible platform to pro- 3,157 of the UNIGENEs where a domain was found. Analysis vide the information, more specifically a public web database, with the KEGG Automatic Annotation Server yielded KEGG are crucial. Orthology information for 994 UNIGENEs. To obtain informa- Here, we introduce a Pleurochrysis transcript sequence data- tion related to subcellular localization of polypeptides poten- base, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/ tially encoded by UNIGENEs, analyses using the SignalP phapt/). The Pleurochrysome stores >9,000 transcript se- (Petersen et al. 2011), ChloroP (Emanuelsson et al. 1999) and quences of Pleurochrysis haptonemofera which were con- WoLF PSORT (Horton et al. 2007) programs were also per- structed from expressed sequence tag (EST) sequences newly formed. SignalP predicted ‘Signal peptide’ and ‘Signal anchor’ or previously analyzed (Fujiwara et al. 2007) with their struc- in 3,912 and 6,771 UNIGENEs, respectively; ChloroP predicted tural and functional annotation derived from our bioinformatic that 7,485 UNIGENEs may encode plastid-localized proteins. It analyses. To allow comparative genomics approaches, the data- should be noted that, when looking at this information, the base incorporates information of orthologous sequences results of open reading frame (ORF) estimation should also be among 12 divergent unicellular species: 10 algae, a cyanobac- referred to since the probability of the frame being the true ORF terium and a yeast. The database offers fundamental and of the transcript is not considered in prediction of localization. unique search functions to explore candidate genes The estimation of ORFs was performed using the ESTScan tool relevant to targeted traits and novel genes in Pleurochrysis. (Iseli et al. 1999). The result showed that significant ORFs were To our knowledge, this is the first ‘full-blown’ web database deduced in 4,210 UNIGENEs. Hence, the subcellular localization which provides comprehensive transcript information in the predicted in the estimated ORFs of the 4,210 UNIGENEs would Haptophyta. be more reliable than that in the other ORFs. Orthologous genes for UNIGENEs were predicted based on results obtained by analysis with OrthoMCL (Chen et al. 2006) and BLAST search Results as described later. These annotation and prediction results are collectively housed in the Pleurochrysome. Information stored in the database To obtain comprehensive transcript information, we newly Construction and search functions of the database analyzed 4,924 EST sequences of P. haptonemofera in addition The Pleurochrysome was developed as a web database so that to the 9,564 ESTs which we had already sequenced (Fujiwara users can access the database via the Internet using a web et al. 2007; DDBJ accession Nos. HX954614–HX969076). These browser. The database and user interfaces were built as a typical EST sequences were assembled into contigs after quality con- server–client system using open source software: Linux trol. The resulting sequences of 1,868 contigs and 7,155 singlets, (CentOS release 5.10,
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages7 Page
-
File Size-