Functional Annotation of Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5 Joy Hoskeri
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Computer Science & Systems Biology - Open Access Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000055 www.omicsonline.com JCSB/Vol.3 Issue 2 Functional Annotation of Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5 Joy Hoskeri. H1, V. Krishna1 and C. Amruthavalli2 1P.G. Department of Studies and Research in Biotechnology and Bioinformatics, Kuvempu University, Shankaraghatta – 577 451, Karnataka, India 2Bioinformatics Division, Centre for Information Science and Technology, University of Mysore, Mysore Abstract Completed genome sequences are rapidly increasing for Rickettsia but most the genes in them does not have an assigned function. The present investigation mainly focused only on identifying the function of hypothetical proteins of Rickettsia massiliae MTU5 that were conserved in Rickettsia species. Nearly, 114 conserved hypothetical proteins were selected that were conserved in Rickettsia species. Function annotation was carried out by using databases like SMART, Interproscan, Pfam, JAFA, COG, and BLAST. Among 114 conserved hypothetical proteins only 35 proteins were annotated. The study motivated us to thoroughly study the genome of Rickettsia massiliae MTU5 by bioinformatics approach for categorizing the proteins whose function was unknown. Keywords: Functional annotation; Hypothetical proteins; However, in any newly sequenced bacterial genomes, as Rickettsia massiliae MTU5 many as 30-40% of the genes do not have an assigned function. A significant proportion of the remaining 60-70% of genes in Introduction genomes, for which functional annotations have been made, are Over the last decade, the genomes of several hundreds of often imprecisely described or assigned with vague functions. organisms have been sequenced (http://www.ncbi.nlm.nih.gov/ The functional annotations are in most cases derived by Genomes/) providing a tremendous amount of data that need to inference rather than by experiment, through the observation be interpreted and decorated with functions. of some level of sequence identity to a characterized gene product from another organism. It is Genomics itself which Approximately, 10 complete Rickettsia genome sequences provides that rare opportunity in science where the boundaries have been deposited in public databases. Rickettsiae has been of current knowledge can be clearly defined. commonly detected in Rhipicephalus ticks from Central Africa, France, Greece, Mali, Portugal, Spain, Switzerland, and the United Completed genome sequences are rapidly increasing for States (Matsumoto et al., 2005; Eremeeva et al., 2005; Eremeeva Rickettsia but most the genes in them does not have an assigned et al., 2006). R. massiliae might be commonly associated with function. In the present investigation, we have selected a true these worldwide-distributed ticks. R. massiliae has also been human pathogen Rickettsia massiliae MTU5 strain of Rickettsia recently identified in Ixodes ricinus ticks (Fernandez-Soto et al., for function annotation of hypothetical protein. 2006). The sequencing and the primary analysis of the genome Materials and Methods of R. massiliae strain MTU5 isolated from the R. turanicus tick collected on horses in Camargues, France (Blanc et al., 2007). The method includes new protocol for function annotation of hypothetical proteins found in R. massiliae MTU5 that are Rickettsia massiliae is the cause of Rocky Mountain spotted conserved in Rickettsia species using bioinformatics approach. fever (RMSF) and is the prototype bacterium in the spotted That is, changing many paradigms of function annotation, fever group of rickettsiae. Rickettsia massiliae is found in the which does not involve any conventional methods. Americas and is transmitted to humans through the bite of infected ticks. Rickettsiae are a group of organisms belonging to Annotations of the identified proteins were facilitated the class Alphaproteobacteria, a large and metabolically diverse by highly automated bioinformatics tools, such as ExPASy group of gram-negative bacteria (Olsen et al., 1994; Stothard et Proteomics server, NCBI website, KEGG database (Kyoto al., 1995; Weisburg et al., 1989). Encyclopedia of Genes and Genomes). These sites provided many links to different branches of information and special annotations In Spain R. massiliae is prevalent in ticks (Cardenosa et for individual data. Function annotation was carried out by al., 2003). These virulent species of rickettsiae are of great interest both as emerging infectious diseases (Azad, 1998) and for their potential deployment as bioterrorism agents (Azad, 2007). Species in the genus Rickettsia are obligate intracellular *Corresponding author: Joy Hoskeri. H, P.G. Department of Studies and Research symbionts of plants (Beati and Raoult, 1993). Rickettsia in Biotechnology and Bioinformatics, Kuvempu University, Shankaraghatta – 577 451, Karnataka, India, Tel: +91-9343544218; E-mail: [email protected] massiliae was first isolated from Rhipicephalus sanguineus collected in Marseille (France) in 1992 (Beati et al., 1993a; Received March 22, 2010; Accepted April 11, 2010; Published April 11, 2010 Beati et al., 1993b). The Rickettsiae have very small genomes Citation: Hoskeri JH, Krishna V, Amruthavalli C (2010) Functional Annotation of of about 1.0-1.5 million bases. Certain segments of Rickettsial Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5. J Comput Sci Syst Biol 3: 050-052. doi:10.4172/jcsb.1000055 genomes resemble that of mitochondria (Emelyanov, 2003). The deciphered genome of Rickettsia is approx 1,111,523 bp Copyright: © 2010 Hoskeri JH, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits long and contains 834 protein-coding genes (Andersson et al., unrestricted use, distribution, and reproduction in any medium, provided the 1998). original author and source are credited. lish Pub ing S G C ro I u M p O J Comput Sci Syst Biol Volume 3(2): 050-052 (2010) - 050 ISSN:0974-7230 JCSB, an open access journal Citation: Hoskeri JH, Krishna V, Amruthavalli C (2010) Functional Annotation of Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5. J Comput Sci Syst Biol 3: 050-052. doi:10.4172/jcsb.1000055 using databases like SMART (http://smart.embl-heidelberg.de/), massiliae MTU5. And the results can depicted from the Table 1 interproscan(http://www.ebi.ac.uk/Tools/InterProScan/),Pfam (included as a supplementary information). (http://pfam.sanger.ac.uk/), JAFA (http://jafa.burnham.org/), The central goal of this study was determining the protein COG (http://www.ncbi.nlm.nih.gov/COG/), BLAST (http:// functions from this sequences. Functional annotation of blast.ncbi.nlm.nih.gov/Blast.cgi). In addition to own extensive conserved hypothetical protein are of major importance in literature survey, bioinformatics tools helped us to check providing insides into their molecular functions and will experimental data by comparison to theoretical data and also help us in the identification of new drugs against spotted from the results of other fields (Figure 1). fever caused by Rickettsia massiliae MTU5 infection. Table As per the data from NCBI (http://www.ncbi.nlm.nih.gov), 1 shows the functional proteomics of Rickettsia massiliae Rickettsia massiliae MTU5 as totally 1423 genes with 968 MTU5 hypothetical proteins by using various online tools and protein coding and 416 pseudogenes with other coding frames. databases. These hypothetical proteins were further studied Amongst these nearly 296 hypothetical proteins were found. for their conservness in its parent’s species Rickettsia. This Further, only the hypothetical proteins that were conserved in resulted in nearly 114 conserved hypothetical proteins. These Rickettsia species were conserved for function annotation. And 114 hypothetical proteins were considered for training in so obtained results can depicted from Table 1 (included as a identifying their function based on the results obtained by supplementary information). studying these sequences using databases and tools such as, SMART, Interproscan, Pfam, JAFA, COG, and NCBI-BLAST. Results and Discussion We have followed a different method of protocol for function In this investigation, genome of Rickettsia massiliae MTU5 annotation of conserved hypothetical protein. This protocol was thoroughly studied by identifying the proteins whose includes prediction of protein function with respect to different function was not annotated and those that are conserved in on line functional annotation tools. The results of all these tools Rickettsia, using various Bioinformatics tools that helped us were considered and compare with the data and the results that in identifying the genome strategies of the strain Rickettsia proved to be similar in most of the tools, the proteins whose Figure 1: Flowchart dictates the overall methodology. lish Pub ing S G C ro I u M p O J Comput Sci Syst Biol Volume 3(2): 050-052 (2010) - 051 ISSN:0974-7230 JCSB, an open access journal Journal of Computer Science & Systems Biology - Open Access Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000055 www.omicsonline.com JCSB/Vol.3 Issue 2 information is overlapping are considered to be significant. Confirmation that Rickettsia helvetica sp. nov. is a distinct species of From the results mentioned (See, Table 1 (included as a the spotted fever group of rickettsiae. Int J Syst Bacteriol 43: 521-526. supplementary information)), from 114 conserved hypothetical