Journal of Computer Science & Systems Biology - Open Access Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000055 www.omicsonline.com JCSB/Vol.3 Issue 2

Functional Annotation of Conserved Hypothetical Proteins in Massiliae MTU5 Joy Hoskeri. H1, V. Krishna1 and C. Amruthavalli2 1P.G. Department of Studies and Research in Biotechnology and Bioinformatics, Kuvempu University, Shankaraghatta – 577 451, Karnataka, India 2Bioinformatics Division, Centre for Information Science and Technology, University of Mysore, Mysore

Abstract Completed genome sequences are rapidly increasing for Rickettsia but most the genes in them does not have an assigned function. The present investigation mainly focused only on identifying the function of hypothetical proteins of Rickettsia massiliae MTU5 that were conserved in Rickettsia species. Nearly, 114 conserved hypothetical proteins were selected that were conserved in Rickettsia species. Function annotation was carried out by using databases like SMART, Interproscan, Pfam, JAFA, COG, and BLAST. Among 114 conserved hypothetical proteins only 35 proteins were annotated. The study motivated us to thoroughly study the genome of Rickettsia massiliae MTU5 by bioinformatics approach for categorizing the proteins whose function was unknown.

Keywords: Functional annotation; Hypothetical proteins; However, in any newly sequenced bacterial genomes, as Rickettsia massiliae MTU5 many as 30-40% of the genes do not have an assigned function. A significant proportion of the remaining 60-70% of genes in Introduction genomes, for which functional annotations have been made, are Over the last decade, the genomes of several hundreds of often imprecisely described or assigned with vague functions. organisms have been sequenced (http://www.ncbi.nlm.nih.gov/ The functional annotations are in most cases derived by Genomes/) providing a tremendous amount of data that need to inference rather than by experiment, through the observation be interpreted and decorated with functions. of some level of sequence identity to a characterized gene product from another organism. It is Genomics itself which Approximately, 10 complete Rickettsia genome sequences provides that rare opportunity in science where the boundaries have been deposited in public databases. Rickettsiae has been of current knowledge can be clearly defined. commonly detected in Rhipicephalus from Central Africa, France, Greece, Mali, Portugal, Spain, Switzerland, and the United Completed genome sequences are rapidly increasing for States (Matsumoto et al., 2005; Eremeeva et al., 2005; Eremeeva Rickettsia but most the genes in them does not have an assigned et al., 2006). R. massiliae might be commonly associated with function. In the present investigation, we have selected a true these worldwide-distributed ticks. R. massiliae has also been human pathogen Rickettsia massiliae MTU5 strain of Rickettsia recently identified in ticks (Fernandez-Soto et al., for function annotation of hypothetical protein. 2006). The sequencing and the primary analysis of the genome Materials and Methods of R. massiliae strain MTU5 isolated from the R. turanicus collected on horses in Camargues, France (Blanc et al., 2007). The method includes new protocol for function annotation of hypothetical proteins found in R. massiliae MTU5 that are Rickettsia massiliae is the cause of Rocky Mountain spotted conserved in Rickettsia species using bioinformatics approach. fever (RMSF) and is the prototype bacterium in the spotted That is, changing many paradigms of function annotation, fever group of rickettsiae. Rickettsia massiliae is found in the which does not involve any conventional methods. Americas and is transmitted to humans through the bite of infected ticks. Rickettsiae are a group of organisms belonging to Annotations of the identified proteins were facilitated the class , a large and metabolically diverse by highly automated bioinformatics tools, such as ExPASy group of gram-negative (Olsen et al., 1994; Stothard et Proteomics server, NCBI website, KEGG database (Kyoto al., 1995; Weisburg et al., 1989). Encyclopedia of Genes and Genomes). These sites provided many links to different branches of information and special annotations In Spain R. massiliae is prevalent in ticks (Cardenosa et for individual data. Function annotation was carried out by al., 2003). These virulent species of rickettsiae are of great interest both as emerging infectious diseases (Azad, 1998) and for their potential deployment as bioterrorism agents (Azad, 2007). Species in the genus Rickettsia are obligate intracellular *Corresponding author: Joy Hoskeri. H, P.G. Department of Studies and Research symbionts of plants (Beati and Raoult, 1993). Rickettsia in Biotechnology and Bioinformatics, Kuvempu University, Shankaraghatta – 577 451, Karnataka, India, Tel: +91-9343544218; E-mail: [email protected] massiliae was first isolated from Rhipicephalus sanguineus collected in Marseille (France) in 1992 (Beati et al., 1993a; Received March 22, 2010; Accepted April 11, 2010; Published April 11, 2010 Beati et al., 1993b). The Rickettsiae have very small genomes Citation: Hoskeri JH, Krishna V, Amruthavalli C (2010) Functional Annotation of of about 1.0-1.5 million bases. Certain segments of Rickettsial Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5. J Comput Sci Syst Biol 3: 050-052. doi:10.4172/jcsb.1000055 genomes resemble that of mitochondria (Emelyanov, 2003). The deciphered genome of Rickettsia is approx 1,111,523 bp Copyright: © 2010 Hoskeri JH, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits long and contains 834 protein-coding genes (Andersson et al., unrestricted use, distribution, and reproduction in any medium, provided the 1998). original author and source are credited.

lish Pub ing S G C ro I u M p O

J Comput Sci Syst Biol

Volume 3(2): 050-052 (2010) - 050

ISSN:0974-7230 JCSB, an open access journal Citation: Hoskeri JH, Krishna V, Amruthavalli C (2010) Functional Annotation of Conserved Hypothetical Proteins in Rickettsia Massiliae MTU5. J Comput Sci Syst Biol 3: 050-052. doi:10.4172/jcsb.1000055

using databases like SMART (http://smart.embl-heidelberg.de/), massiliae MTU5. And the results can depicted from the Table 1 interproscan(http://www.ebi.ac.uk/Tools/InterProScan/),Pfam (included as a supplementary information). (http://pfam.sanger.ac.uk/), JAFA (http://jafa.burnham.org/), The central goal of this study was determining the protein COG (http://www.ncbi.nlm.nih.gov/COG/), BLAST (http:// functions from this sequences. Functional annotation of blast.ncbi.nlm.nih.gov/Blast.cgi). In addition to own extensive conserved hypothetical protein are of major importance in literature survey, bioinformatics tools helped us to check providing insides into their molecular functions and will experimental data by comparison to theoretical data and also help us in the identification of new drugs against spotted from the results of other fields (Figure 1). fever caused by Rickettsia massiliae MTU5 infection. Table As per the data from NCBI (http://www.ncbi.nlm.nih.gov), 1 shows the functional proteomics of Rickettsia massiliae Rickettsia massiliae MTU5 as totally 1423 genes with 968 MTU5 hypothetical proteins by using various online tools and protein coding and 416 pseudogenes with other coding frames. databases. These hypothetical proteins were further studied Amongst these nearly 296 hypothetical proteins were found. for their conservness in its parent’s species Rickettsia. This Further, only the hypothetical proteins that were conserved in resulted in nearly 114 conserved hypothetical proteins. These Rickettsia species were conserved for function annotation. And 114 hypothetical proteins were considered for training in so obtained results can depicted from Table 1 (included as a identifying their function based on the results obtained by supplementary information). studying these sequences using databases and tools such as, SMART, Interproscan, Pfam, JAFA, COG, and NCBI-BLAST. Results and Discussion We have followed a different method of protocol for function In this investigation, genome of Rickettsia massiliae MTU5 annotation of conserved hypothetical protein. This protocol was thoroughly studied by identifying the proteins whose includes prediction of protein function with respect to different function was not annotated and those that are conserved in on line functional annotation tools. The results of all these tools Rickettsia, using various Bioinformatics tools that helped us were considered and compare with the data and the results that in identifying the genome strategies of the strain Rickettsia proved to be similar in most of the tools, the proteins whose

Figure 1: Flowchart dictates the overall methodology.

lish Pub ing S G C ro I u M p O

J Comput Sci Syst Biol

Volume 3(2): 050-052 (2010) - 051

ISSN:0974-7230 JCSB, an open access journal Journal of Computer Science & Systems Biology - Open Access Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000055 www.omicsonline.com JCSB/Vol.3 Issue 2 information is overlapping are considered to be significant. Confirmation that Rickettsia helvetica sp. nov. is a distinct species of From the results mentioned (See, Table 1 (included as a the group of rickettsiae. Int J Syst Bacteriol 43: 521-526. supplementary information)), from 114 conserved hypothetical » CrossRef » PubMed » Google Scholar proteins only 35 proteins were annotated and were considered 7. Blanc G, Ogata H, Robert C, Audic S, Claverie JM, Raoult D (2007) for further studies for pathway fixation and reverse vaccinology Lateral gene transfer between obligate intracellular bacteria: evidence studies. from the Rickettsia massiliae genome. Genome Res 17: 1657-1664. » CrossRef » PubMed » Google Scholar The present investigation mainly focused only on identifying the function of hypothetical proteins of Rickettsia massiliae 8. Cardenosa N, Segura F, Raoult D (2003) Serosurvey among MTU5 that were conserved in Rickettsia species. The study Mediterranean spotted fever patients of a new spotted fever group motivated us to thoroughly study the genome of Rickettsia rickettsial strain (Bar29). Eur J Epidemiol 18: 351-356. » CrossRef » PubMed massiliae MTU5 for categorizing the proteins whose function » Google Scholar was unknown. 9. Emelyanov VV (2003) Mitochondrial connection to the origin of the eukaryotic cell. Eur J Biochem 270: 1599-618. » CrossRef » PubMed » Google As we all know that in any newly sequence bacterial genome, Scholar nearly 30-40 % of the genes do not have an assigned function. The majority of such hypothetical genes without any assigned 10. Eremeeva ME, Bosserman EA, Demma LJ, Zambrano ML, Blau DM, function as a wider phylogenetic distribution & there fore are et al. (2006) Isolation and identification of Rickettsia massiliae from usually referred to as conserved hypothetical. Hence there is Rhipicephalus sanguineus ticks collected in Arizona. Appl Environ a great to identify the functions for these proteins by in-silico Microbiol 72: 5569-5577. » CrossRef » PubMed » Google Scholar approach. Hence, an efficient effort made by us to annotate 11. Eremeeva ME, Madan A, Shaw CD, Tang K, Dasch GA (2005) New the function for nearly 35 conserved hypothetical protein of perspectives on Rickettsial evolution from new genome sequences Rickettsia massiliae MTU5. of Rickettsia, particularly R. canadensis, and . Ann NY Acad Sci 1063: 47-63. » CrossRef » PubMed » Google Scholar References 12. Fernandez-Soto P, Perez-Sanchez R, Diaz Martin V, Encinas- 1. Andersson SG, Zomorodipour A, Andersson JO, Sicheritz-Ponten Grandes A, Alamo Sanz R (2006) Rickettsia massiliae in ticks T, Alsmark UC, et al. (1998) The genome sequence of Rickettsia removed from humans in Castilla y Leon, Spain. Eur J Clin Microbiol prowazekii and the origin of mitochondria. Nature 396: 133-140. Infect Dis 25: 811-813. » CrossRef » PubMed » Google Scholar » CrossRef » PubMed » Google Scholar 13. Matsumoto K, Ogawa M, Brouqui P, Raoult D, Parola P (2005) 2. Azad AF (1998) Beard CB. Rickettsial pathogens and their arthropod Transmission of Rickettsia massiliae in the tick, Rhipicephalus vectors. Emerg Infect Dis 4: 179-186. » CrossRef » PubMed » Google Scholar turanicus. Med Vet Entomol 19: 263-270. » CrossRef » PubMed » Google Scholar 3. Azad AF (2007) Pathogenic rickettsiae as bioterrorism agents. Clin 14. Olsen GJ, Woese CR, Overbeek R (1994) The winds of (evolutionary) Infect Dis 45: S52-S55. » CrossRef » PubMed » Google Scholar change: breathing new life into microbiology. J Bacteriol 176: 1-6. 4. Beati L, Raoult D (1993) Rickettsia massiliae sp. nov. A new spotted » CrossRef » PubMed » Google Scholar fever group rickettsia. Int J Syst Bacteriol 43: 839-840. » CrossRef » PubMed 15. Stothard DR, Fuerst PA (1995) Evolutionary analysis of the spotted » Google Scholar fever and groups of Rickettsia using 16S rRNA gene 5. Beati L, Finidori JP, Raoult D, (1993) First isolation of Rickettsia sequences. Syst Appl Microbiol 18: 52-61. » CrossRef » PubMed » Google Scholar slovaca from Dermacentor marginatus in France. Am J Trop Med Hyg 16. Weisburg WG, Dobson ME, Samuel JE, Dasch GA, Mallavia LP 48: 257-268. » CrossRef » PubMed » Google Scholar (1989) Phylogenetic diversity of the Rickettsiae. J Bacteriol 171: 6. Beati L, Péter O, Burgdorfer W, Aeschlimann A, Raoult D (1993) 4202-4206. » CrossRef » PubMed » Google Scholar

lish Pub ing S G C ro I u M p O

J Comput Sci Syst Biol

Volume 3(2): 050-052 (2010) - 052

ISSN:0974-7230 JCSB, an open access journal