Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 Research Article Available online through ISSN: 0974-6943 www.jprsolutions.info Does hypothetical of Yersinia pestis CO92 Capable of Coding ?

Sunil Pande1 and Dilip Gore2* 1Rajiv Gandhi Biotechnology Centre, Rashtrasant Tukdoji Maharaj Nagpur University, L.I.T. Premises, Nagpur, Maharashtra, India. 440 033 2Sai Bioinfosys Institute of Bioinformatics Research, Plot No. 271 Raghuji Nagar, Nagpur (Maharashtra), India.440009

Received on:25-03-2015; Revised on:16-04-2015 ; Accepted on: 12-05-2015

ABSTRACT Background: Yersinia pestis are known for the plague outbreak and when it has been sequenced for the genome; more than 900 hypothetical proteins have been marked but still the exact functions about those are obscure. Methods: The evidences brought up by the combined results of CDD BLAST, INTERPRSCAN, PFAM and CATH domain search programs enabled to search functions based on conserved domains available in hypothetical proteins. Results and Discussion: Y. pestis is showcasing 210 hypothetical proteins with enzyme coding ability evident from conserved domains. Conclusion: These Y. pestis hypothetical proteins possesing conserved domains of enzymes may be functioning in cellular metabolism to bring about the virulence.

KEYWORDS: Bioinformatics, Enzyme conserved domain, Hypothetical proteins.

INTRODUCTION The deadly plague brought about by the bacterium, Yersinia pestis MATERIALS AND METHODS as previously diverged from Yersinia pseudotuberculosis (YPT) some 5000–15,000 years ago.1 Y. pestis reported to transmits in human by Data Collection: the bite of infected rodents. Once inside the human, the transmission The total 998 hypothetical sequences marked for Y. pestis of occurred through skin and progress towards lymph nodes, CO92 strain (biovar Orientalis) and those were retrieved in FASTA establishing bubonic type plague, ultimately lead to septicemia and format available at the website: www.genome.jp//. death within seven days if not medicated properly. Human to human transmission is another deadly mode recorded which causes rapid Search for Conserved Sequences spread among highly crowded community as in Madagascar and in The Y. pestis genome encoding 998 hypothetical proteins were the Democratic Republic of Congo.2, 3 screened for the presence of enzyme domains in the query sequences when input given to the four conserved domain searching programs. The early genome sequenced for Y. pestis CO92 offered first-time These programs remain linked with the conserved domain databases opportunities for understanding the virulence traits of the deadly and reported particular showing best scored homology pathogen.4 Additionally, two other Y. pestis strain whole genomes, for the given query sequence. KIM and 91001, were sequenced.5-8 These findings provided an in- The programs used as:- creasing opportunity to uncover virulence-associated genes linked A) CDD-BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) 10 11 12 for the pathogenicity associated with humans. 9 In the present study, B) INTERPROSCAN (http://www.abi.ac.uk/interpro) 13 genomic information has been used for the functional annotation of C) PFAM (http://www.pfam.sanger.ac.uk/) 14 the hypothetical proteins of Y. pestis have been investigated by con- D) CATH (http://www.cathdb.info/) 15 served domain analysis using four bioprograms. Functional Categorization *Corresponding author. The hypothetical proteins possess enzyme domains as reported by Dilip Gore four programs were categorized in percent confidence levels using Sai Bioinfosys Institute of Bioinformatics Research, following parameters:- Plot No. 271 Raghuji Nagar, 1. Four tools indicate the same enzyme functions then confidence Nagpur (Maharashtra), India.440009 level set as 100 percent.

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 % 50 25 50 75 75 25 50 100 50 50 75 50 75 50 50 75 75 75 100 100 25 75 25 75 25 50 75 50 25 75 50 100 75 50 Y . pestis CATH NO NO Lipoprotein -like domain tRNA sulfurtransferase -like domain NO 4-hydroxy-4-methyl-2-oxoglutarate aldolase -like domain 1D-myo-inositol -like domain Putative -like domain Sulfurtransferase TusA -like domain UPF0659 protein YMR090W -like domain NO Glycerol-3-phosphate acyltransferase 6 - like domain Phosphoribosyltransferase -like domain Succinyl-CoA [ADP-forming] subunit beta -like domain Hydroxyethylthiazole kinase -like domain NO HTH-type transcriptional regulator malT - like domain Ly sophospholipase -like domain Mycothiol acetyltransferase -like domain 1/2/3 NO Ca(2+)/calmodulin-dependent protein kinase - like domain NO NO NO NO S-norcoclaurine synthase -like domain NO NO Adenylate cyclase -like domain NO 4,5-DO PA dioxygenase extradiol -like domain NO NO Sensor protein -like domain PFAM Putative thioesterase (yiiD_Cterm) Isoprenylcysteine carboxyl methyltransfe Peptidase family M23 Rhodanese-like domain Putative transposase, YhgA-like Demethylmenaquinone methyltransferase GlcNAc-PI de-N-acetylase Haloacid dehalogenase-like hydrolase Sulfurtransferase TusA NADH(P)-binding Phosphoribosyl Sucrose-6F-phosphate phosphohydrolase TRSP domain C terminus to P RT ase_2 A TP-grasp in the biosynthetic pathway with Ter operon YjeF-related protein N-terminus Transposase kinase-, DNA gyrase B-, and HSP90 -like A TPase Putative exonuclease SbcCD, C subunit Patatin-like phospholipase Acetyltransferase (GN A T) family Cdc14 phosphatase binding protein N-terminus Protein phosphatase 2C Pre-toxin domain with VENN motif Putative glucoamylase NO Na+/H+ antiporter family Polyketide cyclase / dehydrase and lipid transport Putative mono-oxygenase ydhR Protein-kinase domain of FAM69 CYTH domain permease Catalytic LigB subunit of aromatic ring-opening dioxygenase Glutathionylspermidine synthase pre ATP-grasp Type IV leader peptidase family T able 1 : Enzyme function prediction in the hypothetical proteins of INTERPROSCAN Thioesterase UPF0126 Duplicated hybrid motif Rhodanese Transposase_31 RraA-like N-ACETYLGLUCOSAMINYL- PHOSPH A TIDYLINOSITOL DE-N-ACETYLASE-REL A TED HADHALOGNASE SirA-like NO PRTase_1 HAD-like Uncharacterised conserved protein, UCP020967 type Uncharacterised conserved protein, UCP029120 type yjeF_cterm: YjeF family C-terminal domain HTH_Tnp_1 A TPase domain of HSP90 chaperone/DNA topoisomerase II/histidine kinase SbcCD_C Patatin GN AT NO PP2C-like NO Glycoamylase Six-hairpin glycosidases Na_H_antiporter Polyketide_cyc2 Dimeric alpha+beta barrel NO CYTH-like phosphatases AA_permease_2 LigB GSP_synth Peptidase_A24 CDD BLAST N-Acyltransferase superfamily UPF0126 domain Peptidase family M23 Rhodanese Homology Domain (RHOD) YhgA-like hypothetical protein Uncharacterized proteins, LmbE homologs HAD_like SirA, Y ed F , and eeD nucleoside-diphosphate-sugar epimerases PELO TA RNA binding domain NO TRSP domain C terminus to P RT ase_2 ATP-grasp domain Uncharacterized conserved protein HTH_Tnp_1 Histidine kinase-like A TPases SbcCD_C Pat_hypo_Ecoli_yjju_like Acetyltransferase (GN A T) family NO Protein phosphatase 2C V-type A TP synthase subunit E Glycoamylase super family Cellobiose phosphorylase ArsB_NhaD_permease super family PYR_PYL_RCAR_like putative monooxygenase NO CYTH-like_Pase_CHAD putative transporter 45_DO PA_Dioxygenase Glutathionylspermidine synthase Flp pilus assembly protein KEGG NO YPO0032 YPO0042 YPO0063 YPO0065 YPO0076 YPO0082 YPO0083 YPO0141 YPO0275 YPO0284 YPO0290 YPO0291 YPO0292 YPO0293 YPO0368 YPO0392a YPO0392 YPO0396 YPO0433 YPO0466 YPO0489a YPO0593 YPO0601 YPO0608 YPO0610 YPO0624 YPO0625 YPO0629 YPO0636 YPO0652 YPO0655 YPO0659 YPO0660 YPO0695

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287

50

50 25

50 75

25 50

50

75 50

50 50

75 75

50 50

% 25 100

75 25

50

75 75

50 50

25 75

50 50

75 50

Beta-agarase A -like domain Inositol 2-dehydrogenase -like domain

Receptor-associated protein of the synapse - like domain 1/2 NO

Saccharopine dehydrogenase -like domain

NO NO

Unsaturated rhamnogalacturonyl hydrolase yesR -like domain

Succinate dehydrogenase assembly factor 2, - like domain NO

UPF0301 protein MCA2336 2 -like domain General secretion pathway protein E -like

domain Alanine racemase -like domain Light-independent protochlorophyllide

reductase -like domain 1/2/3 Lipoprotein -like domain NO

CATH NO Catechol 1,2-dioxygenase -like domain

Cysteine desulfuration protein sufE -like domain NO

4-hydroxyphenylpyruvate dioxygenase -like domain 1/2

Inpp5b protein -like domain 1/2 Ribosomal RNA-processing protein 8 -like domain

NO NO

2-phospho-L-lactate transferase -like domain NO

NO NO

Hydroxyacylglutathione hydrolase -like domain Hydroxypyruvate -like domain

Rossmann fold

Sugar-binding cellulase-like

Tetratricopeptide repeat Spi protease inhibitor

Oxidoreductase family, NAD-binding Saccharopine dehydrogenase

Glutathione S-transferase, C-terminal domain Metallo-beta-lactamase superfamily

Glycosyl Hydrolase Family 88

Flavinator of succinate dehydrogenase FAD binding domain

Uncharacterized ACR, COG1678 Type II/IV secretion system protein

Alanine racemase, N-terminal domain Periplasmic binding protein

Peptidase family M23 CAAX protease self-immunity

Esterase PHB depolymerase Dioxygenase

PFAM Fe-S metabolism associated domain tRNA pseudouridine synthase C

Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily

Endonuclease/Exonuclease/phosphatase family Methyltransferase domain

Polyketide cyclase / dehydrase and lipid transport CAAX protease self-immunity

Uncharacterised protein family UPF0052 HNH endonuclease Transglycosylase associated protein

2OG-Fe(II) oxygenase superfamily

Beta-lactamase superfamily domain Xylose isomerase-like TIM barrel

no description

TPR_1 NO

GFO_IDH_MocA Saccharop_dh

NO Metallo-beta-lactamase superfamily

Six-hairpin glycosidases

Sdh5 FAD_binding_3

T2SP_E T2SP_E

Ala_racemase_N Peripla_BP_2

Lysozyme-like Abi

TAT Dioxygenase_C SufE

INTERPROSCAN Uncharacterised conserved protein

Glyoxalase

Exo_endo_phos Methyltransf_11

Polyketide_cyc Abi

UPF0052 HNH Transgly_assoc

Uncharacterised conserved protein, UCP030125 type

no description Xylose isomerase-like

Cellulase-like super family

Soluble NSF attachment protein, SNAP NO

Oxidoreductase family NADB_Rossmann super family

DUF2058 super family Zn-dependent , including

LanC_like superfamily

hypothetical protein Ubiquinone biosynthesis hydroxylase

Tfp pilus assembly protein, pilus retraction ATPase PilT Tfp pilus assembly protein, pilus retraction

ATPase PilT PLPDE_III_Yggs_like Periplasmic binding protein TroA_f

lysozyme_like domain CAAX protease self-immunity

Sulfur oxidation protein SoxY intradiol_dioxygenase_like cysteine desulfurase, sulfur acceptor subunit

CDD BLAST CsdE Uncharacterized protein conserved in bacteria

Glo_EDI_BRP_like_1

EEP superfamily Methyltransferase domain

Coenzyme Q-binding protein COQ10p and similar proteins CAAX protease self-immunity

YvcK_like HNH nucleases hypothetical protein

2OG-Fe(II) oxygenase superfamily

Predicted Zn-dependent hydrolases of the beta-lactamase fold Fructose/tagarose-bisphosphate

aldolase class II

YPO0757

YPO0763 YPO0769

YPO0774 YPO0775

YPO0790a YPO0800 glyoxylases

YPO0840 YPO0897

YPO0908

YPO0936 YPO0940

YPO0941 YPO0955

YPO0969 YPO0983

YPO0986 YPO0987 YPO1027

KEGG NO YPO1037

YPO1061

YPO1077 YPO1080

YPO1102 YPO1140

YPO1158 YPO1179 YPO1181

YPO1192

YPO1228 YPO1238

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 % 75 75 75 50 100 100 75 25 75 75 75 75 50 50 100 50 25 75 75 50 25 50 50 100 25 25 75 50 50 25 75 75 50 CATH NO Inner membrane protein yejM -like domain Acid phosphatase -like domain NO Fumarylacetoacetate hydrolase family protein - like domain Acid phosphatase -like domain Ribosomal protein L11 methyltransferase -like domain Opine dehydrogenase -like domain NO NO Hydroxyacylglutathione hydrolase -like domain Homoserine kinase -like domain Matrix metalloproteinase-21 -like domain Sonic hedgehog protein -like domain 1/2 3-oxoacyl-[acyl-carrier-protein] synthase 3 - like domain 1 NO NO Lipoprotein -like domain Lipoprotein -like domain Amine oxidase -like domain 1/2 NO UDP-N-acetylenolpyruvoylglucosamine reductase -like domain NO Glycerol dehydrogenase -like domain NO NO NIPSNAP1 protein -like domain 1/2 Homoserine kinase -like domain Lysine-specific demethylase NO66 -like domain 1/2 High frequency lysogenization protein HflD - like domain RNA pyrophosphohydrolase -like domain NO NO PFAM DNase/tRNase domain of colicin-like bacteriocin Sulfatase PAP2 superfamily Putative zinc- or iron-chelating domain Fumarylacetoacetate ( FAA) hydrolase family PAP2 superfamily Nicotianamine synthase protein Uncharacterized protein conserved in bacteria NAD dependent epimerase/dehydratase family Glycosyltransferase family 25 Metallo-beta-lactamase superfamily Fructosamine kinase L,D-transpeptidase catalytic domain Bacterial protein of unknown function Beta-ketoacyl synthase, N-terminal domain S-adenosylmethionine-dependent methyltransferase DDE domain Peptidase family M23 Peptidase family M23 Gene 25-like lysozyme Glycosyl hydrolase family 1 Transporter associated domain NADH(P)-binding Iron-containing alcohol dehydrogenase Protein of unknown function NADH-ubiquinone oxidoreductase chain 4, amino terminus Antibiotic biosynthesis monooxygenase Phosphotransferase enzyme family Cupin superfamily protein Putative methyltransferase NUDIX domain 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) hydroxylase Dextransucrase DSRB INTERPROSCAN Colicin-DNase Sulfatase PAP2 CxxCxxCC FAA_hydrolase PAP2 NAS no description Epimerase Glyco_transf_25 Lactamase_B Protein kinase-like (PK-like) PGBD-like Hedgehog/DD-peptidase ketoacyl-synt Putative RNA-binding Domain in PseudoUridine RNaseH_fold Peptidase_M23 Peptidase_M23 GPW_gp25 NO CBS no description Fe-ADH DUF393 NO ABM APH Cupin_4 DUF489 NUDIX Kdo_hydroxy DSRB CDD BLAST Colicin-DNase super family Sulfatase super family PAP2_like proteins Predicted Fe-S-cluster oxidoreductase Fumarylacetoacetate ( FAA) hydrolase family PAP2_like_6 proteins Nicotianamine synthase protein Uncharacterized protein conserved in bacteria Nucleoside-diphosphate-sugar epimerases Glycosyltransferase family 25 Metallo-beta-lactamase superfamily hypothetical protein L,D-transpeptidase catalytic domain Peptidase_M15_3 super family Beta-ketoacyl-acyl carrier protein Middle domain of the SAM-dependent methyltransferase RlmI and related proteins Integrase core domain Peptidase family M23 Peptidase family M23 Predicted component of the type VI protein secretion system NO Transporter associated domain Nucleoside-diphosphate-sugar epimerases Glycerol dehydrogenases-like Predicted thiol-disulfide oxidoreductase NO ABM superfamily Phosphotransferase enzyme family Uncharacterized conserved protein putative lysogenization regulator Nudix_Hydrolase_2 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) hydroxylase hypothetical protein KEGG NO YPO1258 YPO1260 YPO1276 YPO1286 YPO1289 YPO1325 YPO1346 YPO1347 YPO1355 YPO1382 YPO1394 YPO1401 YPO1407 YPO1408 YPO1427b YPO1445 YPO1450a YPO1473 YPO1476 YPO1488 YPO1500 YPO1527 YPO1551 YPO1556 YPO1564 YPO1574 YPO1588 YPO1614 YPO1632 YPO1637 YPO1639 YPO1650 YPO1656

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 % 75 25 100 100 75 100 75 50 75 50 50 50 50 75 75 75 75 100 100 50 100 100 50 50 100 25 25 75 75 100 CATH Inosose dehydratase -like domain NO Metalloproteinase -like domain Homoserine kinase -like domain Putative uncharacterized protein -like domain RNA pyrophosphohydrolase -like domain Potential lactate/pyruvate transporter -like domain 1 NO Integrase -like domain NO NO Anthranilate synthase -like domain Hydroxyacylglutathione hydrolase -like domain NO NO NO NO Glycerophosphodiester phosphodiesterase - like domain Glycerophosphodiester phosphodiesterase - like domain Chaperone protein torD -like domain tRNA (mo5U34)-methyltransferase -like domain tRNA (cmo5U34)-methyltransferase -like domain RutC family protein yjgF -like domain Acetate kinase -like domain 1/2 Fumarylacetoacetate hydrolase family protein - like domain NO NO Quercetin 2,3-dioxygenase -like domain A TP-dependent protease ATPase subunit Aldose 1-epimerase family protein -like domain PFAM Xylose isomerase-like TIM barrel DNA repair metallo-beta-lactamase Glucose-regulated metallo-peptidase M90 RIO1 family Multi-copper polyphenol oxidoreductase laccase NUDIX domain Iron permease FTR1 family Dyp-type peroxidase family Phage integrase family Transposase Iron permease FTR1 family chorismate binding enzyme Metallo-beta-lactamase superfamily Ribose-5-phosphate isomerase Methyltransferase domain Alpha/beta hydrolase family Alpha/beta hydrolase family Glycerophosphoryl diester phosphodiesterase family Glycerophosphoryl diester phosphodiesterase family Nitrate reductase delta subunit Protein of unknown function Methyltransferase domain Endoribonuclease L-PSP Glycoprotease family Fumarylacetoacetate ( FAA) hydrolase family N terminal extension of bacteriophage endosialidase Isocitrate family Pirin C-terminal cupin domain PrkA serine protein kinase C-terminal domain Aldose 1-epimerase tRNA_methyltr_CmoA INTERPROSCAN Xylose isomerase-like NO MtfA RIO-like kinase Cu-oxidase_4 NUDIX_COA FTR1 TAT Phage_integrase HTH_Tnp_1 FTR1 ADC synthase Lactamase_B Ribose/Galactose isomerase RpiB/AlsB Methyltransf_11 Abhydrolase_5 alpha/beta-Hydrolases GDPD GDPD Nitrate_red_del tRNA_methyltr_CmoB Ribonuc_L-PSP Peptidase_M22 FAA_hydrolase NO NO Pirin PrkA HslU -like domain Aldose_epim CDD BLAST Xylose isomerase-like TIM barrel NO DgsA anti-repressor MtfA RIO kinase family Multi-copper polyphenol oxidoreductase laccase a member of the Nudix hydrolase superfamily Iron permease FTR1 family Dyp-type peroxidase family DNA_BRE_C super family Transposase proteins are necessary for efficient DNA transposition High-affinity Fe2+/Pb2+ permease chorismate binding enzyme Zn-dependent hydrolases Ribose-5-phosphate isomerase Methyltransferase domain Alpha/beta hydrolase family Hydrolases of the alpha/beta superfamily Glycerophosphoryl diester phosphodiesterase Glycerophosphoryl diester phosphodiesterase Uncharacterized component of anaerobic dehydrogenases S-adenosylmethionine-dependent methyltransferases S-adenosylmethionine-dependent methyltransferases YjgF_YER057c_UK114_like_2 Glycoprotease family 2-keto-4-pentenoate hydratase/2-oxohepta-3- ene-1,7-dioic acid hydratase NO NO Pirin C-terminal cupin domain PrkA serine protein kinase C-terminal domain D-hexose-6-phosphate epimerase-like KEGG NO YPO1670 YPO1690 YPO1732 YPO1734 YPO1747 YPO1772 YPO1854 YPO1856 YPO1868 YPO1874 YPO1941 YPO1976 YPO1980 YPO1991 YPO1992 YPO1997 YPO2002 YPO2028 YPO2035 YPO2038 YPO2049 YPO2050 YPO2070 YPO2072 YPO2082 YPO2096 YPO2109 YPO2149 YPO2153 YPO2156

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287

100

50 75

25 50

50 50

75

50 75

75

75 25

% 100

25 25

50 75

50 50 75

75 75

25 25

75 75

50

Oxygen-insensitive NAD(P)H nitroreductase -like domain

Error-prone DNA polymerase -like domain NO

NO NO

NO 1,4-dihydroxy-2-naphthoyl-CoA

hydrolase -like domain UDP-N-acetylenolpyruvoylglucosamine reductase -like domain

NO NO

Polysaccharide deacetylase -like domain

Homoserine kinase -like domain Alpha-L-arabinofuranosidase axhA -

CATH like domain Mycothiol acetyltransferase -like domain 1/2/3

uncharacterised family 168 -like domain Vacuolar protein-sorting-associated protein 46 -like domain

Inositol 2-dehydrogenase -like domain Inosose dehydratase -like domain

NIPSNAP1 protein -like domain 1/2 NO 2-succinyl-6-hydroxy-2,4-cyclohexadiene

-1-carboxylate -like domain NO NO

NO NO

NO RNA pyrophosphohydrolase -like domain

Carboxyesterase-related protein -like domain 1

Nitroreductase family

PHP domain NAD dependent epimerase/dehydratase family

NO alpha/beta hydrolase fold

Carboxymuconolactone decarboxylase family Thioesterase superfamily

FAD binding domain

Kinase/pyrophosphorylase EamA-like transporter family

Polysaccharide deacetylase

Fructosamine kinase DUF

Acetyltransferase (GNAT) family

PFAM YIEGIA protein HD domain

Oxidoreductase family, NAD -binding Rossmann fold Xylose isomerase-like TIM barrel

Antibiotic biosynthesis monooxygenase Acyltransferase family Alpha/beta hydrolase family

Allophanate hydrolase subunit 2 Methyltransferase domain

Tubby C 2 AdoMet dependent proline di-methyltransferase

NAD dependent epimerase/dehydratase family NUDIX domain

Putative lysophospholipase

Nitroreductase

DNA polymerase alpha chain like domain Epimerase

Adenylylcyclase toxin (the edema factor) Abhydrolase_3

AhpD-like unchar_dom_1: uncharacterized

domain 1 FAD-oxidase_C

Kinase-PPPase EamA

Polysacc_deac_1

Protein kinase-like DUF

Acetyltransf_1

INTERPROSCAN DUF386 Metal dependent phosphohydrolases with conse

GFO_IDH_MocA Xylose isomerase-like

ABM NO ABHYDROLASE

AHS2 N6_MTASE

NO NO

Epimerase NUDIX

Hydrolase_4

Nitroreductase-like family which includes NADH oxidase and arsenite oxidiase

Predicted metal-dependent phosphoesterases Nucleoside-diphosphate-sugar epimerases

NO Esterase_lipase super family

Carboxymuconolactone decarboxylase family PaaI_thioesterase

FAD binding domain

Uncharacterized protein conserved in bacteria 4-amino-4-deoxy-L-arabinose- phosphoundecaprenol

flippase subunit ArnF 4-deoxy-4-formamido-L-arabinose- phosphoundecaprenol

deformylase ArnD Phosphotransferase enzyme family Predicted glycosylase

N-Acyltransferase superfamily

CDD BLAST Beta-galactosidase, beta subunit 5'-nucleotidase

Oxidoreductase family Sugar phosphate /epimerases

Uncharacterized conserved protein Predicted O-acyltransferase Alpha/beta hydrolase family

Allophanate hydrolase subunit 2 S-adenosylmethionine-dependent

methyltransferases Iron permease FTR1 family NO

Predicted nucleoside-diphosphate sugar epimerase Nudix hydrolase superfamily

Putative lysophospholipase

YPO2163

YPO2211 YPO2238

YPO2313 YPO2336

YPO2360 YPO2406

YPO2407

YPO2410 YPO2416

YPO2419

YPO2444 YPO2474

YPO2508

KEGG NO YPO2541 YPO2559

YPO2584 YPO2586

YPO2589 YPO2592 YPO2638

YPO2699 YPO2709

YPO2715 YPO2726

YPO2778 YPO2781

YPO2814

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 %

75

50 25 25

25 50 50

25 25 50 50

25 50 50

50 25 50

50

50 25

50 50 50

25 25 50

50 CATH

30S ribosomal protein S4 -like domain

Glucose N-acetyltransferase 1 -like domain NO Collagen alpha-3(VI) chain -like domain 1/2/3/4/5/6/7/8/9/11/12

NO NO NO

NO NO Protease HtpX -like domain NO

NO UPF0133 protein Rv3716c/MT3819 -like domain Photosystem II 12 kDa extrinsic

protein -like domain NO NO NO

Glycerophosphodiester phosphodiesterase - like domain

NO NO

Asparagine synthetase [glutamine- hydrolyzing] -like domain 1/2 NO NO

Putative uncharacterized protein - like domain NO NO 1-deoxy-D-xylulose-5-phosphate PFAM

S4 domain

Glycosyltransferase Uncharacterized protein conserved in bacteria von Willebrand factor type A domain

Dyp-type peroxidase family Winged helix-turn helix Uncharacterised protein family

Putative neutral zinc metallopeptidase NO Peptidase family M48 SPFH domain / Band 7 family

Elongator subunit Iki1 YbaB/EbfC DNA-binding family Helix-hairpin-helix motif

Acyl-CoA dehydrogenase N terminal Methyltransferase domain Protein of unknown function

Glycerophosphoryl diester phosphodiesterase family

Protein of unknown function L,D-transpeptidase catalytic domain

Glutamine amidotransferases class-II Adhesin biosynthesis transcription regulatory protein Predicted periplasmic lipoprotein

Multi-copper polyphenol oxidoreductase laccase DUF L-fucose isomerase, C-terminal domain Transketolase, pyrimidine binding domain INTERPROSCAN

NO

ImpA-rel_N GlcNAc Creatinase_N

DUF386 NUDIX arsC_related: transcriptional regulator, Spx/MgsR DUF441

Zn_peptidase UPF0118 Peptidase_M48

Aminoacyl-tRNA deacylase, YbaK type NAD_binding_8 TIGR00051: acyl-CoA thioester

hydrolase, YbgC/YbaW TIGR00426: competence protein ComEA helix-hairpin- DUF1255 Methyltransf_11

NO GDPD

DUF539

YkuD GATASE_TYPE_2 Acetyltransf_1

NO Cu-oxidase_4 MFS_1 FucI/AraA N-terminal and middle

domains synthase -like domain CDD BLAST

S4/Hsp/ tRNA synthetase RNA-binding domain

ImpA-related N-terminal Glycosyltransferase (GlcNAc) A family including aminopeptidase P, aminopeptidase M, and prolidase.

Beta-galactosidase nudix-type nucleoside diphosphatase Arsenate Reductase (ArsC) family Predicted membrane]

protein [Function unknown Predicted metalloprotease Predicted permease Peptidase family M48

NO Protoporphyrinogen oxidase [Coenzyme metabolism] 4-hydroxybenzoyl-CoA thioesterase (4HBT).

Helix-hairpin-helix motif hypothetical protein S-adenosylmethionine-dependent

methyltransferases (SAM or AdoMet-MTase) NO Glycerophosphoryl diester

phosphodiesterase [Energy production and conversion] Uncharacterized protein conserved in bacteria [Function unknown]

Uncharacterized protein conserved in bacteria Glutamine amidotransferases class-II (Gn-AT)_YafJ-type N-Acyltransferase superfamily

NO hypothetical protein putative MFS family transporter protein L-fucose isomerase (FucIase) and

L-arabinose isomerase (AI) family KEGG NO

YPO2825

YPO2951 YPO2952 YPO3007

YPO3025 YPO3042 YPO3055 YPO3058

YPO3065 YPO3069 YPO3083

YPO3089 YPO3121 YPO3152

YPO3161 YPO3218 YPO3219

YPO3232 YPO3233

YPO3241

YPO3242 YPO3245 YPO3274

YPO3276 YPO3290 YPO3311 YPO3313

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287 75 25 50 25 25 25 50 25 50 100 75 25 25 50 75 50 50 25 25 50 50 50 50 50 50 25 100 % 1-deoxy-D-xylulose-5-phosphate synthase -like domain 1/2/3 NO NO Estrogen receptor -like domain NO NO Regulator of ribonuclease activity B - like domain Oxidoreductase H TA TIP2 -like domain Alginate lyase -like domain Alginate lyase -like domain NO NO NO Bacterial luciferase family protein - like domain Bacterial luciferase family protein - like domain Glutathione S-transferase family protein - like domain Quercetin 2,3-dioxygenase -like domain Chromosomal replication initiator protein DnaA -like domain NO NO Dephospho-CoA kinase -like domain RutC family protein yjgF -like domain NO Hydroxyacylglutathione hydrolase - like domain NO NO Tyrosine-protein phosphatase -like domain CATH Transketolase, thiamine diphosphate binding domain Protein of unknown function Bacterial chaperone lipoprotein Transposase zinc-ribbon domain Predicted permease YjgP/YjgQ family Predicted permease YjgP/YjgQ family Regulator of ribonuclease activity B NADH(P)-binding Alginate lyase Alginate lyase Alginate lyase NO Peptidase family U32 Luciferase-like monooxygenase Luciferase-like monooxygenase Glutathione S-transferase, N-terminal domain Pirin AFG1-like ATPase DUF Permease P-loop A TPase protein family Endoribonuclease L-PSP Esterase-like activity of phytase Metallo-beta-lactamase superfamily D-alanyl-D-alanine carboxypeptidase Acetyltransferase (GN A T) family Tyrosine phosphatase family C-terminal region PFAM Transket_pyr Transketolase_N Polysacc_deac_1 DUF1342 DUF262 YjgP_YjgQ YjgP_YjgQ DUF386 UPF0306 Alginate_lyase Alginate_lyase Alginate_lyase NO Peptidase_U32 Bac_luciferase NO GST_N_2 NO AFG1_ A TPase MCE LptC A TP_bind_2 Phytase-like no description no description NO Y_phosphatase3C INTERPROSCAN Transketolase, C-terminal domain Thiamine pyrophosphate (TPP) family Predicted xylanase/chitin deacetylase Uncharacterized protein conserved in bacteria Uncharacterized conserved protein [Function unknown] lipopolysaccharide ABC transporter permease lipopolysaccharide ABC transporter permease LptF Beta-galactosidase, beta subunit hypothetical protein Alginate lyase Alginate lyase Heparinase II/III-like protein NO putative protease Flavin-utilizing monoxygenases NO Glutathione S-transferase, N-terminal domain NO Predicted ATPase [General function prediction only] ABC-type transport system involved in resistance to organic solvents lipopolysaccharide exporter periplasmic protein P-loop A TPase protein family Uncharacterized protein conserved in bacteria Zn-dependent hydrolases D-alanyl-D-alanine carboxypeptidase NO Tyrosine phosphatase family CDD BLAST YPO3314 YPO3336 YPO3410 YPO3432 YPO3439 YPO3440 YPO3445 YPO3466 YPO3468 YPO3473 YPO3474 YPO3478 YPO3480 YPO3484 YPO3486 YPO3544 YPO3546 YPO3564 YPO3565 YPO3574 YPO3586 YPO3590 YPO3640 YPO3647 YPO3676 YPO3693 YPO3785 KEGG NO

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287

2. Three tools indicate the same enzyme functions and one showing outbreak of pneumonic plague in Madagascar, Lancet, different function then the confidence level set as 75 percent. 355(9198), 2000, 111-3. 3. Two tools indicate the same enzyme functions and two showing 3. Bertherat E, Lamine KM, Formenty P, Thuier P, et al., Major different functions then the confidence level set as 50 percent. pulmonary plague outbreak in a mining camp in the Demo- 4. One tool indicates the enzyme function and three tools showing cratic Republic of Congo: brutal awakening of an old scourge, different functions then the confidence level set as 25 percent. 16 Med Trop (Mars), 65(6), 2005, 511-4. 4. Parkhill J, Wren BW, Thomson NR, Titball RW, Holden MT, OBSERVATIONS AND RESULTS Prentice MB, et al. Genome sequence of Yersinia pestis, the causative agent of plague, Nature 413, 2001, 523–527. Functional Annotation to the Hypothetical Proteins 5. Lindler LE, Plano GV, Burland V, Mayhew GF, Blattner FR, The enzymatic coding by 210 hypothetical proteins has been suc- Complete DNA sequence and detailed analysis of the cessfully linked by using bioprograms CDD-BLAST, CATH, Yersinia pestis KIM5 plasmid encoding murine toxin and INTERPROSCAN and PFAM. The programs analyzed that out of 998 capsular antigen, Infect. Immun. 66, 1998, 5731–5742. hypothetical proteins; only 210 hypothetical proteins showcased the 6. Perry RD, Straley SC, Fetherston JD, Rose DJ, Gregor J, probability of enzyme function. Further, the obtained confidence in Blattner FR, DNA sequencing and analysis of the low-Ca2+- enzyme coding of hypothetical proteins have been scored and classi- response plasmid pCD1 of Yersinia pestis KIM5, Infect. fied at four levels such as 100% for 135 proteins, 75% for 119, 50% for Immun. 66, 1998, 4611–4623. 7. Deng W, Burland V, Plunkett G, et al., Genome sequence of 52, 25% for 45 as per the similar reports obtained from each program Yersinia pestis KIM, J. Bacteriol. 184, 2002, 4601–4611. given in Table 01. Based on the obtained % confidence, it has been 8. Song Y, Tong Z, Wang J, et al., Complete genome sequence noteworthy that 135 proteins with 100% confidence could be first of Yersinia pestis strain 91001, an isolate avirulent to hu- choice for In Vivo analysis and eventually determining the fate of mans, DNA Res., 11, 2004, 179–197. these hypothetical proteins in the metabolism of Y. pestis. 9. Zhou D, Han Y, Song Y, Huang P, Yang R, Comparative and evolutionary genomics of Yersinia pestis, Microbes Infect, 6, 2004, 1226–1234. CONCLUSION 10. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, With the advancement of computational proteomics and sequencing Miller W, Lipman DJ, Gapped BLAST and PSI-BLAST: a an opportunity has created to elucidate insilico enzyme functions in new generation of protein database search programs, Nucleic the hypothetical proteins of Y. pestis. Study investigated 998 hypo- Acids Res, 25 (17), 1997, 3389-402. thetical proteins of Y. pestis for the probable enzyme function by 11. Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, involving conserved domain search strategy described earlier by the Wolf YI, Koonin EV, Altschul SF, Improving the accuracy of PSI-BLAST protein database searches with composition researchers17-20. By homology based searching, total 210 hypotheti- based statistics and other refinements, Nucleic Acids Res, cal proteins found to code enzyme based on the conserved domain 29(14), 2001, 2994-3005. available. Similar studies performed earlier also reported that all bac- 12. Marchler-Bauer A et al., CDD: a conserved domain data- teria harbor hypothetical proteins and several of those codes en- base for interactive domain family analysis, Nucleic Acids zymes as per insilico studies confirmed at molecular level in Shigella Res, 35(D), 2007, 237-240. flexneri, Bacillus anthracis, Haemohpilus influenzae and 13. Zdobnov EM, Rolf A, Interproscan- an integration platform for the signatures recognition methods in InterPro, Helicobacter pylori 17-20. Based on the results, these enzyme coding Bioinformatics, 17, 2001, 847-848. hypothetical proteins should be playing several crucial roles in me- 14. Bateman A et al., The Pfam families’ database, Nucleic Ac- tabolism of Y. pestis and by further expression analysis, probably the ids Res, 30, 2000, 276-80. fate of these mystery proteins of Y. pestis will come out. 15. Sillitoe I et al., New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D REFERENCES structures, Nucleic Acids Res., 41(Database issue), 2013, D490-8. 1. Achtman M, Zurth K, Morelli G, Torrea G, Guiyoule A, Carniel 16. Kewate P et al., In silico enzyme function prediction in hy- E, Yersinia pestis, the cause of plague, is a recently emerged pothetical proteins of Mycobacterium bovis AF2122/97, clone of Yersinia pseudotuberculosis, Proc. Natl. Acad. Sci. Journal of Pharmacy Research, 9(3), 2015, 182-189. USA, 96(24), 1999, 14043–14048. 17. Gore D, In silico prediction of structure and enzymatic activ- 2. Ratsitorahina M, Chanteau S, Rahalison L, Ratsifasoamanana ity for hypothetical proteins of Shigella flexneri, L, Boisier P, Epidemiological and diagnostic aspects of the Biofrontiers, 1 (2), 2009, 1-10.

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287 Sunil Pande et al. / Journal of Pharmacy Research 2015,9(4),278-287

18. Gore D, Raut A, Computational function and structural an- Approach. International Journal of Soft Computing and notations for hypothetical proteins of Bacillus anthracis, Bioinformatics, 1(2), 2010, 67-77. Biofrontiers, 1 (1), 2009, 27-36. 20. Gore D, Denge P, Amrute M, Homology Modeling and En- 19. Dogra P, Gore D, Prediction of Enzymatic Function and Struc- zyme Function Prediction in the Hypothetical Proteins of ture of H. influenzae Hypothetical Proteins - An In silico Helicobacter pylori - an Insilico Approach, Biomirror, 1-5, 2010, bm-1111251610.

Source of support: Nil, Conflict of interest: None Declared

Journal of Pharmacy Research Vol.9 Issue 4.April 2015 278-287