INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005

Research Article

In Silico Analysis of Stem GS. Sekhar*, P. Rajkeshwar and P. Anil Shambhunath Institute of Pharmacy, Jhalwa, Allahabad, Uttar Pradesh-211012, India.

ABSTRACT Bromelain is a general name for a family of sulfhydryl-containing, proteolytic obtained from Ananas comosus. In this work we have performe Sequence analysis, Functional, Structural & Phylogenetic analysis and Prediction of structure of protein using different Bio-informatics tools and databases like as NCBI, BLAST, FASTA, Clustal W, Protein Data Bank etc. Sequence Similarity was search using BLAST against Homo sapiens protein database. Stem& was extracted from Stem & Fruit of Pineapple plant which contains the enzymes Stem Bromelain (Cysteine endopeptidase ) and Fruit Bromelain (Aspartic endopeptidase), respectively. INTRODUCTION ovaries of many flowers more or less grown The pineapple (Ananas comosus) is placed in together into one mass. the Order Bromeliales and the Family Bromelain is a general name for a family of Bromeliaceae. It is a medium tall (1–1.5 m) sulfhydryl-containing, proteolytic enzymes herbaceous perennial plant with 30 or more obtained from Ananas comosus, the pineapple trough-shaped and pointed leaves 30–100 cm plant6-7. The primary component of Bromelain long, surrounding a thick stem. Pineapples is a sulfhydryl proteolytic fraction. It also are short stemmed, herbaceous plants having contains a peroxidase, acid phosphatase, basal rosettes of stiff, spiny leaves. The several inhibitors, and organically- flowers are clustered together on a short stalk bound calcium. It is made up of 212 amino that arises from the rosette of leaves. acids and the molecular weight is 33,000 D. Bioinformatics is the application of computer technology to the management of biological information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied to -based drug discovery and development17. The need for Bioinformatics capabilities has been precipitated by the Fig. 1: explosion of publicly available genomic information resulting from the Project. The cluster of flowers resembles a pine cone, thus the name pineapple1-5. Leaves also grow MATERIALS AND METHODS from the top of the flower cluster. As the Bioinformatics tools and databases ovaries of the individual flowers on the stalk Stem Bromelain, the protein of interest was ripen, they swell up to form the pineapple fruit. analyzed using different Bioinformatics tools The pineapple fruit is considered a multiple and databases listed below: fruit because it consists of the enlarged

S.No Tool / Database Application 1 NCBI Sequence retrieval 2 BLAST Sequence similarity 3 Clustal W Multiple Sequence Alignment 4 Clustal Distance Matrix Evolutionary distance 5 Protparam Primary Structure analysis 6 Hierarchial Neural Network Secondary Structure analysis 7 CPH, HHPred Tertiary Structure analysis 8 Protein Data Bank (PDB) 3D-structure 9 Rasmol 3D-structure visualization

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 992 INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005

1) Sequence analysis Sequence Similarity Analysis For sequence analysis we have retrieve the Sequence Similarity was search using sequence Stem Bromelain form NCBI in BLAST14 against Homo sapiens protein FASTA format. database. In bioinformatics, Basic Local Alignment Aminoacid sequence of Stem Bromelain Search Tool, or BLAST, is an algorithm for >gi|115139|sp|P14518.1|BROM2_ANACO comparing primary biological sequence Stem bromelain information, such as the amino-acid AVPQSIDWRDYGAVTSVKNQNPCGACWAF sequences of different or the AAIATVESIYKIKKGILEPLSEQQVLDCAKGYG nucleotides of DNA sequences. A BLAST CKGGWEFRAFEFIISNKGVASGAIYPYKAAK search enables a researcher to compare a GTCKTDGVPNSAYITGYARVPRNNESSMMY query sequence with a library or database of AVSKQPITVAVDANANFQYYKSGVFNGPCG sequences, and identify library sequences that TSLNHAVTAIGYGQDSIIYPKKWGAKWGEAG resemble the query sequence above a certain YIRMARDVSSSSGICGIAIDPLYPTLEE threshold. Length: 212 aa I have searched for the protein sequences of Homo sapiens showing maximum similarity Function with the query sequence i.e. Stem bromelain, Peptidase C1A subfamily; (composed of by using the Sequence Similarity search tool cysteine peptidases (CPs) similar to , “Basic Local Alignment Search Tool” (BLAST). including the mammalian CPs ( B, I have performed Protein-Protein BLAST and C, F, H, L, K, O, S, V, X and W). Papain is an selected 4 sequences showing maximum endopeptidase with specific substrate similarity depending on the parameters viz., preferences, primarily for bulky hydrophobic or Score, e-value, Identities, Positives & Gaps. aromatic residues at the S2 subsite, a They were represented below : hydrophobic pocket in papain that >gb|AAH02642.1| S [Homo accommodates the P2 sidechain of the sapiens] substrate (the second residue away from the emb|CAG46477.1| CTSS [Homo scissile bond). Most members of the papain sapiens] subfamily are endopeptidases. Some Length=331 exceptions to this rule can be explained by GENE ID: 1520 CTSS | [Homo specific details of the catalytic domains like the sapiens] occluding loop in which confers an Score = 101 bits (226), Expect = 2e-21, additional carboxydipeptidyl activity and the Method: Compositional matrix adjust. mini-chain of resulting in an N- Identities = 59/147 (40%), Positives = 84/147 terminal exopeptidase activity. (57%), Gaps = 10/147 (6%) Papain-like CPs have different functions in various organisms15-16. Plant CPs are used to FASTA format mobilize storage proteins in seeds. Parasitic >gi|12803615|gb|AAH02642.1| Cathepsin S CPs act extracellularly to help invade tissues [Homo sapiens] and cells, to hatch or to evade the host MKRLVCVLLVCSSAVAQLHKDPTLDHHWHL immune system. Mammalian CPs are primarily WKKTYGKQYKEKNEEAVRRLIWEKNLKFVM lysosomal enzymes with the exception of LHNLEHSMGMHSYDLGMNHLGDMTSEEVM , which is retained in the SLMSSLRVPSQWQRNITYKSNPNWILPDSV endoplasmic reticulum. They are responsible DWREKGCVTEVKYQGSCGACWAFSAVGAL for protein degradation in the lysosome. EAQLKLKTGKLVSLSAQNLVDCSTEKYGNK Papain-like CPs are synthesized as inactive GCNGGFMTTAFQYIIDNKGIDSDASYPYKAM proenzymes with N-terminal propeptide DQKCQYDSKYRAATCSKYTELPYGREDVLK regions, which are removed upon activation. In EAVANKGPVSVGVDARHPSFFLYRSGVYYE addition to its inhibitory role, the propeptide is PSCTQNVNHGVLVVGYGDLNGKEYWLVKN required for proper folding of the newly SWGHNFGEEGYIRMARNKGNHCGIASFPSY synthesized and its stabilization in PEI. denaturing pH conditions. Residues within the propeptide region also play a role in the Function transport of the proenzyme to lysosomes or Cathepsin propeptide inhibitor domain (I29). acidified vesicles. Also included in this This domain is found at the N-terminus of subfamily are proteins classified as non- some C1 peptidases such as peptidase homologs, which lack peptidase where it acts as a propeptide. There are also a activity or have missing residues. number of proteins that are composed solely

of multiple copies of this domain such as the

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 993 INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005 peptidase inhibitor salarin. This family is GENE ID: 1512 CTSH | cathepsin H [Homo classified as I29 by MEROPS18-20. sapiens] Peptidase C1A subfamily; composed of Score = 136 bits (308), Expect = 6e-32, cysteine peptidases (CPs) similar to papain, Method: Compositional matrix adjust. including the mammalian CPs (cathepsins B, Identities = 77/219 (35%), Positives = 118/219 C, F, H, L, K, O, S, V, X and W). Papain is an (53%), Gaps = 17/219 (7%). endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or FASTA format aromatic residues at the S2 subsite, a >gi|16506813|gb|AAL23961.1|AF426247_1 hydrophobic pocket in papain that cathepsin H [Homo sapiens] accommodates the P2 sidechain of the MWATLPLLCAGAWLLGVPVCGAAELCVNSL substrate (the second residue away from the EKFHFKSWMSKHRKTYSTEEYHHRLQTFAS scissile bond). Most members of the papain NWRKINAHNNGNHTFKMALNQFSDMSFAEI subfamily are endopeptidases. Some KHKYLWSEPQNCSATKSNYLRGTGPYPPSV exceptions to this rule can be explained by DWRKKGNFVSPVKNQGACGSCWTFSTTGA specific details of the catalytic domains like the LESAIAIATGKMLSLAEQQLVDCAQDFNNYG occluding loop in cathepsin B which confers an CQGGLPSQAFEYILYNKGIMGEDTYPYQGK additional carboxydipeptidyl activity and the DGYCKFQPGKAIGFVKDVANITIYDEEAMVE mini-chain of cathepsin H resulting in an N- AVALYNPVSFAFEVTQDFMMYRTGIYSSTSC terminal exopeptidase activity. Papain-like HKTPDKVNHAVLAVGYGEKNGIPYWIVKNS CPs have different functions in various WGPQWGMNGYFLIERGKNMCGLAACASYP organisms. Plant CPs are used to mobilize IPLV storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, Inference to hatch or to evade the host immune system. The Query Sequence i.e. Stem Bromelain Mammalian CPs are primarily lysosomal protein sequence is showing maximum enzymes with the exception of cathepsin W, Sequence and functional similarity with which is retained in the endoplasmic reticulum. different types of Cathepsins of Homo sapiens. They are responsible for protein degradation in The Query sequence and all the Cathepsins the lysosome. Papain-like CPs are are having the same function i.e. Peptidase synthesized as inactive proenzymes with N- C1A subfamily; Cysteine peptidases similar to terminal propeptide regions, which are Papain; indicating that they may have common removed upon activation. In addition to its ancestor. inhibitory role, the propeptide is required for proper folding of the newly synthesized Phylogenetic analysis of Stem Bromelain enzyme and its stabilization in denaturing pH by using Clustal W conditions. Residues within the propeptide Clustal W is a general purpose Multiple region also play a role in the transport of the Sequence Alignment program for DNA or proenzyme to lysosomes or acidified vesicles. Proteins. It produces biologically meaningful Also included in this subfamily are proteins multiple sequence alignments of divergent classified as non-peptidase homologs, which sequences. It calculates the best match for lack peptidase activity or have missing active the selected sequences, and lines them so site residues. that the identities, similarities and differences >gb|AAL23961.1|AF426247_1 cathepsin H can be seen. Evolutionary relationships can be [Homo sapiens] seen via viewing Cladograms or Length=335 Dendrograms.

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 994 INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005

Clastal W 2.1 Multiple sequence alignment

Clustal Distance Matrix Used to know the evolutionary distance among proteins sequences.

(1) (2) (3) (4) (5) CATHEPSIN_K_ (1) 0.000 0.451 0.491 0.594 0.620 Cathepsin_S_ (2) 0.451 0.000 0.508 0.558 0.660 CATHEPSIN_L1 (3) 0.491 0.508 0.000 0.583 0.648 Stem_bromelain (4) 0.594 0.558 0.583 0.000 0.635 CATHEPSIN_H (5) 0.620 0.660 0.648 0.635 0.000

Structural analysis of Stem Bromelain13 1) Primary Structure analysis by using estimated half-life, instability index, aliphatic “Protparam” tool index and grand average of hydropathicity ProtParam computes various physico- (GRAVY). Molecular weight and theoretical pI chemical properties that can be deduced from are calculated as in Compute pI/Mw. a protein sequence. The parameters computed by ProtParam include the molecular Number of amino acids: 212 weight, theoretical pI, amino acid composition, Molecular weight: 22830.9 atomic composition, extinction coefficient, Theoretical pI: 8.60.

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 995 INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005

Aminoacid composition Stem Bromelain is rich in Alanine (11.8%) followed by Glycine (10.4%). It is poor in Histidine (0.5%) and Methionine (1.4%).

Instability index The instability index (II) is computed to be 26.22 This classifies the protein as stable.

Grand average of hydropathicity (GRAVY):

-0.158 α-helix were represented in Magenta color 2) Secondary structure analysis by β-sheets were represented in Blue color “Hierarchial Neural Network (HNN) Cysteine residues were represented in Yellow color 10 20 30 40 50 60 70 CONCLUSION | | | | | | | Stem Bromelain was analyzed in silico by AVPQSIDWRDYGAVTSVKNQNPCGACWAF using different Bioinformatics tools and AAIATVESIYKIKKGILEPLSEQQVLDCAKGY databases described above. It is showing GCKGGWEFR more sequence similarity with Cathepsin cccccccccccceeeeecccccccchhhhhhhhhhhhhh of Homo sapiens. From the hhhhhchhhcchhhhhhhhhcccccccceeh phylogenetic analysis, it was determined that AFEFIISNKGVASGAIYPYKAAKGTCKTDGV the evolutionary distance between the protein PNSAYITGYARVPRNNESSMMYAVSKQPIT of interest i.e. Stem Bromelain and Cathepsin VAVDANANF S is less. Structural analysis revealed the fact hhheheccccccccceeccccccccccccccccceeeeec that the Stem Bromelain is showing more ceccccccceeeeeecccceeeeeccccce similarity with Papain, molecular visualization QYYKSGVFNGPCGTSLNHAVTAIGYGQDSII of which has been represented. YPKKWGAKWGEAGYIRMARDVSSSSGICGI AIDPLYPTL eehhcccccccccccccceeeeeeccccceeecccccccc REFERENCES ccchhehhhhccccccceeeeecccccccc 1. Kelly S Bromelain. A Literature EE Review and Discussion of its Cc Therapeutic applications. Gregory Alt Alpha helix (Hh): 44 is 20.75% Med Rev. 1996;1(4):243-257. Extended strand (Ee): 46 is 21.70% 2. Brien S, Lewith G and Walker A. Random coil (Cc) : 122 is 57.55% Bromelain as a Treatment for Osteoarthritis: a Review of Clinical 3) Tertiary Structure prediction Using CPH Studies. Evidence-based Model complementary and alternative Entry: 1PCI chain: A score: 109 E: 7e-25 medicine. 2004;1(3):251-257. HHPred : 3. Kleef R, Delohery T and Bovbjerg. PDB ID : 1PCI Selective modulation of cell adhesion molecules on lymphocytes by 4) Molecular Visualization by using Rasmol Bromelain protease. Pathobiology. RasMol is a computer program written for 1996;64(6):339-46. molecular graphics visualization intended and 4. Heinicke RM and Gortner WA. Stem used primarily for the depiction and exploration Bromelain-a new protease preparation of biological macromolecule structures, such from pineapple plants Econ Bot. as those found in the Protein Data Bank. It 1957;11(3): 225-234. was originally developed by Roger Sayle in the 5. Gutfreund A, Taussig S and Morris A. early 90s. Effect of oral bromelain on blood pressure and heart rate of hypertensive patients. Hawaii medical journal. 1978;37(5):143-6. 6. Gregory S and Kelly ND. Bromelain: A Literature Review and Discussion of

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 996 INTERNATIONAL JOURNAL OF PHARMACEUTICAL AND CHEMICAL SCIENCES ISSN: 22775005

its Therapeutic Applications. Alt Med 13. Van Kuik JA , Hoffmann RA , Rev. 1996;1(4):243-257. Mutsaers JHGM , Kamerling JP and 7. White RR, Crawley FE and Vellini M. Vliegenthart JFG. Department of Bio- Bioavailability of 125I bromelain after Organic Chemistry, State University of oral administration to rats. Biopharm Utrecht, Transitorium III, Padualaan Drug Dispos. 1988;9:397-403. 8, NL-3584 CH Utrecht, The 8. Kumakura S, Yamashita M and Netherlands. BLASTP 2.2.18. Tsurufuji S. Effect of bromelain on 14. Stephen F, Thomas L, Alejandro A, kaolin-induced inflammation in rats. Jinghui Z, Zheng Z, Webb Mi and Eur J. Pharmacol. 1988;150:295-301. David J. Gapped BLAST and PSI- 9. Uchida Y and Katori M. Independent BLAST: a new generation of protein consumption of high and low database search programs. Nucleic molecular weight kininogens in vivo. Acids Res. 1997;25:3389-3402. Adv Exp Med Biol. 1986;198:113-118. 15. Protein structure database search and 10. Taussig SJ and Batkin S. Bromelain, evolutionary classification Jinn-Moon the enzyme complex of pineapple Y and Chi-Hua T. Nucleic Acids Res. (Ananas comosus) and its clinical 2006;34(13): 3646–3659. application. J Ethnopharmacol. 16. Krzysztof G, Nick VG, Adam G and 1988;22:191-203. Leszek R. Practical lessons from 11. Felton GE. Fibrinolytic and protein structure prediction. Nucleic antithrombotic action of bromelain Acids Res. 2005;33(6):1874–1891. may eliminate thrombosis in heart 17. Eric DS and Philip EB. Application of patients. Med Hypotheses. protein structure alignments to iterated 1980;6:1123-1133. hidden Markov model protocols for 12. Heinicke RM, Van der Wal M, structure prediction. BMC Yokoyama MM. Effect of bromelain Bioinformatics. 2006;7:410. (Ananase) on human platelet 18. www.scienceresearch.com. aggregation. Experientia. 19. www.ncbi nlm.nih.org./PubMed/ 1972;28:844-845. 20. www. highwire.stanford.edu

Vol. 2 (2) Apr-Jun 2013 www.ijpcsonline.com 997