In Silico Characterization of Human Cytochrome P450 Monooxygenases
Total Page:16
File Type:pdf, Size:1020Kb
Science Vision www.sciencevision.org Science Vision www.sciencevision.org Science Vision www.sciencevision.org Science Vision www.sciencevision.org Science Vision volume 15 number 4 2015 October-December Original Research ISSN (print) 0975-6175 ISSN (online) 2229-6026 In silico characterization of human cytochrome P450 monooxygenases K. Syed Ibrahim, Anuradha Laishram, Esther Lalnunmawii, N. Senthil Kumar* Department of Biotechnology, Mizoram University, Aizawl 796004, India Received 20 July 2015 | Revised 26 October 2015 | Accepted 30 October 2015 ABSTRACT Cytochrome P450 monooxygenases (CYPs) represent a large group and diverse family of enzymes involved in the myriad of biological processes in humans. In the present study, a total of 57 protein sequences of human CYPs retrieved from UniprotKB have been characterized for various physio- chemical properties, homology search, motif and super family search and phylogenetic relationship. Physicochemical analysis showed that the isoelectric point values and GRAVY index ranged from 5.84 to 9.47 and 0.018 to -0.367, respectively. Many proteins (50 members, 87.8%) were in basic form, while few (7 members, 12.2%) were of acidic nature. Moreover, GRAVY index revealed that only CYP26C1 as hydrophobic while all others as hydrophilic. Phylogenetic analysis revealed that P450 proteins basically fall into two main clades and are divided into five subgroups. Motif analysis with MEME indicated presence, absence and even shuffling of motifs within clades. Clustering using the maximum likelihood analysis was also in accordance with P450s central roles in drug and xenobi- otic metabolism as well as steroid hormone synthesis, fat-soluble vitamin metabolism, and the con- version of polyunsaturated fatty acids to biologically active molecules. Motif conservation within clusters showed the evolutionary pressure for maintenance of the structural and functional organiza- tion between different groups of protein. These results will help in the context of understanding the characteristics of the cytochrome P450 monooxygenase isoforms. Key words: Cytochrome P450; in silico; maximum likelihood; motif analysis; phylogenetic analysis. INTRODUCTION portance. Besides that they also play major roles in diverse physiological processes including ster- Cytochrome P450 (CYP450) is a large family oid and cholesterol biosynthesis, fatty acid me- of heme-thiolate proteins that plays key role in tabolism (prostacyclin, thromboxane) and the 1 xenobiotic metabolism, metabolizing most of maintenance of calcium homoeostasis. These the drugs and chemicals of toxicological im- highly conserved genes are found distributed in different life forms, including prokaryotes (archaea, bacteria), unicellular eukaryotes Corresponding author: Senthil Kumar (protists, fungi) and multicellular eukaryotes Phone: +91-9436352574 (plants and animals).2 The human genome en- E-mail: [email protected] 178 CC BY-SA 4.0 International Ibrahim et al. codes 57 P450 genes3 grouped into 18 mammali- formed by methods explained by Brindha et al.10 an families and 44 subfamilies. They are basical- Data on physiochemical properties were gener- ly membrane-associated proteins4 located on the ated using tools like ProtParam, Protein calcula- smooth endoplasmic reticulum of cells ubiqui- tor, Compute pI/Mw, ProtScale from Expert tously in the body, but predominantly expressed Protein Analysis System (EXPASY) proteomic in the liver.5 Any imbalance in enzyme availabil- server from the protein sequences.11 The molecu- ity or its malfunction, e.g., due to a genetic mu- lar weights (kilo dalton) were calculated by the tation, may lead to a disease state in humans.6,7 addition of average isotopic masses of amino In humans, CYP450 are known for their central acid in the protein and deducting the average role in phase I drug metabolism where they are isotopic mass of one water molecule. The pI was of critical importance to two of the most signifi- calculated using pK values of amino acid ac- cant problems in clinical pharmacology: drug cording to Bjellqvist et al.12 The atomic composi- interactions and inter-individual variability in tion, extinction coefficients and aliphatic index drug metabolism.8 Besides detoxification, they was derived using the ProtParam tool, available also synthesize biologically active compounds at ExPASy. The Instability Index which predicts such as steroids, prostaglandins, and arachi- regional instability by calculating the weighted donate metabolites. CYP proteins have been sum of dipeptides that occur more frequently in identified in life forms like animals, plants, fun- unstable proteins when compared to stable pro- gi, protists, bacteria, archaea, and even in virus- teins was calculated using the approach of es,9 but not in Escherichia coli. 8 Guruprasad et al.13 The Grand average hydropa- P450s are usually divided into gene families. thy (GRAVY) was calculated by adding the hy- Besides, they are also divided into four different dropathy value for each residue and dividing by classes depending on the redox partner required. the length of the sequence,14 respectively. Ali- They have a varied primary, secondary, and ter- phatic index was calculated using the formula x* tiary structures. Each P450, has a preferred set (ALA) + a*x (VAL) + b*x (LEU) + b*x (ILE) of substrates, must have a unique method of sub- where a = 2.9 and b = 3.9 are constants.15 Mega strate recognition and an active site they all ap- 6.0116 was used for phylogenetic analysis by pear to have a similar structural core. Thus, the maximum likelihood method (ML). An optimal charge distribution and topography of the ‘redox model of evolution for the aligned dataset was -partner binding region’ of each class of P450 determined. Sequence divergences were then vary to accommodate the various types of redox calculated using the best model which was iden- partners there by providing a great variety of tified by Mega 6.01. This model choice was phylogenetically distributed isoform activities. guided by the need to avoid under parameterisa- Though much studied, still more remains poorly tion.17 Domain analysis was performed using understood. Pfam 27.0.18 Motif analysis was done using ME- In the present study, we performed an in silico ME19 using the expectation maximization ap- analysis on human P450 protein sequences by proach. analyzing their biochemical features, homology, motif patterns, cluster and superfamily distribu- RESULTS AND DISCUSSION tion to understand their functional evolution. Physicochemical characterization of P450 proteins METHOD Humans have 57 CYP450 genes,3 which are The amino acid sequences of the human subdivided to 18 families and 44 subfamilies. P450 in FASTA format were downloaded from There are more than 9000 known CYP450 se- the UniProt Knowledgebase (UniProtKB) quences.20 Owing to its importance, many ani- (Table 1). Physicochemical analysis were per- 179 Function Vitamin D Vitamin biosynthesis Retinoic acid acid Retinoic acid metabolism acid Key steps in steroid steroid in steps Key vitamin D activation D vitamin Bile acid biosynthesis, biosynthesis, acid Bile metabolism/inactivation metabolism/inactivation Arachadonic acid and fatty fatty and acid Arachadonic Thromboxane A2 synthesis A2 Thromboxane Drug/xenobiotic metabolism Drug/xenobiotic 329(E). pfam 442(K). 442(C). 318(C). 455(C). 476(C). 428(C). 450(C). 450(C). 462(C). 479(C). 452(C). 454(C). 468(C). 468(C). 442(C). 441(C). 442(C). Domain Domain predicted by predicted 467(C), 467(C), 475(C), 335(E). 475(C), 468(C), 328(E). 468(C), 468(C), 328(E). 468(C), 468(C), 328(E). 468(C), 453(C), 315(E). 453(C), 457(C), 321(E). 457(C), 457(C), 321(E). 457(C), 180 0.1 0.14 0.21 0.16 0.07 Table0.04 1. Physiochemical parameters of the P450 proteins. - 0.081 0.179 0.185 0.367 0.073 0.245 0.059 0.196 0.284 0.201 0.126 0.134 0.074 0.118 0.204 0.208 0.186 0.036 0.116 - - - - - - - - - - - - - - - - - - - - - - - - GRAVY 96.2 95.15 89.31 90.23 88.04 99.11 96.96 89.06 90.75 90.16 88.55 90.11 96.97 94.39 94.18 94.88 95.29 94.71 85.11 98.46 98.03 97.59 96.08 95.47 100.05 c index c Aliphati 52.4 42.4 50.27 40.97 56.21 49.52 49.31 42.31 47.96 34.54 44.89 43.32 49.69 52.52 55.43 52.23 54.23 57.86 44.19 43.76 33.92 29.92 32.31 41.19 52.518 In silico characterization of human cytochrome P450 monooxygenases Instabili ty index ty - n co n 51755 48610 66390 78185 80955 67755 69245 87445 47620 85995 84060 83935 92650 95630 88640 99515 86330 74870 85870 50225 53205 53080 46215 102830 102745 efficient Extinctio pI 9.4 6.6 8.96 9.31 9.34 9.05 8.94 9.47 8.89 7.56 9.29 8.74 7.19 8.95 7.02 6.26 8.73 7.59 8.47 9.21 8.96 8.27 9.21 8.86 8.27 Mw 56504 60102 60724 56198.6 42632.4 60234.7 58875.4 57560.1 57572.9 60518.3 59085.9 58875.2 61958.2 60269.8 60145.7 59994.6 59846.7 59853.4 58991.1 59245.8 59347.8 57669.7 57525.5 57108.6 57343.1 P19099 P15538 P05108 P24557 P98187 P78329 P13584 P24462 P20815 P08684 Q9HBI6 O43174 O15528 Q02318 Q07973 Q08477 Q02928 Q4G0S4 Q8N118 Q6NT55 Q9HB55 Q9HCS2 Q5TCH4 Q6ZWL3 Q86W10 Accession no. Accession Gene/ CYP4F8/520 CYP4F3/520 CYP4F2/520 CYP4Z1/505 CYP4X1/509 CYP4B1/511 CYP5A1/533 CYP4V2/525 CYP3A7/503 CYP3A5/502 CYP3A4/503 amino acids amino CYP4F22/531 CYP4F12/524 CYP4F11/524 CYP27B1/508 CYP11B2/521 CYP11B1/503 CYP27C1/372 CYP26A1/497 CYP27A1/531 CYP24A1/514 CYP11A1/521 CYP4A22/519 CYP4A11/519 CYP3A43/503 3 2 1 Clade Sub clade clade Sub Sub clade clade Sub Sub clade clade Sub Major Major Group group 1 group (aromatase) biosynthesis Retinoic acid acid Retinoic Steroid biosynthesis Steroid Oestrogen biosynthesis biosynthesis Oestrogen Cholesterol metabolism Cholesterol metabolism/inactivation Cholesterol biosynthesis Cholesterol Testosterone and oestrogen oestrogen and Testosterone Drug/xenobiotic metabolism Drug/xenobiotic 435(R) 436(L).