Variants to Decipher the Relationship Between Protein Stability

Variants to Decipher the Relationship Between Protein Stability

www.nature.com/scientificreports OPEN Computational analysis of androgen receptor (AR) variants to decipher the relationship between protein stability and related‑diseases Fangfang Chen1*, Xiaoqing Chen2, Fan Jiang3, Feng Leng4, Wei Liu5, Yaoting Gui1 & Jing Yu6* Although more than 1,000 androgen receptor (AR) mutations have been identifed and these mutants are pathologically important, few theoretical studies have investigated the role of AR protein folding stability in disease and its relationship with the phenotype of the patients. Here, we extracted AR variant data from four databases: ARDB, HGMD, Cosmic, and 1,000 genome. 905 androgen insensitivity syndrome (AIS)-associated loss-of-function mutants and 168 prostate cancer-associated gain-of-function mutants in AR were found. We analyzed the efect of single-residue variation on the folding stability of AR by FoldX and guanidine hydrochloride denaturation experiment, and found that genetic disease-associated mutations tend to have a signifcantly greater efect on protein stability than gene polymorphisms. Moreover, AR mutants in complete androgen insensitivity syndrome (CAIS) tend to have a greater efect on protein stability than in partial androgen insensitive syndrome (PAIS). This study, by linking disease phenotypes to changes in AR stability, demonstrates the importance of protein stability in the pathogenesis of hereditary disease. Since most proteins need to be folded to function, protein stability is one of the most basic properties of a pro- tein. Te protein stability discussed herein primarily refers to the thermodynamic stability of a protein, which determines whether the protein is in a naturally folded confguration or a denatured (unfolded or extended) state. Protein stability is a fundamental property that afects protein confguration, activity and regulation. It plays an essential role in evolution, a variety of diseases and industrial applications1–4. Te most common cause of monogenic diseases is single-nucleotide variation (SNV) leading to amino acid substitutions. Tese missense variants can have a strong efect on the stability of a protein, leading to detrimental changes to protein function. Loss of protein stability is a major contributor to this single-gene disease1. More and more attention has been paid in the past few decades to understand the biological principles of protein stability5,6. Accurately predicting protein stability through theoretical and experimental methods is crucial for academic research and industrial applications. Androgens have a wide range of physiological efects on male reproductive and non-reproductive systems at diferent stages of development7–9. During the fetal period, androgens are primarily responsible for sex dif- ferentiation by masculinizing the Wolf tube and external genitalia 9. During puberty, androgens regulate the growth and function of the male reproductive system 9. In adults, androgens play key role in regulating behavior, spermatogenesis and bone metabolism 9. Androgens mediate their actions primarily via the androgen receptor 1Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics, Peking University Shenzhen Hospital, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China. 2Department of Orthopaedics, Quanzhou First Hospital Afliated to Fujian Medical University, Quanzhou 350005, China. 3NanoAI Biotech Co., Ltd., Huahan Technology Industrial Park, Pingshan District, Shenzhen 518109, China. 4National Cancer Institute, National Institute of Health, Bethesda, MD 20892, USA. 5Shenzhen Key Laboratory for Neuronal Structural Biology, Biomedical Research Institute, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China. 6Department of Laboratory Medicine, Peking University Shenzhen Hospital, Shenzhen 518036, China. *email: [email protected]; [email protected] SCIENTIFIC REPORTS | (2020) 10:12101 | https://doi.org/10.1038/s41598-020-68731-7 1 Vol.:(0123456789) www.nature.com/scientificreports/ Figure 1. Flowchart of the computational analysis pipeline for disease-associated androgen receptor (AR) variants. (AR), a ligand-dependent nuclear transcription factor expressed in primary/secondary sex organs 8,9. AR is also expressed in non-genital organs such as the skeletal muscle, skin, adrenal gland, kidney and nervous system8,9. AR is an extensively studied steroid receptors. AR mutations have been identifed in various diseases, includ- ing hereditary diseases such as androgen insensitivity syndrome (AIS)10,11, spinal and bulbar muscular atrophy (SBMA)12,13 and benign prostatic hyperplasia14,15. AR also plays an important role in the development and metastasis of several hormone-related cancers, including prostate cancer12, breast cancer16,17, liver cancer18–20. Te AR contains three major functional domains: (1) the N-terminal domain (NTD) comprises an activation function 1 (AF-1) region, (2) DNA binding domain (DBD) and (3) the C-terminal ligand binding domain (LBD) comprise an AF-2 region8,21. Te primary mechanism of action for AR is to directly regulate gene transcription. Androgen binds to the AR, leading to conformational change of AR, dissociation of heat shock proteins, driv- ing the interaction between the N- and C-terminus of AR, and importin-α binds AR to transport AR into the nucleus22. In the nucleus, the AR dimerizes and binds to androgen response elements (ARES) in the promoter region of the target genes23. AR interact with additional proteins in the nucleus, causing the transcription of specifc target genes to be up- or down-regulated. Notable target genes for AR are insulin-like growth factor I receptor (IGF-1R)24, prostate-specifc antigen (PSA)25,26, and transmembrane protease serine 2 (TMPRSS2)27–29. Although more than 1,000 AR mutations have been identified and these mutants are pathologically important12, few theoretical studies have investigated the impact of mutations on AR protein folding stability in disease and its relationship with the phenotype of the patients. Several algorithms have been developed to predict the efect of mutations on protein stability 30–35. Notable algorithms include FoldX 36, Dmutant37, I-Mutant2.038, CUPSAT39, Eris40 and STRUM41. Compared with other methods, FoldX performs well and is the most commonly used protein stability prediction algorithm42. Folding free energy refects the overall protein stability, and changes in protein stability due to naturally occurring missense mutations ofen cause disease42. Tis work calculates the folding free energy of AR variants by FoldX and measures the guanidine hydrochloride (GdmHCl) denaturation curves of diferent mutants, trying to establish correlation between protein stability and patient phenotype. By correlating the patient’s phenotype with changes in AR stability, this study may prove to be diagnostic and/or predictive tools for assessing the efects of mutations on disease outcome. Results and discussion Computational pipeline for disease‑associated androgen receptor (AR) variants. In order to study the pathogenic pattern of AR mutants in related diseases, our computational analysis consist of the fol- lowing steps as in Fig. 1: (1) extracted AR variations from the following four databases: ARDB (Te androgen receptor gene mutations database: 2012 update; https ://andro gendb .mcgil l.ca), HGMD (Human Gene Mutation Database, 2015), Cosmic (2018.10.01), 1,000 genome; (2) analyzed mutation frequency, protein stability and SCIENTIFIC REPORTS | (2020) 10:12101 | https://doi.org/10.1038/s41598-020-68731-7 2 Vol:.(1234567890) www.nature.com/scientificreports/ Figure 2. AR domains and structure. Blue and red spheres represent N-terminal and C-terminal, respectively. relative surface accessibility (RSA) of these variants; (3) summarized the pathogenic mechanism of AR mutants in related diseases. AR structure. Te NTD domain accounts for more than half of the entire AR protein as in Fig. 2 (amino acids 1–558). AR NTD contains polyglutamine (ploy-Q) and polyglycine (poly-G) repeats, and the length of these two repeats is highly variable in the human population 43,44. Te latest human AR reference gene sequence (NM_000044.2) encodes a protein of 920 amino acids in length (instead of the previous 919). Because the refer- ence length of ploy-Q is replaced by 23 instead of the original 21, and the reference length of poly-G is changed from 24 to 23. Te length of the ploy-Q repeat is inversely proportional to the AR transcriptional activity, and the longer the polyglutamine repeat, the smaller the AR transcriptional activity 45. Te AR NTD is an intrinsi- cally disordered proteins (IDP) that lacks a stable structure in aqueous solution (Fig. 2)46,47. AR NTD undergoes conformational changes when interacting with DNA and/or target proteins and in the presence of structurally stable solutes46,47. Te plasticity of the AR NTD structure allows it to interact with many structurally distinct proteins (e.g., P160 family coactivator, transcription factor IIF) and intramolecularly interact with C terminal LBD domain (N/C interaction of AR)48–51. AR DBD (amino acids 560–620) consists of two zinc-coordinated modules, and it is highly conserved among steroid hormone receptors (Fig. 2). DBD selectively binds to the androgen response elements (ARES) on the promoter, activating specifc AR target genes, such as TMPRSS2, PSA and IGF-1R52. AR contains two NLS sequences-one in the DBD domain, the other one in the hinge region between DBD and LBD. NLS consists of two basic amino acid clusters separated by ten residues (617-RKCYEAGMTLGARKLKK-634).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us