Protein Intrinsic Disorder As a Flexible Armor and a Weapon of HIV-1
Total Page:16
File Type:pdf, Size:1020Kb
Cell. Mol. Life Sci. (2012) 69:1211–1259 DOI 10.1007/s00018-011-0859-3 Cellular and Molecular Life Sciences REVIEW Protein intrinsic disorder as a flexible armor and a weapon of HIV-1 Bin Xue • Marcin J. Mizianty • Lukasz Kurgan • Vladimir N. Uversky Received: 15 July 2011 / Revised: 28 September 2011 / Accepted: 3 October 2011 / Published online: 28 October 2011 Ó Springer Basel AG 2011 Abstract Many proteins and protein regions are disor- Keywords HIV-1 Á Viral protein Á Protein–protein dered in their native, biologically active states. These interaction Á Intrinsically disordered protein Á MoRF proteins/regions are abundant in different organisms and carry out important biological functions that complement the functional repertoire of ordered proteins. Viruses, with Introduction their highly compact genomes, small proteomes, and high adaptability for fast change in their biological and physical In addition to transmembrane, globular, and fibrous pro- environment utilize many of the advantages of intrinsic teins, it is becoming increasingly recognized that the disorder. In fact, viral proteins are generally rich in protein universe includes intrinsically disordered proteins intrinsic disorder, and intrinsically disordered regions are (IDPs) and proteins with intrinsically disordered regions commonly used by viruses to invade the host organisms, to (IDRs). These IDPs and IDRs are biologically active yet hijack various host systems, and to help viruses in fail to form specific 3D structures, existing as collapsed or accommodation to their hostile habitats and to manage extended dynamic conformational ensembles [1–7]. These their economic usage of genetic material. In this review, we floppy proteins and regions are known as pliable [8], focus on the structural peculiarities of HIV-1 proteins, on rheomorphic [9], flexible [10], mobile [11], partially folded the abundance of intrinsic disorder in viral proteins, and on [12], natively denatured [13], natively unfolded [3, 14], the role of intrinsic disorder in their functions. natively disordered [6], intrinsically unstructured [2, 5], intrinsically denatured [13], intrinsically unfolded [14], intrinsically disordered [15], vulnerable [16], chameleon [17], malleable [18–20], 4D [21], protein clouds [22, 23], and dancing proteins [24], among several other terms. The Electronic supplementary material The online version of this variability of terms used to describe such proteins and article (doi:10.1007/s00018-011-0859-3) contains supplementary regions is a reflection of their highly dynamic nature and material, which is available to authorized users. the lack of unique 3-D structures. Since these proteins are highly abundant in any given B. Xue Á V. N. Uversky (&) Department of Molecular Medicine, University of South Florida, proteome [25], the role of disorder in determining protein College of Medicine, 12901 Bruce B. Downs Blvd, MDC07, functionality in organisms can no longer be ignored. Native Tampa, FL 33612, USA biologically active proteins were conceptualized as parts of e-mail: [email protected] the ‘‘protein trinity’’ [15] or the ‘‘protein quartet’’ [26] M. J. Mizianty Á L. Kurgan models where functional proteins might exist in one of the Department of Electrical and Computer Engineering, several conformations—ordered, collapsed-disordered University of Alberta, Edmonton, AB T6G 2V4, Canada (molten globule-like), partially collapsed-disordered (pre- molten globule-like), or extended-disordered (coil-like), and V. N. Uversky Institute for Biological Instrumentation, Russian Academy protein function might be derived from any one of these of Sciences, 142290 Pushchino, Moscow Region, Russia states and/or from the transitions between them. Disordered 123 1212 B. Xue et al. proteins are typically involved in regulation, signaling, and disease-related unfoldome and unfoldomics were devel- control pathways [27–29]. Their functions complement the oped [25, 54]. functional repertoire of ordered proteins, which have Most viral proteins (e.g., proteins involved in replication evolved mainly to carry out efficient catalysis [30–33]. and morphogenesis of viruses, and the major capsid pro- Because of the fact that IDPs play crucial roles in teins of icosahedral virions) have no homologues in numerous biological processes, it was not too surprising to modern cells despite being shared by many groups of RNA find that many of them are involved in human diseases and DNA viruses [55]. This suggests that viruses might be [34]. Originally, this hypothesis was based on numerous very antique, and that viral genes either primarily origi- case studies in which a particular IDP was shown to be nated in the virosphere during the replication of viral associated with a particular disease. For example, the genomes, and/or were recruited from cellular lineages that presence of disorder has been directly observed in several are now extinct [56]. Viruses represent an interesting cancer-associated proteins, including p53 [35], p57kip2 example of adaptation to extreme conditions, which [36], Bcl-XL and Bcl-2 [37], c-Fos [38], thyroid cancer include both environmental peculiarities, and biological associated protein TC-1 [39], and many others. Some other and genetic features of the hosts. Viruses have to survive maladies associated with IDPs includes Alzheimer’s dis- outside and within the host cell (some viruses infect ease (deposition of amyloid-b, tau-protein, a-synuclein Archaea, which are isolated from geothermally heated hot fragment NAC [40–43]; Niemann-Pick disease type C, environments [57]), and need to infect the host organism subacute sclerosing panencephalitis, argyrophilic grain and replicate their genes while avoiding the host’s coun- disease, myotonic dystrophy, and motor neuron disease termeasures [58]. Genomes of many viruses are with neurofibrillary tangles (accumulation of tau-protein in characterized by unusually high rates of mutation, which, form of neurofibrillary tangles [42]); Down’s syndrome when estimated as exchanges per nucleotide, per genera- (nonfilamentous amyloid-b deposits [44]); Parkinson’s tion can be as high as 10-5–10-3 for RNA viruses, 10-5 for disease, dementia with Lewy body, diffuse Lewy body ssDNA viruses, and 10-8–10-7 for double-stranded DNA disease, Lewy body variant of Alzheimer’s disease, mul- viruses, compared to 10-10–10-9 in bacteria and eukary- tiple system atrophy and Hallervorden–Spatz disease otes [59]. Viral genomes are unusually compact and (deposition of a-synuclein in a form of Lewy body or Lewy contain overlapping reading frames. Therefore, a single neurites [45]); prion diseases (deposition of PrPSC [46]); mutation might affect more than one viral protein [58]. and a family of polyQ diseases, a group of neurodegener- All these peculiarities raised an intriguing question of ative disorders caused by expansion of GAC trinucleotide whether the viral proteins possess unique structural fea- repeats coding for PolyQ in the gene products [47]. tures. In an attempt to answer this question, a detailed At least three computational and bioinformatics analysis of viral proteins was undertaken [60]. First, 123 approaches were elaborated to estimate the abundance of representative single domain proteins of 70–250 amino IDPs in various pathological conditions. The first approach acids that contained no covalent cofactors and whose was based on the assembly of specific data sets of proteins crystal structure has been solved at high resolution were associated with a given disease and the computational analyzed. Of these 123 proteins, 26 were proteins from analysis of these data sets using a number of disorder RNA viruses, 19 were proteins from DNA viruses (18 were predictors [27, 48–52]. In essence, this was an analysis of double-strand and one was single-strand DNA viruses), 26 individual proteins extended to a set of independent pro- were hypothermophilic, 26 were mesophilic eukaryotes, teins. Using this approach, a prevalence of intrinsic and 26 were mesophilic prokaryotes. The analysis revealed disorder in proteins associated with cancer [27], cardio- that viral proteins, and especially RNA viral proteins, vascular disease [49], neurodegenerative diseases [7, 51], possessed systematically lower van der Waals contact various amyloidoses [50], and diabetes [34] was observed. densities compared to proteins from other groups. Fur- A second approach utilized the diseasome, a network of thermore, viral proteins were shown to have a larger genetic diseases where the related proteins are interlinked fraction of residues that are not arranged in well-defined within one disease and between different diseases [53]. A secondary structural elements such as helixes and strands. third approach was based on the evaluation of the associ- Finally, the effects of mutations on protein conformational ation between a particular protein function (including the stability (DDG values) were compared for all these pro- disease-specific functional keywords) with the level of teins. This analysis showed that viral proteins show lower intrinsic disorder in a set of proteins known to carry out this average DDG per residue than proteins from other organ- function [31–33]. Based on the fact that IDPs and proteins isms. RNA viral proteins show particularly low with long IDRs were commonly found in various diseases, DDG values on average, 0.20 kcal/mol lower than meso- the ‘‘disorder in disorders’’ or D2 concept was introduced to philic proteins of the same size, and 0.26 kcal/mol lower summarize