Structural Biology and Its Role in Variant Classification
Total Page:16
File Type:pdf, Size:1020Kb
Structural biology and its role in variant classification Created for educational purposes by Iain Kerr, Ph.D. Medical Services, Myriad Genetic Laboratories, Inc. Data presented as of June 2016 Myriad and the Myriad logo are registered trademarks of Myriad Genetics, Inc., in the United States and other jurisdictions. ©2016 Myriad Genetic Laboratories, Inc. Introduction/Background Structural Biology is the study of the three-dimensional structure of proteins and nucle- ic acids and their interaction with other molecules in the cell. As the structure of these molecules is associated with their function,1-6 mutations that impart structural changes may lead to the production of abnormal/malfunctioning proteins that cause disease. transcription translation DNA RNA protein The basis of the central dogma and the foundation of genetic testing lies in the discovery of the double-helical structure of DNA, which revealed the molecular details of the genetic code and implied a copying mechanism that would later be replicated in vitro to perform DNA sequencing.7, 8 MEEPQSDPSVEPPLSQETFSDLWKLLPENN A B VLSPLPSQAMDDLMLSPDDIEQWFTEDPGP DEAPRMPEAAPPVAPAPAAPTPAAPAPAPS unfolded protein chain Folded protein Figure 1. Linear strings of amino acids form a folded, three dimensional protein. (A) Part of the p53 amino acid sequence (above). (below) Individual blocks in the schematic represent indvidual amino acids, like beads on a string (B) The three dimensional structure of the DNA-binding domain of TP53 bound to DNA. Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 2 A B C MSH2 MSH6 Figure 2. Protein domain structure and organization. (A) the ATPase domain of MSH2. (B) The full-length MSH2 protein. Individual domains differ in color. The ATPase domain is circled (C) The structure of the MSH2•MSH6 mismatch recognition complex. During protein synthesis, as the primary amino acid sequence is converted into more complex structure, separate protein domains begin to form (Fig. 2A). Domains represent conserved, independently folded modules of a protein that have their own compact, structure. Protein domains often have unique functions that are essential to the role played by the protein in the cell. A protein may contain one domain (Fig. 2A) or multiple domains (Fig. 2B). Often, two or more proteins interact to form a functional complex (Fig. 2C), as is often the case during DNA repair. For this reason, conservation analysis of the primary sequence is insufficient to explain the role of a variant and may be thought of as one-dimensional solution to a more complex, multi-component, three-dimensional problem. Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 3 Techniques While Structural Biology encompasses a number of experimental techniques, Nuclear Magnetic Resonance Spectrospcopy (NMR) and X-ray Crystallography produce the most accurate, detailed protein structures. X-ray Crystallography is routinely used by the pharmaceutical industry in rational drug-design9-12 and is the primary method of choice for full structure determination. NMR has the advantage of being able to study the dynamics of a protein (how proteins ‘move’). However, the NMR experiment is limited by the size of protein that can be studied in solution. Many of the commonly studied hereditary cancer genes encode large proteins outside the reach of NMR. Figure 3. Three stages of the X-ray crystallography experiment – crystal growth, diffraction image from a protein crystal exposed to X-rays and the final structure In cases where experimental structure data are not available for a given protein, homology modelling may be used when the structure of a protein, similar to the desired target, is known. As the precision of the homology model cannot be validated against experimental structure data, great care must be taken when using this method. To ensure the accuracy of the model, the target protein and the homolog with known structure should have amino acid sequences that share a high degree of identity. 13, 14 More advanced computational techniques may be used to further explore the effect a given variant may have on protein structure and function.15 These methods are especially useful in X-ray crystal structure analysis; while proteins and their atoms are known to exhibit dynamic motion, crystal structures represent a ‘static’, averaged, snap-shot in time. Techniques such as Molecular Dynamics (MD), can provide information on the motion of surrounding structure when a variant is introduced, or on the protein as a whole. In recognition of the importance of this field, Drs. Martin Karplus, Michael Levitt and Arieh Warshel received the 2013 Nobel Prize in Chemistry for “the development of multiscale models for complex chemical systems.” Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 4 Structural Biology in Variant Classification The 2015 ACMG “Standards and guidelines for the interpretation of sequence variants”16 state that mutations located within a “critical and well-established functional domain” may be considered moderate evidence of pathogenicity. This provides a platform for the use of Structural Biology in variant classification. In particular, this method is advantageous for missense variants where current non-structural data, in isolation, are insufficient to lead to a definitive classification. Case 1. TP53 c.844C>T (p.Arg282Trp) DNA L1 Arg282 (wild-type) DNA L1 ? Arg282Trp (variant) Figure 4. A three-dimensional structure of the TP53 tetramer bound to DNA. The small square denotes the region where the variant occurs on one p53 molecule. Large squares show this region in greater detail. Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 5 The variant is located in the DNA-binding domain (DBD) of TP53. In addition, Arg282 is a mutation ‘hotspot’ for TP53 pathogenic mutations. For this variant, an experimentally determined structure is available.17 This is often not the case and protein modelling of the wild-type structure is usually required to visualize the effect of most variants. The substitution of arginine (Arg) for tryptophan (Trp) at amino acid 282 causes displacement of the DNA-binding L1 loop and the surrounding structure becomes more flexible, showing this to be a destabilizing mutation. When analyzed at 37˚C in a biophysical assay, the mutation is show to completely unfold the TP53 protein18, confirming the structural mechanism. Unfolded proteins like TP53 are unable to perform their function. Given the evidence, this mutation is classified as suspected deleterious (likely pathogenic). Case 2. BRCA1 c.5153G>C (p.Trp1718Ser) Trp1718 (wild-type) BRCT1 BRCT2 Figure 5. A three-dimensional structure of the BRCT domains of BRCA1. The variant is buried in the BRCT1 domain of BRCA1. Trp1718Ser Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 6 The ability of the C-terminal BRCT domains to bind phosphorylated proteins is essential to BRCA1’s role as a tumour suppressor. 19 Trp1718 is buried in the core of the BRCT1 domain. Mutations buried in the protein core are frequently associated with disease.20 The variant, Trp1718Ser, has a much smaller side chain creating a cavity in the protein core. Cavity creating mutations are known to be destabilizing to protein structure. In vitro, biochemical analysis of Trp1718Ser by Lee et al. confirmed that the mutant protein harbors a severe folding defect which compromises the downstream activity of BRCA1.21 On the basis of structural analysis and functional data, c.5153G>C (p.Trp1718Ser) is therefore considered suspected deleterious. References [1] Phillips, D. C. (1966) The three-dimensional structure of an enzyme molecule, Scientific American 215, 78-90. [2] Edelman, G. M. (1973) Antibody structure and molecular immunology, Science 180, 830-840. [3] Martin, A. C., Orengo, C. A., Hutchinson, E. G., Jones, S., Karmirantzou, M., Laskowski, R. A., Mitchell, J. B., Taroni, C., and Thornton, J. M. (1998) Protein folds and functions, Structure 6, 875-884. [4] Hegyi, H., and Gerstein, M. (1999) The relationship between protein structure and function: a comprehensive survey with application to the yeast genome, Journal of molecular biology 288, 147-164. [5] Thornton, J. M., Orengo, C. A., Todd, A. E., and Pearl, F. M. (1999) Protein folds, functions and evolution, Journal of molecular biology 293, 333-342. [6] Dessailly, B. H., Redfern, O. C., Cuff, A., and Orengo, C. A. (2009) Exploiting structural classifications for function prediction: towards a domain grammar for protein function, Current opinion in structural biology 19, 349-356. [7] Watson, J. D., and Crick, F. H. (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid, Nature 171, 737-738. [8] Watson, J. D., and Crick, F. H. (1953) Genetical implications of the structure of deoxyribonucleic acid, Nature 171, 964-967. [9] Colman, P. M. (1994) Structure-based drug design, Current opinion in structural biology 4, 868-874. [10] Wlodawer, A., and Vondrasek, J. (1998) Inhibitors of HIV-1 protease: a major success of structure-assisted drug design, Annual review of biophysics and biomolecular structure 27, 249-284. [11] Chen, L., Morrow, J. K., Tran, H. T., Phatak, S. S., Du-Cuny, L., and Zhang, S. (2012) From laptop to benchtop to bedside: structure-based drug design on protein targets, Current pharmaceutical design 18, 1217-1239. [12] Congreve, M., Dias, J. M., and Marshall, F. H. (2014) Structure-based drug design for G protein-coupled receptors, Progress in medicinal chemistry 53, 1-63. [13] Dalton, J. A., and Jackson, R. M. (2007) An evaluation of automated homology modelling methods at low target template sequence similarity, Bioinformatics 23, 1901-1908. [14] Kopp, J., and Schwede, T. (2004) Automated protein structure homology modeling: a progress report, Pharmacogenomics 5, 405-416. [15] Friedman, R., Boye, K., and Flatmark, K. (2013) Molecular modelling and simulations in cancer research, Biochimica et biophysica acta 1836, 1-14.