<<

Structural and its role in variant classification

Created for educational purposes by Iain Kerr, Ph.D. Medical Services, Myriad Genetic Laboratories, Inc. Data presented as of June 2016

Myriad and the Myriad logo are registered trademarks of Myriad , Inc., in the United States and other jurisdictions. ©2016 Myriad Genetic Laboratories, Inc. Introduction/Background is the study of the three-dimensional of and nucle- ic acids and their interaction with other in the . As the structure of these molecules is associated with their ,1-6 that impart structural changes may lead to the production of abnormal/malfunctioning proteins that cause disease.

transcription DNA RNA

The basis of the central dogma and the foundation of genetic testing lies in the discovery of the double-helical structure of DNA, which revealed the molecular details of the genetic code and implied a copying mechanism that would later be replicated in vitro to perform DNA .7, 8

MEEPQSDPSVEPPLSQETFSDLWKLLPENN A B VLSPLPSQAMDDLMLSPDDIEQWFTEDPGP DEAPRMPEAAPPVAPAPAAPTPAAPAPAPS

unfolded protein chain Folded protein

Figure 1. Linear strings of amino acids form a folded, three dimensional protein. (A) Part of the p53 sequence (above). (below) Individual blocks in the schematic represent indvidual amino acids, like beads on a string (B) The three dimensional structure of the DNA-binding domain of TP53 bound to DNA.

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 2 A B C

MSH2 MSH6

Figure 2. structure and organization. (A) the ATPase domain of MSH2. (B) The full-length MSH2 protein. Individual domains differ in color. The ATPase domain is circled (C) The structure of the MSH2•MSH6 mismatch recognition complex.

During protein synthesis, as the primary amino acid sequence is converted into more complex structure, separate protein domains begin to form (Fig. 2A). Domains represent conserved, independently folded modules of a protein that have their own compact, structure. Protein domains often have unique functions that are essential to the role played by the protein in the cell. A protein may contain one domain (Fig. 2A) or multiple domains (Fig. 2B). Often, two or more proteins interact to form a functional complex (Fig. 2C), as is often the case during DNA repair. For this reason, conservation analysis of the primary sequence is insufficient to explain the role of a variant and may be thought of as one-dimensional solution to a more complex, multi-component, three-dimensional problem.

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 3 Techniques While Structural Biology encompasses a number of experimental techniques, Nuclear Magnetic Resonance Spectrospcopy (NMR) and X-ray Crystallography produce the most accurate, detailed protein . X-ray Crystallography is routinely used by the pharmaceutical industry in rational drug-design9-12 and is the primary method of choice for full structure determination. NMR has the advantage of being able to study the dynamics of a protein (how proteins ‘move’). However, the NMR experiment is limited by the size of protein that can be studied in solution. Many of the commonly studied hereditary cancer encode large proteins outside the reach of NMR.

Figure 3. Three stages of the X-ray crystallography experiment – crystal growth, image from a protein crystal exposed to X-rays and the final structure

In cases where experimental structure data are not available for a given protein, homology modelling may be used when the structure of a protein, similar to the desired target, is known. As the precision of the homology model cannot be validated against experimental structure data, great care must be taken when using this method. To ensure the accuracy of the model, the target protein and the homolog with known structure should have amino acid sequences that share a high degree of identity. 13, 14 More advanced computational techniques may be used to further explore the effect a given variant may have on and function.15 These methods are especially useful in X-ray crystal structure analysis; while proteins and their are known to exhibit dynamic motion, crystal structures represent a ‘static’, averaged, snap-shot in time. Techniques such as (MD), can provide information on the motion of surrounding structure when a variant is introduced, or on the protein as a whole. In recognition of the importance of this field, Drs. Martin Karplus, and Arieh Warshel received the 2013 Nobel Prize in Chemistry for “the development of multiscale models for complex chemical systems.”

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 4 Structural Biology in Variant Classification The 2015 ACMG “Standards and guidelines for the interpretation of sequence variants”16 state that mutations located within a “critical and well-established functional domain” may be considered moderate evidence of pathogenicity. This provides a platform for the use of Structural Biology in variant classification. In particular, this method is advantageous for missense variants where current non-structural data, in isolation, are insufficient to lead to a definitive classification.

Case 1. TP53 c.844C>T (p.Arg282Trp)

DNA

L1

Arg282 (wild-type)

DNA

L1 ?

Arg282Trp (variant)

Figure 4. A three-dimensional structure of the TP53 tetramer bound to DNA. The small square denotes the region where the variant occurs on one p53 . Large squares show this region in greater detail.

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 5 The variant is located in the DNA-binding domain (DBD) of TP53. In addition, Arg282 is a ‘hotspot’ for TP53 pathogenic mutations. For this variant, an experimentally determined structure is available.17 This is often not the case and protein modelling of the wild-type structure is usually required to visualize the effect of most variants. The substitution of arginine (Arg) for tryptophan (Trp) at amino acid 282 causes displacement of the DNA-binding L1 loop and the surrounding structure becomes more flexible, showing this to be a destabilizing mutation. When analyzed at 37˚C in a biophysical assay, the mutation is show to completely unfold the TP53 protein18, confirming the structural mechanism. Unfolded proteins like TP53 are unable to perform their function. Given the evidence, this mutation is classified as suspected deleterious (likely pathogenic).

Case 2. BRCA1 c.5153G>C (p.Trp1718Ser)

Trp1718 (wild-type)

BRCT1 BRCT2

Figure 5. A three-dimensional structure of the BRCT domains of BRCA1. The variant is buried in the BRCT1 domain of BRCA1. Trp1718Ser

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 6 The ability of the C-terminal BRCT domains to bind phosphorylated proteins is essential to BRCA1’s role as a tumour suppressor. 19 Trp1718 is buried in the core of the BRCT1 domain. Mutations buried in the protein core are frequently associated with disease.20 The variant, Trp1718Ser, has a much smaller side chain creating a cavity in the protein core. Cavity creating mutations are known to be destabilizing to protein structure. In vitro, biochemical analysis of Trp1718Ser by Lee et al. confirmed that the mutant protein harbors a severe folding defect which compromises the downstream activity of BRCA1.21 On the basis of structural analysis and functional data, c.5153G>C (p.Trp1718Ser) is therefore considered suspected deleterious.

References [1] Phillips, D. C. (1966) The three-dimensional structure of an molecule, Scientific American 215, 78-90. [2] Edelman, G. M. (1973) Antibody structure and molecular , Science 180, 830-840. [3] Martin, A. C., Orengo, C. A., Hutchinson, E. G., Jones, S., Karmirantzou, M., Laskowski, R. A., Mitchell, J. B., Taroni, C., and Thornton, J. M. (1998) Protein folds and functions, Structure 6, 875-884. [4] Hegyi, H., and Gerstein, M. (1999) The relationship between protein structure and function: a comprehensive survey with application to the yeast , Journal of 288, 147-164. [5] Thornton, J. M., Orengo, C. A., Todd, A. E., and Pearl, F. M. (1999) Protein folds, functions and , Journal of molecular biology 293, 333-342. [6] Dessailly, B. H., Redfern, O. C., Cuff, A., and Orengo, C. A. (2009) Exploiting structural classifications for function prediction: towards a domain grammar for protein function, Current opinion in structural biology 19, 349-356. [7] Watson, J. D., and Crick, F. H. (1953) Molecular structure of nucleic acids; a structure for deoxyribose , 171, 737-738. [8] Watson, J. D., and Crick, F. H. (1953) Genetical implications of the structure of deoxyribonucleic acid, Nature 171, 964-967. [9] Colman, P. M. (1994) Structure-based , Current opinion in structural biology 4, 868-874. [10] Wlodawer, A., and Vondrasek, J. (1998) Inhibitors of HIV-1 protease: a major success of structure-assisted drug design, Annual review of and 27, 249-284. [11] Chen, L., Morrow, J. K., Tran, H. T., Phatak, S. S., Du-Cuny, L., and Zhang, S. (2012) From laptop to benchtop to bedside: structure-based drug design on protein targets, Current pharmaceutical design 18, 1217-1239. [12] Congreve, M., Dias, J. M., and Marshall, F. H. (2014) Structure-based drug design for G protein-coupled receptors, Progress in 53, 1-63. [13] Dalton, J. A., and Jackson, R. M. (2007) An evaluation of automated homology modelling methods at low target template sequence similarity, 23, 1901-1908. [14] Kopp, J., and Schwede, T. (2004) Automated protein structure : a progress report, 5, 405-416. [15] Friedman, R., Boye, K., and Flatmark, K. (2013) Molecular modelling and simulations in cancer research, Biochimica et biophysica acta 1836, 1-14. [16] Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., Grody, W. W., Hegde, M., Lyon, E., Spector, E., Voelkerding, K., Rehm, H. L., and Committee, A. L. Q. A. (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and and the Association for Molecular , Genet Med 17, 405-424. [17] Joerger, A. C., Ang, H. C., and Fersht, A. R. (2006) Structural basis for understanding oncogenic p53 mutations and designing rescue drugs, Proc Natl Acad Sci U S A 103, 15056-15061. [18] Bullock, A. N., Henckel, J., and Fersht, A. R. (2000) Quantitative analysis of residual folding and DNA binding in mutant p53 core domain: definition of mutant states for rescue in cancer therapy, Oncogene 19, 1245-1256. [19] Shakya, R., Reid, L. J., Reczek, C. R., Cole, F., Egli, D., Lin, C. S., deRooij, D. G., Hirsch, S., Ravi, K., Hicks, J. B., Szabolcs, M., Jasin, M., Baer, R., and Ludwig, T. (2011) BRCA1 tumor suppression depends on BRCT phosphoprotein binding, but not its E3 ligase activity, Science 334, 525-528. [20] Gao, M., Zhou, H., and Skolnick, J. (2015) Insights into Disease-Associated Mutations in the Human Proteome through Protein Structural Analysis, Structure 23, 1362-1369. [21] Lee, M. S., Green, R., Marsillac, S. M., Coquelle, N., Williams, R. S., Yeung, T., Foo, D., Hau, D. D., Hui, B., Monteiro, A. N., and Glover, J. N. (2010) Comprehensive analysis of missense variations in the BRCT domain of BRCA1 by structural and functional assays, Cancer Res 70, 4880-4890.

Created for educational purposes by Medical Services, Myriad Genetics Laboratories. Data presented as of June 2016. 7