Abstract the Human Protein PMS1 Is a Protein That Functions in DNA Mismatch Repair
Total Page:16
File Type:pdf, Size:1020Kb
ii iv Abstract The human protein PMS1 is a protein that functions in DNA mismatch repair. PMS1 is part of the High Motility Group Protein family (HMG proteins). Using homology modeling in YASARA, a 3-dimensional structure of the PMS1 protein was produced and the structure was verified as realistic using molecular dynamics. Using Evolutionary Analysis in an online program called ConSurf identified the high and low conservative regions of the PMS1 protein. Sequences with high conservation scores indicate important structural and functional aspects of the proteins. Using the GNOMAD database, human variants of the protein were found with a focus placed on those that caused missense, loss of function and frameshift mutations which can be found in the 3-dimensional structure. Where these proteins are found will be at the sites of phenotypical consequences from mutation. Using the Human Protein Atlas, PMS1 was found in all cells. However, it is most common in cells with a high reproductivity rate, like cells in the digestive tract. Malfunctioning of PMS1 leads to genome instability and more frequent mutations, which can cause genetic defects or cancers. The Catalogue of Somatic Mutation in Cancer (COSMIC) was used to identify cancer related mutations of PMS1, such as colorectal cancer. It is hoped that this sequence to structure to function to phenotype approach will contribute to the future of genomic medicine. v Acknowledgments I would like to thank Dr. Hawkins and the entire honors committee for the riveting opportunity to partake in Walsh University’s Honors Program. This faculty has pushed me past my comfort level for the past four years to become the successful student and person I am today. Conducting and presenting original research is an opportunity I am thankful for and would have not had without this program. Being a part of the honors program has allowed and prepared for my future in veterinary medicine, as well as life outside of the classroom. Next, I would like to thank Dr. Freeland for being my thesis advisor. He had several advisees, but still was still dedicated to my success in this research. I am thankful for the many hours he spent discussing, reading, editing, and researching with me. He has made an impact on my education as an advisor, a professor, and a friend. I would also like to thank my fellow honors peers, who have contributed to my success in the program with unending support. Most importantly, I would like to thank my family. My family has been my greatest support in my journey as an honors student. vi Table of Contents List of Figures……………………………………………………………………………………vii List of Graphs…………………………………………………………………………………...viii Introduction………………………………………………………………………………………..1 Limitations...…………………………………………………………………………………….2 Why is this work relevant?............................................................................................................2 Literature Review………………………………………………………………………………….3 Genomic Medicine……………………………………………………………………………..3 Understanding DNA……………………………………………………………………………4 MMR Through Evolution………………………………………………………………………10 Early Discoveries………………………………………………………………………………13 Medical Consequences…………………………………………………………………………16 Research Statement……………………………………………………………………………...18 Research Methods……………………………………………………………………………….18 NCBI…….…………………………………………………………………………………….18 YASARA Homology Model…………………………………………………………………….20 Molecular Dynamics…………………………………………………………………………...22 ConSurf…………………………………………………………………………………………22 COSMIC………………………………………………………………………………………...22 Human Protein Atlas…...………………………………………………………………..…….23 GNOMAD; Looking for human genome variants……………………………………………..23 Results and Discussion………………………………………………………………………….23 Yasara Molecular Dynamics & Consurf Dimer Report……………………...………………..25 vii Missing Residues………………………………………………………………………………29 BLOSUM Scoring Matrix…………………………….………………………………………..33 COSMIC Results…………………………………...…………………………………………..35 Human Protein Atlas Results…………………………..……………………………………....36 Conclusion……………………………………………………………………………………….37 Works Cited……………………………………………………………………………………...39 List of Figures Figure 1. Illustrates where genes are located in the cell and their role in protein expression…….5 Figure 2. Shows the 5 prime and 3 prime ends on a DNA strand………………………………...6 Figure 3. Shows where exons and introns are located in the structure of a gene and how the introns interrupt the gene sequence and must be removed………………………………..6 Figure 4. Shows the structure of the PMS1 gene on chromosome 2……………………………...7 Figure 5. Shows and explains the levels of protein folding……………………………………….7 Figure 6. Illustrates base pairing within DNA…………………………………………………….9 Figure 7. Shows the involvement of PMS1 protein in various forms of DNA repair, and how the DNA repair proteins often function as heterodimers…………………………………….12 Figure 8. Shows a representation of the relationships between Eukaryotic, Bacterial, and Archaean MMR proteins, including their common ancestor…………………………….13 Figure 9. A picture of the PMS1 structure after homology modeling has been run.………….....24 Figure 10 shows beta sheath formed on residues 117-136 of the PMS1 homodimer……………28 Figure 11 shows alpha helix on amino acid residues 270-290 on the PMS1 homodimer……….28 Figure 12. The structure of PMS1, for residues 353-932, as a homodimer……………………..33 Figure 13. Showing PMS1 residues 353-932 as a monomer……………………………………33 viii Figure 14. Shows the HMG box for PMS1. The three helices comprising the HMG box are labeled 1, 2, and 3………..………………..…………………………………………….34 Figure 15. Shows the BLOSUM scoring matrix………………………………………………..34 Figure 16. Shows a graph of the percentage of different types of mutations that occur in PMS1……………………………………………………………………………………37 Figure 17. Shows where PMS1is located in the cell. PMS1 is an MMR protein so we hypothesized it would be found in the nucleus where DNA is replicated………………38 List of Graphs Graph 1. Root Mean Square Deviation (RMSD) vs. Time……………………………………...25 Graph 2. This graph shows the total movement of each amino acid in the sequence over 20ns..26 Graph 3. This graph also demonstrates the relationship of the RMSD for every amino acid to a smaller scale……………………………………………………………………………..27 Graph 4. Running Avg Conservation score and RMSD for each residue……………………….29 Graph 5 shows the comparison of BLOSUM scores between residues 353-932 of PMS1……..36 Graph 6 shows where PMS1 protein is expressed in parts of the body…………………………39 1 Introduction There are 22,000 genes in the human genome. There are 3.2 billion nucleotide pairs in one set of chromosomes, which equals 6.4 billion nucleotide pairs in every human cell. Individual humans differ in our DNA sequences in 1 out of 1000 nucleotides. Therefore, each person’s genome contains about 6 million variants from what is considered the “normal” human genome. In the genomic medicine of the future, a physician will look at variants in a patient’s genome, the goal being to detect increased risk of diseases or determining which drugs are suitable for that patient, based on the patient’s genomic profile. Detailed knowledge about gene sequences will be necessary for using genomic data in any predictive way. The doctors engaging in genomic medicine will not have to look at 6 million different gene variants in order to make decisions about the best treatments for a patient, because most of these variants are in DNA sequences that will not affect the function of the protein encoded by the gene. The current research is the analysis of gene sequences to determine protein structures, evolutionary comparisons to identify the critical gene sequences, and filtering of human gene variants to identify which ones will be likely to alter protein structure or function. Analysis of one gene at a time is the only way to acquire the detailed knowledge of genes that may affect human health. In this research we will explore the structure and function of the human Post Meiotic Segregation protein 1 because it is important in the repair of DNA damage. Malfunctioning of PMS1 leads to genome instability and more frequent mutations, which can cause genetic defects or cancers. PMS1 has not been extensively studied, although evolutionarily related proteins and heterodimers (a complex of two different proteins joined together) with similar function have been studied. This research can fill the knowledge gap concerning the three-dimensional structure of PMS1, the evolutionary conservation of amino acids within the protein, and which 2 human variants are likely to have phenotypical consequences. Biologists know that genomic variants may change amino acids in a protein sequence; however, the goal of this research is to find which amino acid changes are most likely to affect the structure or function of PMS1. Limitations This research will come with limitations. The three-dimensional structure we arrive at will be a realistic approximation, but it will be hard to verify that it is the correct structure of PMS1 protein in the cell. We will use techniques to validate the quality of the structure we produce, but that still falls short of knowing whether we have arrived at the correct structure. We will use a realistic cellular environment simulation where the protein will be able to react to its environment. However, because it is a simulation and not a real-life cell in the lab, this limits exact results. When we analyze human variants in the genome, we focus on human variants that affect the amino acid sequence, which may alter function or structure of the protein. This means