Assessing Functional Impacts of Human Coding Variants
Total Page:16
File Type:pdf, Size:1020Kb
Assessing Functional Impacts of Human Coding Variants by Fan Yang A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Molecular Genetics University of Toronto © Copyright by Fan Yang, 2017 i ii Assessing Functional Impacts of Human Coding Variants Fan Yang Doctor of Philosophy Molecular Genetics University of Toronto 2017 Abstract Advances in sequencing technology have made it routine to determine all coding variation in an individual human genome. A pressing challenge in the post-genomic era is to functionally characterize these variants, particularly within the disease-associated genes. Within the realm of cancer genome research, a critical problem that remains is how to separate the ‘driver’ from ‘passenger’ mutations and to further understand the functional mechanisms and consequences of driver mutations. I analyzed the missense somatic tumor mutations from 71 whole-genome or whole-exome sequencing studies across 21 cancers. I identified cancer-type-specific mutated domains and mutational hotspots. In some cases, I identified shifts in mutation and domain position between cancer types (but within a given gene product). I also provided clues to mutations’ functional effects. In addition to this, I identified different domain-centric mutational distribution patterns between oncoproteins and tumor suppressor proteins. The systematic correlation of mutations and cancer types at the domain level has the potential to guide more precise cancer treatments. Predictive models were also developed to quantify the impact of antigenicity on the spectrum of tumor missense somatic mutations. I found that somatic mutations are significantly depleted in peptides that are predicted to be displayed by MHC class I proteins, and characterized the dependence of this depletion on expression level. My results indicate that HLA class I alleles are, in general, incompletely dominant. I developed a model that produces an ‘antigenicity score’ for any input somatic coding mutation. These antigenicity scores could guide immunotherapy or aid in developing personalized cancer vaccines. iii In another collaborative effort to characterize human variation, I developed yeast-based functional assays to assess the functionality of the disease-associated coding variants. I evaluated the ability of wild-type human disease-associated genes to rescue homologous yeast mutants. Complementation between homologous human and yeast genes could often be found in the absence of annotated orthology, and these complementation relationships were of similar value as orthologous relationships for detecting human disease-causing variation. Finally, I found that the ability to detect pathogenic variation from complementation assays was not limited to variants which occur within the aligned region of human and yeast homology. iv AcKnowledgments I would first like to express my deepest appreciation to my supervisor, Dr. Frederick Roth, who has supported and inspired me through the duration of my time in graduate school. In addition to teaching me the subject of high-throughput biology, he has also indirectly taught me on how to be a generous and kind person by setting such an excellent example. It has been a great privilege and pleasure to work under his supervision. I thank him for his creativity and his insight over the past four and a half years – this time working under his supervision has been a life changing and irreplaceable experience. I would like to thank my two supervisory committee members, Drs. Frank Sicheri and Lincoln Stein, for providing a constant source of high caliber ideas as well as offering a critical eye. They have both contributed a great deal including helpful guidance and continuous support, which has resulted in the successful completion of my project. I offer my sincerest gratitude to collaborators – without them, this work would not have been possible. Their technical expertise, and a never-ending support have been invaluable to me. As a non-exhaustive list, I would like to thank Guihong Tan, Nidhi Sahni, Song Yi, David E. Hill, Marc Vidal, and Charlie Boone. A special thanks to Drs. Hidewaki Nakagawa and Seiya Imoto, and their respective teams– their hard work and dedication contributed to the success of my analysis on the immunogenicity of cancer mutations. To all of the members of Roth lab – it has been a pleasure working with you. I would like to thank each and every one of them for constantly providing me with excellent advice, for their senses of humor, for pulling me up when I’m down, keeping me excited about research, and for being the best possible teammates I could ask for. I’ve never felt so much like I was a part of something important, as I’ve felt here with all of them. Of special note, thank you to Song Sun for being a patient teacher while ‘I learned the ropes’ of human-yeast complementation. Thank you to Evangelia Petsalakis for her continuous support with everything, especially data analysis, as well as her friendship, which has made the hardest times much easier. Thank you to Dae-Kyum Kim and Yingzhou Wu for their patience in answering all of my questions. Thank you to Kristina Ognjanovic for her great effort in proof reading this thesis. A huge thank you to Rong Huang and Shijie Zhou for being my home away v from home, making every day fun and filled with laughter. I have spent the most productive hours of my last four and a half years with them, and they have no idea how much I appreciate their presence in my life. Finally, I would like to thank my friends and family. I have been lucky enough to be surrounded by more beautiful, supportive people than I could possibly list here – I would like to thank them for being my safety net, my biggest fans, and my best friends. Yang Zhao and Huan Lian, my sister-friends, for understanding exactly how much this means to me. To Meng Zhang, Liang Chen and Bo Bao, I am thankful for the drinks, games, and late nights – they helped me survive. I thank Qiuyue Qu for being a constant source of hugs – I just hope I’m not too boring once grad school is over. To my parents, I am grateful for the myriad ways that they have helped me on this journey, and for backing me up no matter what. They have kept me focused on the light at the end of the tunnel, and given me perspective, something that I often overlooked in the final years of my work. To Jiantao Xie, I am grateful for him holding me up during the best and worst of times, being my voice of reason in the middle of the night, and making sure I always take good care of myself. He is everything I could have asked for as a boyfriend, and I appreciate his love, understanding and support more than he knows. vi Table of Contents Acknowledgments ................................................................................................................. iv Table of Contents .................................................................................................................. vi List of Tables ......................................................................................................................... ix List of Figures ........................................................................................................................ xi List of Appendices ............................................................................................................... xvi List of Abbreviations ........................................................................................................... xvii Chapter 1 Introduction .......................................................................................................... 1 Introduction ................................................................................................................... 2 1.1 Addressing the genotype-phenotype question .................................................................... 3 1.2 Assessing the functional impact of human disease mutations using the Saccharomyces cerevisiae as a model system ........................................................................................................... 5 1.2.1 Yeast temperature sensitive strains ...................................................................................... 5 1.2.2 Human-yeast functional complementation assay ................................................................. 6 1.3 Cancer ................................................................................................................................. 8 1.3.1 Complexity and development of cancer ................................................................................ 9 1.3.2 Primary and metastatic tumor ............................................................................................ 10 1.3.3 Oncogenes and tumor suppressors ..................................................................................... 11 1.4 Introduction to current cancer genomics research ............................................................ 12 1.4.1 TCGA, ICGC and PCAWG ProJects ........................................................................................ 13 1.5 The function and dysfunction of the immune system in cancer ......................................... 15 1.5.1 Cancer immune-editing ....................................................................................................... 16 1.5.2 Antigen presentation process .............................................................................................