Iii COMPUTATIONAL APPROACHES for ASSESSING KINOME

COMPUTATIONAL APPROACHES FOR ASSESSING KINOME FUNCTION AND DEREGULATION A Dissertation Presented to the Faculty of the Weill Cornell Graduate School of Medical Sciences in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Charles Joseph Murphy August 2018 iii © 2018 Charles Joseph Murphy ALL RIGHTS RESERVED COMPUTATIONAL APPROACHES FOR ASSESSING KINOME FUNCTION AND DEREGULATION Charles Joseph Murphy, PhD Cornell University 2018 Protein kinases are a diverse family of about 500 proteins that all share the common ability to catalyze phosphorylation of the side chains of amino acids in proteins. Kinases play a vital role across diverse biological functions including proliferation, differentiation, cell migration, and cell-cycle control. Moreover, they are frequently altered across most cancers types and have been a focus for development of anticancer drugs, which has led to the development of 38 approved kinase inhibitors as of 2018. In this thesis, I developed two orthogonal computational approaches for investigating kinase function and deregulation. Starting with data from a large cohort of mouse triple negative breast cancer (TNBC) tumors, I use a combination of whole exome sequencing (WES) and RNA-seq to identify somatic alterations that drive individual tumors. I discovered that a large number of these alterations involve protein kinases and subsequent therapeutic targeting led to tumor regression. For my second approach, I utilized a large peptide library dataset from about 300 kinases. Which kinase phosphorylate which phosphorylation site is determined by both kinase- intrinsic and contextual factors. Peptide library approaches provide kinase-intrinsic amino acid specificity, which I used to predict novel kinase substrates and map out kinase phosphorylation networks. In summary, I developed methods using next- generation sequencing and peptide library data to generate novel insights into protein kinase function and deregulation. BIOGRAPHICAL SKETCH Charlie graduated from University of Wisconsin-Milwaukee in 2013 with a Bachelor of Science in Mathematics. Throughout his undergraduate career he participated in various research experiences, but the pivotal one was his acceptance into the two-year fellowship, Undergraduate Research Experiences in Aquatic Biology and Mathematical Modeling, that inspired his interest in computational biology. He enrolled in the Tri-Institutional Training Program in Computational Biology and Medicine at Weill Cornell Medical College in 2013 and joined the labs of Dr. Lewis Cantley and Dr. Olivier Elemento. iii ACKOWLEDGEMENTS I would like to thank my thesis advisors Lewis Cantley and Olivier Elemento for their guidance and patience. I would also like to thank lab members of both the Cantley and Elemento labs for the advice and training they offered on a variety of questions and problems I posed. I especially want to thank Hui Liu and Jared Johnson for their collaboration that enabled me to achieve my dissertation. I would also like to thank the Tri-I CBM program: David Christini for his advice and Margie Hinonangan- Mendoza for the reminders and gentle pushes to meet my deadlines. I am also grateful for my various undergraduate advisors; without the opportunity they provided I would never have pursued a PhD. Gabriella Pinter and Istvan Lauko, who accepted and guided me in their undergraduate fellowship that sparked my interest in computational biology. Peter Tonellato, who let me work in his UWM and Harvard labs my last couple undergraduate years. He not only taught me science but also inspired me to have more confidence and persistence. Finally, I would like to thank my partner, Mike, and my family for their consistent support. iv TABLE OF CONTENTS BIOGRAPHICAL SKETCH .................................................................................. iii ACKOWLEDGEMENTS ....................................................................................... iv TABLE OF CONTENTS ......................................................................................... v LIST OF FIGURES ................................................................................................ vi CHAPTER 1 INTRODUCTION ............................................................................. 1 1.1 Kinase signaling networks underpin normal cell functioning .................................. 2 1.2 Kinases are frequent drivers of cancer ..................................................................... 6 1.3 The mechanisms of somatic alterations for changing kinase activity....................... 8 1.4 In this dissertation ................................................................................................... 10 CHAPTER 2 ONCOGENIC KINASE DRIVERS IN MOUSE TRIPLE NEGATIVE BREAST CANCER .......................................................................... 12 2.1 Modeling human TNBC .......................................................................................... 14 2.2 Characterization of the mouse TNBC transcriptome and genome ........................ 19 2.3 Mouse precision medicine identifies driver kinases ................................................ 35 2.4 Summary .................................................................................................................. 44 CHAPTER 3 KINOME SUBSTRATE SPECIFICITY ........................................ 47 3.1 Modeling kinase substrate specificity ...................................................................... 51 3.2 Phospho-priming ..................................................................................................... 60 3.3 Summary .................................................................................................................. 62 CHAPTER 4 CONCLUDING REMARKS .......................................................... 64 4.1 – Kinase drivers in triple-negative breast cancer ................................................... 64 4.2 – Kinome substrate specificity ................................................................................. 65 4.3 – Future work .......................................................................................................... 66 BIBLIOGRAPHY .................................................................................................. 68 v LIST OF FIGURES Figure 1-1: Family tree of the human kinome.......................................................... 4 Figure 1-2: Determinants of kinase phosphorylation. ............................................. 6 Figure 1-3: Driver kinases identified from cancer genomic studies. ....................... 7 Figure 2-1: Generation of mouse TNBC tumors. ................................................... 14 Figure 2-2: Representative staining for ER, PR, and HER2. ................................ 15 Figure 2-3: Classification of mouse transcriptomes into TNBC. ........................... 17 Figure 2-4: Unsupervised clustering of mouse TNBC with human TNBC. .......... 18 Figure 2-5: Pan-cancer gene fusion rate. ................................................................ 21 Figure 2-6: Gene fusions in mouse primary tumors. ............................................. 22 Figure 2-7: Sample of in-frame kinase fusions. ...................................................... 23 Figure 2-8: Pan-cancer CNA rates. ........................................................................ 24 Figure 2-9: Copy-number alteration landscape of mouse TNBC. ......................... 28 Figure 2-10: CNA rate comparison between Brca1 genotype. ............................... 28 Figure 2-11: Recurrent focal amplifications. ......................................................... 30 Figure 2-12: Single focal amplifications. ................................................................ 31 Figure 2-13: Pan-cancer mutation rate. ................................................................. 32 Figure 2-14: Number and type of mutations in mouse TNBC. .............................. 34 Figure 2-15: Mutations in kinase domains. ............................................................ 35 Figure 2-16: Oncogenic and targetable alterations. ............................................... 38 Figure 2-17: Kinases with two or more alterations. ............................................... 38 Figure 2-18: Sanger sequence validation. ............................................................... 41 Figure 2-19: Western blots. ..................................................................................... 42 Figure 2-20: Tumor treatment. ............................................................................... 43 Figure 3-1: Kinase peptide library. ........................................................................ 50 Figure 3-2: t-SNE by kinase type. ........................................................................... 53 Figure 3-3: t-SNE by kinase family. ....................................................................... 54 Figure 3-4: Kinase domain and specificity correlation. ......................................... 55 Figure 3-5: Putative kinase substrates. ................................................................... 56 Figure 3-6: Correlation with quality metrics. ........................................................ 57 Figure 3-7: Kinase peptide library performance. .................................................. 59 Figure 3-8: Phospho-tyrosine selectivity................................................................. 61 Figure

Iii COMPUTATIONAL APPROACHES for ASSESSING KINOME

Subcellular Localization and Mitotic Interactome Analyses Identify SIRT4 As a Centrosomally Localized and Microtubule Associated Protein

Sirt1 Is Regulated by Mir-135A and Involved in DNA Damage Repair During Mouse Cellular Reprogramming

Cases. Genes Are Rank-Ordered According to Their P-Value

UC San Diego UC San Diego Electronic Theses and Dissertations

Suppl. Table 1

Peripheral Blood Epi-Signature of Claes-Jensen Syndrome Enables Sensitive and Specific Identification of Patients and Healthy Ca

Novel Targets of Apparently Idiopathic Male Infertility

Anti-TUBGCP4 Monoclonal Antibody, Clone HU923 (DCABH-7801) This Product Is for Research Use Only and Is Not Intended for Diagnostic Use

Mouse Tubgcp3 Conditional Knockout Project (CRISPR/Cas9)

Molecular Targeting and Enhancing Anticancer Efficacy of Oncolytic HSV-1 to Midkine Expressing Tumors

Supplemental Solier

Lineage-Specific Effector Signatures of Invariant NKT Cells Are Shared Amongst Δγ T, Innate Lymphoid, and Th Cells