Interpretation of Mutations, Expression, Copy Number in Somatic Breast Cancer: Implications for Metastasis and Chemotherapy
Total Page:16
File Type:pdf, Size:1020Kb
Western University Scholarship@Western Electronic Thesis and Dissertation Repository 9-15-2015 12:00 AM Interpretation of Mutations, Expression, Copy Number in Somatic Breast Cancer: Implications for Metastasis and Chemotherapy Stephanie Dorman The University of Western Ontario Supervisor Dr. Peter Rogan The University of Western Ontario Graduate Program in Biochemistry A thesis submitted in partial fulfillment of the equirr ements for the degree in Doctor of Philosophy © Stephanie Dorman 2015 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Diagnosis Commons, Medical Genetics Commons, Neoplasms Commons, Oncology Commons, and the Therapeutics Commons Recommended Citation Dorman, Stephanie, "Interpretation of Mutations, Expression, Copy Number in Somatic Breast Cancer: Implications for Metastasis and Chemotherapy" (2015). Electronic Thesis and Dissertation Repository. 3227. https://ir.lib.uwo.ca/etd/3227 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. INTERPRETATION OF MUTATIONS, EXPRESSION AND COPY NUMBER IN SOMATIC BREAST CANCER: IMPLICATIONS FOR METASTASIS AND CHEMOTHERAPY (Thesis format: Integrated Article) by Stephanie N. Dorman Graduate Program in Biochemistry A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy The School of Graduate and Postdoctoral Studies The University of Western Ontario London, Ontario, Canada © Stephanie N. Dorman 2015 Abstract Breast cancer (BC) patient management has been transformed over the last two decades due to the development and application of genome-wide technologies. The vast amounts of data generated by these assays, however, create new challenges for accurate and comprehensive analysis and interpretation. This thesis describes novel methods for fluorescence in-situ hybridization (FISH), array comparative genomic hybridization (aCGH), and next generation DNA- and RNA-sequencing, to improve upon current approaches used for these technologies. An ab initio algorithm was implemented to identify genomic intervals of single copy and highly divergent repetitive sequences that were applied to FISH and aCGH probe design. FISH probes with higher resolution than commercially available reagents were developed and validated on metaphase chromosomes. An aCGH microarray was developed that had improved reproducibility compared to the standard Agilent 44K array, which was achieved by placing oligonucleotide probes distant from conserved repetitive sequences. Splicing mutations are currently underrepresented in genome-wide sequencing analyses, and there are limited methods to validate genome-wide mutation predictions. This thesis describes Veridical, a program developed to statistically validate aberrant splicing caused by a predicted mutation. Splicing mutation analysis was performed on a large subset of BC patients previously analyzed by the Cancer Genome Atlas. This analysis revealed an elevated number of splicing mutations in genes involved in NCAM pathways in basal-like and HER2- enriched lymph node positive tumours. Genome-wide technologies were leveraged further to develop chemosensitivity models that predict BC response to paclitaxel and gemcitabine. A type of machine learning, called support vector machines (SVM), was used to create predictive models from small sets of biologically-relevant genes to drug disposition or resistance. SVM models generated were able to predict sensitivity in two groups of independent patient data. High variability between individuals requires more accurate and higher resolution genomic data. However the data themselves are insufficient; also needed are more insightful analytical methods to fully exploit these data. This dissertation presents both improvements in data quality and accuracy as well as analytical procedures, with the aim of detecting and ii interpreting critical genomic abnormalities that are hallmarks of BC subtypes, metastasis and therapy response. Keywords genomic technology, breast cancer, nucleic acid hybridization, fluorescence in-situ hybridization, microarray, copy number changes, next generation sequencing, splicing mutations, NCAM, mutation validation, chemosensitivity, support vector machines, machine learning, paclitaxel, gemcitabine, formalin fixed paraffin embedded tissue iii Co-Authorship Statement Chapter 1 – Stephanie Dorman wrote the introduction/literature review and created all figures and tables, with the exception of Figure 1.4, which was adopted from previous work (as cited). Peter Rogan and Joan Knoll provided helpful comments and feedback. For Chapters 2, 3, 4, and 5, Stephanie Dorman performed all research and analyses with the exceptions noted below. Stephanie Dorman and Peter Rogan conceived and designed the experiments. Stephanie Dorman wrote all Chapters with helpful comments and revisions from Peter Rogan and the other co-authors, unless stated otherwise. Chapter 2 – Ben Shirley developed and ran the ab initio algorithm, and ran PICKY to design the genome-wide oligonucleotides. Natasha Caminsky designed and validated the NOTCH1 and CCND1 probes under the supervision of Stephanie Dorman, Peter Rogan and Joan Knoll. Chapter 3 – Peter Rogan elaborated the problem and conceived of the analytic solution. Coby Viner implemented the algorithm, wrote, and tested the Veridical software and its accompanying Perl and R scripts. Coby Viner and Stephanie Dorman designed the methods used, generated, and interpreted mutation results. Ben Shirley wrote and tested the Shannon pipeline, which provided the predictions evaluated by Veridical. Coby Viner, Stephanie Dorman, and Peter Rogan wrote the manuscript, which has been approved by all of the authors. Chapter 4 – Peter Rogan and Stephanie Dorman conceived of the study. Peter Rogan directed the project, and Stephanie Dorman and Coby Viner performed all analyses. Specifically, Coby Viner ran Strelka and Veridical, and generated the Circos plots and word clouds. Stephanie Dorman and Peter Rogan wrote and revised the chapter, with input from Coby Viner. Chapter 5 – Peter Rogan, Joan Knoll, and Stephanie Dorman conceived of the study. Stephanie Dorman and Katherina Baranova performed all bioinformatics analysis. Stephanie Dorman completed all nucleic acid extractions and cDNA synthesis on the breast cancer formalin fixed paraffin embedded tissue samples. Stephanie Dorman and Katharina Baranova completed all copy number and gene expression measurements. Stephanie Dorman, iv Katharina Baranova, and Peter Rogan wrote the manuscript, which was approved by all authors. Chapter 6 – Stephanie Dorman wrote the discussion with helpful comments and feedback from Peter Rogan and Joan Knoll. v Acknowledgments I would like to start off by thanking my supervisor, Dr. Peter Rogan, for his guidance and advice over the course of my graduate studies. I am truly grateful for the scientific training I have received and opportunities I was granted as a PhD student in his laboratory. I would also like to acknowledge Dr. Joan Knoll, who has been an invaluable mentor and advisor. I feel very privileged to have worked with two scientists who have shared so much of their vast knowledge, expertise, and stories of human genetics research with me, as well as their love for Italian gelato and pizza. I would like to thank Dr. Rob Hegele, for all of the helpful discussions during committee meetings, as well as a number of other individuals who have helped with the studies presented in this thesis – the Vancouver Prostate Center Microarray Facility for performing the HapMap hybridizations, Dr. Brad Urquhart for his guidance with the growth inhibition studies, and Linda Jackson for her help preparing the FFPE tissue blocks. I would like to thank all of the past and present Rogan/Knoll Lab members. From identifying chromosomes to programming code, I hope I have taught you as much about Excel as you have in all of your respective areas of expertise. I will miss morning coffees with John and Natasha, pasta bar lunches with Li, Kat, and Ben, and tennis with Wahab. I am also grateful for all of the friendships I have made throughout the Department and CaRTT. A huge thanks to Ben Kleinstiver, for all of his support in both work and life, and Tom and Jason for the Grad Club dates and group chat support over the years. Last but not least, I would like to thank my mom and dad, for their unwavering love and support, and for being the best parents anyone could ask for. I would like to thank my sisters, Sarah and Katie, and brother-in-law Tim, for being my biggest cheerleaders and keeping me in check with my mixed up expressions. A special thanks to my best friend and partner, Brad - I can’t wait for our next adventure. To all of my other friends from Western and London, you mean the world to me and my time here would not have been the same without you. vi Table of Contents Abstract ............................................................................................................................... ii Co-Authorship Statement................................................................................................... iv Acknowledgments.............................................................................................................. vi Table of Contents .............................................................................................................