Characterization of Genomic Copy Number Variation in Mus Musculus Associated with the Germline of Inbred and Wild Mouse Populations, Normal Development, and Cancer
Total Page:16
File Type:pdf, Size:1020Kb
Western University Scholarship@Western Electronic Thesis and Dissertation Repository 4-18-2019 2:00 PM Characterization of genomic copy number variation in Mus musculus associated with the germline of inbred and wild mouse populations, normal development, and cancer Maja Milojevic The University of Western Ontario Supervisor Hill, Kathleen A. The University of Western Ontario Graduate Program in Biology A thesis submitted in partial fulfillment of the equirr ements for the degree in Doctor of Philosophy © Maja Milojevic 2019 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Genetics and Genomics Commons Recommended Citation Milojevic, Maja, "Characterization of genomic copy number variation in Mus musculus associated with the germline of inbred and wild mouse populations, normal development, and cancer" (2019). Electronic Thesis and Dissertation Repository. 6146. https://ir.lib.uwo.ca/etd/6146 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. Abstract Mus musculus is a human commensal species and an important model of human development and disease with a need for approaches to determine the contribution of copy number variants (CNVs) to genetic variation in laboratory and wild mice, and arising with normal mouse development and disease. Here, the Mouse Diversity Genotyping array (MDGA)-approach to CNV detection is developed to characterize CNV differences between laboratory and wild mice, between multiple normal tissues of the same mouse, and between primary mammary gland tumours and metastatic lung tissue. A CNV detection pipeline was used in conjunction with evaluated probe sets, targeting 925,378 loci at an inter-probe-set median distance of 319 bp, to identify CNVs in a publicly- available dataset that includes representatives of 114 classical laboratory (CL) strain mice, 52 wild-derived (WD) mice, and 19 wild-caught (WC) mice. On average, WC and WD mice (~50 CNVs/mouse) have twice as many CNVs as CL mice. DdPCR confirmed 96% of MDGA- predicted copy number states. CL CNVs impact gene pathways related to immunity and nucleosome-associated functions, whereas olfaction and pheromone detection are impacted in WC mice. WD mice share impacted genic pathways with both cohorts. In a five-member C57BL/6J inbred mouse family, losses of developmentally-important HOXA genes were detected and confirmed in multiple normal tissues. Further confirmation of postzygotic Hoxa13 losses in unrelated C57BL/6J, CBA/CaJ, and DBA/2J mice points to a widespread phenomenon occurring in mice, involving mutation hotspots and/or programmed losses. In comparison to normal tissues (25 CNVs/mouse), cancer samples from an MMTV- PyMT mouse breast cancer model with lung metastasis have 1.6- to 3.2-fold more CNVs. CNV size is reduced and CNV recurrence is increased among primary tumours in the absence of the hyaluronan-mediated motility receptor, suggestive of altered mechanisms of CNV formation and selection for specific phenotypes in the tumour microenvironment, respectively. CNVs were found to arise during normal development, producing different CNV profiles than with tumorigenesis and metastasis. CNV profiles also differ between laboratory ii and wild mice. This thesis presents improvements to an array-based CNV detection and analysis pipeline which was used to determine the contribution of CNVs to genetic variation in M. musculus. Keywords Mus musculus, Mouse Diversity Genotyping Array, copy number variants, single nucleotide polymorphism, somatic mosaicism, de novo genetic variation, cancer, classical laboratory strains, wild-derived strains, wild-caught mice. iii Co-Authorship Statement Chapters 2 and 3 contain material from a manuscript published in BMC Genomics on July 4, 2015, entitled: “Genomic copy number variation in Mus musculus”. This publication was co- authored by M. Elizabeth O. Locke, Susan T. Eitutis, Nisha Patel, Andrea E. Wishart, Mark Daley and Kathleen A. Hill. I generated the filtered probe lists, compiled a list of genes expected to remain consistent in copy number, performed analysis of genic impact of CNVs, and confirmed putative CNVs. These components are included in the thesis while work primarily done by others was excluded. The CNV calls from this publication are presented in Chapter 3 in the broad survey of mouse genotypes and were generated by M.E.O. Locke. M.E.O. Locke also determined CNV concordance with previous studies. M.E.O. Locke, K.A. Hill, and I drafted the manuscript. S. Eitutis provided the initial probe list and filtering criteria. N. Patel performed probe to reference genome alignment. A.E. Wishart provided helpful discussions pertaining to the design of the study. M. Daley provided interpretation of statistical analysis and contributed critical revisions. K.A. Hill conceived of the study and participated in its design and coordination. All authors participated in useful discussion, as well as read, edited, and approved the final manuscript. Dr. Melissa Holmes provided the naked mole-rat samples described in Chapter 2. Chloe D. Rose performed the DNA extraction from tails of these samples. For Chapter 4, I performed the tissue harvesting for all members of the mouse family together with Alanna K. Edge, Chloe D. Rose, Zachary Hawley, and Hasan Baassiri. DNA extractions were performed by A.K. Edge (pancreas and bladder), C.D. Rose (tail), Z. Hawley (kidney and lung) and me (hippocampus). Cancer sample data files used in Chapter 5 were provided by Dr. Eva Turley and generated by Conny Toelg and David Carter at the London Regional Genomics Centre. M.E.O. Locke generated SNP and CNV genotype calls. iv Acknowledgments There are many people involved in the making of this thesis who I would like to thank for providing guidance, research assistance, and support. First, I would like to express my gratitude to my supervisor, Dr. Kathleen Hill, for being supportive of my research and career goals and for providing me with the opportunity to work autonomously, gain new skills, and present my research to the scientific community. I would also like to thank my advisors Dr. Mark Daley and Dr. Robert Cumming for providing helpful advice and feedback throughout my studies. This research would not have been possible without the help of Beth Locke who provided critical computational assistance and research advice, as well as great friendship. Thank-you to Freda Qi, Alanna K. Edge, Nisha Patel, and other past and present members of the Hill laboratory for their research collaboration and for all the fun times and laughter we shared. I would like to thank Carol Curtis, Diane Gauley, Arzie Chant, Hillary Bain, and Sherri Fenton from the Biology Graduate Office for their availability, kindness, and willingness to provide assistance with administrative tasks over the years. Thank-you to my parents for their unwavering support and love, and for encouraging me to pursue higher education. Thank-you to my sisters for bringing me so much joy and inspiration, and for patiently waiting for me to finish my studies. Finally, I would like to thank Nicolas Bensoussan for being there for me every day, for providing moral support throughout the thesis writing process, and for inspiring me with his curiosity, determination, and ingenuity. This research was supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant awarded to Dr. Kathleen Hill as well as funds awarded through the Western Strategic Support for NSERC Success initiative at Western University. This research was also supported by external and internal funding awarded to me, including the Queen Elizabeth II Graduate Scholarship in Science and Technology, and the Dr. Irene Uchida Fellowship in Life Sciences awarded by Western’s Biology Department. Financial support for conference attendance was provided by the Department of Biology Graduate Travel Award, v and Environmental Mutagenesis and Genomics Society Student and New Investigator Travel Awards. vi Table of Contents Abstract ............................................................................................................................... ii Co-Authorship Statement ................................................................................................... iv Acknowledgments ............................................................................................................... v Table of Contents .............................................................................................................. vii List of Tables ................................................................................................................... xiii List of Figures ................................................................................................................... xv List of Appendices .......................................................................................................... xvii List of Abbreviations ....................................................................................................... xix Chapter 1 ............................................................................................................................. 1 1 Introduction to CNVs and Thesis Aims ......................................................................... 1 1.1 Copy Number Variants (CNVs) .............................................................................