Reciprocal Translocations in the Domestic Pig, the Prevalence, Genetic and Genomic Factors Associated with Breakpoint Formation

By Brendan Thomas Donaldson

A Thesis presented to The University of Guelph

In partial fulfilment of the requirements for the degree of Doctor of Philosophy in Biomedical Sciences

Guelph, Ontario, Canada © Brendan Thomas Donaldson, August 2020

ABSTRACT

RECIPROCAL CHROMOSOME TRANSLOCATIONS IN THE DOMESTIC PIG, THE PREVALENCE, GENOMIC, AND GENETIC FACTORS ASSOCIATED WITH BREAKPOINT FORMATION

Brendan Thomas Donaldson Advisors: University of Guelph, 2020 Dr. W. Allan King Dr. Jon LaMarre

Chromosome rearrangements such as reciprocal translocations are prevalent in the domestic pig, estimated to occur in 1/200 live births, and are suspected to be the reason behind

50% of cases of hypoprolificacy. Despite there being over 200 chromosome rearrangements found in the pig, little is known about why rearrangements form in the pig genome. In order to better understand chromosome rearrangements, their breakpoints, and factors influencing their formation a routine cytogenetic screening program was created to identify carriers in Canadian swine herds.

Using data from this project and others a comprehensive analysis of rearrangement breakpoints was conducted, and a GWAS and CNV analysis was performed using DNA samples from identified carrier boars. Routine cytogenetic screening of 6491 boars revealed 101 carriers of chromosome rearrangements, with a prevalence of 1.56%. Comprehensive analysis of rearrangement breakpoints in pigs revealed a non-random distribution with hotspots for rearrangement sharing a set of architectural features including a euchromatic composition, as well as higher densities of , simple repeats, and tRNA. A GWAS and CNV analysis of carrier boars and their parents revealed five SNP associations, four CNVR and eleven nearby genes each of which played roles in genomic stability or DNA repair. The results of this study show the high

prevalence of rearrangements in pigs, as well as the effectiveness of screening efforts. In addition, genomic architectural features along with genetic and genomic variants in the pig genome may be proposed to influence chromosome breakage and promote rearrangement.

Declaration of Work Performed

I hereby declare that the contents of this thesis and the research included within were performed by me, with the following exceptions as detailed below:

Peripheral blood samples were collected by skilled farm technicians or veterinarians at the various farms. Lymphocyte cell culture, slide preparation, GTG-banding, and imaging were performed with the help of Samira Rezaei, Daniel Villagomez, Tamas Revay, Elizabeth St. John,

Nicolas Mary, Anh Quach, Joohwan Kim, Yatin Sidhu, Tim Carter, Stacey Del Castillo Poppe,

Lauren Piccoli-Kuschke, Diana Caravajal, Baharae Ahmadi, and Olutobi Oluwole. Karyotyping was performed with the help of Daniel Villagomez, Nicolas Mary, and Samira Rezaei.

DNA extraction for SNP array genotyping was performed by skilled technicians at various farms, as well as Tamas Revay and Elizabeth St. John.

iv

Acknowledgements

I would like to thank my advisors Dr. W. Allan King and Dr. Jon LaMarre for their support throughout this program. I would also like to thank my co-advisors Dr. Thomas Koch and Dr.

Andy Robinson for their guidance.

I would like to thank Dr. Daniel Villagomez, and Dr. Tamas Revay, true mentors and friends who pushed me to think and expand the scope of my views.

I would like to thank the members of the RHB lab, including Dr. Monica Antenos, Ed

Reyes, Allison MacKay, and Elizabeth St. John.

A special thank you to Samira Rezaei, I could not have asked for a better lab partner, and she made my time here especially enjoyable.

I am grateful for my friends and family.

v

Table of Contents

Abstract ...... ii Declaration of Work Performed ...... iv Acknowledgements ...... iv Table of Contents ...... v List of Tables ...... x List of Figures ...... xiii List of Symbols, Abbreviations, and Nomenclature ...... xvi Introduction ...... 1 Chapter 1: Review of Literature...... 5 The Chromosome ...... 5 in the Cell Cycle ...... 6 The Chromosome Constitution of Mammalian Species ...... 6 Chromosome Abnormalities ...... 7 Numerical Chromosome Abnormalities ...... 8 Structural Chromosome Abnormalities ...... 10 Sub-Microscopic Genomic Variation ...... 13 Copy Number Variants ...... 13 Single Nucleotide Polymorphisms ...... 13 Detection of Chromosome and Genomic Variation ...... 14 Classical Cytogenetics ...... 14 Veterinary Cytogenetics ...... 15 Molecular Cytogenetics ...... 16 Cytogenomics ...... 17 Genome Wide Association Studies ...... 19 Detection of CNV ...... 20 Chromosomes of the Domestic Pig ...... 21 Chromosome Abnormalities in the Domestic Pig ...... 21 Mosaic Chromosome Rearrangements ...... 26 Economic and Breeding Perspectives ...... 28 Routine Cytogenetic Screening and Management of Chromosome Rearrangements ...... 29 The Generation of Chromosome Rearrangements ...... 30

vi

Generation of DSBs ...... 30 Susceptibility to Chromosome Breakage ...... 32 DNA Damage Recognition and the Initiation of DNA Repair ...... 34 DNA Repair ...... 34 Rationale ...... 38 Chapter 2: Cytogenetic Screening of Canadian Swine Herds: The Prevalence of Chromosome Abnormalities Work from this Chapter has been published in Genes, Scientific Reports, and Elsevier ... 40 Introduction ...... 40 Materials and Methods ...... 41 Peripheral Blood Collection ...... 41 Lymphocyte Culture and Chromosome Analysis ...... 42 Familial and Reproductive Information ...... 44 Statistical Analysis ...... 45 Results ...... 45 Constitutional Reciprocal Translocations in Canadian Swine Herds...... 46 Reproductive Performance of Constitutional Reciprocal Translocation Carriers ...... 48 Mosaic Reciprocal Translocations in Canadian Swine Herds ...... 49 Other Chromosome Rearrangements or Aberrations in Canadian Swine Herds ...... 51 Prevalence of Chromosome Rearrangements in Canadian Swine Herds ...... 51 Extent of Mosaicism in Carriers, and Prevalence of Mosaic Cells in Swine Herds ...... 52 Prevalence of Chromosome Rearrangements in the Major Breeds in Canada ...... 53 Prevalence of Chromosome Rearrangements in Different Herds ...... 54 Prevalence of Chromosome Rearrangements over Generations with Interference via Cytogenetic Screening ...... 56 Chapter 3: Reciprocal Translocation Breakpoints are non-Randomly Distributed in the Pig Genome Work from this Chapter has been published in Genes, and Scientific Reports ...... 73 Introduction ...... 73 Materials and Methods ...... 74 Cytogenetic Screening Analysis of Pig Populations ...... 74 Selection and Analysis of Reciprocal Chromosome Translocations Published in the Literature ...... 75 Definition of Chromosome Parameters ...... 75 Procuring a List of Genes in the Pig Genome and Estimating Cytogenetic Positions ...... 77 Definition of Evolutionary Breakpoints in the Pig Genome ...... 78 Definition of Repetitive Elements in the Pig Genome ...... 78 Definition of Hotspots for Rearrangement and Recurrent Rearrangements in the ...... 78

vii

Statistical Analysis ...... 79 Results ...... 79 Analysis of Translocations between Non-Homologous Chromosomes ...... 80 The Distribution of Rearrangements between Autosomal Chromosomes ...... 80 Analysis of Reciprocal Translocation Breakpoints on Individual Chromosomes ...... 83 The Impact of Chromosome Length on Translocation Frequency ...... 83 Impact of Density on Chromosome Translocation Frequency ...... 86 The Impact of Length on Chromosome Arm Translocation Frequency ...... 88 The Role of Gene Density in the Differential Translocation Frequency of Chromosome Arms ...... 91 The Influence of Chromosome Arm Morphology on Translocation Frequency ...... 91 Analysis of the Translocation Frequency on Cytogenetic Bands ...... 92 Influence of the Length of Cytogenetic Bands on Translocation Frequency ...... 96 Influence of Relative Chromosome Arm Cytogenetic Band Position on Translocation Frequency ...... 97 The Influence of GTG-banding, or Relative Chromatin Density on Translocation Frequency ...... 97 The Influence of Gene Density on Translocation Frequency of Cytogenetic Bands ...... 99 The Influence of Evolutionary Breakpoint Regions on Translocation Frequency ...... 100 The Influence of Common Fragile Sites on Translocation Frequency ...... 101 The Influence of G-banding on the Translocation Frequency of Bands with Common Fragile Sites ...... 103 General Chromosome Features are Associated with Cytogenetic Band Translocation Frequency ...... 104 Proposal of Hotspots for Rearrangement in the Pig Genome ...... 105 Analysis of Repetitive Elements in the Pig Genome and the Association with Chromosome and Cytogenetic Band Translocation Frequency ...... 108 Influence of Repetitive Elements on Cytogenetic Band Translocation Frequency ...... 110 Distribution of Mosaic Translocation Breakpoints in the Pig Genome ...... 113 Discussion ...... 115 Chapter 4: A Genome Wide Association Study of Chromosome Rearrangements in the Domestic Pig . 129 Introduction ...... 129 Materials and Methods ...... 131 Animals and Phenotypes ...... 131 Genotyping and Quality Controls ...... 133 Genome-wide Association Studies ...... 134 Candidate Gene Search and Functional Annotation ...... 136 Haplotype Block Analysis ...... 136 CNV Analysis ...... 136 Functional Annotation of CNVRs...... 137

viii

Results ...... 138 SNP Data Statistics ...... 138 Genome Wide Association Study ...... 138 Gene Annotation of Carrier Boars ...... 141 Linkage Disequilibrium Analysis ...... 144 CNV Analysis ...... 146 Annotation of CNVRs ...... 147 Discussion ...... 150 General Discussion ...... 165 Conclusions, Summary, and Future Directions ...... 176 Summary ...... 176 Conclusions ...... 178 Future Directions ...... 181 References ...... 185 Appendix I ...... 208

ix

List of Tables

Table 1: Count of Chromosome Rearrangements Observed in Canadian Swine Herds Table 2: List of Constitutional Reciprocal Translocations Observed in Canadian Boars Table 3: List of Repeated Constitutional Reciprocal Translocations Table 4: List of Mosaic Reciprocal Translocations in Canadian Swine Herds Table 5: List of Non-Reciprocal Chromosome Aberrations Observed in Canadian Swine Herds Table 6: Prevalence of Chromosome Rearrangements in Canadian Swine Herds by Year Table 7: Distribution of Chromosome Rearrangements by Breed Table 8: Distribution of Constitutional Reciprocal Translocations by Breed Table 9: Distribution of Mosaic Reciprocal Translocations by Breed Table 10: Distribution of Chromosome Rearrangements by Herd Table 11: Distribution and Prevalence of Chromosome Rearrangements by Generation Table 12: Rearrangements with Recurring Breakpoints Table 13: Mosaic Rearrangement Breakpoints Showing Re-Use Table 14: Distribution of Observations for each Possible Autosome-Autosome Translocation Table 15: Poisson Distribution of the Number of Observations per Translocation Table 16: Distribution of Chromosome Breakpoints on Autosomal Chromosomes Table 17: Gene Density per Chromosome Table 18: The Distribution of Translocation Breakpoints on Chromosome Arms Table 19: Comparison of the Translocation Frequency between Short and Long Chromosome Arms Table 20: The Distribution of Translocation Breakpoints by Chromosome Arm Morphology Table 21: The Distribution of Breakpoints per Cytogenetic Band Table 22: The Distribution of Breakpoints by Relative Position on Chromosome Arms Table 23: The Distribution of Translocation Breakpoints on GTG-bands Table 24: Translocation Frequency of the G-positive and G-negative Bands of each Chromosome Table 25: Distribution of Translocation Breakpoints on Defined and Proposed EBR Bands Table 26: The Distribution of Translocation Breakpoints on Bands with Common Fragile Sites

x

Table 27: Translocation Frequency for the Normal and Fragile Bands of Each Chromosome Table 28: The Distribution of Translocation Breakpoints by GTG-banding and the Presence of Common Fragile Sites Table 29: Multiple Linear Regression Analysis for Translocation Breakpoints and Frequency Table 30: Proposed Hotspots for Rearrangement Based on Breakpoint Number Table 31: Proposed Hotspots for Rearrangement Based on Translocation Frequency Table 32: Multiple Linear Regression of Multiple Classes of LTR Elements Table 33: Multiple Linear Regression of Multiple Classes of Repetitive Elements Table 34: Simple Linear Regression Models of Repeat Classes and Translocation Frequency Table 35: Repetitive Elements Associated with Cytogenetic Band Translocation Frequency Table 36: Multiple Linear Regression Analysis of Repetitive Elements and Translocation Frequency on Cytogenetic Bands Table 37: Multiple Linear Regression Model of Cytogenetic and Repetitive Elements and Translocation Frequency Table 38: Poisson Distribution of Mosaic Breakpoints on Cytogenetic Bands Table 39: Cytogenetic Bands with Multiple Mosaic Breakpoints Table 40: Number of Samples for each GWAS Table 41: SNPs Removed from each Quality Control Step Table 42: Contingency Table for Association Tests of Alleles between Cases and Controls Table 43: List of Significant SNPs Associated with Chromosome Rearrangements Table 44: Significant GO terms of genes within 100kb of significant SNPs from the Boars Table 45: Significant GO terms of genes within 2Mb of significant SNPs from the Boars Table 46: Significant GO terms of genes within 2Mb of significant SNPs from the Dams Table 47: List of SNPs with Nearest Genes Table 48: Significant GO terms of genes within 50kb of carrier Boar CNVR Table 49: Significant GO terms of genes within 50kb of Dams of carriers CNVR Table 50: List of Constitutional Reciprocal Translocations in the Domestic Pig Table 51: List of Mosaic Reciprocal Translocations and non-Reciprocal rearrangements in the Domestic Pig Table 52: List of Common Fragile Sites in the Pig Genome

xi

Table 53: List of Inferred and Established Evolutionary Breakpoint Regions in the Pig Genome Table S1: Estimated Physical Length and Cytogenetic Position of Porcine Cytogenetic Bands Table S2: Verification of Estimated Cytogenetic Band Length using BAC Clones Table S3: Comparative Map Demonstrating Synteny Between Human and Porcine Cytogenetic Bands Table S4: List of Constitutional Reciprocal Translocations in the Domestic Pig Including the Impact on Fertility Table S5: List of Mosaic Reciprocal Translocations and non-Reciprocal rearrangements in the Domestic Pig Including the Type of Rearrangement

xii

List of Figures

Figure 1: The Length of Chromosomes but not the Gene Density is Related to the Number of Observed Instances of each Translocation. Figure 2. Physical chromosome length is associated with the number of breakpoints on chromosomes, but not the translocation frequency. Figure 3:Gene Density is not related to Chromosome Translocation Frequency. Figure 4. Physical length of chromosome arms is associated with breakpoint number but not the translocation frequency of chromosome arms. Figure 5: Gene Density of Chromosome Arms has no Relationship with Translocation Frequency. Figure 6. Ideogram of the domestic pig karyotype with important cytogenetic markers displayed. Figure 7. Physical cytogenetic band length is associated with the number of translocation breakpoints. Figure 8: Translocation Frequency of cytogenetic bands is associated but no correlated with gene density. Figure 9: Classes of repetitive elements are positively or negatively associated with translocation frequency. Figure 10: Manhattan plots for the GWAS analyses for the Boars (a), Dams (b), and Sires (c). Figure 11: Linkage disequilibrium plot of a significantly associated region centering on two significant SNPs. Figure 12: A. GTG-banded karyotype of a Yorkshire boar carrying a t(3;13)(q21;q21). Figure 13: A. GTG-banded karyotype of a Duroc boar carrying a t(1;7)(q21;p11). Figure 14: A. GTG-banded karyotype of a Duroc boar carrying a t(5;12)(q11;q12). Figure 15: A. GTG-banded karyotype of a Yorkshire boar carrying a t(1;3)(p23;q25). Figure 16: A. GTG-banded karyotype of a Duroc boar carrying a t(4;12)(p11;p15). Figure 17: A. GTG-banded karyotype of a Landrace boar carrying a t(4;9)(p13;p24). Figure 18: A. GTG-banded karyotype of a Landrace boar carrying a t(14;15)(q13;q15). Figure 19: A. GTG-banded karyotype of a Yorkshire boar carrying a t(Y;13)(p13;q33). Figure 20: A. GTG-banded karyotype of a Duroc boar carrying a t(9;13)(q24;q31). Figure 21: A. GTG-banded karyotype of a Duroc boar carrying a t(9;13)(q24;q31). Figure 22: A. GTG-banded karyotype of a Yorkshire boar carrying a t(4;6)(q11;q27). Figure 23: A. GTG-banded karyotype of a Yorkshire boar carrying a t(2;15)(q13;q24).

xiii

Figure 24: A. GTG-banded karyotype of a Yorkshire boar carrying a t(1;14)(q21;q14). Figure 25: A. GTG-banded karyotype of a Landrace boar carrying a t(3;6)(q13;p13). Figure 26: A. GTG-banded karyotype of a Yorkshire boar carrying a t(12;14)(q13;q21). Figure 27: A. GTG-banded karyotype of a Yorkshire boar carrying a t(6;7)(q33;q22). Figure 28: A. GTG-banded karyotype of a Yorkshire boar carrying a t(13;18)(q21;q13). Figure 29: A. GTG-banded karyotype of a Yorkshire boar carrying a t(10;13)(p13;q31). Figure 30: A. GTG-banded karyotype of a Landrace boar carrying a t(12;14)(q15;q23). Figure 31: A. GTG-banded karyotype of a Yorkshire boar carrying a t(10;13)(q13;q21). Figure 32: A. GTG-banded karyotype of a Duroc boar carrying a t(2;10)(p17;q13). Figure 33: A. GTG-banded karyotype of a Yorkshire boar carrying a t(15;18)(q24;q24). Figure 34: A. GTG-banded karyotype of a Yorkshire boar carrying a t(6;15)(q33;q13). Figure 35: A. GTG-banded karyotype of a Yorkshire boar carrying a t(2;17)(p17;q13). Figure 36: A. GTG-banded karyotype of a Yorkshire boar carrying a t(4;15)(q21;q11). Figure 37: A. GTG-banded karyotype of a Yorkshire boar carrying a t(Y;1)(q11;q17). Figure 38: A. GTG-banded karyotype of a Yorkshire boar carrying a t(9;14)(p13;q11). Figure 39: A. GTG-banded karyotype of a Landrace boar carrying a t(1;14)(q2.11;q25). Figure 40: A. GTG-banded karyotype of a Yorkshire boar carrying a t(5;18)(q21;q11). Figure 41: A. GTG-banded karyotype of a Yorkshire boar carrying a t(5;13)(q21;q43). Figure 42: A. GTG-banded karyotype of a French Yorkshire boar carrying a t(7;9)(q15;p24). Figure 43: A. GTG-banded karyotype of a Yorkshire boar carrying a t(13;14)(q31;q29). Figure 44: A. GTG-banded karyotype of a Duroc boar carrying a t(1;7)(q21;p11).

Figure 45: A. GTG-banded karyotype of a Duroc boar carrying a t(5;12)(q11;q12). Figure 46: A. GTG-banded karyotype of a Duroc boar carrying a t(12;14)(q13;q15). Figure 47: A. GTG-banded karyotype of a Duroc boar carrying a t(13;18)(q21;q13). Figure 48: A. GTG-banded karyotype of a Duroc boar carrying a t(10;13)(p13;q31). Figure 49: A. GTG-banded karyotype of a Pietrain boar carrying two mos t(7;9)(q24;q24) in a mosaic state. Figure 50: A. GTG-banded karyotype of a Duroc boar carrying a t(7;18)(q22;q11) in a mosaic state.

xiv

Figure 51: A. GTG-banded karyotype of a Landrace boar carrying a t(8;16)(q21;q21) in a mosaic state. Figure 52: A. GTG-banded karyotype of a French Yorkshire boar carrying two mos t(1;2)(p23;q23) in a mosaic state. Figure 53: A. GTG-banded karyotype of a Duroc boar carrying two mos t(1;1)(q2.11;q21) in a mosaic state.

Figure 54: A. GTG-banded karyotype of a boar carrying a del(Y). Figure 55: A. GTG-banded karyotype of a Duroc boar carrying an inv(9)(p11;p22). Figure 56: Karyotypes of a boar exhibiting two distinct cell lines. Figure S1: The Standard GTG Karyotype of the Domestic Pig

xv

List of Symbols, Abbreviations, and Nomenclature

A Adenine A.I Artificial Insemination ABL1 Abelson murine leukemia viral oncogene homolog 1 ACRBP Acrosin binding ACTR5 Actin-related protein 5 ADAD1 Adenosine deaminase domain-containing protein 1 ADAMTS2 A disintegrin and metalloproteinase with thrombospondin motifs 2 Alu Arthrobacter luteus restriction endonuclease ANOVA Analysis of Variance ARID4A AT-rich interactive domain-containing protein 4A ATM Ataxia telangiecsta serine/threonine kinase ATR ataxia telangiectasia and Rad3-related protein ATRIP ATR-interacting protein BAC Bacterial Artificial Chromosome BCR Breakpoint cluster region protein B-DNA Refers to the right-handed double helix DNA, it's normal conformation C Cytosine CCSI Canadian Centre for Swine Improvement CELF3 CUGBP Elav-like family member 3 CNV Copy Number Variant CNVR Copy Number Variant Region COSMIC Catalogue of Somatic Mutations in Cancer CREB3L4 Cyclic AMP-responsive element-binding protein 3-like protein 4 CRY1 Cryptochrome-1 DAVID Database for Annotation, Visualization and Integrated Discovery DDR Epithelial discoidin domain-containing receptor 1 del Deletion DNA Deoxyribonucleic Acid UDP-N-acetylglucosamine-dolichyl-phosphate N- DPAGT1 acetylglucosaminephosphotransferase DSB Double Strand Break DSN1 Kinetochore-associated protein DSN1 homolog E.U European Union EBR Evolutionary Breakpoint Region EBV Estimated Breeding Value ENO2 Enolase 2

xvi

ERV1 endogenous retrovirus-1 ERVK Endogenous Retrovirus Group K ERVL Endogenous Retrovirus Group L FA Fanconi Anemia FANCM Fanconi anemia, complementation group M FBS Foetal Bovine Serum FISH Fluorescent in Situ Hybridization FoSTeS Fork Stalling and Template Switching G Guanine GAMT Guanidinoacetate methyltransferase GAPDH Glyceraldehyde 3-phosphate dehydrogenase G-negative Giemsa Negative G-positive Giemsa Positive GTG G Banding with Trypsin and Giemsa GWAS Genome Wide Association Study H2AX Histone 2A Variant X hAT hAT transposons HORMAD1 HORMA domain-containing protein 1 HR Homologous Recombination HWE Hardy-Weinburg Equilibrium I.R Ionizing Radiation I-DNA four-stranded quadruplex structures of DNA IGF1 Insulin Like Growth Factor 1 INO80 Chromatin-remodeling ATPase INO80 inv Inversion ISCN International System for Chromosome Nomenclature ISH Stress response protein ish1 Kb Kilobase LCR Low Copy Repeat LINE Long Interspersed Nuclear Elements LTR Long Terminal Repeat Mb Megabase MMBIR Microhomology Mediated Break Induced Replication MMEJ Microhomology Mediated End Joining MMR DNA mismatch repair MMS22L Protein MMS22-like mos Mosaic MRE11 Double-strand break repair protein MRE11

xvii

MSH2 DNA mismatch repair protein Msh2 MSH6 DNA mismatch repair protein Msh6 MutSa Putative DNA mismatch repair protein NAHR Non-Allelic Homologous Recombination NBA Number Born Alive NCBI National Center for Biotechnology Information NGS Next Generation Sequencing NHEJ Non-Homologous End Joining NSB1 Nibrin 1 OAZ3 Ornithine decarboxylase antizyme 3 PAFAH1B2 Platelet-activating factor acetylhydrolase IB subunit beta p-arm Short Chromosome Arm PATRR Palindromic AT rich repeat PCR Polymerase Chain Reaction PCSK4 Proprotein convertase subtilisin/kexin type 4 PDE5A cGMP-specific 3',5'-cyclic phosphodiesterase PHA Phytohemaglutinnin PPP1R16B Protein phosphatase 1 regulatory inhibitor subunit 16B PRE Porcine Repetitive Element PRPF19 Pre-mRNA-processing factor 19 PVRL2 Poliovirus receptor-related 2 q-arm Long chromosome arm QM Quinacrine Mustard RAD50 DNA repair protein RAD50 RAD51 DNA repair protein RAD51 RAG Recombination-activating gene RARA Retinoic acid receptor alpha RBA Reverse Banding RBL1 Retionoblastoma-like 1 rcp Reciprocal Rob Robertsonian ROS Reactive Oxygen Species RPA Replication protein A RPA1 Replication protein A 70 kDa DNA-binding subunit RPA2 Replication protein A 32 kDa subunit RPMI Roswell Park Memorial Insitute SETMAR Histone-lysine N-methyltransferase SETMAR SINE Short Interspersed Nuclear Elements

xviii

SLC37A4 Glucose-6-phosphate exchanger SLC37A4 SMC5 Structural maintenance of chromosomes protein 5 SMC6 Structural maintenance of chromosomes protein 6 SNP Single Nucleotide Polymorphism SOCS5 Suppressor of cytokine signaling 5 SRC Proto-oncogene tyrosine-protein kinase Src SSB Single Strand Breaks Sscrofa Sus scrofa domestica STK11 Serine/threonine-protein kinase STK11 SYCP3 Synaptonemal complex protein 3 T Tyrosine t Translocation TCR T-cell Receptor TDRKH Tudor and KH domain-containing protein THEG Testicular haploid expressed gene protein TNB Total Number Born TONSL Tonsoku-like protein TOP1 Topoisomerase 1 TPI1 Triosephosphate isomerase TRA T-cell receptor alpha locus TRB T-cell receptor beta locus TRD T-cell receptor delta locus TRG T-cell receptor gamma locus tRNA Transfer Ribonucleic Acid U.S.A United States of America USDA United States Department of Agriculture V(D)J Variable, Diversity, Joining recombination

xix

Introduction

The Canadian swine industry is one of the highest value aspects of Canadian agriculture, reporting cash receipts in excess of 4 billion dollars annually (Statistics Canada, 2018). Swine producers must work to meet both Canadian and global market demands, with breeders and producers increasingly exporting pork products into the global marketplace, helping to meet the increasing demand for pork products worldwide. As the world population continues to increase, and the middle class continues to grow, demand for livestock products is expected to trend upward in tandem with economic and population growth (Thornton, 2010). In order to meet these demands swine breeders in Canada have placed considerable emphasis on the recording of phenotypic data and have put in place genetic selection systems in order to select those pigs with the most desirable traits for breeding (Robinson and Buhr, 2005). These selection systems consider both physical market characteristics such as back fat content, but also reproductive traits such as litter size, and in doing so have led to increases in the quality of pigs and litter size in Canadian swine herds

(CCSI, 2019).

Despite the implementation of phenotypic and genetic selection, many breeders fail to account for the presence of chromosome rearrangements in swine herds. Chromosome rearrangements are structural chromosome abnormalities resulting from the simultaneous breakage of one or more chromosomes, and their subsequent mis-repair, resulting in the creation of derivative chromosomes, formed from rearranged chromosome material. Despite the large amount of genetic material being rearranged in the genome, carriers of chromosome rearrangements nearly always appear normal from an observers’ perspective. This is due to their being very little genetic material lost as part of the rearrangement process, resulting in no significant genetic imbalance in the carriers. Carriers of chromosome rearrangements however

1 experience predictable reproductive loss dependent on the morphology of the rearrangement

(Quach et al., 2016). The most severe reproductive losses occur in carriers of reciprocal translocations, with carriers experiencing an average of 40% loss in litter size relative to other members of their herd due to the production of unbalanced gametes which result in spontaneous abortion (King et al., 1981; Pinton et al., 2000). Chromosome rearrangements including reciprocal translocations are known to appear in various mammalian species such as humans and bovids, however reciprocal translocations are especially prevalent in the domestic pig.

Reciprocal translocations are expected to occur frequently in swine herds throughout the world, being proposed to occur spontaneously in 1/200 live births (Ducos et al., 2007, King et al.,

2019). Carriers of rearrangements if permitted to breed may then pass on the rearrangement to approximately 50% of their successful offspring, increasing the prevalence of chromosome rearrangements in swine herds over time (Ducos et al., 1998a). In order to combat the presence of chromosome rearrangements in swine herds, several labs operate cytogenetic screening programs in order to screen prospective breeding boars for chromosome rearrangements (Ducos et al., 2008).

Although screening programs cannot eliminate rearrangements from herds, they can however prevent carriers from breeding, resulting in the maintenance of litter size, eliminating the possibility of inheritance, and reducing the prevalence of rearrangements. In countries where such programs are available many breeders will voluntarily submit their breeding boars for cytogenetic screening, seeing clear economic benefits to managing the presence of rearrangements. Most large swine producing countries however fail to implement cytogenetic screening of their swine herds.

Thus, the implementation of cytogenetic screening, or some other method, to identify carriers or potential carriers has room to be widely implemented in the swine industry, and greatly reduce the impact of chromosome rearrangements on swine herds.

2

To date over 200 distinct chromosome rearrangements have been identified in the domestic pig, and hundreds have been identified in humans, however little is known about the factors underlying their formation. Analysis of rearrangement breakpoints in the human genome has revealed hotspots for rearrangement, where breaks appear more likely to occur, however no such analysis has yet been conducted in the pig genome. In addition, several chromosome features are proposed to be associated with the presence of rearrangement breakpoints, such as genomic regions that are more transcriptionally active, and regions with particular repetitive elements such as palindromic sequences or low copy repeats (Ou et al., 2011; Bacolla et al., 2015). However, little is known about whether rearrangements preferentially locate to these regions, and if so, why this preference for breakage and rearrangement exists.

The introduction of cytogenomics techniques and their increasing affordability has made more specific investigations of the genomes of rearrangement carriers possible. These investigations in the human genome have largely focused on rearrangements as a contributor of disease, studying breakpoint junctions for the interruption of genes. Others study rearrangement from a mechanistic perspective, attempting to elucidate the mechanism of repair. These studies however do not attempt to delineate the underlying causes of rearrangement from a genetic perspective. The high prevalence of chromosome rearrangements in the domestic pig suggests that some animals may be more susceptible to acquiring rearrangements than others. Animals more susceptible to acquiring DNA breaks, or less able to initiate efficient DNA repair could be more susceptible to acquiring rearrangements than others.

The large number of pigs in Canada as well as the large number of rearrangements reported in the literature provides a wealth of opportunity to observe new rearrangements and understand the genomic landscape of rearrangements in the pig genome. In addition, the accessibility of DNA

3 samples along with cytogenomics technology allows for the first true investigations of the genomes of rearrangement carriers and their family members in order to determine associations between the genomic landscape of pigs and rearrangements. The studies presented in this Doctoral thesis were designed to investigate the porcine genome in order to determine if genomic structure and genomic variants associated with the presence of chromosome rearrangements in the pig genome. It is anticipated that this information may be used to develop more efficient selection systems that identify animals most susceptible to acquiring rearrangements and remove them from breeding eligibility without need for cytogenetic screening.

4

Chapter 1: Review of Literature

The Chromosome

The genomes of mammals are immensely complex, with the building blocks of each organism, covalently linked stretches of deoxyribonucleic acids (DNA), twisted helically and stretching for nearly 2 metres if fully unraveled. In order to accommodate this large amount of genetic material the DNA is split up into several demarcated divisions known as chromosomes.

Stabilization and condensation of DNA is achieved with the help of a highly conserved octamer of known as histones which the DNA wraps itself around forming nucleosome cores (Van

Holde, 2012). This DNA wound around the histone octamers is known as a chromatin fibre and continues to condense by winding and coiling around other chromatin fibres. At its most condensed these centimeters long strands of DNA are successively coiled to into a chromatid, about 700nm wide, and allows the entire genetic complement of each organism to fit into a single cell

(Manuelidis and Chen, 1990).

During the prophase of the cell cycle these chromatids replicate themselves, producing two identical sister chromatids which align, shorten and condense as the cell moves on towards metaphase. It is here that the two aligned sister chromatids are at their most condensed, forming a

1400nm wide X-shaped structure producing the familiar image of a chromosome. Due to the highly condensed structure of chromosomes at this point, the metaphase stage of cell division is where chromosomes are most often imaged, allowing the structure of chromosomes to be observed under light microscopy. Here chromosomes may be subjected to a number of techniques allowing for the chromosomes to be differentiated from one another, and subsequently to identify any abnormalities, numerical or structural, that may exist amongst them.

5

Chromosomes in the Cell Cycle

One of the hallmarks of chromosomes is their ability to duplicate and divide, producing genetically identical daughter cells, and extending their lineage. During the interphase stage chromosomes exist as single chromatids which are then faithfully duplicated during the synthesis stage, producing an identical sister chromatid. As the mitotic phase of the cell cycle is initiated, these sister chromatids shorten, thicken, and align with one another, producing the condensed

1400nm chromosome structure. Chromosomes must then align along the metaphase plate, an imaginary demarcation along the middle of the cell, where microtubule fibres will attach to the centromere of chromosomes, pulling and separating the sister chromatids towards opposite ends of the cell. With the genetic material evenly split, the cell will begin to furrow about the middle, eventually pinching off the cell into two identical daughter cells.

The cell cycle is essential to life, with cells constantly undergoing this process over the lifespan of an organism, ensuring a continual supply of new cells to replace aging ones. Faithful production of daughter cells is essential in each cell division, however is most important arguably in meiotic cell divisions, where the resulting daughter cells develop into gametes which will attempt to confer new life by fusing with the gamete of the opposite sex, creating a zygote which will subsequently develop into an embryo. Errors in chromosome composition are most damaging here as the genome is largely dose sensitive, with gametes carrying the incorrect chromosome number, or structural damage to chromosomes often resulting in early embryonic death or spontaneous abortion (Hassold, 1980).

The Chromosome Constitution of Mammalian Species

Each organism has a unique chromosome constitution, referring to the number and structure of chromosomes. Even amongst closely related species the number and structure of

6 chromosomes may differ providing an indication to the importance of the organization of genetic material in differentiating life. Within a species however this chromosome composition should be consistent across all individuals. For instance the human genome is expected to contain a diploid chromosome number of 2n = 46, with each individual having two copies (one maternally derived, and the other paternal) of each of 22 distinct autosomes, and two sex chromosomes denoting genetic sex, XX in females, and XY in males (Tjio and Levan, 1956; Ford and Hamerton, 1956).

The domestic pig has 38 chromosomes (2n = 38), with 18 autosomes, and the two sex chromosomes (Krallinger, 1931; Bryden, 1933). Deviations in the expected number or structure of these chromosomes change the chromosome constitution of the organism, and dependent on the severity of the change, may have detrimental consequences.

Chromosome Abnormalities

Despite most individuals within a species conforming to a standard chromosome constitution, a small number of individuals may deviate from this expectation. Any number of chromosome abnormalities may be present in individuals which either change the number or structure of chromosomes. Aneuploidy of chromosomes indicates a deviation from the expected chromosome number and is most commonly observed as either monosomy (one-copy), or trisomy

(three copies) of an individual chromosome. In some cases, these gametes may go on to produce live offspring which will carry a constitutional change in chromosome number, either having a chromosomal monosomy, or a trisomy. Individuals diagnosed with an aneuploidy are denoted as such, for example, a human female carrier of a trisomy of chromosome 18 would have a

47,XX,+18 chromosome constitution.

Structural abnormalities of chromosomes may be present as well, the result of the generation of double strand breaks (DSBs) during meiosis, and their subsequent mis-repair, or a

7 change in the copy number of chromosome segments, resulting in a rearrangement or duplication/deletion respectively. As with aneuploidies, gametes that contain rearranged chromosome segments may go on to produce live offspring with an altered chromosome constitution, with each cell having an altered chromosome arrangement. For example, a male carrier of a terminal deletion of the long arm of chromosome 18 would have a 46,XY,-del(18)(qter) chromosome constitution.

In addition to constitutional changes in an individual, affecting all cell lines, individuals may exist with two or more cell lines. The occurrence of non-disjunction events or chromosome rearrangement events in somatic cell lines may produce a new subset of cells, known as mosaicism.

Chromosomal mosaicism is typically less damaging to the individual overall, as just a subset of cells are affected, with the mosaic cell lines typically producing an effect in proportion to their prevalence and the amount of chromosome material affected (Taylor et al., 2014).

Numerical Chromosome Abnormalities

Numerical chromosome abnormalities are one of the main sources of genomic instability seen in mammalian genomes, with chromosomal aneuploidies typically being incompatible with life or associated with severe pathologies in carriers (Hassold and Hunt, 2001). Upon the publication of the correct human chromosome number of 2n = 46 by Tjio and Levan (1956), and its confirmation by Ford and Hamerton (1956), cytogeneticists began to associate monosomy and trisomy of whole chromosomes with known diseases. For instance trisomy of chromosome 21 was determined to be the cause of Down’s syndrome (Lejeune et al., 1959), while two other autosomal trisomies of chromosomes 13 and 18 were found to be the cause of Patau’s Syndrome (Patau et al., 1960), and Edwards’ syndrome respectively (Edwards et al., 1960). The presence of an extra chromosome in each case was associated with severe physical and neurological pathologies. Prior

8 to this, it had been assumed that any deviation from the expected chromosome number was incompatible with life.

Numerical abnormalities of the sex chromosomes were also linked to diseases, such as an additional X chromosome in biological males resulting in Klinefelter’s syndrome (Jacobs and

Strong, 1959), and a missing X chromosome in biological females resulting in Turner’s syndrome

(Ford et al., 1959). Despite trisomies of chromosomes 13, 18, 21, X, and Y being compatible with life, albeit with associated pathologies, monosomy of just one chromosome, X, is the only monosomy compatible with life as just one X chromosome is required to be activated in each cell.

Other chromosomal aneuploidies, such as trisomy of chromosome 16 occur quite frequently, however results in embryonic death (see Jacobs et al., 1995 and Wolstenholme, 1995 for review).

Therefore, numerical chromosome abnormalities of most chromosomes are confined to a mosaic state. As such there is a wide phenotypic range dependent on the prevalence and distribution of mosaic cells, with individuals with low level mosaicism likely facing no observable or minor effects, while individuals with a higher grade of mosaicism may see more severe associated pathologies (Taylor et al., 2014).

Numerical chromosome abnormalities are primarily the result of non-disjunction events during meiotic or mitotic cell divisions. In this case chromosomes fail to segregate properly during anaphase stage of meiosis I or mitosis, resulting in both chromosome homologues being placed together into gametes, or sister chromatids failing to separate during meiosis II (see

Sankaranayanan, 1979 for review). This results in the creation of two daughter cells, one of which will carry a trisomy, and the other a monosomy. Generally speaking, the greatest risk factor for numerical chromosome abnormalities is maternal age, with several studies finding concrete links

9 between older individuals and increased risk of aneuploidy (Hassold et al., 1984; Risch et al.,

1986).

Structural Chromosome Abnormalities

In addition to changes in the number of whole chromosomes, alterations in the structure of chromosomes may occur. Structural chromosome rearrangements are alterations in chromosome structure that are large enough to be viewed under a microscope. Structural chromosome rearrangements may be divided into four main groups: deletions, insertions/duplications, inversions, and translocations (Griffiths et al., 2005). These rearrangements may then be further sub-divided into genetically balanced (inversions and translocations), and unbalanced (deletions and duplications) rearrangements. Balanced rearrangements involve structural changes with minimal loss of genetic material, and thus primarily re-orientate genetic material within the cell, while unbalanced rearrangements result in a change in copy number (gain or loss) of a chromosome segment. Carriers of balanced rearrangements typically see no observable signs of the rearrangement, as they have the correct amount of genetic material, while carriers of unbalanced rearrangements are more likely to see observable physical or neurological manifestations due to the gain or loss of genetic material which may impact dose sensitive genes.

Deletions, duplications, and insertions of chromosome material are relatively simple chromosome events. As the name implies, deletions involve chromosome breakage, where a section of the chromosome is lost, either interstitially or terminally. Insertions or duplications or chromosome material occur in the opposite fashion, as regions of chromosomes may be duplicated on the chromosome itself or inserted into another chromosome. Duplications are genetically unbalanced and change the copy number of the affected chromosome segments, producing phenotypic effects largely proportional to their size. Insertions on the other hand may be balanced

10 but greatly alter the structure of chromosomes, which in some cases may result in phenotypic effects.

Chromosome inversions are genetically balanced rearrangements that occur as the result of the generation of two DSBs on the same chromosome, producing a broken chromosome segment. The resulting chromosome segment is subsequently inverted and re-inserted back on the chromosome with minimal genetic loss, but in the wrong orientation. Inversions that are confined to a single chromosome arm are paracentric, while inversions that cross the centromere, and involve both the long and short arms, are referred to as pericentric. Carriers of inversions typically do not experience any visible phenotypic signs of the rearrangement.

Chromosome translocations in mammalian genomes may take on several forms including:

Robertsonian translocation (centric fusion), tandem fusion, and reciprocal translocation.

Robertsonian translocations can only occur amongst acrocentric or telocentric chromosomes, with very short arms, as they involve the fusion of two chromosomes at the centromere. Here the chromosomes break about the centromere, and rather than repairing correctly, the long arms of the chromosomes fuse together producing a derivative chromosome. A second derivative chromosome is formed by the joining of the short arms, however due to the short length of this new derivative chromosome, it is typically lost in the first few cell divisions. The genetic material of these acrocentric short arms is typically made up of repetitive non-coding DNA, and therefore can be lost without producing significant phenotypic consequences. Just a handful of chromosomes in the human and pig genomes with sufficiently short chromosome arms may undergo such a rearrangement. Robertsonian translocations unlike deletions and insertions of chromosome material, also result in the apparent loss of a chromosome due to the fusion of two chromosomes, and loss of the second derivative chromosome consisting of the short arms. Tandem fusions are

11 similar to Robertsonian translocations; however, they are much rarer. These rearrangements involve the fusion of the centromere of one chromosome to the telomere of another. As with

Robertsonian translocations, carriers of tandem fusions demonstrate the apparent loss of a chromosome, but with no significant loss of genetic material.

Reciprocal translocations are perhaps the most common structural chromosome rearrangement amongst humans and pigs, occurring at a rate of approximately 1/500 and 1/200 live births respectively (Jacobs et al., 1992; Ducos et al., 2007). Reciprocal translocations occur as the result of the production of simultaneous double strand breaks (DSBs) on two non-homologous chromosomes. These broken chromosome segments are subsequently mis-repaired, resulting in the production of two derivative chromosomes, each of which carries two non-homologous chromosome segments, with a minimal loss of genetic material. As with carriers of inversions and other translocations, carriers of reciprocal translocations most often appear phenotypically normal.

Carriers of translocations however often experience loss of fertility, with carriers of reciprocal translocations experiencing greater fertility reduction than carriers of inversions or Robertsonian translocations (Quach et al., 2016). This loss of fertility is caused by the production of unbalanced gametes, often resulting in early embryonic loss (King et al., 1981). Depending on the structure of the rearrangement some human carriers of translocations may produce offspring that carry the rearrangement in an unbalanced form. In these cases, the derivative chromosomes are separated into different gametes, resulting in the production of gametes and eventually embryos with a partial monosomy and a partial trisomy. Carriers of these unbalanced translocations, as they carry two partial aneuploidies, often suffer from a range of disorders dependent on the extent of the aneuploidy, and the chromosome regions involved (AlMajhad et al., 2017; Caballero et al., 2017;

Wu et al., 2017).

12

Sub-Microscopic Genomic Variation

Copy Number Variants

Microscopic genomic variation (< 5Mb) is of insufficient size to be visible under a microscope, and specialized genomics tools such as DNA arrays must be utilized in order to detect their presence. Copy number variants (CNV) comprise one of the larger sub-microscopic variants in mammalian genomes. CNV are segments of DNA at least 1kb in length that are variable in copy number relative to a reference genome (Feuk et al., 2006). CNV smaller than 1kb are referred to as indels. Copy number variants include deletions and duplications of genetic material, with the size of these variants ranging from 1kb to millions of base pairs (Lupski, 2015). CNV as they consist of concrete changes in the copy number of chromosome segments, and may include genes, account for much of the genomic variation observed between any two individuals. Since the discovery of CNV, they have been increasingly associated with a number of disease traits and individual genomic variation (Conrad et al., 2010; Stankiewicz et al., 2010).

Single Nucleotide Polymorphisms

Single nucleotide polymorphisms (SNPs) are single nucleotide variations that are frequently occurring in the population, with at least 1% of the population being carriers

(Sachidanandam et al., 2001). SNPs are prevalent in mammalian genomes, occurring in an estimated 1 in 300 nucleotides in the humans, with at least 10 million SNPs predicted to reside in the human genome (Shaw, 2011). SNPs were once thought to confer much of the variation amongst interindividual genomes, however since the discovery of CNVs, the idea of SNPs as a major source of genomic variation has been diminished. Many SNPs are present in non-coding DNA, and thus confer no noticeable affect on the individual. SNPs are however used in genome wide association studies in order to identify SNPs that are more often carried by individuals of interest relative to a

13 control population. Although such SNPs may not be causative of the trait of interest, they may be linked to, or associate closely with SNPs within a gene of interest, and thus individuals carrying such a SNP may be of heightened risk to carry a causative variant as well.

Detection of Chromosome and Genomic Variation

Classical Cytogenetics

In order to view chromosomes under a microscope a series of steps must be taken to culture and prepare cells allowing for chromosomes to be effectively viewed and abnormalities to be delineated. Classical cytogenetics techniques encompass the culture of peripheral blood lymphocytes, and preparation of chromosomes for viewing under light microscopy. Following the development of modern cell culture techniques providing well-spread metaphase chromosomes of consistent quality on glass slides (Moorhead et al., 1960; Arakaki and Sparkes, 1963), researchers have attempted to establish methods to identify and differentiate chromosomes from one another.

One of the first methods introduced was the hybridization of fluorescent quinacrine mustard (QM) to chromosomes, producing a reproduceable fluorescent pattern on chromosomes as a function of the amount of guanine residues present, and the ability of QM to bind to the chromosome

(Caspersson et al., 1970). This method was followed by the introduction of chromosome banding using a protease, trypsin, to partially digest chromosomes, and a dye, Giemsa, to stain the chromosomes producing a reproduceable banding pattern on each chromosome known as G- banding with trypsin and Giemsa (GTG) (Seabright, 1971; Sumner et al., 1972; Wang and

Fedoroff, 1972). GTG-banding stains heterochromatic regions of chromosomes that are typically more AT-rich, more condensed and less transcriptionally active more intensely than the more GC- rich, less condensed, more transcriptionally active euchromatic regions of chromosomes, as the

14 trypsin can less easily penetrate and digest heterochromatic regions (Bickmore, 2001; Iannuzzi and DiBernardino, 2008).

Other methods were introduced soon after such as R-banding using acridine orange fluorochrome also known as RBA-banding, which performed the inverse of GTG-banding, staining euchromatic regions more intensely (Dutrillaux and Jejeune, 1971; Dutrillaux, 1973).

Other banding methods were also developed and used to reveal specific chromosome features, such as constitutive heterochromatin blocks (C-bands), nucleolar organizing regions (Ag-NOR- bands), and telomeric regions (T-bands) (Sumner, 1972; Bloom and Goodpasture, 1976;

Dutrillaux, 1973).

Veterinary Cytogenetics

Seeing the advances of cytogenetics techniques in humans, and their application to diagnose chromosome abnormalities, these methods were adapted for use in analysing domestic animal genomes. The discovery of animal carriers of chromosome rearrangements in Sweden spurred interest in the expansion of veterinary cytogenetics, and the adaption of established techniques which are still largely used today (Hageltorn and Gustavsson, 1973). The standard application of the GTG and RBA banding techniques to porcine chromosomes, in tandem with the guidelines provided by the Reading conference (Ford et al., 1980), resulted in the establishment of a standard karyotype of the domestic pig (Gustavsson, 1988).

RBA-banding and especially GTG-banding are typically the most used banding techniques in porcine cytogenetics and produce approximately 300 bands across all pig chromosomes.

Standard GTG-banding and RBA-banding produce a resolution of approximately 5-10Mb, with terminal chromosome abnormalities under 5Mb in size typically not being visible (Bickmore et al., 2001; Vorsanova et al., 2010). Higher resolution banding is achievable by synchronizing

15 chromosomes in pro-metaphase, producing GTG or RBA banded chromosomes with 600 bands, and allowing viewing of chromosomes at a resolution of 2-5 Mb (Rønne 1990; Yerle et al., 1991;

Bickmore et al., 2001). The implementation of classical techniques has been instrumental in identifying over 200 distinct structural rearrangements in the pig genome, including reciprocal translocations, Robertsonian translocations, tandem fusions, inversions, and deletions of chromosomes (King et al 2019; Tables 50, 51, S5).

Molecular Cytogenetics

Despite the many chromosome rearrangements identified in pigs using classical cytogenetics techniques, the limited resolution of banding makes it ineffective at observing precise rearrangement breakpoints, as well as structural variants involving chromosome rearrangements under 2-5 Mb. Molecular cytogenetics techniques such as fluorescent in situ hybridization (FISH) are capable of viewing the genome at a high resolution of up to 0.5Mb and reveals chromosome elements not visible using banding techniques (Danielak-Czech et al., 2016).

FISH has been applied to the study of chromosome rearrangements in order to properly delineate rearrangement breakpoints. The first instance of FISH being used in this way to study porcine rearrangements was by Konfortova et al. (1995), who utilized single-coloured painting probes in order to visualize the reciprocal exchange of a t(7;15)(q24;p12). Chromosome painting probes are obtained by flow sorting chromosomes, followed by fluorescently labelling the probes, then hybridizing them to chromosomes, emitting a fluorescent signal that can be viewed under a microscope (Telenius et al., 1992; Langford et al., 1993; Yerle et al., 1993). Painting probes specific for chromosome arms or bands may also be produced via microdissection with a needle or a laser, and are particularly useful for the study of specific breakpoint locations of intra-

16 chromosomal rearrangements, and inversions to which painting of the whole chromosome would not demonstrate the inverted chromosome region (Pinton et al., 2003; Kubickova et al., 2002).

Chromosome painting probes have been successfully implemented in the use of porcine cytogenetics and may be used in concert with classical cytogenetics techniques. As banding provides a limited resolution of approximately 5Mb, chromosome rearrangements near the terminal end of chromosomes, and those occurring between bands and chromosome regions with similar banding patterns and intensities, classical cytogenetics may be insufficient to properly identify or delineate breakpoints of some chromosome rearrangements (O’Connor et al., 2017).

Rearrangement breakpoints identified via G-banding have been revisited with chromosome painting probes, helping to properly resolve breakpoints that were initially incorrectly identified

(Pinton et al., 1998). Recent advancement using molecular cytogenetics have employed the use of sub-telomeric bacterial artificial chromosome (BAC) probes, emitting different colours for each chromosome, allowing chromosome rearrangements between terminal regions of chromosomes to be visualized that would otherwise be missed via banding techniques (O’Connor et al., 2017).

Cytogenomics

In order to obtain the highest resolution views of the genome, cytogenomics techniques must be applied. Cytogenomics refers to the use of DNA microarrays and next generation sequencing (NGS) to visualize the genome at a high resolution. DNA microarrays are a tool used to analyse genomes, which consist of a series of DNA probes attached to a sold surface (chip).

Single stranded DNA may be hybridized to the DNA probes, producing a fluorescent signal which can be read and interpreted, providing an indication of the relative amount of genetic material present corresponding to each probe (Bumgarner, 2013). DNA microarrays such as SNP arrays are useful to cytogeneticists for the diagnosis of unbalanced chromosome rearrangements. As the

17 probes used produce a signal once bound which indicates the intensity of binding, the signal can be interpreted to determine probes where stronger or weaker signals are produced relative to the expected intensity, which can correspond to gains or losses in copy number respectively. Regions of chromosomes with consistently higher or lower signal intensities can thus be used to determine the presence of deletions or duplications of chromosome material, as well as the presence of partial monosomies or trisomies in patients, such as those of unbalanced translocation carriers (Treff et al., 2011). SNP arrays are blind to balanced translocations however, as they have little to no loss of genetic material.

Next generation sequencing methods provide a higher resolution look at the genome than

DNA microarrays by fragmenting DNA into short DNA segments several bases to hundreds of bases long. Adapters are ligated to these DNA segments, and then amplified via polymerase chain reaction (PCR), producing several copies of each DNA segment. The DNA segments are then exposed to fluorescently labelled nucleotides and DNA polymerase, binding to one base at a time, and taking an image which is interpreted by a computer. This process is repeated several times allowing for each segment to be sequenced several times over and aligned producing an accurate sequence of the genomic region of interest. NGS technologies may also be useful to cytogeneticists as they allow for high resolution viewing of the genome. NGS has been applied by cytogeneticists to delineate chromosome rearrangement breakpoints, and the breakpoint signatures coinciding with the repair mechanism, small copy number variants and indels (CNV less than 1kb in length) not visible via SNP array, and novel single nucleotide variants within genes that may be associated with disease (Nilsson et al., 2017).

18

Genome Wide Association Studies

Although SNPs are now less thought of as conferring substantial genomic variation between individuals, SNP arrays may still be useful in evaluating genetic differences between individuals. SNPs are sites where a significant portion of the population (at least 1%) harbours one of two single nucleotide variants. Although the SNPs themselves are typically uninformative, as they often exist in non-coding regions, they can be useful to identify associations with disease.

Genome wide association studies (GWAS) take advantage of the hundreds of thousands of identified SNPs in mammalian genomes to identify allelic variants (SNPs) that are associated with a disease or trait of interest (Visscher et al., 2017). Groups of case individuals with a disease or trait of interest can be genotyped alongside a group of control individuals that serves as an unaffected reference population. SNPs with statistically significant allele frequencies between case and control groups may then be analysed in order to determine if there is a direct causative link, between the SNP and the disease, such as a gene variant, or if the SNP is in linkage disequilibrium with another SNP with a direct causative link (LaFramboise, 2009). SNPs in linkage disequilibrium occur together more often than expected by chance, typically because there is little historical recombination between them. The presence of one SNP in these cases is associated with the presence of another, thus they confer the same information, and may be useful for genetic association.

The underlying technical principle of SNP arrays is that nucleotide residues (i.e A, T, C, and G) will bind to their complementary partners. Standard commercial SNP arrays thus act by hybridizing single stranded DNA to arrays containing hundreds of thousands of nucleotide probe sequences (LaFramboise, 2009). Each probe is labelled with a nucleotide, and a fluorescent protein, which will then bind its complementary sequence, producing a fluorescent signal. These

19 signals will then be read by specialized equipment, which interpret the signal, producing a measure of signal intensity for each probe. The signal intensity of the probes is dependent both on the amount of target DNA and the affinity for the target (LaFramboise, 2009). Subsequent processing of the signal data will then transform the raw signal intensity and make inferences on the absence or presence of an allele. The signal intensity data is then normalized internally, producing an allele specific copy measurement at each SNP. Specifically, for each SNP it defines a transformed ratio of signal intensity, a theta (B allele) signal intensity, and normalized allele intensities. From this it will define three genotype clusters as AA, AB, and BB. The theta intensity is used to signify the presence of a B allele, with the most intense signals indicating a BB genotype, while intensity half as intense indicates an AB genotype, and a low signal intensity indicates an AA genotype. High quality SNPs will be those with high call rates, and well separated clusters of genotypes.

Detection of CNV

As described above SNP arrays produce a measure of signal intensity for each SNP. These measures of signal intensity are used to infer the copy number of a given allele. SNP arrays assume a there are two copies of an allele at a given locus, however by using both the B allele frequency and normalized signal intensity, regions of the chromosomes with different copy number states, such as deletions and duplications can be inferred. Copy number calling algorithms such as

PennCNV (Wang et al., 2007) take advantage of this signal intensity data produced by SNP arrays, in order to infer the copy number of SNPs for each individual beyond the simple AA, AB, and BB genotypes inferred by SNP arrays. PennCNV takes in the B allele frequency, and normalized signal intensity data for each SNP, and each individual, along with the population B allele frequency, and applies a Hidden Markov Model, which assumes copy number as a ‘hidden state’ in order to infer the copy number for each SNP. In its most simple form, CNV may be identified

20 from observing runs of successive SNPs that have B allele frequencies that deviate from the expected 0, 0.5, and 1 (denoting AA, AB, and BB genotypes), as well as signal intensities above or below that which would be expected. Although limited to large CNVs in the genome (typically crossing at least 3 SNPs), these CNV algorithms provide a low-cost option to narrow down genomic regions of interest for further CNV analysis via PCR or NGS of the selected genomic region for more accurate CNV analysis.

Chromosomes of the Domestic Pig

The chromosomes of the domestic pig (Sus scrofa domestica) are among the best studied amongst domestic animals. The diploid chromosome number of the pig is 38, consisting of 18 autosomal chromosomes and the two sex chromosomes X and Y. Porcine chromosomes are notably well differentiated from each other, with twelve bi-armed chromosomes, and six one- armed chromosomes, which present with a variety of lengths. Porcine chromosomes are typically divided into four distinct groups: Sub-Metacentric, Acrocentric, Metacentric, and Telocentric and are organized into a karyotype as such (Gustavsson, 1988). The morphological variety of porcine chromosomes results in the pig karyotype more resembling the human karyotype, rather than the chromosomes of bovids which are pre-dominantly composed of poorly differentiated acrocentric chromosomes. This in particular allows each chromosome to be easily distinguished from one another making the identification of chromosome abnormalities in the pig considerably easier than in other domestic species.

Chromosome Abnormalities in the Domestic Pig

Chromosome abnormalities, particularly rearrangements, are quite prevalent in the pig. To date over 200 distinct chromosome rearrangements have been identified in the pig, with rearrangements occurring at an estimated rate of 1/200 live births (Ducos et al., 2007; Table S5).

21

The first chromosome rearrangement identified in the domestic pig was observed to be a reciprocal translocation in a hypoprolific boar producing litters 56% smaller than the herd average (Henricson and Backstrom, 1964). The hypoprolificacy was apparently the result of the rearrangement, as semen and physical parameters appeared normal. The observation of fertility issues in a pig with no concrete observable phenotype associated with a chromosome rearrangement helped to spur interest in the field of veterinary cytogenetics. Soon after, other chromosome abnormalities were identified in the domestic pig, including numerical abnormalities in embryos (McFeely et al.,

1967), and the presence of a t(1;11)(q-;q+) reciprocal translocation in a mosaic form in a stillborn piglet exhibiting physical malformations (Hansen-Melander and Melander, 1970).

Although numerical and structural chromosome abnormalities have been observed in the pigs, structural rearrangements are far more prevalent. Numerical chromosome abnormalities in the pig are relatively rare and are largely confined to the sex chromosomes. Aneuploidy has been observed in embryos (McFeely et al., 1967) and in live births with cases of 37,X X-monosomy

(Lojda, 1975), and 39XXY Klinefelter Syndrome being reported (Breeuwsma, 1968; Hancock and

Daker, 1981; Makinen et al., 1998). Mosaic aneuploidies of the sex chromosomes have also been observed in a handful of cases (Breeuwsma, 1970; Hancock and Daker, 1981; Quilter et al., 2003).

Chimerism, the presence of two distinct sets of DNA in blood leukocytes has also been described in the form of XX/XY individuals (Bruere et al., 1968; Somlev et al., 1970; Toyama, 1974;

Christensen and Nielsen, 1980; Clarkson et al., 1995; Padula, 2005; Ducos et al., 2007; Rezaei et al., 2020).

In contrast to sex chromosome aneuploidy, the pig genome appears completely intolerant to whole chromosome aneuploidy, with no animals yet reported carrying such an aberration. To date just a handful of pigs have been observed carrying any aneuploidy, most of which are in a

22 mosaic state, where only a subset of cells is affected (Vogt et al., 1974; Bosch et al., 1985). Just a single pig carrying a whole-body partial aneuploidy has been identified, the result of inheritance of an unbalanced translocation (Villagomez et al., 1995b). This pig was notable, in that the littermates who also inherited the unbalanced rearrangement died soon after birth, while that pig showed no physical malformations, aside from acrosomal defects in the sperm (Villagomez et al.,

1995b).

Structural chromosome rearrangements are much more prevalent in the pig than numerical aberrations, however not all types of rearrangements occur at the same rate. Deletions and duplications of chromosomes in the pig are quite rare, likely reflecting the apparent intolerance of the pig genome towards genetic imbalances. These genetic imbalances change the dosage of genes, either having too much or too little, and appears to most often result in embryonic death. Just two cases are known, one a deletion of a chromosome arm in an embryo (McFeely, 1966), and the second the aforementioned partial aneuploidy in a piglet, the result of inheritance of an unbalanced rearrangement (Villagomez et al., 1995b).

Inversions of chromosomes are also generally quite rare. Just twelve cases have been reported, primarily from the same laboratory in France (Ducos et al., 2007), however inversions are found amongst boars in other countries as well (Danielak-Czech et al., 1996; Quach et al.,

2016; Sanchez-Sanchez et al., 2019). Carriers of inversions are usually phenotypically normal and lack the low litter sizes affecting other rearrangement carriers, and thus are typically only identified through routine cytogenetic screening rather than case-based submissions (Quach et al., 2016;

Raudsepp and Chowdhary, 2011).

The presence of Robertsonian translocations in pigs is interesting, as they are primarily associated with Wild Boars, rather than the domestic pig. The karyotype of the wild boar is rarely

23

38XX or 38XY, but instead more often has 36 or 37 chromosomes (Tikhonov et al., 1975). This is due to the ubiquitous presence of Robertsonian translocations in these animals, typically between chromosomes 13 and 17, resulting in most Wild Boars carrying a rob(13;17) rearrangement in a heterozygotic or homozygotic state (Rejduch et al., 2013). Robertsonian translocations however have been observed in the domestic pig as well, though more rarely. Seven cases of Robertsonian translocation have been described in the domestic pig, primarily the rob(13;17) rearrangement mentioned previously (Miyake et al., 1977; Alonso and Cantu, 1982;

Schwerin et al., 1986; Ducos et al., 2007; Quach et al., 2016). Other Robertsonian rearrangements have been observed as well, including rob(14;15), rob(14;17), and rob(16;17) rearrangements

(Ducos et al., 2007; Astakhova et al., 1991). Carriers of Robertsonian translocations are known to produce smaller litters than normal boars, however the effect seems quite variable carrier to carrier, and is less severe than that of reciprocal translocations (Schwerin et al., 1986; Quach et al., 2016).

This is thanks in part to male and female carriers of these rearrangements demonstrating different proportions of unbalanced gametes, with the males (3.2%) having a lower prevalence than the female carriers (28.9%) (Pinton et al., 2009). Overall carriers of Robertsonian translocations are expected to have less impacted litter sizes than carriers of reciprocal translocations (Quach et al.,

2016).

Though inversions and Robertsonian translocations are known to occur in pigs, reciprocal chromosome translocations are by far the most prevalent chromosome abnormality in the domestic pig. To date 170 distinct reciprocal translocation carriers have been reported in over seventeen countries (Tables 50, S4). The first case was reported in 1964, in a hypoprolific Swedish Landrace boar, which was identified to be a carrier of a t(11;15)(p15;q13) reciprocal translocation through subsequent cytogenetic analyses (Henricson and Backstrom, 1964; Akesson and Henricson, 1972;

24

Gustavsson et al., 1973). By the end of the next decade five more carriers would be identified in labs throughout Europe, which demonstrated distinctly lower litter sizes compared to the respective herd averages (Bouters et al., 1974; Hageltorn et al., 1976; Locniskar et al., 1976;

Madan et al., 1978; Popescu and Legault, 1979). These observations increased awareness of the presence of chromosome rearrangements in swine herds, and the negative impact they have on fertility.

Carriers of reciprocal translocations, as with other balanced structural chromosome rearrangements, appear phenotypically normal and have normal semen parameters. Carriers however experience predictable loss of fertility, averaging around 40%, revealing the presence of the rearrangement (Table S4; Pinton et al., 2000). This is due to the generation of both balanced and unbalanced gametes in carriers of chromosome rearrangements (King et al., 1981). The generation of unbalanced gametes in rearrangement carriers is due to the need for the homologous chromosome segments of the derivative chromosomes to align during meiosis. The derivative chromosomes arrange themselves into a quadrivalent formation along with their normal homologous counterparts, allowing for homologous chromosomes segments to properly align

(Scriven et al., 1998). From this quadrivalent formation chromosomes segregate unpredictably, by alternate segregation which produces balanced gametes, or by one of adjacent-1 or adjacent-2 segregation, which produces unbalanced gametes (Scriven et al., 1998). Balanced gametes contain either both derivative chromosomes or both normal chromosomes, thus there is no net gain or loss of genetic material. Balanced gametes, if fertilized, may then go on to produce live offspring either with a normal chromosome constitution, or offspring that will inherit the rearrangement, and carry the balanced rearrangement constitutionally. Unbalanced gametes in contrast contain just one of the derivative chromosomes, and one of the normal chromosomes. This creates a partial

25 aneuploidy whereby there is a partial monosomy of one chromosome, and a partial trisomy of the other. Aneuploidies in the pig genome, even minor ones, appear almost entirely incompatible with life, thus fertilized unbalanced gametes will undergo predictable early embryonic death (King et al., 1981).

Carriers of chromosome rearrangements are thus expected to undergo predictable reproductive loss due to the generation of unbalanced gametes. Although the proportion of unbalanced gametes may vary between different carriers, a high proportion of unbalanced gametes is still expected in each case (Ogilvie and Scriven, 2002). In addition to the loss of fertility, of the liveborn offspring, approximately 50% will inherit the chromosome rearrangement in its balanced form (Ducos et al., 1998b). This allows the rearrangement to proliferate and extend its presence over multiple generations if uncontrolled, compounding the issue by possibly extending the impact of one rearrangement over several generations.

Mosaic Chromosome Rearrangements

In addition to those chromosome rearrangements originating in the germline, producing carriers of constitutional chromosome rearrangements, some individuals may acquire chromosome rearrangements over their lifetime in somatic cells. These chromosome rearrangements rather than being present in every cell are instead restricted to a subset of cells and may be confined to single tissues, with carriers being referred to as mosaics due to the presence of multiple cell lines. These mosaic rearrangements are structurally similar to constitutional rearrangements, however they are associated with different consequences. Constitutional rearrangements by definition are subject to intense selection pressure, thus ensuring that only those rearrangements that do not significantly interfere with cellular functions can survive and result in live offspring. Thus, only those

26 rearrangements that do not significantly interrupt or alter genes, especially those genes affected by dosage effects, can go on to produce live offspring.

Mosaic chromosome rearrangements that exist in just a subset of cells however face less extreme selection pressure. As they make up just a subset of cells, mosaic cells that interrupt or alter gene function may persist, with normal functioning cells making up the lack of gene function.

As mosaic rearrangements that alter gene activity may persist in the genome, many mosaic chromosome rearrangements are associated with diseases, often due to the interruption of genes, or the creation of fusion genes with new products. Several mosaic rearrangements have been associated with cancers in humans, most notably a mos t(9;22)(q34;q11) rearrangement known as the Philadelphia chromosome (Nowell and Hungerford, 1960). This rearrangement results in the creation of an oncogenic BCR-ABL fusion gene, which produces an “always on” tyrosine kinase signalling protein, which causes the cells to divide uncontrollably, and is associated with chronic myeloid leukemia (Daley et al., 1990; Elefanty et al., 1990). To date no mosaic rearrangements have yet been associated with cancers in the pig genome.

Numerous other mosaic rearrangements are known in humans, with the most notable rearrangements typically being associated with cancers. Mosaic rearrangements occur in healthy individuals as well, and between 0.002-0.004% of cells in an individual are expected to carry mosaic rearrangements (Warburton, 1991). Unlike constitutional rearrangements, mosaic rearrangements are less often associated with fertility problems, with the chance of finding a somatic rearrangement in an individual experiencing fertility problems being just 0.04% (Opheim et al., 1995). For this reason, mosaic chromosome rearrangements are rarely reported in the domestic pig, as cytogenetic labs primarily focus on constitutional rearrangements which are known to cause fertility problems. To date just a handful of mosaic rearrangements have been

27 identified in the pig (Table 51), none of which were identified through routine cytogenetic screening. One was present in a stillborn boar exhibiting physical malformations (Hansen

Melander and Melander, 1970), while a series of somatic rearrangements involving aberrant rearrangement between genes of the T-cell receptor were sought for and subsequently confirmed to occur in pigs (Musilova et al., 2014).

Economic and Breeding Perspectives

Modern swine breeding practices have continuously emphasized the use of a small number of pure-bred boars for simultaneous artificial insemination (AI) of multiple sows. Swine breeding in Canada for example has become an increasingly concentrated business, with the number of

Canadian farms reporting pigs declining 72.8% since 1991 (Statistics Canada, 2019), while the average number of pigs per farm increased by 430%, from 345 to 1,829 per farm in 2019 (Statistics

Canada, 2019). The ratio of sows to breeding boars in Canada is over 50:1, thus it may be presumed that the average number of breedings per boar is at least 50 (Statistics Canada, 2019). As such the presence of a single carrier of a reciprocal translocation permitted to breed in this way will produce approximately 50 litters simultaneously with associated loss of 40% of potential piglets. Presuming the average litter size to be 12 (CCSI Annual Report, 2019), this would result in the loss of approximately 250 piglets, costing the breeders thousands to tens of thousands of dollars.

In addition to the smaller litter sizes from chromosome rearrangement carriers, the presence of rearrangement carriers in swine herds inflict distinct economic concerns. Canada, despite its size, is one of the largest exporters of pork products in the world, exporting over 1 million metric tonnes of pork products in 2018 into a diversified global marketplace (USDA, 2019). In addition, high quality breeding boars are also increasingly exported from Canada into America, with 15,296 such boars being exported in 2017, generating over a thousand more dollars in revenue per head

28 than feeder hogs (USDA, 2018). Given the economic importance of the pork industry to the

Canadian economy it is clear that the presence of chromosome rearrangements in Canadian swine herds is detrimental to the industry at large, and managing their impact may ensure acceptable litter sizes for breeding boars, minimizing losses, and increasing the stock of these boars for export.

Routine Cytogenetic Screening and Management of Chromosome Rearrangements

The recognized association between chromosome rearrangements and lower fertility has spurred the development of cytogenetic screening programs in several countries. The National

Sow Herd Management Program in France was the first such screening program, and stipulated that boars siring litters of 8 piglets or less on average are to be cytogenetically examined prior to additional breeding (Dagorn, 1978; Popescu et al., 1983). This program greatly increased the number of boars being cytogenetically examined in France over the years and resulted in 20 reported reciprocal translocations in French boars by 1999 (Ducos et al., 1998a; Ducos et al.,

1998b). This program was expanded in 1999 to include mandatory cytogenetic screening of boars born of small litters prior to approval for A.I (Pinton et al., 2000). The success of this program has led to many French breeders voluntarily submitting their boars for cytogenetic screening regardless of whether they met the criteria or not, with it being strongly suggested that all boars entering A.I centres be screened for chromosome rearrangements (Ducos et al., 2000; Ducos et al., 2002).

As of 2017, most cytogenetic screening of pigs is conducted at the National Veterinary

School of France in Toulouse, with 31,000 boars having passed through this lab as of 2017 (Ducos et al., 2017). Several other cytogenetic screening efforts have been put in place as well in other countries. The Cooperative Pig Centers for Artificial Insemination in Pigs has screened over 1000 pigs in the Netherlands (Ducos et al., 2008), and nearly 2000 animals have been screened at the

National Institute of Animal Production in Balice, Poland, amongst others (Ducos et al., 2008).

29

Other cytogenetic screening programs have been implemented at the National Institute for

Agricultural and Food Research and Technology in Madrid, Spain (over 800 animals), and at the

University of Guelph, Canada (over 700 animals) (Sanchez-Sanchez et al., 2019; Quach et al.,

2016). Despite these labs performing cytogenetic screening for breeding boars, just a small percentage of breeding boars worldwide are subject to cytogenetic screening. Indeed, the technology to perform cytogenetic screening is widely available, however most breeders, especially those in highly productive countries such as the U.S.A, and China do not take advantage of cytogenetic screening on their own herds. Thus, the implementation of cytogenetic screening has much room to grow in the swine industry.

The Generation of Chromosome Rearrangements

Despite the large number of chromosome rearrangements identified in both humans and pigs, little is known about the factors leading to their formation. Chromosome rearrangements at their core require three events to occur, the first being the generation of DSBs simultaneously on one or more chromosomes, the second being the recognition and initiation of the DNA repair response, and the third being the mis-repair of the chromosome segments resulting in a reciprocal translocation. Most research concerning chromosome rearrangements focuses primarily not on this process, but instead on the consequences of the rearrangement itself, such as the impact of gene fusions from somatic rearrangements. Thus, little is known about the processes that lead to and result in the formation of reciprocal translocations.

Generation of DSBs

Double strand breaks are estimated to occur quite frequently in the mammalian genome, with an average of ten DSBs generated in each cell every day (Lieber et al., 2003; Martin et al.,

1985). DSBs may form due to a variety of factors, both exogenous and endogenous. Exogenous

30 factors include ionizing radiation, and stress derived from chemical sources, while endogenous factors include programmed DSBs that are a normal part of cellular development, mistakes during

DNA replication, and mechanical stress from within the cell itself.

Ionizing radiation (IR) has long been known to generate DSBs due to the breaking of water molecules (Sax, 1938; Yamaguchi et al., 2015). This results in the production of hydroxyl free radicals which may react with nearby DNA, producing single strand breaks (SSBs) which may later be converted to DSBs (Milligan et al., 1995; Friedberg et al., 2005). A similar method of producing DSBs is produced by the cell’s themselves in the form of reactive oxygen species

(ROS). Normal oxidative cellular respiration converts a small portion of oxygen molecules into superoxide, which may be converted into hydroxyl free radicals by superoxide dismutase (Chance et al., 1979). It is estimated that this results in the production of 109 ROS per cell, per hour, with a small portion of which may enter the nucleus resulting in DSBs (Lieber, 2010).

Another exogenous method of generating DSBs is through chemical agents that damage

DNA known as clastogens which are radiomimetic compounds, so named due to their ability to interact with DNA. Clastogens may damage DNA through a variety of methods, including DNA- alkylation, and DNA cross-linking (Wyrobek et al., 2005). DNA-alkylating agents may affect

DNA by adding an alkyl group to guanine residues in the DNA, preventing the double helix from linking properly, and generating DSBs. DNA-crosslinking agents work a bit differently, creating novel covalent bonds between nucleotide residues, which inhibit DNA strand separation, and prevent transcription and replication. Topoisomerase inhibitors may also produce DSBs by trapping topoisomerase-DNA complexes, preventing it from relieving coiling induced stress which may lead to the collapse of replication forks (Koster et al., 2007). Although such agents may be

31 useful in the treatment of cancer, as they more seriously affect quickly dividing tumour cells which are more susceptible to DNA damage, in otherwise normal cells these extra DSBs are unwarranted.

Endogenous sources of DSBs are ever present in cells, as DNA may be damaged during replication, which is left unrepaired may promote genomic instability (Syeda et al., 2014). DSBs form naturally during meiosis in order to facilitate crossover events between homologous chromosomes. Programmed DSBs are important aspects of several important cellular processes and require the generation of DSBs to function. An example of this is V(D)J recombination, the mechanism of genetic recombination necessary for the production of a diverse set of immunoglobulin receptor genes, as well as the genes of the T-cell receptor. Immunoglobulin class switching may also lead to the generation of DSBs (Soulas-Sprauel et al., 2007). These chromosome breaks are highly regulated, lacking the spontaneity of exogenously induced breaks.

Nicks in the DNA or the presence of DNA secondary structures such as hairpins and cruciform may cause polymerases to stall during DNA replication, leading to the collapse of a replication fork, and subsequent DSB formation (Pfeiffer et al., 2000; Lu et al., 2015). Stalled replication forks may regress, and partially displace newly synthesized DNA from their template strands, resulting in a structural similar to a Holliday junction, which may be cleaved resulting in a DSB (Mehta and Haber, 2014). Other methods of endogenous DSB generation may occur from segregation defects which delay chromosome clearing from the central spindle, which may lead to the generation of DSBs (Hoffelder et a., 2004).

Susceptibility to Chromosome Breakage

Chromosome breakage is thought to not occur evenly across chromosomes, with particular chromosomes and chromosome regions being thought to be more susceptible to breakage than others. The structure of chromosomes has been suggested to influence the frequency of

32 rearrangement, with longer chromosomes, and more acrocentric chromosomes being thought to rearrange more frequently (Bickmore et al., 2001; Lin et al., 2018). Several studies of translocation breakpoints have found hotspots for rearrangement in the human genome, where chromosome breaks resulting in rearrangement occur more frequently than other regions (Yu et al., 1978; Aurias et al., 1978; Warburton, 1991). These chromosome regions are proposed to have some common factors including having a less condensed, more transcriptionally active euchromatic composition

(Goetze et al., 2007; Falk et al., 2008). Other chromosome features thought to increase susceptibility to breakage include common fragile sites, heritable features of mammalian genomes known to break under exposure to distinct chemical stressors such as aphidicolin, bromodeoxyuridine (BrdU), and folate (Riggs et al., 1993; Yang and Long, 1993; Ronne, 1995).

Sixty of these fragile sites are considered common amongst pigs and are expected to occur in most individuals (Ronne, 1995). Analysis of fragile sites in pigs has shown that cytogenetic bands harbouring common fragile sites often overlap with known reciprocal translocation breakpoints

(Ronne, 1995; Quach et al., 2016).

Genomic elements on chromosomes such as repetitive elements are also proposed to be associated with rearrangement breakpoint regions. Many repetitive elements such as long interspersed repetitive elements (LINE), short interspersed repetitive elements (SINE), low copy repeats (LCR), and palindromic sequences are thought to influence chromosome breakage or rearrangement. This is proposed due to the potential for high between different

LINE, SINE, and LCRs, and the ability for repetitive sequences such as palindromes to occupy non-B DNA structures such as hairpins or cruciform that may promote DNA breakage (Ou et al.,

2011; Bacolla et al., 2015). These repetitive sequences have been identified near the sites of breakpoints of both somatic and germline reciprocal translocations and through their nature are

33 proposed to influence or promote rearrangement at these sites, however precise roles for these repetitive elements in all circumstances are unknown (Nilsson et al., 2017; Luokonnen et al.,

2018).

DNA Damage Recognition and the Initiation of DNA Repair

Given the importance of the recognition of DSBs and the initiation of repair, cells have an abundant number of mechanisms to perform these actions. DNA damage recognition (DDR) proteins such as the MRE11-NBS1-RAD50 complex are always active in cells, on the lookout for

DSBs. Once identified these DDR will act to initiate a series of signalling events that will lead to

DNA repair. This is begun via protein kinases such as ATM, which become activated and phosphorylate key proteins involved in DNA damage repair. These protein kinases play a role in the detection of DNA damage, and soon after the initiation of a DSB, ATM will phosphorylate

H2AX near the site of the DSB. H2AX plays several roles in the initiation of the DNA damage response by recruiting DNA damage signaling and repair proteins to the DSBs, chromatin remodelling near the site of the DSB, and signal amplification of the DNA damage check point proteins. ATM then activates checkpoint proteins which aim to stop the cell cycle from progressing while the DNA repair response is initiated, and transducer proteins which themselves go on to activate effector proteins which will initiate DNA repair.

DNA Repair

The DSBs routinely generated in cells must be repaired in a timely manner in order to maintain genomic stability. In most cases, these DSBs will be correctly repaired, with the homologous chromosome segments rejoining together in the correct orientation, thus properly resolving the break. Instances of mis-repair, where the chromosome segments are rejoined either in the wrong orientation, or to new non-homologous partners, results in the generation of

34 chromosome rearrangements (Agarwal et al., 2006; Richardson and Jasin, 2000; Khanna and

Jackson, 2001). Several pathways for DSB repair are known, with the main pathways being non- homologous end joining (NHEJ) and homologous recombination (HR).

NHEJ requires little to no homology between chromosome segments and acts by modifying the broken DNA ends via small deletions and insertions, and ligating the broken DNA ends together (Pannunzio et al., 2014). NHEJ can work throughout the cell cycle, as it requires no homologous partner with which to use as a template for DNA repair (Avlon et al., 2004; Ira et al., 2004; Moore and Haber, 1996). NHEJ is thought to supress chromosome translocations as it acts on DSBs in a timely manner. Mutants with delayed or inactive NHEJ increase the time that

DSBs exist in cells, increasing the likelihood that two DSBs will exist simultaneously (Iarovaia et al., 2014). In addition, insufficient NHEJ activity may lead to mutagenic modes of DNA repair such as Alt-NHEJ, which may take on several error-prone forms including microhomology- mediated end joining (MMEJ), and single-strand annealing (SSA). In general Alt-NHEJ results in extensive resection and cleavage of DNA near the breakpoints, resulting in small to large deletions (Rodgers and McVey, 2016). Despite the mutagenic nature of Alt-NHEJ being proposed to contribute to rearrangement, little evidence suggests this is the case, with canonical

NHEJ being more often observed at breakpoint junctions (Ghezraoui et al., 2014).

HR recombination is different from NHEJ in that it requires a homologous template, typically in the form of a sister chromatid, in order to repair DSBs (Moynahan and Jasin, 2010).

This limits the timeframe in which HR may be active, as it can not be active in the G0 or G1 phases of the cell cycle. As HR requires a homologous template in order to repair DSBs, it is considered to be largely error free, however there are cases where HR will occur between highly homologous chromosome regions on different chromosomes, resulting in the mis-repair of the DSBs, and

35 resulting in a chromosome rearrangement. Recurrent chromosome rearrangements are proposed to occur via non-allelic homologous recombination (NAHR) between long highly homologous chromosome segments known as low-copy repeats (Ou et al., 2011).

More recently the development of NGS technologies have allowed for the precise delineation of rearrangement breakpoints, revealing breakpoint signatures, allowing the inference of the potential mechanisms underlying the rearrangement. Limited numbers of rearrangement breakpoints that have been sequenced suggest two major underlying mechanisms of DNA repair leading to rearrangement. Some breakpoints have long homologous segments flanking recurrent translocation junctions, suggesting that in these cases NAHR mediated the rearrangement (Giglio et al., 2002; Hastings et al., 2009; Ou et al., 2011). Other publications however have shown a distinct lack of any significant homology at the sites of breakpoints suggesting the NHEJ underlies the formation of non-recurrent chromosome rearrangements (Chiang et al., 2012).

In addition to NHEJ and HR being implicated in the generation of chromosome rearrangements, signatures of error prone DNA repair mechanisms have been observed at the sites of rearrangement breakpoint junctions as well. Error prone replication-based mechanisms such as microhomology mediated break induced replication (MMBIR), and fork stalling and template switching (FoSTeS) have been suggested to underlie nonrecurrent rearrangements associated with disease (Hastings et al., 2009; Lee et al., 2007; Abyzov et al., 2015). The hallmark of these DNA repair methods is the presence of microhomology, small templated insertions, at the breakpoint junction, as well as other signatures such as accompanying inversions and copy number gains

(Carvalho et al., 2011; Carvalho et al., 2013). The microhomology at the breakpoint junction is used to prime the resumption and synthesis of a stalled or collapsed replication fork.

36

Limited analysis of balanced chromosome translocation breakpoints reveals a variety of breakpoint signatures corresponding to different DNA repair methods discussed above. Many breakpoint junctions have been observed to be precisely repaired, with no evidence of microhomology, suggesting NHEJ as the mechanism for repair (Nilsson et al., 2017). Still many other breakpoints have been observed with microhomology and small insertions at the breakpoint junctions suggesting repair by MMBIR or FoSTeS (Nilsson et al., 2017). Microhomology at breakpoint junctions could also be explained by alt-NHEJ, which takes over when NHEJ is insufficient, and also inserts small microhomology sequences at the breakpoint junction

(Ghezraoui et al., 2014).

The presence of repetitive elements at the sites of recurrent breakpoints such as LINE,

SINE, Alu, and LCR sequences has led to the suggestion that NAHR between these highly homologous sequences resulted in the generation of a translocation at those breakpoint junctions

(Bailey et al., 2003; Deininger et al., 2003; Ou et al., 2011). These sequences are rarely identified however at non-recurrent breakpoint junctions, suggesting the DNA repair mechanisms between different genomic regions differ based on the underlying genomic architecture (Nilsson et al.,

2017). Despite several recurrent rearrangement breakpoints being observed in mammalian genomes, it is still unclear why it is that they so frequently undergo rearrangement. Indeed, it is unclear if breakpoints are promoted by similar underlying factors in each case, or if different sets of breakpoints occur in each case of rearrangement.

37

Rationale

Chromosome rearrangements are a well-known, but understudied phenomenon in the domestic pig. Chromosome rearrangements are routinely generated in swine herds, occurring in

1/200 live births (Ducos et al., 2007). Although carriers of chromosome rearrangements may appear normal, they experience predictable reproductive loss averaging 40% per carrier. This is particularly troublesome for large swine breeders who select a small number of pure-bred boars to breed to several sows simultaneously via artificial insemination. As chromosome rearrangements give no visible signs of their presence, carriers may easily be included in breeding operations, causing tens to hundreds of thousands of dollars in losses simultaneously, while also passing on the rearrangement to half of liveborn offspring.

Currently the only method of detecting carriers of chromosome rearrangements is by routine cytogenetic screening, which while effective, is costly and time consuming. In addition, despite the availability of the technology, few countries have appropriate cytogenetic labs, and few breeders take advantage of their presence. The establishment of permanent cytogenetic labs for the purpose of screening pure-bred boars is paramount to ensuring only those boars with proper chromosome constitutions are permitted to breed, thus improving the genomics of swine herds.

Even with the identification of carriers and their removal from breeding operations, de novo rearrangements reliably occur in 1/200 live births, thus necessitating that cytogenetic screening be performed on an ongoing basis until comparably effective methods are developed. The high prevalence of rearrangements in the domestic pig suggests they occur non-randomly, and that there may be a genetic component which could be mechanically investigated.

Despite the existence of more than 200 unique rearrangements described in the pig, little is known about the factors that promote the formation of rearrangement in those carriers. The

38 characterization of rearrangements and their carriers could help to elucidate risk factors in the genome for acquiring chromosome rearrangements. Thus, the hypothesis of the present study is that carriers of chromosome rearrangements are prevalent in Canadian swine herds, and that there are distinct features of the genomic architecture and genomic landscape of pigs that are associated with chromosomal rearrangement.

In order to test this hypothesis, the study had several objectives:

Objective 1: Carry out routine cytogenetic screening of prospective breeding boars to identify and characterize chromosome rearrangements

Objective 2: Perform bioinformatic analysis of chromosome rearrangement breakpoints in order to elucidate genomic factors associated with their formation

Objective 3: Obtain DNA samples of carrier boars and their direct family members in order to perform a GWAS and evaluation of CNV in order to identify SNPs and CNVs associated with chromosome rearrangements and propose a novel mechanism to explain the formation and high prevalence of rearrangements in the domestic pig.

39

Chapter 2: Cytogenetic Screening of Canadian Swine Herds: The Prevalence of Chromosome Abnormalities

Work from this Chapter has been published in Genes, Scientific Reports, and Elsevier

Introduction

Chromosome rearrangements, most notably reciprocal translocations, are one of the leading causes of reproductive dysfunction in the domestic pig, with an estimated 50% of hypoprolific boars estimated to be carriers (Gustavsson, 1990). Chromosome rearrangements result from the breakage and mis-repair of chromosomes, leading to a re-ordering of chromosome material. Despite this large-scale re-organization of chromosome material in the genomes of carriers, no observable phenotypic consequences are typically observed. This is due to the chromosome rearrangements being genetically balanced, formed without the significant loss of genetic material. As a result, reciprocal translocation carriers may reside in herds undetected for a long period of time. The presence of a chromosome rearrangement in a carrier typically only becomes known if the carrier is selected for breeding. In this case the carrier boar will produce significantly smaller litter sizes, with an average of 40% fewer offspring relative to other members of the herd (Gustavsson, 1990; Pinton et al., 2000).

Although the presence of chromosome rearrangements affecting the fertility of pigs has been known for over 50 years, few breeders account for the presence of rearrangements in their herds. This is likely to result in thousands of dollars in losses for each carrier boar permitted to breed (Quach et al., 2016; King et al., 2019). This is especially damaging due to the high prevalence (1/200) of rearrangements in pigs (Ducos et al., 2007; Quach et al., 2016). The introduction of cytogenetic screening into herds may therefore have clear economic benefits by identifying rearrangement carriers prior to breeding. In addition to the financial benefits for

40 breeders, submitting boars for cytogenetic screening may also reveal novel chromosome rearrangements in swine herds. These new rearrangements may help cytogeneticists further understand the prevalence of rearrangements in swine herds, and the breadth of rearrangement possible in the pig genome. In doing so there is hope that this information may help to uncover potential reasons as to why the prevalence of rearrangements appears so high in the pig.

Here we describe the results of a cytogenetic screening program established at the

University of Guelph in Canada. Over a period of five years peripheral blood samples from more than 6000 boars were subject to cytogenetic screening, leading to the observation of several new chromosome rearrangements in the domestic pig. These novel chromosome rearrangements are reported as well as the prevalence of rearrangements overtime, where we show the ability of cytogenetic screening to drastically reduce the prevalence of rearrangements that impair fertility in a short time span.

Materials and Methods

Peripheral Blood Collection

Peripheral blood samples from 6,491 pigs (Sus scrofa domestica) raised in various

Canadian farms were collected by experienced farm technicians or Canadian Food Inspection

Agency veterinarians according to the Canadian Council on Animal Care and the University of

Guelph’s Animal Care Committee Guidelines. The majority of these pigs (99.5%) were reproductively unproven young boars from commercial herds, approximately six months of age, and in good general health. The remaining 0.5% of pigs were gilts or sows that were related to a boar identified to carry a chromosome rearrangement. The breeds of the pigs were as follows:

Duroc (n = 1,870), Landrace (n = 1,420), Pietrain (n = 132), Yorkshire (n = 2,207), and

Other/Undeclared (n = 864).

41

Lymphocyte Culture and Chromosome Analysis

Lymphocyte cultures were set up as follows. Peripheral blood samples were obtained from

Canadian farms and kept in chilled heparin lined tubes. From each sample, 1 ml of peripheral blood was pipetted into a T25 ventilated flask (Corning, USA), containing a prepared media consisting of 8.9 ml Roswell Park Memorial Institute (RPMI) 1640 medium (Gibco, Grand Island, USA), 1.0 ml of fetal bovine serum (FBS; Gibco, Canada), 0.07 ml phytohemagglutinin, M-form (PHA-M;

Gibco, Grand Island, USA), and 0.02 ml of penicillin/streptomycin (10,000 units/ml penicillin;

10,000 µg/ml streptomycin; Sigma-Aldrich, St. Louis, USA). The flasks were then placed into an incubator at 37.5°C for a period of 72 hours.

At the 72 hour mark the cell cultures were treated with 30µl of 10 µg/ml colcemid solution

(Gibco, Grand Island, USA) for twenty minutes, resulting in the termination of the culture, and the arrest of cells in the metaphase stage. After this time the contents of the cell culture flasks were transferred into 15ml conical tubes and placed in a centrifuge at 300G for 10 minutes. The supernatant was then aspirated, and a 0.56% hypotonic solution (1.12g, 0.075M KCl in 200 ml

MilliQ water; Sigma-Aldrich, St. Louis, USA) was pipetted into each tube, and gently mixed in order to resuspend the blood cells. Once resuspended, the tubes were filled to the 11 ml mark with the remaining hypotonic solution and left in a 37.5°C oven for 20 minutes, allowing the cells to swell.

After this time, each sample was treated with 200µl of a chilled 3:1 methanol (Fisher

Chemical, Fair Lawn, USA) and glacial acetic acid (Fisher Chemical, Fair Lawn, USA) fixative solution. The tubes were then gently mixed and placed in the centrifuge at 300x g for 10 minutes, after which the supernatant of the tubes was aspirated. Further fixation of the solution was performed, adding 3 ml of fixative solution, 1 ml at a time, to each sample while vortexing at

42

2200rpm in order to resuspend the cells. Once resuspended, each tube was filled to the 13 ml mark with fixative solution, and chilled for 30 minutes. This fixation step was then repeated twice, allowing for three full rounds of fixative treatment. After chilling the tubes for a final time, they were placed in the centrifuge at 300x g for 10 minutes, the supernatant aspirated, and the cells re- suspended with a small amount of fixative solution. In a temperature of approximately 24°C, and

60% humidity, two to three drops of cell suspension was then pipetted onto two clean labelled glass slides (Fisherbrand, Pittsburgh, USA) near a humidifier, and were left over a 37.5°C water- bath to dry for 3 minutes. The slides were then removed and left to age for three days.

At the end of the third day the slides were placed in an oven at 40.0°C overnight prior to

GTG-banding. The slides were then treated with a 0.01% trypsin solution (1.6 ml of 0.25% sterile trypsin diluted in 40 ml of 0.01 M phosphate buffer solution; Sigma-Aldrich, St. Louis, USA) for

20 seconds to over a minute. The goal of this step was to partially digest the chromosomes, with tests being carried out beforehand to determine the optimal exposure time. The slides were then stained in a 6% Giemsa solution (3 ml of Giemsa stain diluted in 50 ml of MilliQ water; Sigma-

Aldrich, St. Louis, USA) for eight minutes. After this time the slides were removed from the solution and left to dry.

Once dry, the slides were observed under a camera mounted Leica microscope (Leica

Camera AG, Germany). At least fifteen images of high-quality metaphases (well-spread, long chromosomes, with clear G-banding) were captured under the 100x objective. The resulting images were then exported and at least two high-quality images were karyotyped using SmartType software (Digital Scientific UK). Each karyotype was evaluated for the presence of chromosome rearrangements or abnormalities. Any rearrangement present was then characterized according to

43

International System for Human Cytogenetics Nomenclature (ISCN), in reference to the standard

G-banded pig karyotype (Gustavsson, 1988).

Familial and Reproductive Information

In these cases, wherever possible, additional peripheral blood samples from the carrier boar, or direct relatives were obtained for additional cytogenetic analysis. The cytogenetic screening of the parents and siblings of carrier boars allowed partial pedigrees to be obtained in ten circumstances. In the other cases, one or both parents were culled prior to reporting the chromosome rearrangement, resulting in the origin of the rearrangement to be undetermined.

Among the cases of unique rearrangements, both parents were only available for cytogenetic analysis in 10 cases. In 9 of these both parents were revealed to have a normal chromosome composition (Cases# 4, 5, 7, 8, 9, 13, 14, 16, 20). The boar in Case #1 had inherited the reciprocal translocation from the sire. In all other cases only a single parent, or neither parent was available for cytogenetic analysis. In addition to unique instances of a rearrangement, we observed 27 carriers of rearrangements previously identified in our lab, each of which was confirmed to be related to the first observed carrier. Thus, all recurrently observed rearrangements were contained within families. For the purpose of analysis, the first instance of any constitutional rearrangement was considered to be of de novo or spontaneous in origin, unless otherwise determined.

Two carriers of constitutional reciprocal translocations, and three carriers of mosaic reciprocal translocations were experimentally bred by a farm. The farms provided reproductive data for each boar in the form of the total number born (TNB), number born alive (NBA), as well as the number of stillborn and fetal mummies in the case of the mosaic boars. The farms also provided litter size data for the sires, dams, and siblings of mosaic carriers, alongside the herd

44 average for each metric to allow for comparison. Analysis of the effect of mosaic rearrangements on litter size in the domestic pig is discussed in Rezaei et al. (2020).

Statistical Analysis

Statistical analysis was performed where appropriate. Specifically, the chi-square test was used in order to determine statistically significance differences in the outcome between two or more groups.

Results

From the beginning of 2015 through to the end of 2019, 6491 pigs were karyotyped in our laboratory. The number of pigs karyotyped annually ranged between 847 and 1640, generally increasing each year (Table 1). Over this period of time 101 pigs were found to carry a chromosome abnormality (Table 1). Nearly all chromosome abnormalities were structural in nature, with 98 pigs observed to be carriers of reciprocal translocations in either a constitutional

(n = 59), or a mosaic form (n = 39), while a single carrier of a pericentric inversion, and a deletion were identified as well. Two pigs were observed to be chimeras, having two distinct cell lines with different sex chromosomes, and are thus 2n=38,XX/XY. One of the chimeric boars also carried a mosaic rearrangement and is listed amongst the other carriers of mosaic rearrangements, with chimerism being listed as a secondary characteristic.

Table 1: Count of Chromosome Rearrangements Observed in Canadian Swine Herds

Constitutional Mosaic Other Total Number Year Total Normal Reciprocal Reciprocal Chromosome of Translocation Translocation Aberration Rearrangements

2015 847 843 3 1 0 4 2016 1101 1079 21 1 0 22

45

2017 1640 1621 9 10 0 19 2018 1615 1577 17 19 2 38 2019 1290 1272 9 8 1 18

Total 6493 6389 59 38 4 101

Constitutional Reciprocal Translocations in Canadian Swine Herds

In total 32 distinct constitutional reciprocal translocations, with a unique set of breakpoints, were observed amongst 59 carriers (Table 2). In each individual, all karyotypes presented with identical reciprocal translocations, thus attesting to the constitutional nature of the rearrangements.

Recurring observation of five distinct constitutional chromosome rearrangements among family members was found to have resulted from the inheritance of the rearrangement from a previously identified carrier parent or grandparent that was used for breeding. In total we observed 27 carriers that had inherited one of five distinct reciprocal translocations directly from a previously identified carrier parent previously (Table 3). Each of the 32 unique reciprocal translocation carriers are described in detail in appendix I, along with descriptions of those pigs that inherited rearrangements afterwards.

Table 2: List of Constitutional Reciprocal Translocations Observed in Canadian Boars

Case Year Sex Breed Karyotype Rearrangement

1 2015 M Yorkshire Translocation t(3;13)(q21;q21) 2 2015 M Duroc Translocation t(1;7)(q21;p11) 3 2015 M Yorkshire Translocation t(5;12)(q11;q12) 4 2016 M Yorkshire Translocation t(1;3)(p23;q25) 5 2016 M Duroc Translocation t(4;12)(p11;p15) 6 2016 M Landrace Translocation t(4;9)(p13;p24) 7 2016 M Large White Translocation t(14;15)(q13;q15)

46

8 2016 M Yorkshire Translocation t(Y;13)(p13;q33) 9 2016 M Duroc Translocation t(9;13)(q24;q31) 10 2016 M Landrace Translocation t(3;6)(q25;q11) 11 2016 M Yorkshire Translocation t(4;6)(q11;q27) 12 2016 M Yorkshire Translocation t(2;15)(q13;q24) 13 2016 M Yorkshire Translocation t(1;14)(q21;q14) 14 2016 M Landrace Translocation t(3;6)(q13;p13) 15 2017 M Duroc Translocation t(12;14)(q13;q21) 16 2017 M Yorkshire Translocation t(6;7)(q33;q22) 17 2017 M Duroc Translocation t(13;18)(q21;q13) 18 2017 M Duroc Translocation t(10;13)(p15;q31) 19 2017 M Landrace Translocation t(12;14)(q15;q23) 20 2017 M Yorkshire Translocation t(10;13)(q13;q21) 21 2017 M Duroc Translocation t(2;10)(p17;q13) 22 2018 M Yorkshire Translocation t(15;18)(q24;q24) 23 2018 M Yorkshire Translocation t(6;15)(q33;q13) 24 2018 M Yorkshire Translocation t(2;17)(p17;q13) 25 2018 M Yorkshire Translocation t(4;15)(q21;q11) 26 2018 M Yorkshire Translocation t(Y;1)(q11;q17) 27 2019 M Yorkshire Translocation t(9;14)(p13;q11) 28 2019 M Landrace Translocation t(1;14)(q2.11;q25) 29 2019 M Yorkshire Translocation t(5;18)(q21;q11) 30 2019 M Yorkshire Translocation t(5;13)(q21;q43) 31 2019 M Yorkshire Translocation t(7;9)(q15;p24) 32 2019 M Crossbred Translocation t(13;14)(q31;q29)

Table 3: List of Repeated Constitutional Reciprocal Translocations

Case Year Sex Breed Karyotype Rearrangement

2b 2016 M Duroc Translocation t(1;7)(q21;p11) 2c 2016 M Duroc Translocation t(1;7)(q21;p11) 2d 2016 M Duroc Translocation t(1;7)(q21;p11) 2e 2016 M Duroc Translocation t(1;7)(q21;p11) 2f 2016 M Duroc Translocation t(1;7)(q21;p11) 2g 2016 M Duroc Translocation t(1;7)(q21;p11) 2h 2016 M Duroc Translocation t(1;7)(q21;p11) 2i 2016 M Duroc Translocation t(1;7)(q21;p11)

47

3b 2016 M Yorkshire Translocation t(5;12)(q11;q12) 3c 2016 M Yorkshire Translocation t(5;12)(q11;q12) 15b 2017 M Duroc Translocation t(12;14)(q13;q21) 15c 2018 F Duroc Translocation t(12;14)(q13;q21) 15d 2018 M Duroc Translocation t(12;14)(q13;q21) 15e 2018 M Duroc Translocation t(12;14)(q13;q21) 15f 2019 M Duroc Translocation t(12;14)(q13;q21) 15g 2019 M Duroc Translocation t(12;14)(q13;q21) 17b 2017 M Duroc Translocation t(13;18)(q21;q13) 17c 2018 M Duroc Translocation t(13;18)(q21;q13) 17d 2018 F Duroc Translocation t(13;18)(q21;q13) 17e 2018 F Duroc Translocation t(13;18)(q21;q13) 17f 2018 M Duroc Translocation t(13;18)(q21;q13) 17g 2019 M Duroc Translocation t(13;18)(q21;q13) 18b 2018 M Duroc Translocation t(10;13)(p15;q31) 18c 2018 F Duroc Translocation t(10;13)(p15;q31) 18d 2018 F Duroc Translocation t(10;13)(p15;q31) 18e 2018 F Duroc Translocation t(10;13)(p15;q31) 18f 2018 F Duroc Translocation t(10;13)(p15;q31)

Reproductive Performance of Constitutional Reciprocal Translocation Carriers

Of the five chromosome rearrangements identified in more than one individual, two were due to the specific experimental breeding of the original carriers. The first boar to be bred was

Case #2, a Duroc carrier of a rcp(1;7)(q21;p11), which produced 15 litters. The average NBA was

5.4, representing a 33.6% decrease in litter size relative to the reported herd average. Samples from fifteen of the resulting offspring were submitted for cytogenetic analysis, with eight being observed to have inherited the rearrangement (Cases #2b-2i). Overall 53% of the offspring from this boar inherited the rearrangement.

48

The second boar to be experimentally bred, Case #3, a Yorkshire carrier of a rcp(5;12)(q11;q12), produced 15 litters as well. The average NBA was 4.6, indicating a 63% decline in litter size relative to the reported herd average. Samples from five of the boars born from the litters were karyotyped, with two of the boars (40%) being observed to have inherited the rearrangement (Cases #3b and 3c). Considering the offspring from Cases #2 and #3, on average

50% of the offspring of were found to have inherited the rearrangement from their sire.

Mosaic Reciprocal Translocations in Canadian Swine Herds

Through the course of cytogenetic screening we identified 39 carriers of mosaic rearrangements (Table 4). In each case a subset of karyotypes presented with a reciprocal translocation, while the remaining karyotypes had a normal chromosome composition. Of these cases, 32 (Cases #33-64) have been previously reported and described by Rezaei et al. (2020).

Since this time, a further seven carriers of mosaic reciprocal translocations in a mosaic state were observed, and are described in Appendix I. In each of these new cases a reciprocal translocation was observed in of the two karyotypes arranged from high-quality metaphases. A number of additional karyotypes were then arranged bringing the total number to 25. In each case, no additional chromosome aberrations were identified indicating the mosaic nature of the rearrangement, and the carriers were estimated to have ~ 4% mosaicism.

Table 4: List of Mosaic Reciprocal Translocations in Canadian Swine Herds

Case Year Sex Breed Rearrangement Author

33 2015 M Duroc mos t(7;9)(q24;q24); mos t(3;13)(q21;q49) Rezaei et al., 2020

34 2016 M Yorkshire mos t(7;9)(q24;q24) Rezaei et al., 2020 35 2017 M Unknown mos t(3;7)(p15;q13) Rezaei et al., 2020 36 2017 M Landrace mos t(7;9)(q24;q24) Rezaei et al., 2020

49

37 2017 M Landrace mos t(7;9)(q24;q24) Rezaei et al., 2020 38 2017 M Unknown mos t(9;13)(p22;q41) Rezaei et al., 2020 39 2017 M Landrace mos t(8;9)(q21;q24) Rezaei et al., 2020 40 2017 M Unknown mos t(6;7)(q21;q22); mos t(7;9)(q24;q24) Rezaei et al., 2020 41 2017 M Landrace mos t(7;9)(q24;q24) Rezaei et al., 2020 42 2017 M Duroc mos t(7;9)(q24;q24) Rezaei et al., 2020 43 2017 M Landrace mos t(3;7)(q23;q26) Rezaei et al., 2020 44 2017 M Yorkshire mos t(3;10)(q23;p13) Rezaei et al., 2020 45 2018 M Yorkshire mos t(7;9)(q24;q24) Rezaei et al., 2020 46 2018 F Landrace mos t(7;9)(q24;q24) Rezaei et al., 2020 47 2018 M Duroc mos t(7;9)(q24;q24) Rezaei et al., 2020 48 2018 M Yorkshire mos t(7;9)(q24;q24) Rezaei et al., 2020 49 2018 M Duroc mos t(7;9)(q24;q24) Rezaei et al., 2020 50 2018 M Unknown mos t(7;9)(q24;q24) Rezaei et al., 2020 51 2018 M Yorkshire mos t(7;9)(q24;q24) Rezaei et al., 2020 52 2018 F Duroc mos t(7;9)(q24;q24) Rezaei et al., 2020 53 2018 M Landrace mos t(7;9)(q24;q24); 2n = 38,XX/XY Rezaei et al., 2020 54 2018 M Yorkshire mos t(7;7)(q24;q15) Rezaei et al., 2020 55 2018 M Yorkshire mos t(7;9)(q24;q24) Rezaei et al., 2020 56 2018 M Landrace mos t(9;18)(q22;q11) Rezaei et al., 2020 57 2018 M Duroc mos t(7;18)(q22;q11) Rezaei et al., 2020 58 2018 M Yorkshire mos t(7;18)(q22;q11) Rezaei et al., 2020 59 2018 M Duroc mos t(5;9)(q21;p22) Rezaei et al., 2020 60 2018 M Duroc mos t(2;8)(q23;q21) Rezaei et al., 2020 61 2018 M Yorkshire mos t(6;16)(p15;q21) Rezaei et al., 2020 62 2018 M Duroc mos t(7;9)(q24;q24) Rezaei et al., 2020 63 2018 M Duroc mos t(7;9)(q15;q15) Rezaei et al., 2020 64 2019 M Pietrain mos t(7;13)(q22;q21) Rezaei et al., 2020 65 2019 M Unknown mos t(7;9)(q24;q24) This Thesis

50

66 2019 M Duroc mos t(7;9)(q24;q24) This Thesis 67 2019 M Duroc mos t(7;9)(q24;q24) This Thesis 68 2019 M Duroc mos t(7;18)(q15;q22) This Thesis 69 2019 M Landrace mos t(8;16)(q21;q21) This Thesis 70 2019 M Yorkshire mos t(1;2)(p23;q23) This Thesis 71 2019 M Duroc mos t(1;1)(1q2.11;1q21) This Thesis

Other Chromosome Rearrangements or Aberrations in Canadian Swine Herds

In addition to reciprocal translocations in constitutional or mosaic carriers of a deletion, an inversion, and two boars displaying XX/XY chimerism were identified. These cases are listed

(Table 5), and described in Appendix I. Case #53, a boar observed to display both XX/XY chimerism, and a mos t(7;9)(q24;q24) rearrangement has been previously been described by

Rezaei et al. (2020).

Table 5: List of Non-Reciprocal Chromosome Aberrations Observed in Canadian Swine Herds

Case Year Sex Breed Karyotype Rearrangement

72 2018 M Unknown Deletion 38, X, del(Yq)

73 2019 M Duroc Inversion inv(9)(p11;p22)

74 2018 M Duroc Chimera 2n=38, XY/XX

Prevalence of Chromosome Rearrangements in Canadian Swine Herds

Through the routine cytogenetic screening of 6491 pigs, we observed the prevalence of chromosome abnormalities in Canadian swine herds to be 1.56% (Table 6). Thus 1 in 64 samples from pigs that entered our laboratory carried a chromosome abnormality. The most prevalent chromosome abnormalities were constitutional reciprocal translocations, with 0.91% of pigs submitted for screening being found to be carriers. Mosaic reciprocal translocations were prevalent

51 as well, with 0.59% of pigs found to be carriers. In contrast other chromosome aberrations such as inversions (0.015%), deletions (0.015%), and chimerism (0.03%), were far less prevalent.

Considering all constitutional rearrangements together (reciprocal translocation, inversion, and deletion), there were 61 total carriers of rearrangements, and 34 distinct cases of rearrangement.

Thus, the prevalence of constitutional chromosome rearrangements was 0.94%, while the de novo prevalence of rearrangements is estimated to be 0.52%.

Table 6: Prevalence of Chromosome Rearrangements in Canadian Swine Herds by Year

Constitutional Other Mosaic Reciprocal All Chromosome Year Reciprocal Chromosome Translocation Rearrangements Translocation Aberrations

2015 0.35% 0.12% 0% 0.47% 2016 1.91% 0.09% 0% 2% 2017 0.55% 0.61% 0% 1.16% 2018 1.05% 1.11% 0.19% 2.35% 2019 0.7% 0.62% 0.08% 1.4%

Total 0.91% 0.59% 0.06% 1.56%

Extent of Mosaicism in Carriers, and Prevalence of Mosaic Cells in Swine Herds

Although the prevalence of carriers of mosaic chromosome rearrangements was 0.59% in the Canadian pig population, this does not necessarily reflect the extent of mosaicism in the species. Based on the total number of pigs karyotyped, as well as the number of mosaic carrier karyotyped, we can estimate the prevalence of mosaicism in lymphocytes. Cases #33 and #40 were found to have two distinct mosaic cell lines, presenting with ~ 8% mosaicism and ~ 4% mosaicism each, while Case #59 revealed fifteen additional mosaic rearrangements, and demonstrated ~ 60% mosaicism. In total we observed 59 mosaic cell lines amongst 6491 pigs. Based on the total number

52 of pigs karyotyped, as well as the additional karyotypes arranged for mosaic carriers, we have performed 13,993 total karyotypes during this study. The prevalence of mosaic cell lines is thus observed to be 0.4%, with a frequency of 1 in 250 lymphocytes. A more conservative estimate which considers Case #59 an outlier as it was the only boar identified to have a rate of mosaicism exceeding 8%, could also be considered. This results in the consideration of the 41 distinct mosaic cell lines observed amongst 13933 karyotypes. In this case the prevalence of mosaic cells is 0.29%, with 1 in 330 lymphocytes expected to carry a rearrangement.

Prevalence of Chromosome Rearrangements in the Major Breeds in Canada

Of the 6,491 pigs examined, 5,497 were from one of the three major breeds in Canada,

Duroc, Landrace, and Yorkshire. Comparing the prevalence of chromosome rearrangements between the breeds showed no breed had significantly more rearrangements than the others (X2 =

0.992, p = 0.609; Chi-square test; Table 7). Breaking down the rearrangements by constitutional and mosaic rearrangements we found that Yorkshire pigs had a prevalence of constitutional rearrangements, 2.86x higher than that of the other breeds, however this number failed to reach statistical significance (X2 = 5.772, p = 0.056; Chi-square test; Table 8), while no breed had a significantly higher prevalence of mosaic rearrangements than the others (X2 = 1.097, p = 0.578;

Chi-square test; Table 9).

Table 7: Distribution of Chromosome Rearrangements by Breed

Number of Number of Expected Prevalence of Breed Pigs Distinct Number of X2 Value Rearrangements Karyotyped Rearrangements Rearrangements

Duroc 1870 23 22.79 1.23% 0.002 Landrace 1420 14 17.31 0.99% 0.633

53

Yorkshire 2207 30 26.9 1.36% 0.357

X2 = 0.992, d.f = 2, p = 0.609

Table 8: Distribution of Constitutional Reciprocal Translocations by Breed

Number of Number of Expected Prevalence of Breed Pigs Unique Number of X2 Value Rearrangements Karyotyped Rearrangements Rearrangements

Duroc 1870 7 10.55 0.37% 1.195 Landrace 1420 5 8.01 0.35% 1.131 Yorkshire 2207 19 12.45 0.86% 3.446

X2 = 5.772, d.f = 2, p = 0.056

Table 9: Distribution of Mosaic Reciprocal Translocations by Breed

Number of Number of Expected Prevalence of Breed Pigs Unique Number of X2 Value Rearrangements Karyotyped Rearrangements Rearrangements

Duroc 1870 14 11.23 0.75% 0.683 Landrace 1420 8 8.52 0.56% 0.032 Yorkshire 2207 11 13.25 0.5% 0.382

X2 = 1.097, d.f = 2, p = 0.578

Prevalence of Chromosome Rearrangements in Different Herds

Pigs from 25 different herds were included in this study, 15 of which contributed at least

100 pigs to our study. Comparing the observed number of rearrangements to the expected number we found just one herd that had significantly more rearrangements than expected, however generally there was no significant difference between the number of rearrangements observed and

54 expected (X2 = 16.786, p = 0.268; Chi-square test; Table 10). In this case a herd of Duroc boars had 2.46x more rearrangements than expected, primarily due to an elevated rate of mosaic rearrangements. Overall however we found little evidence that different swine herds experienced a significantly different prevalence of rearrangement than others.

Table 10: Distribution of Chromosome Rearrangements by Herd

Number of Number of Expected Herd Breed Pigs Unique Number of X2 Value Karyotyped Rearrangements Rearrangements

A Duroc 778 4 8.69 2.531 B Landrace 368 2 4.11 1.083 C Canadian Landrace 419 5 4.68 0.022 D French Landrace 371 5 4.14 0.179 E Pietrain 116 1 1.3 0.069 F Yorkshire 210 5 2.35 2.988 G Canadian Yorkshire 514 4 5.74 0.527 H French Yorkshire 1220 17 13.63 0.833 I Unknown 112 1 1.25 0.05 J Duroc 198 3 2.21 0.282 K Unknown 728 4 8.13 2.098 L Duroc 182 5 2.03 4.345 M Duroc 702 11 7.84 1.274 N Landrace 170 2 1.9 0.005 O Yorkshire 179 1 2 0.5

X2 = 16.786, d.f = 14, p = 0.268

55

Prevalence of Chromosome Rearrangements over Generations with Interference via Cytogenetic Screening

One particular breeder has contributed over 4000 pigs to our screening program allowing us the opportunity to examine how cytogenetic screening over several generations affected the prevalence of chromosome rearrangements in this population. In total six generations of pigs have been evaluated in the screening program. Due to the varying sample size between generations we grouped the first three generations and the last three generations together. The prevalence of constitutional rearrangements was highest in the first three generations, with 1.14% of pigs being carriers, compared to just 0.52% in the last three generations, a decrease of 54% (Table 11). This is largely due to the removal of existing carriers from breeding eligibility, eliminating inheritance as an option, and ensuring all new rearrangements are likely of a spontaneous or de novo origin.

Table 11: Distribution and Prevalence of Chromosome Rearrangements by Generation

Constitutional Mosaic Number of Pigs All Generation Reciprocal Reciprocal Karyotyped Rearrangements Translocations Translocations

A - C 2196 25 5 30 D - F 1912 10 14 25

Prevalence Prevalence Number of Pigs Prevalence All Generation Constitutional Mosaic Karyotyped Rearrangements Rearrangements Rearrangements

A - C 2196 1.14% 0.23% 1.37% D - F 1912 0.52% 0.73% 1.31%

In contrast to constitutional rearrangements the prevalence of mosaic rearrangements increased between the first and last three generations, from 0.23% to 0.73%, a 217.4% increase.

56

This increase in prevalence likely reflects increased scrutiny being placed upon the chromosomes prior to karyotyping, especially considering the recurrent mos t(7;9)(q24;q24) rearrangement which may be observed in closely examined metaphase spreads without karyotyping by skilled cytogeneticists. As a result, the overall prevalence of chromosome rearrangements is similar between the first three and last three generations (1.37% and 1.31%). Despite this, the prevalence of constitutional rearrangements which are most interesting to breeders due to the associated reduction in litter size decreased by 54% between the first three and the last three generations. This suggests that the implementation of routine cytogenetic screening is quickly effective at identifying carriers of rearrangements and removing them from breeding eligibility resulting in significant reductions in rearrangements prevalence within a five-year span.

Discussion

Over a five-year period of routinely cytogenetically screening Canadian pigs for chromosome rearrangements we identified 101 carriers of chromosome rearrangements amongst

6491 pigs. The prevalence of chromosome rearrangements in the Canadian swine population was observed to be 1.56%. Though the prevalence of rearrangements is high, the introduction of routine cytogenetic screening into swine herds along with proper management of carriers led to a 54% reduction in the prevalence of constitutional reciprocal translocations over a five-year period.

Generally different breeds and herds of pigs were equally susceptible to rearrangement, with no breeds, and just one herd having significantly more rearrangements than expected. This work has contributed 34 novel constitutional chromosome rearrangements, and 41 novel mosaic chromosome rearrangements to the over 200 chromosome rearrangements previously described in the domestic pig (Table S5).

57

The introduction of a routine cytogenetic screening program for reproductively unproven boars was effective at identifying carriers of chromosome rearrangements in herds prior to breeding, with 101 total carriers of rearrangements being observed amongst 6491 total pigs. The prevalence of chromosome rearrangements in Canadian swine herds was observed to be 1.56%.

Compared to other countries which implement cytogenetic screening programs this figure is quite high, with only a more recent program in Spain reporting a higher prevalence, 3.3% (Sanchez-

Sanchez et al., 2019). This is largely due to our inclusion of carriers of mosaic chromosome rearrangements, which are apparently observed but disregarded by other laboratories due to their having no apparent impact on fertility (Rezaei et al., 2020).

Considering only carriers of constitutional rearrangements as most other cytogenetics laboratories have done, the prevalence of constitutional chromosome rearrangements was 0.94%.

Compared to other nations, the incidence of abnormalities Canada is mid-range, with other countries such as the Netherlands and Poland reporting a higher prevalence of rearrangements of

1.5%, and 1%, while France reports a lower prevalence of just 0.47% (Ducos et al., 2007; Ducos et al., 2008). Newer screening programs are expected to identify a higher prevalence of chromosome rearrangements as they will identify both de novo and inherited cases in swine herds, such as that in Spain, while older programs are expected to observe fewer chromosome rearrangements due to screening out cases, and only observing de novo chromosome rearrangements, such as the case in France (Sanchez-Sanchez et al., 2019; Dagorn et al., 1978;

Popescu et al., 1983; Ducos et al., 2007). The prevalence of constitutional chromosome rearrangements in Canada is reflective of the duration of screening, with the prevalence decreasing since 2015, and being predicted to decrease further overtime as screening is continued.

58

This is not the first iteration of a cytogenetic screening program being conducted in Canada, with a previous program observing that 1.64% of Canadian boars carried constitutional chromosome rearrangements (Quach et al., 2016). Since this time, we have observed a 42.7% decrease in the prevalence of constitutional chromosome rearrangements in Canadian swine herds to just 0.94%. Looking at specific populations which effectively remove identified carrier boars from breeding eligibility also shows the effectiveness of cytogenetic screening. Between the first three and the last three generations of boars screened from a single breeder, we observed a decrease in the prevalence of constitutional chromosome rearrangements from 1.14% to 0.52%, a 54% decline. The constitutional chromosome rearrangements identified in each case from the last three generations are likely to be de novo, as no repeated cases have been found. Taking the prevalence of only unique constitutional rearrangements (presumed to be de novo) shows a prevalence of

0.52% as well. This figure is just 0.05% higher than that observed in France and can be presumed to reflect the de novo rate of formation, indicating that spontaneous constitutional chromosome rearrangements is expected to occur in 1/200 live births (Ducos et al., 2007). This also indicates that the spontaneous rate of rearrangement formation is consistent across different populations, with France and Canada reporting similar numbers. Overall, we propose that the introduction of routine cytogenetic screening into swine herds, coupled with the removal of identified carriers from breeding eligibility, is effective at decreasing the prevalence of constitutional chromosome rearrangements to the proposed spontaneous rate of formation in just a few generations. It is notable however that this effect is not permanent, and cessation of screening will result in the prevalence of chromosome rearrangements rising again over each subsequent generation.

The consistency in the apparent de novo rate of constitutional rearrangement formation between Canada and France perhaps also indicates why we failed to identify differences between

59 the prevalence of rearrangements between breeds and herds. None of the major breeds in Canada, and just one herd of the fifteen considered, showed significant deviance from the expected number of de novo rearrangements. This indicates that chromosome rearrangement does not appear to be a factor of a particular breed or population but is instead a general and consistent feature of the pig genome. Thus, with the implementation of routine cytogenetic screening it should be assumed that the prevalence of constitutional chromosome rearrangements should be reduced to approximately

0.47%, the figure reported by France, without significant deviation (Ducos et al., 2007).

Both rearrangement carriers that were experimentally bred experienced fertility loss averaging 48%, while propagating their rearrangement to 50% of offspring. On average a rearrangement carrier is expected to experience a 40% loss of fertility, with this figure ranging between 10-100% based on the boar (Pinton et al., 2000; Popescu et al., 1983; Quach et al., 2016).

The fertility loss of these boars thus falls in the expected range. Meanwhile carriers are expected to pass on their rearrangement to 50% of liveborn offspring which was the case here (Ducos et al.,

1998a). This demonstrates the effect of constitutional chromosome rearrangements on swine herds, with moderate to severe impairment of fertility, coupled with the generation of new chromosome rearrangement carriers that must be controlled, or if not, result in a compounding of the problems associated with rearrangements in swine herds.

Of the 32 unique constitutional reciprocal translocations observed in Canadian swine herds,

31 of these rearrangements are novel, not having been previously reported in pigs. One rearrangement however, a rcp(12;14)(q13;q21) shared rearrangement breakpoints with a previously reported rearrangement in France by Pinton et al. (2005). This is the first instance of a constitutional reciprocal translocation with identical cytogenetic breakpoints being reported in two apparently unrelated pigs. Constitutional rearrangements are generally considered to be familial

60 events, unique to the first animal in which it was identified, with cases of recurrence being confined within individual families. Given that cytogenetic bands may be millions of base pairs long, without the sequencing of the breakpoint regions it is impossible to determine if these rearrangements occurred in similar chromosomal positions very near to one another, or if the breaks are far apart but on the same cytogenetic bands by chance. Until this is determined, we consider these two rearrangements to represent distinct cases. However, we will wait to see if other identical rearrangements are observed in pigs, which may provide opportunity to study this rearrangement in depth.

In addition to the proposed novel recurrent rearrangement, other interesting cases of constitutional reciprocal translocation were observed, including two full brothers carrying different rearrangements, rcp(2;15)(q13;q24) and rcp(4;6)(q11;q27). This is the first instance of two full siblings being observed to carry different apparently de novo chromosome rearrangements in the domestic pig. Typically litter mates of de novo rearrangement carriers are expected to present with a normal karyotype. Just one other litter of pigs has been observed to contain two different chromosome rearrangements, the result of two carrier parents being bred together (Gustavsson et al., 1983). In this case, a sample from the sire had passed through our lab, and presented with a normal karyotype, while the dam had been culled. Although inheritance of a rearrangement can not be ruled out in this case, the presence of two distinct rearrangements, while no other rearrangement carriers were found amongst their siblings, allows the presumption of a de novo origin. This suggests a unique case where chromosome rearrangement may occur at an elevated frequency in one or the parents, resulting in this unique case.

The vast majority of chromosome rearrangements observed in the pig occur between two autosomal chromosomes, with rearrangements involving the sex chromosomes being rare. Over

61 the course of routine cytogenetic screening we identified two carriers of Y-autosome rearrangements rcp(Y;13)(p13;q33) and rcp(Y;1)(q11;q17). This brings the total number of chromosome rearrangements involving a sex chromosome in the pig to six cases (two involving the X, and four involving the Y) (Table 50). Sex-chromosome rearrangements are notable in the pig as they lead to near complete meiotic arrest in male carriers due to incomplete synapsis of the sex chromosomes, and the triggering of meiotic division checkpoints, resulting in complete infertility (Villagomez et al., 2017; Neal et al., 1998). Tissue biopsy of the rcp(Y;13) carrier revealed the complete absence of sperm from the testis, showing the distinct azoospermia associated with this type of rearrangement (Villagomez et al., 2017).

The distinct constitutional rearrangements observed in Canadian pigs were identified seemingly randomly throughout the pig genome, with no two rearrangements sharing the same two breakpoints. Rearrangements were found involving 15 of the 18 porcine autosomes, and two rearrangements involved chromosome Y. The only chromosomes not observed to participate in rearrangement were chromosomes 8, 11, 16, and X, all of which have been previously described to less often rearrange than other chromosomes in the pig (Basrur and Stranzinger, 2008). Looking more closely at rearrangement breakpoints however started to reveal patterns in their formation.

For instance, chromosomes 13 and 14, which have been suggested to rearrange frequently, had nine and seven breakpoints apiece, far more than any other chromosomes (Basrur and Stranzinger,

2008).

Despite the apparent rarity of recurrence for rearrangements in the pig genome with just two of the 201 constitutional reciprocal translocations identified in the pig genome occurring at two identical cytogenetic bands, recurrence for individual cytogenetic bands appears much more frequently. Amongst the 32 distinct constitutional reciprocal translocations identified, 21 shared

62 at least one breakpoint with another rearrangement we identified (Table 12). Seven cytogenetic bands, 1q21, 2p17, 3q25, 5q21, 6q33, 9p24, 10q13, and 15q24 had two observed breakpoints each, while 13q21, and 13q31 had three and four breakpoints apiece. Although there has been little comprehensive analysis of rearrangement breakpoints in the pig genome, some bands have been suggested to rearrange more frequently, including 1q21 (Tarocco et al., 1987; Basrur and

Stranzinger, 2008). Comparison of all rearrangement breakpoints from those identified in our laboratory revealed that of the 64 individual breakpoints, 53 were re-used in the pig genome, and just two rearrangements, Case #3, t(5;12)(q11;q12), and Case #8, t(Y;13)(p13;q33), presented with two distinct breakpoints yet to be observed in the pig genome. Thus, recurrence for breakpoint regions appears to occur routinely in the pig genome.

Table 12: Rearrangements with Recurring Breakpoints

Band Rearrangement Case

1q21 t(1;7)(q21;p11) #2 t(1;14)(q21;q14) #13 2p17 t(2;10)(p17;q13) #20 t(2;17)(p17;q13) #24 3q25 t(1;3)(p23;q25) #4 t(3;6)(q25;q11) #10 5q21 t(5;18)(q21;q11) #29 t(5;13)(q21;q41) #30 6q33 t(6;7)(q33;q22) #16 t(6;15)(q33;q13) #23 9p24 t(4;9)(p13;p24) #6 t(7;9)(q15;p24) #31 10q13 t(10;13)(q13;q21) #19 t(2;10)(p17;q13) #20 13q21 t(3;13)(q21;q21) #1 t(13;18)(q21;q13) #17 t(10;13)(q13;q21) #19

63

13q31 t(9;13)(q24;q31) #9 t(10;13)(p15;q31) #18 t(10;13)(p13;q31) #21 t(13;14)(q31;q29) #32 15q24 t(2;15)(q13;q24) #12 t(15;18)(q24;q24) #22

In addition to examining constitutional chromosome rearrangements, we observed for the first time the presence of mosaic chromosome rearrangements in swine herds. Mosaic chromosome rearrangements have been previously reported in pigs, however these reports are few and far in between. The first mosaic rearrangement was reported in 1970 in a stillborn boar exhibiting obvious malformation (Hansen-Melander and Melander, 1970). This boar was found to have an average 6.6% degree of mosaicism for a t(1;11)(q-;q+) apparently reciprocal translocation, with the translocation being presumed to be linked to the physical manifestations. No other reports of chromosomal mosaicism in pigs were reported until a series of pigs exhibiting several mosaic rearrangements mos t(7;9), mos t(7;18), and mos t(9;18) were described by Musilova et al. (2014).

In both these cases chromosomal mosaicism was not found incidentally however, as in the first case cytogenetic analysis was carried out in order to diagnose the cause of the malformation, while in the second case the presence of these mosaic rearrangements in the pig genome were specifically looked for using molecular cytogenetics techniques.

During the course of routine cytogenetic screening we identified 39 carriers of mosaic reciprocal translocations. In most cases only 1 of 25 cells karyotyped carried a chromosome rearrangement, thus the carriers exhibited ~ 4% mosaicism, with three exceptions in the form of

Cases #33 and #40, which had an additional mosaic cell line amongst 25 and 50 cells respectively, and Case #59 which exhibited ~ 60% mosaicism (16 of 25 cells). The prevalence of carriers of

64 mosaic rearrangement carriers was estimated to be 0.59%, however a more accurate way of assessing the prevalence of mosaic rearrangements is to look at the prevalence in cells. Considering the 41 distinct mosaic cell lines observed amongst 13,993 karyotypes, the prevalence of mosaic cells amongst blood leukocytes is estimated to be 0.29%, or 1 in 330 cells. This is the first estimation of the prevalence of mosaic chromosome rearrangements in blood leukocytes in the domestic pig. There is little evidence supporting that the presence of mosaic cells in blood leukocytes reflects their presence in other tissues, thus we cannot estimate the prevalence of mosaicism in other tissues (Rezaei et al., 2020). Little study of the true prevalence of mosaic rearrangements in tissues has been conducted. In humans, a study of 377,357 pre-natal cytogenetic diagnoses observed just 2 mosaic reciprocal translocations, suggesting a prevalence of 0.0005% in humans (Warburton, 1991). In this comparison, the prevalence of mosaic rearrangements in porcine lymphocytes is quite high, however no direct comparison is available with human lymphocytes, thus it is difficult to determine if the prevalence of mosaic rearrangements in porcine lymphocytes is exceptional relative to other species.

Unlike constitutional rearrangements, where almost all rearrangements were unique to individual families, over half of the mosaic rearrangements were found recurrently in unrelated animals. Each of the 39 mosaic rearrangements carriers were unrelated to one another. However of the 41 total rearrangements identified, just 17 were distinct to their carriers, while 24 were found recurrently. These recurrent rearrangements were comprised of just two translocations, a mos rcp(7;9)(q24;q24) found in 21 unrelated pigs, and a mos rcp(7;18)(q22;q11) found in three unrelated pigs. None of the 17 distinct mosaic rearrangements have been previously reported in pigs in either a constitutional or mosaic form. Meanwhile both recurrent rearrangements appear to have very similar breakpoints to mosaic rearrangements previously reported by Musilova et al.

65

(2014). The rate at which we observed the mos t(7;9) recurrent rearrangement, 15.1 per 10000 cells, is quite similar to the 18.1 per 10000 rate observed by Musilova et al. (2014). The frequency that we observed the mos t(7;18) rearrangement, 2.16 per 10000 cells, however was far lower than that observed by Musilova et al. (2014), 22.8 per 10000 cells. It is unclear as to why this discrepancy exists, however it is possible that bias was introduced in the selection of metaphases to more often select those displaying possible rearrangement between chromosomes 7 and 9. The rcp(7;18) rearrangement involves shorter chromosome segments, and is thus less visible without arranging a full karyotype for the sample.

The combined frequency of these recurrent rearrangements observed by Musilova et al.

(2014), is 40.9 per 10000 cells, or 0.41% of cells, identical to the prevalence of all mosaic cells observed in our pigs. Though the extensive karyotyping of thousands of cells per animal performed by Musilova et al. (2014) is likely to provide a better reflection of the true prevalence of mosaic cells in porcine blood leukocytes, this shows that similar figures are obtained through routine cytogenetic screening examining a small number of karyotypes in thousands of samples. It also shows that mosaicism is prevalent in the pig genome and may be observed incidentally at a rate similar to that of intensive karyotyping of thousands of cells in an animal. Given that the intensive karyotyping performed by Musilova considered just three mosaic rearrangements, and we identified nineteen distinct rearrangements, it is likely that the true prevalence of mosaic rearrangements in porcine lymphocytes exceeds this 0.41% figure. Considering the prevalence of the two recurrent mosaic rearrangement established by Musilova, along with the prevalence of distinct non-recurrent rearrangements observed in our lab (12.15 per 10000 cells), we can estimate the true prevalence of mosaic rearrangements to be approximately 0.53%, or just over 1 in 200 cells.

66

Mosaic rearrangements in contrast to constitutional rearrangements were less randomly distributed across the genome. Most breakpoints occurred on chromosomes 7 and 9, with fewer breakpoints on the other autosomal chromosomes. In all, 24 rearrangements were identified to occur recurrently, with 21 unrelated animals carrying an observably identical mos t(7;9)(q24;q24), and three unrelated animals carrying a mos t(7;18)(q22;q11) rearrangement. This same discrepancy in recurrence between constitutional and mosaic rearrangements is also evident in the human genome as few recurrent constitutional rearrangements have been observed, while recurrence for somatic rearrangements, especially those associated with cancers is much more prevalent (Ou et al., 2011; see Mittleman et al., 2007 for review). A large number of recurrent somatic rearrangements have been catalogued in humans in databases such as Catalogue of

Somatic Mutations in Cancer (COSMIC; http://www.sanger.ac.uk/genetics/CGP/cosmic/), and the

Mittleman Database (Mittleman Database; https://mitelmandatabase.isb-cgc.org/).

Notably, rearrangement breakpoints were observed on chromosomes 8 and 16 in a mosaic form, but not in a constitutional form. Rearrangement breakpoints appeared recurrently on bands other than 7q24 and 9q24. Several bands, 2q23, 3q23, 7q22, 9p22, and 16q21 had two observed breakpoints apiece, while three rearrangements had a breakpoint on 8q21. Compared to constitutional rearrangements, these mosaic rearrangements had a greater tendency to re-use breakpoints. Excluding the mos t(7;9) and mos t(7;18) recurrent cases, 13 of the 17 (76.5%) mosaic rearrangements re-used at least one breakpoint, compared to 65.6% (21 of 32) constitutional rearrangements that shared breakpoints (Table 13). Although this value is only 16.6% larger amongst mosaic rearrangements, it should be noted that far fewer mosaic rearrangements were identified, thus making this consistent re-use of breakpoints more poignant. Overall, we see a

67 tendency for recurrence at the chromosome and cytogenetic band level for mosaic rearrangements not seen in constitutional rearrangements.

Table 13: Mosaic Rearrangement Breakpoints Showing Re-Use

Band Rearrangement Case

2q23 mos t(2;8)(q23;q21) #60 mos t(1;2)(p23;q23) #70 3q23 mos t(3;7)(q23;q26) #43 mos t(3;10)(q23;p13) #44 7q22 mos t(7;18)(q22;q11) #56 mos t(7;13)(q22;q21) #64 8q21 mos t(8;9)(q21;q24) #39 mos t(2;8)(q23;q21) #60 mos t(8;16)(q21;q21) #69 9p22 mos t(9;13)(p22;q41) #38 mos t(5;9)(q21;p22) #59 16q21 mos t(6;16)(p15;q21) #61 mos t(8;16)(q21;q21) #69 2q23 mos t(2;8)(q23;q21) #60 mos t(1;2)(p23;q23) #70 3q23 mos t(3;7)(q23;q26) #43 mos t(3;10)(q23;p13) #44 7q22 mos t(9;18)(q22;q11) #56 mos t(7;13)(q22;q21) #64 8q21 mos t(8;9)(q21;q24) #39 mos t(2;8)(q23;q21) #60 mos t(8;16)(q21;q21) #69 9p22 mos t(9;13)(p22;q41) #38

Non-reciprocal translocations were rarely observed in pigs, making up just four of the 104 total distinct chromosome aberrations observed. Of the over 200 distinct chromosome rearrangements identified in the pig, only 14 cases of inversions have been reported, the majority

68 of which are pericentric. The inv(9)(p11;p21) inversion reported here is the fourth paracentric inversion observed in the pig, and the second reported on chromosome 9 (Fries and Stranzinger,

1982). The possible ambiguous banding patterns of inverted chromosomes, especially paracentric inversions that do not alter centromere placement, coupled with the low impact on reproduction relative to reciprocal translocations may help to explain why few inversions are identified in pigs

(Massip et al., 2009; Quach et al., 2016).

Cytogenetic investigation revealed for the first time a boar exhibiting the constitutional loss of a whole chromosome arm, the form of a 38, X, del(Yq). This is the first case of a whole chromosome arm deletion being observed in a phenotypically normal pig and is exceptional considering that genomic imbalance is generally not tolerated by the pig genome. One pig has been previously reported to carry any constitutional genomic imbalance, the result of inheritance of an unbalanced translocation resulting in a partial trisomy of chromosome 17 (Villagomez et al.,

1995b). The boar appeared normal, however was demonstrated to have acrosomal defects in its sperm. The Y-chromosome is considered to be generally gene poor, with the majority of genes concentrated on the short-arm, while the long-arm generally contains mostly repetitive sequences

(Skinner et al., 2016). It could thus be reasoned that the majority of genes on the Y chromosome would be unaffected by this deletion, thus no observable phenotypic effect would occur. Although tissue samples were not available for this case, it can be speculated that this deletion did manifest itself in the form of infertility and azoospermia, as the failure of the sex chromosomes to properly align and synapse during meiosis could trigger meiotic checkpoints bringing a cessation to meiotic cell divisions, and resulting in azoospermia and infertility as seen in carriers of Y-autosome rearrangements (Villagomez et al., 2017).

69

In addition to rearrangements of chromosome material, we identified two boars carrying two distinct cell lines with different sex chromosome compositions. Both boars displayed at least two distinct leukocyte cell lines, one of which was XX, and the other XY, thus both boars were considered to be XX/XY chimeras. Both boars have been previously discussed in detail by Rezaei et al. (2020). Chimerism is fairly rare in the pig, likely due to these pigs usually appearing phenotypically normal (Bruere et al., 1968; Clarkson et al., 1995; Padula, 2005). Twelve boars displaying XX/XY mosaicism have been previously identified during the course of routine cytogenetic screening in France, with a prevalence higher than that of our own laboratory (0.14%)

(Ducos et al., 2007).

Routine cytogenetic screening of reproductively unproven boars is not commonly performed globally. The largest programs present in Poland, France, the Netherlands, and Canada together screen an estimated 5,000 boars per year or fewer, just a fraction of all breeding boars

(Ducos et al., 2007; Ducos et al., 2008). There are an estimated 18,500 breeding boars in Canada at anytime, however a maximum of 1600 are screened in any given year, just 8.6% the total

(Statistics Canada, 2019). The proven ability of cytogenetic screening to identify carriers of chromosome rearrangements prior to breeding provides a massive opportunity for swine breeders to maximize the litter sizes of their breeding boars and minimize potentially losses. Given that a single pure-bred boar is estimated to sire 50 or more litters simultaneously, and the average litter size of Yorkshire and Landrace boars in Canada is approximately 12, a single rearrangement carrier permitted to breed may produce litters with 250 fewer piglets total, a 40% loss (Statistics

Canada, 2019; CCSI, 2019). A conservative loss of 25$ per boar, based on the market price of hogs would result in a loss of 6000$, however as these boars are meant for breeding, and are

70 consequentially worth more than feeder hogs, losses from a single rearrangement carrier may reach the tens to possibly hundreds of thousands of dollars (Quach et al., 2016; King et al., 2019).

Despite the accessibility to cytogenetic screening to swine produces, few appear to take advantage. Of the ten largest swine producing countries or economic unions, just two, the

European Union and Canada implement cytogenetic screening programs on breeding boars.

Notably these programs implemented in the E.U are largely localized to four countries, which together account for approximately a third of swine production (USDA, 2020). Meanwhile China and the United States implement no such large programs, while accounting for over half of the world’s production of pork products (USDA, 2020). In order to increase the efficiency of swine breeding it is imperative that cytogenetic screening programs be integrated with current swine assessment parameters in order to select the best boars for breeding.

The swine industry in Canada expresses interest in competing with our closest trading partner, the United States, not in quantity, but by producing better quality pigs. Genetic improvement programs have been active in Canada since the 1960s, with many of the largest breeders in the country actively participating in these programs. As time has progressed, the average litter size of Canadian pigs has increased, while economically desirable characteristics such as fat content have been improved (CCSI, 2019). The incorporation of routine cytogenetic screening into the evaluation of breeding boars helps to remove carriers from breeding herds and minimize financial losses due to unexpectedly low litter size. In turn this helps to increase the marketability of these breeding boars, especially to other countries that do not implement such programs. Cytogenetic screening could therefore help drive improvements in Canadian breeding programs and could help to raise Canadian breeding boars to even higher levels the global swine industry.

71

The introduction of routine cytogenetic screening of Canadian boars was shown to successfully identify 101 distinct carriers of chromosome abnormalities in Canadian swine herds.

This in turn established the breadth of chromosome rearrangements present in Canadian swine herds, including 32 novel constitutional reciprocal translocations, and for the first time recurrent and non-recurrent mosaic rearrangements. The prevalence of chromosome rearrangements in

Canadian swine herds was found to be quite high, however the introduction of routine cytogenetic screening was found to progressively lower the prevalence of constitutional rearrangements to the expected de novo rate of formation in just a few years. The result of this program demonstrates how cytogenetic screening can be effectively used to both routinely identify novel chromosome rearrangements, and may be used in breeding selection, to prevent the breeding of carriers, and reduce the prevalence of constitutional rearrangements to the de novo rate of formation within a five year period. The result of this allows cytogeneticists and breeders to mutually benefit by the discovery of new rearrangements, and the implementation of new herd management strategies resulting in cost-savings for breeders, and maintaining litter-size expectations.

72

Chapter 3: Reciprocal Translocation Breakpoints are non-Randomly Distributed in the Pig Genome

Work from this Chapter has been published in Genes, and Scientific Reports

Introduction

The establishment of labs performing routine cytogenetic screening of young boars has resulted in a large number of rearrangements being observed in the pig genome (Ducos et al., 2007;

Raudsepp and Chowdhary, 2011). Basic analysis of rearrangement breakpoints identified at the

University of Guelph reveals that many cytogenetic breakpoints are re-used between individuals, with it being apparently rare to identify a novel rearrangement where at least one breakpoint has not been previously reported in another rearrangement (Donaldson et al., 2019). Although several studies of chromosome rearrangements in the pig have suggested that breakpoints are non- randomly distributed in the pig genome, this hypothesis has yet to be thoroughly tested (Tarocco et al., 1987; Basrur and Stranzinger, 2008).

Several chromosomes and cytogenetic bands have been suggested to break more frequently in the pig however it is unclear if the positions of rearrangement breakpoints occur by chance, or if not yet understood factors may promote rearrangement in particular genomic regions. Although a handful of chromosomes have been examined in some detail, no comprehensive analysis of translocation breakpoints across the pig karyotype has been conducted. With over 200 reciprocal translocations now identified in the domestic pig, there is ample information to investigate if rearrangement breakpoints occur randomly in the pig genome, and if specific chromosomal and genomic architectural features may be associated with their presence (Table S5). As well we can compare chromosome rearrangements between the pig and human genomes in order to determine if similar factors are driving rearrangement in both species. In turn this may help us to better

73 understand how chromosome rearrangements are distributed in the pig genome, and if any chromosomal factors appear to be associated with these rearrangements, allowing for a more comprehensive understanding for risk factors for chromosome breakage in the pig genome.

Here we performed a comprehensive analysis of translocation breakpoints in the pig genome at the chromosome, and the cytogenetic level. We considered both larger genomic features and performed more specific bioinformatic analyses in order to consider those genomic features that may be associated with translocation breakpoints and determine how this may play a role in promoting breakage and rearrangements in these chromosome regions.

Materials and Methods

2.1. Cytogenetic Screening Analysis of Pig Populations

As described in detail in Chapter 2, peripheral blood samples were routinely collected from

6491 reproductively unproven young boars raised at various Canadian farms by experienced farm workers or Canadian Food Inspection Agency veterinarians according to the Canadian Council on

Animal Care and the University of Guelph’s Animal Care Committee guidelines. These animals were from commercial herds, in good general health, and were not selected specifically for research purposes. The samples were submitted to the Animal Health Laboratory of the University of Guelph for commercial genetic screening. Lymphocyte cell cultures were set up according to the standard cytogenetic protocols of our laboratory, as in Chapter 2.2, and as previously published

(Quach et al., 2016; Villagomez et al., 2017). Twenty-five high-quality metaphase spreads were captured from each animal under the 100x objective and a minimum of two optimal quality GTG- banded karyotypes were arranged at the level of 300 bands resolution (Gustavsson, 1988).

Following this conventional procedure, we identified 32 constitutional reciprocal chromosome translocations (Table 2).

74

2.2. Selection and Analysis of Reciprocal Chromosome Translocations Published in the Literature

A comprehensive list of all published G-banded reciprocal chromosome translocations was arranged based on a previously published list of 132 reciprocal translocations identified in the pig

(see Raudsepp and Chowdhary, 2011). To this list we add the 32 novel reciprocal chromosome translocations identified in our own lab, and a further 37 reciprocal translocations not included in the first list of 132 rearrangements resulted in a total of 201 unique rearrangements (Table 50).

Each chromosome rearrangement was independently verified by consulting the original source in order to generate the highest accuracy list possible.

2.3. Definition of Chromosome Parameters

The physical length in megabases (Mb) for each chromosome was obtained from the

Sscrofa 11.1 genome assembly (https://www.ncbi.nlm.nih.gov/genome/84). Using the physical lengths of chromosomes as a basis, the lengths of cytogenetic bands were estimated as follows:

The standard GTG-banded ideogram and chromosome landmarks of the domestic pig karyotype were used as a reference to measure the lengths of each chromosome and the constituent bands, and calculate the fractional lengths of each cytogenetic band per chromosome (Gustavsson, 1988)

(Figure S1). The physical length of each band, and their start and stop points were calculated by multiplying the fractional length with the physical length of the chromosome (Table S1). This resulted in a conversion map between cytogenetic bands and their physical length.

These measurements were verified by selecting 25 bacterial artificial chromosome (BAC) clones from the literature that were mapped to cytogenetic bands via FISH, as well as physically mapped to exact genetic loci in the genome (Table S2). These cytogenetic bands were then converted by our map to physical positions and compared to the established genomic positions of

75 the probes. Of the 25 cytogenetically mapped probes, 20 fell within the estimated physical positions for their respective cytogenetic bands, and the remaining five fell within an adjacent band. Therefore, it was assumed that the method for estimating pig cytogenetic band lengths is sufficient for rough estimations (within approximately 3 million base pairs).

We defined the translocation frequency of each chromosome segment (i.e., whole chromosome, chromosome arm, cytogenetic band, and groups of chromosomes) as the number of translocation breakpoints per 1 Mb. Translocation frequency was calculated as the number of translocation breakpoints for a given chromosome segment, over the physical length of the chromosome segment, resulting in the number of translocation breakpoints per 1 Mb of chromosome material. The expected number of translocations per chromosome segment was calculated by multiplying the total number of breakpoints by the percent chromosome segment length (Table S1).

The standard GTG-banded karyotype of the domestic pig was used to define each cytogenetic band, and their GTG-banding designation (Gustavsson, 1988) (Figure S1). In total there were 267 distinct cytogenetic bands across the 18 autosomal chromosomes. The positions of cytogenetic bands on chromosome arms were defined as proximal, median, and distal according to the position of the band in the top third, middle third, and bottom third of bands, respectively on each chromosome (Gustavsson, 1988; Holmquist, 1992). A list of common fragile sites in the pig genome was used to define which bands had a common fragile site (Ronne, 1995) (Table 52).

2.4 Procuring a List of Genes in the Pig Genome and Estimating Cytogenetic Positions

A list of genes in the pig was obtained from the Ensembl gene annotation of the Sscrofa

11.1 (https://useast.ensembl.org/Sus_scrofa/Info/Annotation/). This list included the chromosome, position, and gene name in the form of an Ensembl gene stable ID, which was translated into the

76 official gene name and symbol using Ensembl BioMart

(https://www.ensembl.org/biomart/martview/). Using the physical chromosome positions defined for each gene, we converted this to a cytogenetic position using our cytogenetic conversion map.

From this information we could define the number of genes on chromosomes and cytogenetic bands, and calculate the gene density as the number of genes in a chromosome region over the length.

Following a similar procedure we procured a list of genes in the human genome from the

GRCh38.p13 genome assembly (https://useast.ensembl.org/Homo_sapiens/Info/Annotation), and converted the Ensembl stable gene ID to the gene names and symbols using BioMart. The physical positions of cytogenetic bands in the human genome were obtained from the NCBI Genome

Decoration Page (https://www.ncbi.nlm.nih.gov/genome/tools/gdp/). Using these positions alongside the physical positions for the genes, we assigned cytogenetic positions to each gene. We then determined which annotated genes were shared between analyses and compared the cytogenetic and physical positions of those shared genes between the pig and human genomes.

This resulted in the creation of a comparative map between the pig and human genomes which denoted the homologous cytogenetic positions of genes between the two genomes (Table S3).

Using this comparative map, the cytogenetic position of a breakpoint in the pig could be directly compared to its homologous region in the human, showing homologous breakpoint regions between the species.

2.5 Definition of Evolutionary Breakpoints in the Pig Genome

In order to define evolutionary breakpoints in the pig genome we consulted two studies by

Lahbib-Mansais et al. (2005; 2006), which defined evolutionary breakpoints on three chromosomes 2, 16, and 17. We then estimated the positions of evolutionary breakpoints on other

77 chromosomes using our human-pig comparative genome map. Inferred evolutionary breakpoint regions (EBRs) were cytogenetic bands which shared homology with two non-adjacent human cytogenetic bands. Using those evolutionary breakpoints defined on chromosomes 2, 16, and 17, each of those EBRs were also inferred as such using this method, thus we considered it a reasonably accurate method of estimating EBR positions. In total we estimated the positions of an additional 75 cytogenetic bands with proposed EBRs using this method (Table 53).

2.6 Definition of Repetitive Elements in the Pig Genome

Repetitive elements in the pig genome were retrieved using the RepeatMasker database

(http://www.repeatmasker.org/). This generated a list of repetitive elements in the pig genome, along with their chromosome positions, which were converted to cytogenetic positions using our conversion map. This allowed the number of repetitive elements to be determined for each chromosome and cytogenetic band.

2.7 Definition of Hotspots for Rearrangement and Recurrent Rearrangements in the Human Genome

A total of 1997 autosomal breakpoints from constitutional reciprocal translocations in the human genome were obtained from two studies by Warburton (1991), and Cohen et al. (1996). In total 465 breakpoints were obtained from Warburton (1991), and 1530 were obtained from Cohen et al. (1996). The number of breakpoints per chromosome and cytogenetic band were calculated and using the chromosome and cytogenetic bands lengths previously obtained for the human genome, the translocation frequency of chromosomes and cytogenetic bands was calculated.

Cytogenetic bands that fell within either the top 10% of breakpoints and/or top 10% of translocation frequency, were classified as hotspots for rearrangement in the human genome.

78

2.8. Statistical Analysis

Statistical analysis was performed in R 3.5.1 (R Core Team). Spearman’s rank correlation coefficient was applied in order to determine the presence of an association between two variables.

The Chi-square test was applied in order to determine a statistical difference between the observed and expected frequencies of variables such as translocation breakpoint number. The Student’s t- test was used in order to determine if the means of two groups were significantly different from one another. One-way analysis of variance (ANOVA) was used to determine if the means of three or more groups were significantly different from one another. The Poisson distribution was used to determine if translocation breakpoints occurred on cytogenetic bands independently of one another.

Results

A total of 201 constitutional reciprocal translocations were considered for analysis (Table

50). Of these 201 translocations, 195 were between autosomal chromosomes. The small number of translocations involving the sex chromosomes resulted in sex chromosomes being analysed separately from autosomes. Across the 201 reciprocal translocations there were 364 defined cytogenetic breakpoints on autosomal cytogenetic bands. Using this information we performed a comprehensive interrogation of the pig genome in relation to these breakpoints in order to better understand if any factors may influence or promote rearrangement in the pig genome.

3.1 Analysis of Reciprocal Translocations

The Distribution of Rearrangements between Autosomal Chromosomes

There are 153 possible combinations of autosomal reciprocal translocations in the pig genome. We mapped the 195 autosomal translocations, creating a table of the number of rearrangements between each chromosome pair (Table 14). Just under 2/3 (101 of 153) of all

79 possible combinations have been observed, with several rearrangements being repeated multiple times. The most frequently observed translocations were t(1;6), t(1;11), t(1;14), t(1;15), t(10;13), and t(12;14), each of which has been observed at least five times. We attempted to fit our observations to a Poisson distribution, finding a poor fit, largely due to the number of translocations with six or more observations being found in far greater numbers than expected (X2

= 46.385; p < 0.0001; Chi-Square test; Table 15). Those translocations in that category were the t(1;6), t(1;14), t(10;13), and t(12;14), each of which involved one of the four longest chromosomes in the pig genome, chromosomes 1, 6, 13, and 14. Comparing the observed number of each rearrangement to the combined physical length of the chromosome products demonstrated a significant correlation, indicating that longer chromosomes tend to rearrange with one another more frequently (r = 0.461, p = 0; Spearman’s correlation calculation; Figure 1a).

Table 14: Distribution of Observations for each Possible Autosome-Autosome Translocation

80

Table 15: Poisson Distribution of the Number of Observations per Translocation

Number of Cytogenetic Bands Number of Rearrangements Observed Expecteda X2 Value Between Two Chromosomes

0 48 42.59 0.687 1 57 54.65 0.101 2 27 34.87 1.776 3 11 14.76 0.958 4 4 4.66 0.093 5 2 1.17 0.588 6+ 4 0.3 46.385

aBased on a Poisson Distribution with m = 1.27 and n = 153 X2 = 50.588, d.f = 6, p < 0.0001

81

Despite there being an overall relationship between rearrangement number and length, many rearrangements did not rearrange proportionally to their length; specifically the translocations t(10;13) and t(12;14) had exceptionally high X2 values (X2 = 23.98 and X2 =

14.94, respectively) indicating that they were observed more frequently than expected. Indeed, attempting to fit this data to a linear regression model shows that while length is significantly associated with rearrangement number, it explains just a small percentage of variation between rearrangements (p < 0.0001; One-way Anova). We thus considered how chromosome organization may influence the number of rearrangements. Chromosomes are known to occupy defined space within the nucleus, with gene density proposed to influence this organization.

Comparing the combined gene density, or the difference between gene density of rearrangements to the number of rearrangements, showed no evidence of gene density influencing rearrangement number (r = -0.038, p = 0.643; Spearman’s correlation test; Figure 1b; r = 0.008, p = 0.918;

Spearman correlation test; Figure 1c). Thus, rearrangements involving gene-rich or gene-poor chromosomes, or chromosomes with similar gene densities showed no more affinity for one another than would be expected by chance.

82

r = -0.038 p = 0.643

n = 153

Figure 1: The Length of Chromosomes but not the Gene Density is Related to the Number of Observed Instances of each Translocation. (a) Scatterplot comparing the number of observed instanced of each translocation and the combined physical length of the chromosomes. Spearman’s correlation coefficient was used to determine if there is a relationship between the two variables, r = Spearman’s correlation coefficient, p = numerical representation that the result was seen by chance, n = number of chromosomes considered in the analysis. (b) Scatterplot comparing the combined gene densities of chromosomes for each translocation with the number of observations per translocation. Spearman’s correlation coefficient was used as described above. (c) Scatterplot comparing the difference in gene density between the chromosomes for each translocation with the number of observations per translocation. Spearman’s correlation coefficient was used as described above.

3.2 Analysis of Reciprocal Translocation Breakpoints on Individual Chromosomes

The Impact of Chromosome Length on Translocation Frequency

From the 201 reciprocal translocations included for analysis in this study we mapped 396 breakpoints to the 18 autosomal chromosomes. The number of breakpoints per chromosome ranged between 8 and 44 (Table 16). The chromosomes with the highest number of breakpoints

83 were chromosome 1, 7, 13, 14, and 15, while the chromosomes with the fewest breakpoints were chromosomes 10, 11, 16, 17, and 18. Longer chromosomes typically appeared to have more breakpoints than shorter chromosomes. Comparing chromosome length and breakpoint number identified a strong correlation between the two, indicating that breakpoints preferentially occurred on longer chromosomes (r = 0.797, p = 7x10-05; Spearman’s correlation coefficient; Figure 2a).

Many chromosomes did not break in direct proportion to their length however, exhibiting large differences between the observed and expected breakpoint number (X2 = 30.388; p = 0.024; Chi- square test; Table 16).

Table 16: Distribution of Chromosome Breakpoints on Autosomal Chromosomes

Fold Physical Observed Expected Translocation Chromosome X2 Value Change Length (Mb) Breakpoints Breakpoints Frequency (Obs/Exp)

1 274.331 44 47.95 0.325 0.9176 0.1604 2 151.936 19 26.55 2.147 0.7156 0.1251 3 132.849 19 23.22 0.767 0.8183 0.143 4 130.911 20 22.88 0.363 0.8741 0.1528 5 104.526 19 18.27 0.029 1.04 0.1818 6 170.844 27 29.86 0.274 0.9042 0.158 7 121.844 29 21.3 2.784 1.3615 0.238 8 138.966 17 24.29 2.188 0.6999 0.1223 9 139.512 17 24.38 2.234 0.6973 0.1219 10 69.359 16 12.12 1.242 1.3201 0.2307 11 79.17 14 13.84 0.002 1.0116 0.1768 12 61.603 17 10.77 3.604 1.5785 0.276 13 208.335 31 36.41 0.804 0.8514 0.1488 14 141.755 40 24.78 9.348 1.6142 0.2822 15 140.413 33 24.54 2.917 1.3447 0.235 16 79.944 12 13.97 0.278 0.859 0.1501 17 63.494 14 11.1 0.758 1.2613 0.2205 18 55.983 8 9.78 0.324 0.818 0.1429 X2 = 30.388, d.f = 17, p = 0.024

84

Of the 18 autosomes, 14 had at least a 10% difference between observed and expected breakpoint number. Chromosomes 7, 10, 12, 14, and 15 had the largest fold increase, while chromosomes 2, 3, 8, 9, and 18 had the highest fold decrease. We used the translocation frequency of chromosomes, taking the number of breakpoints over the physical length, to better describe the frequency of rearrangement for each chromosome (Table 15). The translocation frequency of chromosomes was directly proportional to the fold change between observed and expected rearrangements, with those chromosomes with higher fold increases conversely having higher translocation frequencies. The translocation frequency ranged between 0.1223 per Mb and 0.2822 per Mb, with an average of 0.1815 per Mb, and was found to be independent of chromosome length

(r = -0.224, p = 0.372; Spearman’s correlation coefficient; Figure 2b). The translocation frequency of chromosomes better represents the susceptibility of chromosomes to rearrangement by placing them on the same scale and indicates that factors beyond length influence the frequency of rearrangement on each chromosome.

50 0.3 a b r = -0.224 45 p = 0.372 0.25 40 n = 18 35 0.2 30 25 0.15 20 r = 0.797 0.1 15 p = 7x10-05 n = 18

10 Frequency Translocation

Number of Breakpoints Number 0.05 5 0 0 0 100 200 300 0 100 200 300 Physical Chromosome Length (Mb) Physical Chromosome Length (Mb)

85

c 50 0.3 45 40 0.25 35 0.2 30 25 0.15 20 15 0.1

10 Number of Breakpoints Number

0.05 Frequency Translocation 5 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Chromosome

Figure 2: Physical chromosome length is associated with the number of breakpoints on chromosomes, but not the translocation frequency. (a) Scatterplot comparing the number of breakpoints to the physical length of each chromosome. Spearman’s correlation coefficient was used to determine if there is a relationship between the two variables, r = Spearman’s correlation coefficient, p = numerical representation that the result was seen by chance, n = number of chromosomes considered in the analysis. (b) Scatterplot comparing the translocation frequency and physical length of each chromosome. Spearman’s correlation coefficient was used as described above. (c) Bar graph showing the number of breakpoints on each chromosome in black, primary y-axis, and the translocation frequency of each chromosome in grey, secondary Y-axis.

Impact of Gene Density on Chromosome Translocation Frequency

Chromosome rearrangements have been proposed to select against occurring on gene-rich chromosomes, thus we tested to determine if this was the case in the pig genome. Comparing the gene-density of chromosome to translocation frequency failed to find a significant correlation between the two metrics (r = 0.108, p = 0.669; Spearman’s correlation coefficient; Figure 3).

Indeed, dividing chromosomes into three equal groups based on gene densities, and comparing the translocation frequencies of the chromosomes of each group showed no significant difference between them (p = 0.677; One-way Anova; Table 17). As such, there appears to be no selection for or against rearrangement in gene-dense chromosomes.

86

0.3 Figure 3: Gene Density is not related to Chromosome Translocation Frequency. A 0.25 scatterplot comparing the gene density per Mb of chromosomes to the translocation frequency per Mb. Spearman’s correlation coefficient was used 0.2 to determine if there is a relationship between the two variables, r = Spearman’s correlation 0.15 r = 0.108 coefficient, p = numerical representation that the result was seen by chance, n = number of 0.1 p = 0.669 n = 18 chromosomes considered in the analysis.

0.05 Translocation Frequency Translocation

0 0 10 20 30 Gene Density per Mb

Table 17: Gene Density per Chromosome

Translocation Chromosome Gene Density Group Frequency

1 0.1604 8.83 Low 2 0.1251 17.66 High 3 0.143 13.59 High 4 0.1528 12.6 Medium 5 0.1818 13.95 High 6 0.158 15.94 High 7 0.238 16.23 High 8 0.1223 7.89 Low 9 0.1219 12.29 Medium 10 0.2307 9.57 Medium 11 0.1768 7.06 Low 12 0.276 23.83 High 13 0.1488 8.84 Low 14 0.2822 11.76 Medium 15 0.235 7.88 Low 16 0.1501 7.36 Low 17 0.2205 12.93 Medium 18 0.1429 11.77 Medium

87

The Impact of Length on Chromosome Arm Translocation Frequency

The pig karyotype is notable for displaying a variety of chromosome morphologies, with chromosomes 1 through 12 displaying distinct chromosome arms (bi-armed chromosomes). We mapped the breakpoints on each of these chromosomes to their respective chromosome arms in order to determine if breakpoint number was related to chromosome arm length. Unsurprisingly, we found a distinct relationship between the length of chromosome arms and the number of breakpoints, with the longest chromosome arms having the most breakpoints (r = 0.614, p =

0.0014; Spearman’s correlation coefficient; Figure 4a). Generally chromosome arms did not rearrange in proportion to their length, with 19 of the 24 chromosome arms having at least a 10% difference between observed and expected breakpoints (X2 = 37.9, p = 0.026; Chi-square test;

Table 18).

Table 18: The Distribution of Translocation Breakpoints on Chromosome Arms

Chromosome Physical Observed Expected Translocation X2 Value Arm Length Breakpoints Breakpoints Frequency

1p 90.048 12 14.6 0.463 0.1333 2p 53.556 10 8.68 0.201 0.1867 3p 49.507 9 8.03 0.117 0.1818 4p 47.558 5 7.71 0.953 0.1051 5p 43.588 6 7.07 0.162 0.1377 6p 48.264 10 7.82 0.608 0.2072 7p 36.398 3 5.9 1.425 0.0824 8p 57.667 9 9.35 0.013 0.1561 9p 62.17 10 10.08 0.001 0.1608 10p 32.049 4 5.2 0.277 0.1248 11p 36.45 6 5.91 0.001 0.1646 12p 27.224 3 4.41 0.451 0.1102 1q 182.052 26 29.51 0.417 0.1428 2q 97.052 6 15.73 6.019 0.0618 3q 81.299 9 13.18 1.326 0.1107 4q 81.347 14 13.19 0.05 0.1721 5q 59.272 12 9.61 0.594 0.2025

88

6q 120.938 15 19.61 1.084 0.124 7q 83.493 23 13.54 6.609 0.2755 8q 79.19 7 12.84 2.656 0.0884 9q 75.952 6 12.31 3.234 0.079 10q 36.429 10 5.91 2.83 0.2745 11q 40.953 6 6.64 0.062 0.1465 12q 32.889 12 5.33 8.347 0.3649

X2 = 37.9, d.f = 23, p = 0.026

a 30 b 0.4 0.35 r = -0.153 25 p = 0.475 0.3 20 n = 24 0.25 15 0.2 0.15 10 r = 0.614 p = 0.0014 0.1 n = 24 Number of Breakpoints Number 5 Translocation Frequency Translocation 0.05 0 0 0 50 100 150 200 0 100 200 Physical Chromosome Arm Length Physical Chromosome Arm Length (Mb) (Mb)

Figure 4. Physical length of chromosome arms is associated with breakpoint number but not the translocation frequency of chromosome arms. (a) Scatterplot comparing the number of breakpoints to the physical length of each chromosome arm. Spearman’s correlation coefficient was used to determine if there is a relationship between the two variables. r = Spearman’s correlation coefficient, p = numerical representation that the result was seen by chance, and n = number of chromosome arms included in the analysis. (b) Scatterplot comparing the translocation frequency and physical length of each chromosome arm. Spearman’s correlation coefficient was used as described above.

Given this discrepancy we calculated the translocation frequency for each chromosome arm in order to better describe the susceptibility of each arm to rearrangement. The translocation frequency of chromosome arms was more variable than that of whole chromosomes, with the translocation frequency of chromosome arms ranging between 0.079 per Mb and 0.3649 per Mb.

89

Translocation frequency was directly proportional to the fold change between observed and expected breakpoint number. Interestingly, translocation frequency was not consistent between homologous chromosome arms, with the more frequently translocating chromosome arm, rearranging 2.02x more frequently on average (Table 19). This difference was also apparently independent of chromosome arm length, with the short arm having a higher translocation frequency than the long arm in six of the twelve bi-armed chromosomes (r = -0.153, p = 0.475;

Spearman’s correlation test; Figure 4b). Notably, chromosomes with short-arms with higher translocation frequencies than their long-arms were typically those with lower overall translocation frequencies, demonstrating that having a low translocation frequency in a longer chromosome region will lead to a lower overall number of breakpoints.

Table 19: Comparison of the Translocation Frequency between Short and Long Chromosome Arms

Short Arm Long Arm Fold Change Fold Change Chromosome Translocation Translocation Between Long Between Short Frequency Frequency and Short Arms and Long Arms

1 0.1333 0.1428 1.0713 0.9335 2 0.1867 0.0618 0.331 3.021 3 0.1818 0.1107 0.6089 1.6423 4 0.1051 0.1721 1.6375 0.6107 5 0.1377 0.2025 1.4706 0.68 6 0.2072 0.124 0.5985 1.671 7 0.0824 0.2755 3.3434 0.2991 8 0.1561 0.0884 0.5663 1.7658 9 0.1608 0.079 0.4913 2.0354 10 0.1248 0.2745 2.1995 0.4546 11 0.1646 0.1465 0.89 1.1235 12 0.1102 0.3649 3.3113 0.302

90

The Role of Gene Density in the Differential Translocation Frequency of Chromosome Arms

Although unrelated to translocation frequency when applied across whole chromosomes, we considered whether gene density influenced the translocation frequency of chromosome arms.

Although we found that the long and short arms of each chromosome often had different gene densities, we found that this had no relationship with the translocation frequency of those chromosome arms (r = 0.171; p = 0.424; Spearman’s correlation coefficient; Figure 5). Therefore, we again find no evidence supporting selection against rearrangement in gene-dense chromosome regions.

Figure 5: Gene Density of Chromosome 0.4 r = 0.171 Arms has no Relationship with 0.35 p = 0.424 Translocation Frequency. A scatterplot 0.3 n = 24 comparing the gene density of chromosome arms with the translocation 0.25 frequency. Spearman’s correlation 0.2 coefficient was used to determine if there 0.15 is a relationship between the two variables. 0.1 r = Spearman’s correlation coefficient, p = numerical representation that the result 0.05 Translocation Frequency Translocation was seen by chance, and n = number of 0 chromosome arms included in the 0 5 10 15 20 analysis. Gene Density The Influence of Chromosome Arm Morphology on Translocation Frequency

Porcine chromosomes may be divided into four categories based on the length of their chromosome arms, metacentric, sub-metacentric, acrocentric, and telocentric. These groups of chromosomes were found to generally rearrange in proportion to their lengths as a whole, with no group having significantly more rearrangements than expected (X2 = 5.504, p = 0.138; Chi-square test; Table 20). The fold change between observed and expected breakpoints was quite low overall, and the translocation frequencies of the groups was quite similar, ranging between 0.1523 per Mb

91 and 0.2 per Mb. Comparing the translocation frequencies of the chromosomes between the groups found no influence on the part of centromere position (p = 0.557, One-way Anova).

Table 20: The Distribution of Translocation Breakpoints by Chromosome Arm Morphology

Fold Chromosome Physical Observed Expected Translocation X2 Value Change Arms Length Breakpoints Breakpoints Frequency (Obs/Exp)

Metacentric 488.61 81 85.4 0.227 0.9485 0.1658 Sub- 794.553 121 138.87 2.3 0.8713 0.1523 Metacentric Acrocentric 292.688 56 51.15 0.46 1.0948 0.1913 Telocentric 689.924 138 120.58 2.517 1.1445 0.2

X2 = 5.504, d.f = 3, p = 0.138

3.3 Analysis of the Translocation Frequency on Cytogenetic Bands

Failing to identify features of chromosomes and chromosome arms associated with translocation frequency, we looked at the breakpoints of each chromosome at the level of cytogenetic bands in order to better understand their distribution on chromosomes. There are 267 distinct cytogenetic bands across the 18 porcine autosomes, with the number of bands per chromosome being roughly proportionate to the length. Cytogenetic bands serve as an appropriate way to break up chromosomes into more manageable chunks, typically at the level of 3 to 15Mb for analysis.

Of the 201 reciprocal translocations, and 402 total breakpoints, 364 were described on specific cytogenetic bands. We mapped each breakpoint onto ideograms of the standard GTG- banded pig karyotype at the level of 300 bands (Figure 6). Rearrangement breakpoints appeared non-randomly distributed across the ideograms, with several bands having five or more

92 rearrangements, while many bands had none at all. The bands with the most rearrangements are

1q17 and 13q21 with seven breakpoints apiece, 9p24 and 13q41, with eight breakpoints each, and

15q13 with eleven breakpoints. Despite the re-use of these bands for the generation of translocation breakpoints, not a single reciprocal translocation involving these bands is recurrent, thus all eleven rearrangement partners for 15q13 for example, were distinct cytogenetic bands (1p25, 1q2.11,

3q27, 6p15, 6q33, 7q13, 9p24, 10p15, 11p15, 14q28, and 17q21).

Given this uneven distribution of rearrangement breakpoints across bands we considered whether this fit to a Poisson distribution. Overall, we found that the distribution of breakpoints fit poorly to a Poisson distribution (X2 = 299.41, p < 0.00001; Chi-square test; Table 21). More bands than expected had any number of breakpoints, while fewer bands than expected had 1 or 2 breakpoints. Meanwhile we found far more bands than expected had 4 or more breakpoints. This indicated that a degree of clustering of breakpoints was present in the pig genome, with particular cytogenetic bands appearing more susceptible to rearrangement. Indeed, we found that just 12.7% of bands (those with four or more breakpoints), had 48.6% of all breakpoints. Overall it appeared that breakpoints were distributed non-randomly, with significant re-use of specific cytogenetic breakpoints in the pig genome.

Table 21: The Distribution of Breakpoints per Cytogenetic Band

Number of Cytogenetic Bands Breakpoints per Observed Expecteda X2 Value Band 0 120 68.14 39.47 1 59 93.23 12.57 2 34 63.61 13.78 3 20 28.85 2.71 4 15 9.79 2.77 5 8 2.65 10.8

93

6 6 0.6 48.6 7+ 5 0.14 168.71 a Based on a Poisson Distribution with m = 1.36 and n = 267 X2 = 299.41, d.f = 7, p < 0.0001

94

Figure 6: Ideogram of the domestic pig karyotype with important cytogenetic markers displayed. An ideogram of the standard GTG(G-banding using trypsin and giemsa)-banded karyotype of the domestic pig is displayed. Diamonds represent breakpoints on cytogenetic bands. Arrows indicate bands with known common fragile sites. Cytogenetic bands on each chromosome are pointed out, usually those bands with the most breakpoints, to help the positional context of each chromosome. Centromeres are denoted with red horizontal lines in bi-armed and acrocentric chromosomes.

95

Influence of the Length of Cytogenetic Bands on Translocation Frequency

Given that the length of whole chromosomes is associated with breakpoint number we tested to see if this was the case at the cytogenetic band level. Unsurprisingly we found a weak yet significant relationship, indicating that longer cytogenetic bands typically had more breakpoints than shorter bands (r = 0.31, p = 0; Spearman’s correlation coefficient; Figure 7). It could clearly be seen however that the number of breakpoints was not directly proportional to the length of chromosomes, with many of the longest cytogenetic bands having no breakpoints at all, or just one breakpoint. Given the disparity between band lengths, we calculated the translocation frequency of cytogenetic bands to place them on the same plane. In contrast to the chromosome level there is a relationship between band length and translocation frequency (r = 0.124, p = 0.043; Spearman correlation coefficient). This relationship appeared at least partially driven by the disparity in length and breakpoint number between bands, with many long bands have no breakpoints, and many short bands having one or two breakpoints.

12 r = 0.31 10 p = 0 8 n = 267

6

4

Number Number Breakpoints of 2

0 0 5 10 15 20 25 Physical Chromosome Length (Mb)

Figure 7. Physical cytogenetic band length is associated with the number of translocation breakpoints. (a) Scatterplot comparing the number of breakpoints to the physical length of each cytogenetic band. Spearman’s correlation coefficient was used to determine if there is a relationship between the two variables. r = Spearman’s correlation coefficient and p = numerical representation that the result was seen by chance.

96

Influence of Relative Chromosome Arm Cytogenetic Band Position on Translocation Frequency

Several studies have suggested that chromosomes break more frequently in centromeric and telomeric regions of chromosomes, thus we organized bands by chromosome arm position into three groups were comprised of roughly equal band number and total physical lengths. There were fewer bands in the medial group, 78 compared to 91 and 98 in the proximal and distal groups, however those bands in the medial group tended to be longer, thus the medial group had the highest total length. The number of breakpoints between the groups was roughly equal, with no group being found to have a significant deviation from the expected number of breakpoints (X2 = 0.218, p = 0.897; Chi-square test; Table 22). Comparing the translocation frequencies of the bands within each group again failed to find a significant difference between them (p = 0.811; One-way

ANOVA). Thus, the translocation frequency of bands is not dependent on their position on the chromosome arm.

Table 22: The Distribution of Breakpoints by Relative Position on Chromosome Arms

Number of Physical Observed Expected Translocation Position X2 Value Bands Length Breakpoints Breakpoints Frequency

Proximal 91 739.493 118 119.89 0.03 0.1596 Median 78 778.896 124 126.27 0.041 0.1592 Distal 98 726.879 122 117.84 0.147 0.1678

X2 = 0.218, d.f = 2, p = 0.897

The Influence of GTG-banding, or Relative Chromatin Density on Translocation Frequency

Observation of the breakpoints mapped to ideograms of GTG-banded chromosomes

(Figure 6) showed that most breakpoints tended to appear on the lighter G-negative bands.

97

Dividing cytogenetic band into G-negative and G-positive groups we attempted to determine if G- negative bands were truly more susceptible to rearrangement. G-negative bands as a group had 6x more breakpoints than G-positive bands, despite having just 24% more bands. Therefore, it was unsurprising that we found a significant difference between the observed and expected breakpoint number in each group, as G-negative bands had far more breakpoints than expected, while G- positive bands had far fewer (X2 = 125.26, p < 0.00001; Chi-square test; Table 23).

Table 23: The Distribution of Translocation Breakpoints on GTG-bands

Number of Physical Observed Expected Translocation G-banding X2 Value Bands Length Breakpoints Breakpoints Frequency

G-negative 148 1271.776 312 206.18 54.311 0.2453 G-positive 119 973.492 52 157.82 70.953 0.0534

X2 = 125.264, d.f = 1, p < 1x10-05

In order to place each band on the same scale we compared the translocation frequency of the bands of each group. Comparing the translocation frequencies of the bands revealed a significant difference between the translocation frequencies of G-positive and G-negative bands (t

= 8.41, p < 0.00001; Student’s T-test). G-positive bands had a low translocation frequency as a group, while G-negative bands had an exceptionally high translocation frequency of 0.2453 per

Mb, 4.6x higher than G-positive bands. This is a consistent trend across the pig genome, with all but one chromosome having a G-negative translocation frequency at least 2x higher than that of

G-positive bands (Table 24).

Table 24: Translocation Frequency of the G-positive and G-negative Bands of each Chromosome

98

Translocation Translocation Fold Change Chromosome Frequency Frequency (G-neg/G-pos) G-Positive G-Negative

1 0.0313 0.2358 7.53 2 0.0885 0.1208 1.36 3 0.0667 0.1976 2.96 4 0.0188 0.2378 12.65 5 0.095 0.2104 2.21 6 0.0645 0.1959 3.04 7 0.0605 0.3273 5.41 8 0.0201 0.172 8.56 9 0.035 0.1729 4.94 10 0.0605 0.3386 5.6 11 0.0313 0.242 7.73 12 0.0413 0.3897 9.44 13 0.0257 0.2833 11.02 14 0.1223 0.4192 3.43 15 0.0444 0.3432 7.73 16 0.0911 0.1915 2.1 17 0.0553 0.2863 5.18 18 0.0804 0.1929 2.4

The Influence of Gene Density on Translocation Frequency of Cytogenetic Bands

We compared the gene densities of cytogenetic bands to translocation frequency in order to determine if rearrangements selected against gene-dense bands. Looking at cytogenetic bands as a whole we failed to find a correlation between gene density and translocation frequency (r =

0.109, p = 0.075; Spearman correlation test; Figure 8). Fitting this data to a linear regression model however, surprisingly indicated that gene density was indeed associated with translocation frequency of bands, however not in the way we expected (p = 0.0217; One-way ANOVA). Bands with higher translocation frequencies had a tendency towards being more gene dense, with bands with translocation frequencies >0.3 per Mb having an average gene density of 9.08 per Mb, compared to a gene density of 6.63 per Mb for bands with lesser translocation frequencies.

99

1 Figure 8: Translocation Frequency of r = 0.109 cytogenetic bands is associated but no 0.9 p = 0.075 correlated with gene density (a) Scatterplot 0.8 n = 267 comparing the number of breakpoints to the 0.7 physical length of each chromosome arm. 0.6 Spearman’s correlation coefficient was used 0.5 to determine if there is a relationship between the two variables. r = Spearman’s 0.4 correlation coefficient, p = numerical 0.3 representation that the result was seen by 0.2 chance, and n = number of chromosome

Translocation Frequency Translocation 0.1 arms included in the analysis. 0 0 10 20 30 Gene Density

The Influence of Evolutionary Breakpoint Regions on Translocation Frequency

Mammalian chromosome rearrangement breakpoints known as EBRs are thought to be re- used often throughout mammalian evolution (Larkin et al., 2009). Given these breakpoints are re- used in many species we sought to consider if those EBRs may still drive rearrangement in the modern pig genome. We performed two analyses comparing rearrangement frequency between bands with EBRs and those without, using an established list of EBRs (Lahbib-Mansais et al.,

2005; 2006) as well as our inferred list based on the synteny between human and porcine chromosome cytogenetic bands. In both cases we found that bands with EBRs were translocating more frequently than expected (X2 = 7.31, p = 0.0069; Chi-square test; Table 25; X2 = 13.41, p =

0.00025; Chi-square test; Table 25). Looking at bands individually however revealed that a higher translocation frequency was not a consistent feature, as about half of bands had high translocation frequencies, while the other half had low translocation frequencies. Fitting this data to a linear regression model showed that in each case the presence of an EBR in a cytogenetic band was uninformative to a model of translocation frequency (p = 0.151; p = 0.108; One-way Anova). There

100 is thus little evidence suggesting that EBRs are significantly re-used as breakpoints in the modern pig genome any more often than expected by chance.

Table 25: Distribution of Translocation Breakpoints on Defined and Proposed EBR Bands

Physical Observed Expected Translocation Category Bands X2 Value Length Breakpoints Breakpoints Frequency

EBR 9 102.594 23 14.65 4.76 0.2242 Normal 25 191.453 19 27.35 2.55 0.0992

X2 = 7.31, d.f = 1, p = 0.0069

Physical Observed Expected Translocation Category Bands X2 Value Length Breakpoints Breakpoints Frequency

EBR 84 724.791 166 132.39 8.53 0.2033 Normal 183 1520.477 198 231.61 4.88 0.1386

X2 = 13.41, d.f = 1, p = 0.00025

The Influence of Common Fragile Sites on Translocation Frequency

Common fragile sites in the pig genome have long been known to overlap with cytogenetic bands with breakpoints, however this has not yet been examined comprehensively (Quach et al.,

2016). Cytogenetic bands were grouped into Fragile (containing a common fragile site), and

Normal groups (Figure 6). Comparing the observed and expected frequencies of breakpoints, we found that fragile bands were rearranging more frequently than expected (X2 = 7.956, p = 0.0048;

Chi-square test; Table 26). However, comparing the translocation frequencies of normal and fragile bands showed that fragile bands did not have significantly higher translocation frequencies than normal bands (p = 0.069; One-way Anova). Although eight chromosomes had significantly higher translocation frequencies on their fragile bands relative to the normal bands, the other ten

101 chromosomes did not (Table 27). Breaking this down we found that 24 fragile bands had high translocation frequencies, >0.24 per Mb, while 19 fragile bands had no breakpoints at all. Thus, although some fragile bands had exceptionally high translocation frequencies, this was not a consistent feature of all fragile bands, indicating that there is little evidence supporting the presence of fragile sites as a direct promoter of rearrangement.

Table 26: The Distribution of Translocation Breakpoints on Bands with Common Fragile Sites

Number of Physical Observed Expected Translocation Fragility X2 Value Bands Length Breakpoints Breakpoints Frequency

Normal 210 1634.322 241 264.95 2.165 0.1475 Fragile 57 610.946 123 99.05 5.791 0.2013

X2 = 7.956, d.f = 1, p = 0.0048

Table 27: Translocation Frequency for the Normal and Fragile Bands of Each Chromosome

Translocation Translocation Fold Change Chromosome Frequency Frequency Fragile (Fragile/Normal) Normal Bands Bands

1 0.0615 0.2112 3.4341 2 0.1231 0.0667 0.5418 3 0.1047 0.2719 2.5969 4 0.1118 0.2619 2.3426 5 0.175 0 0 6 0.1748 0.0912 0.5217 7 0.2138 0.2256 1.0552 8 0.1446 0.0302 0.2089 9 0.1267 0 0 10 0.1912 0.349 1.8253 11 0.1678 0 0 12 0.2626 0 0 13 0.0467 0.2885 6.1777 14 0.2558 0.3228 1.2619 15 0.2097 0.169 0.8059 16 0.1122 0.2848 2.5383 17 0.181 0.3111 1.7188

102

18 0.1443 0.1388 0.9619

The Influence of G-banding on the Translocation Frequency of Bands with Common Fragile Sites

Given that G-negative bands were found to have significantly higher translocation frequencies than G-positive bands, we wondered if this interacted with the presence of common fragile sites on bands to influence translocation frequency. We grouped chromosomes together as

G-negative Normal, G-negative Fragile, G-positive Normal, and G-positive Fragile, and summed the breakpoints and physical lengths of the bands of each group accordingly. Interestingly we found that each group significantly deviated from the expected number of breakpoints, with both

G-negative groups having more breakpoints than expected, while both G-positive groups had fewer breakpoints than expected (X2 = 133.276, p < 0.00001; Chi-square Test; Table 28).

Table 28: The Distribution of Translocation Breakpoints by GTG-banding and the Presence of

Common Fragile Sites

Number of Physical Observed Expected Translocation Category X2 Value Bands Length Breakpoints Breakpoints Frequency

G-positive- 96 731.854 38 118.65 54.82 0.0519 Normal G-positive- 23 241.638 14 39.17 16.17 0.0579 Fragile G-negative- 114 902.468 203 146.31 21.965 0.2249 Normal G-negative- 34 369.308 109 59.87 40.317 0.2951 Fragile

X2 = 133.276, d.f = 3, p < 0.00001

103

We next calculated the translocation frequency for each group, showing that both G- positive groups had similarly low translocation frequencies regardless of the presence of a fragile site (p = 0.462; Student’s t-test). In contrast both G-negative groups had exceptionally high translocation frequencies. Despite there being a tendency for G-negative fragile bands to have high translocation frequencies, we found that G-negative Fragile bands did not rearrangement more frequently than G-negative Normal bands (p = 0.079; One-way Anova). Overall there was a large disparity between the translocation frequencies of G-negative Fragile bands, with many having very high translocation frequencies, while several had very low translocation frequency. Thus, there is little evidence that suggests these fragile bands had high translocation frequencies based on more than chance, and the frequent rearrangement of some of these bands is perhaps better looked at from a more individual perspective.

General Chromosome Features are Associated with Cytogenetic Band Translocation Frequency

We performed multiple linear regression analysis to establish how the length of cytogenetic bands, G-banding, fragility, and gene density influence breakpoint number and translocation frequency. We found that all four variables significantly contributed to a linear model of breakpoint number on cytogenetic bands, with G-banding being the best predictor of breakpoint number (p < 8.82 x 10-16, Table 29). The adjusted R2 value, however, showed that these three variables only explained 32.2% of the variation in breakpoint number amongst cytogenetic bands.

Considering translocation frequency of bands, G-banding and gene density were again found to be significantly associated with translocation frequency of cytogenetic bands, while fragility was not.

The G-banding and gene density of cytogenetic bands, however, explained only 23.2% of the variation in translocation frequency present on cytogenetic bands (adjusted R2 = 0.232, Table 29).

Despite this we found that G-banding, physical length, gene density, and the presence of fragile

104 sites only moderately explain the number of breakpoints and translocation frequency of cytogenetic bands. It thus appears that while we can determine a general architecture of bands more susceptible to rearrangement, specific elements of the genomic architecture are more likely to influence individual breakpoints.

Table 29: Multiple Linear Regression Analysis for Translocation Breakpoints and Frequency

Model: Observed Translocation Breakpoints on Cytogenetic Bands

Variable Coefficient Standard Error t Value p Value

G-Bands 1.5861 0.185 8.574 8.82 x 10-16 Physical Length 0.138 0.0258 5.354 1.88 x 10-07 (Mb) Fragility -0.5818 0.2373 -2.452 0.0149 Gene Density 0.033 0.0152 2.166 0.0312

Model Summary: N = 267, R2 = 0.3323, Adjusted R2 = 0.3221, p < 2.2 x 10-16

Model: Translocation Frequency of Cytogenetic Bands

Variable Coefficient Standard Error t Value p Value

G-Bands 0.1904 0.0227 8.393 2.96 x 10-15 Gene Density 0.005 0.0019 2.703 0.0073

Model Summary: N = 267, R2 = 0.2402, Adjusted R2 = 0.2315, p = 1.33 x 10-15

Proposal of Hotspots for Rearrangement in the Pig Genome

It is clear from analyzing chromosomes and cytogenetic bands that particular chromosome regions, specifically those with a more open chromatin composition, with additional influence from gene density and the presence of common fragile sites, are more susceptible to rearrangement.

We thus sought to propose hotspots for chromosome rearrangement in the pig genome based on breakpoint number and translocation frequency. Starting with the average number of breakpoints per band, 2.4, and average translocation frequency, 0.29 per Mb, we proposed that any band that

105 had at least five breakpoints and/or a translocation frequency of 0.58 per Mb or higher to be hotspots for rearrangement in the pig genome. In total, nineteen bands based on number of breakpoints (Table 30), and fifteen bands based on translocation frequency (Table 31) were proposed as hotspots for rearrangement. Six bands were shared between the lists. These bands are derived from a variety of chromosomes and positions. All bands are in G-negative regions, and twelve had a common fragile site. Notably, many bands from shorter chromosomes with high translocation frequencies feature prominently amongst our proposed hotspot bands. For instance, chromosome 12, with just seventeen breakpoints has three bands amongst our hotspots, indicating the specific bands on this chromosome that appear to drive the high translocation frequency of this chromosome. In total, twenty-eight bands with varied characteristics were proposed as hotspots for rearrangement in the pig genome.

Table 30: Proposed Hotspots for Rearrangement Based on Breakpoint Number

Cytogenetic Observed Physical Translocation Band Breakpoints Length Mb Frequency

15q131 11 14.326 0.7678 9p241 8 9.302 0.86 13q41*1 8 13.603 0.5881 1q17*1 7 10.738 0.6519 13q21* 7 14.346 0.4879 5q21 6 12.922 0.4643 7q131 6 6.851 0.8758 12q13 6 11.864 0.5057 13q31* 6 13.696 0.4381 14q21* 6 15.891 0.3776 17q21* 6 19.289 0.3111 4q21* 5 12.04 0.4153 7q24* 5 9.272 0.5393 8p21 5 16.63 0.3007 8q27 5 11.401 0.4386 12q151 5 5.798 0.8624

106

14q23 5 10.93 0.4575 15q24* 5 11.879 0.4209 16q21* 5 17.556 0.2848

The symbol * denotes cytogenetic bands with a common fragile site, while 1 denotes bands present in both tables 30 and 31

Table 31: Proposed Hotspots for Rearrangement Based on Translocation Frequency

Cytogenetic Observed Physical Translocation

Band Breakpoints Length Mb Frequency

10q13 4 4.449 0.8991

11q11 3 3.413 0.879

7q131 6 6.851 0.8758 12q151 5 5.798 0.8624 9p241 8 9.302 0.86

14q13 4 5.14 0.7782 15q131 11 14.326 0.7678 4q11 3 4.013 0.7476

6q11 3 4.078 0.7357 15q26 4 5.628 0.7107 1q17*1 7 10.738 0.6519

2q13 3 4.73 0.6342 12p15 2 3.347 0.5976 13q41*1 8 13.603 0.5881

14q15* 3 5.114 0.5866

The symbol * denotes cytogenetic bands with a common fragile site, while 1 denotes bands present in both tables 30 and 31.

3.4 Comparative Analysis of Rearrangement Hotspots in the Human and Pig Genomes

We sought to compare whether hotspots for rearrangement in the pig genome were homologous for hotspots in the human genome. We took the top 10% of bands in the pig (n =

43) and human (n = 57) genomes for translocation frequency and/or breakpoint number, and using our comparative map (Figure ???) determined if any of these bands were homologous to one another. Just 17 of these bands were homologous to a human rearrangement hotspot, and

107 only twelve of the twenty-eight proposed hotspot bands were found to be homologous.

Conspicuously absent were three of the five bands with the highest breakpoint numbers, 13q21,

13q41, and 15q13, as well as thirteen of the fifteen bands with the highest translocation frequencies, suggesting that longer bands appeared more likely to be homologous to a hotspot.

To determine if these hotspot bands were more likely than other bands to be homologous with a human hotspot band, we labelled each band as being a hotspot or normal, and homologous or non-homologous. Here we found that hotspot bands in the pig genome were no more likely than other bands to be homologous to human hotspot bands (p = 0.3749; One-way Anova). There is thus little evidence that suggests that chromosome regions in the pig that are more susceptible to rearrangement are homologous to similar regions in the human genome, suggesting that the susceptibility of porcine chromosome regions to breakage is due to different factors than in the human genome.

3.5 Analysis of Repetitive Elements in the Pig Genome and the Association with Chromosome and Cytogenetic Band Translocation Frequency

Repetitive elements in mammalian genomes have been associated with rearrangement breakpoints and are thought to promote or mediate rearrangement in some cases. We compared the densities and percent length of the major families of repetitive elements on each chromosome to the translocation frequency. The density of two families SINE/Other (largely tRNA) and Simple

Repeats showed positive correlation with translocation frequency, while the percent length of long terminal repeat (LTR) showed a negative correlation with translocation frequency. Sub-dividing these families into classes we again found that Simple Repeats were positively correlated with translocation frequency, as well as tRNA (from the SINE family), and hAT (from the DNA family), while the percentage of ERV1, ERVK, ERVL, and LTR (all classes of LTR) was negatively correlated with translocation frequency (Figures 9a-9g).

108

Figure 9: Classes of repetitive elements are positively or negatively associated with translocation frequency. (a-d). Scatterplot comparing the percent length of four classes of LTRs, ERV1, ERVK, ERVL, and LTR with chromosome translocation frequency. (e-g). Scatterplots comparing the density of three classes of repetitive elements, hAT, Simple Repeats, and tRNA with chromosome translocation frequency.

We attempted to incorporate these classes of repeats into a multiple linear regression model. We began with the multiple classes of LTR, finding that these classes of LTR elements demonstrated collinearity, rendering any model incorporating multiple LTR classes uninformative towards a model of chromosome translocation frequency (Table 32). With this is mind we next created a multiple linear regression model consisting of tRNA, Simple Repeats, hAT, and LTR, and excluding ERV1, ERVK, and ERVL. The inclusion of these Classes as predictor variables again showed multicollinearity, making results considering any individual predictor variables, and

109 their redundancy inaccurate (Table 33). Overall it appears that these repeat classes are generally correlated with one another at the chromosome level, as well as translocation frequency.

Table 32: Multiple Linear Regression of Multiple Classes of LTR Elements

Model: Multiple Linear Regression Model Incorporating Various Classes of LTR

Variable Coefficient Std. Error t Value p Value

LTR -184.145 147.928 -1.245 0.2352 ERVL -6.712 10.709 -0.627 0.5417 ERVK 6.105 104.393 0.058 0.9543 ERV1 -11.282 18.999 -0.594 0.5628

Model Summary: N = 18, R2 = 0.4203, Adjusted R2 = 0.242, p = 0.108

Table 33: Multiple Linear Regression of Multiple Classes of Repetitive Elements

Model: Classes of Repetitive Elements Associated with Translocation Frequency

Variable Coefficient Std. Error t Value p Value

tRNA -1.10 x 10-04 2.60E-04 -0.426 0.6768 LTR -2.04 x 10+02 1.52E+02 -1.343 0.2024 Simple Repeat 2.74 x 10-03 1.52E-03 1.797 0.0956 hAT 3.78 x 10-02 3.24E-02 1.167 0.264

Model Summary: N = 18, R2 = 0.5325, Adjusted R2 = 0.3887, p = 0.032

Influence of Repetitive Elements on Cytogenetic Band Translocation Frequency

Given the results at the level of whole chromosomes, we considered if it held true when breaking chromosomes down into individual cytogenetic bands, each with their own translocation frequencies. We began by looking at those Classes of repetitive elements that were positively associated with chromosome translocation frequency. Here we found that tRNA and Simple

110

Repeats were also significantly associated with cytogenetic translocation frequency, while hAT elements were not (Table 34).

Table 34: Simple Linear Regression Models of Repeat Classes and Translocation Frequency

Variable df F Value p Value R2 Adjusted R2

hAT 1, 266 1.775 0.184 0.0067 0.0029 Simple Repeat 1, 266 4.754 0.0301 0.0176 0.0139 tRNA 1, 266 6.263 0.0129 0.0231 0.0194

Table 35: Repetitive Elements Associated with Cytogenetic Band Translocation Frequency

Variable Class df F value p Value R2 Adjusted R2

PRE1j tRNA 1, 265 12.5 0.0005 0.0451 0.0415 (GGC)n SR 1, 265 11.79 0.0007 0.0426 0.039 SINE1B_SS tRNA 1, 265 10.28 0.0015 0.0373 0.0337 SINE1D_SS tRNA 1, 265 10.1 0.0017 0.0367 0.0331 (CCCT)n SR 1, 265 9.768 0.002 0.0356 0.0319 SINE1C_SS tRNA 1, 265 9.705 0.002 0.0353 0.0317 (C)n SR 1, 265 8.947 0.003 0.0327 0.029 PRE1i tRNA 1, 265 8.24 0.0044 0.0302 0.0265 PRE1g tRNA 1, 265 8.233 0.0044 0.0301 0.0265 SINE1_SS tRNA 1, 265 7.812 0.0056 0.0286 0.025 PRE1f2 tRNA 1, 265 7.73 0.0058 0.0283 0.0247 CHR-1 tRNA 1, 265 7.604 0.0062 0.0279 0.0242 (TTT)n SR 1, 265 7.36 0.0071 0.027 0.0234 (CCA)n SR 1, 265 7.314 0.0073 0.0269 0.0232 (TC)n SR 1, 265 6.991 0.0087 0.0257 0.022

We examined whether any of the most common repeats from these classes were associated with translocation frequency. We considered 163 specific repetitive elements from three classes of repeats that were most prevalent in the pig genome; fifteen were significantly associated with higher translocation frequency on cytogenetic bands (p < 0.01; Table 35). Of these only six were

111 significant at the chromosome and cytogenetic level, most being tRNA elements. Just one simple repeat, (CCA)n was significant in both analyses, while other simple repeats such as (TCCC)n and

(TAA)n that were significant at the chromosome level were not significant at the cytogenetic band level. Although little is known about the specific activities of many of these repetitive elements, this does indicate that there are specific tRNA elements and Simple Repeats that are highly associated with cytogenetic bands with higher translocation frequencies.

Table 36: Multiple Linear Regression Analysis of Repetitive Elements and Translocation Frequency on Cytogenetic Bands Model:

Variable Coefficient Std. Error t Value p Value

PRE1j 0.0061 0.0023 2.625 0.0092 (TTT)n -0.0977 0.0407 -2.398 0.0172 (GGC)n 0.0594 0.029 2.046 0.0417

Model Summary: N = 267, R2 = 0.0853, Adjusted R2 = 0.0748, p = 3.199 x 10-5

Looking further at these repetitive elements, we created multiple linear regression models in order to determine if these elements largely explained the same variation in translocation frequency, or if a model could be improved by the inclusion of multiple repeats. We began each model with PRE1j, the element with the strongest association with translocation frequency. We found that the addition of multiple tRNA elements to a model did not strengthen it, showing that they did not explain any additional variation not already shown by the inclusion of PRE1j density.

Instead we found that the addition of two simple repeats, (TTT)n and (GGC)n produced the strongest model (Table 36). Still however very little variation in cytogenetic band translocation frequency could be explained by the inclusion of these variables.

112

Table 37: Multiple Linear Regression Model of Cytogenetic and Repetitive Elements and Translocation Frequency

Model: Variable Coefficient Std. Error t Value p Value

G-banding 0.185 0.0222 8.332 4.49 x 10-15 PRE1j 0.0062 0.0021 3.005 0.0029 (GGC)n 0.0614 0.0258 2.385 0.0178 Fragility -0.066 0.0272 -2.428 0.0159

Model Summary: N = 267, R2 = 0.2792, Adjusted R2 = 0.2682, p = 2.2 x 10-16

Taking those repetitive elements, PRE1j, (TTT)n, and (GGC)n that significantly contribute to a model of translocation frequency on cytogenetic bands, along with those chromosomal features previously discussed, we attempted to fit them to a regression model (Table 37). Here we found that once G-banding was in the model, the addition of PRE1j density, (GGC)n density, and the fragility of bands all contributed to a model of translocation frequency on cytogenetic bands.

Still though these four variables together could account for less than a third of the variation in cytogenetic band translocation frequencies. Though these chromosomal features are looked at with low resolution, this does suggest that overarching chromosomal features can be at best vaguely linked to translocation frequency. Instead it appears that the sequencing of breakpoints is necessary in order to determine the genomic architecture of the breakpoint region, and the precise genomic position, allowing the best determination of whether reciprocal translocation breakpoints are largely private events or influenced by a shared underlying genomic architecture in either each cytogenetic band, or at each breakpoint regardless of band.

Distribution of Mosaic Translocation Breakpoints in the Pig Genome

Considering the non-recurrent mosaic rearrangements identified in the pig genome, there are sixteen unique rearrangements across 23 unique breakpoints. Thus far, all breakpoints

113 involving mosaic rearrangements have occurred in G-negative bands, with nine of these bands having common fragile sites. Despite the small number of mosaic rearrangements observed, many breakpoints were re-used, with seven cytogenetic bands having two or more breakpoints, and two bands, 7q22 and 8q21 having three breakpoints apiece. We fit the number of breakpoints per band to a Poisson distribution, finding a generally poor fit, due to many bands having more breakpoints than expected (X2 = 27.733, p < 0.0001; Chi-square test; Table 38). Given the small number of mosaic rearrangement breakpoints there is a significant degree of clustering. Notably however, each band with two or more mosaic breakpoints had few if any constitutional breakpoints, with the exception of 16q21 (Table 39). Two bands, 2q23 and 9p22 had no constitutional breakpoints at all, while 3q23, 7q22, and 8q21 had only one constitutional breakpoint. This perhaps indicates that some bands that are not particularly susceptible to rearrangements resulting from germline translocation may be susceptible to mosaic rearrangement where the same biological constraints may not be present.

Table 38: Poisson Distribution of Mosaic Breakpoints on Cytogenetic Bands

Number of Cytogenetic Bands Number of Breakpoints Observed Expecteda X2 Value per Band

0 244 240.38 0.054 1 16 25.3 3.43 2 7 0.005 24.25 a Based on a Poisson Distribution with m = 0.105 and n = 267 X2 = 27.73, d.f = 2, p < 0.0001

Table 39: Cytogenetic Bands with Multiple Mosaic Breakpoints

114

Mosaic Constitutional Band Breakpoints Breakpoints

2q23 2 0 3q23 2 1 7q22 2 1 8q21 3 1 9p22 2 0 16q21 2 5

Discussion

Analysing the distribution of reciprocal translocation breakpoints in the pig genome we have demonstrated the chromosomes, and chromosome regions that most often undergo rearrangement. A significant amount of breakpoint re-use occurs in the pig genome, with a subset of cytogenetic bands being particularly favoured for rearrangement. We have identified chromosomal features associated with higher frequency of rearrangement, including the length of cytogenetic bands, G-banding, presence of common fragile sites, gene density, and the presence of repetitive elements. Although these features reveal a characteristic architecture of cytogenetic bands most susceptible to rearrangement, they fail to explain more intricate differences between the translocation frequency of bands. We also revealed that rearrangement hotspots in the human and pig genomes are rarely homologous for one another in the case of constitutional rearrangements, however the breakpoints of mosaic rearrangements appear to play by another set of rules.

In order to fully encompass the breadth of rearrangements, we began by looking at interaction between chromosomes, finding generally chromosome pairs were not rearranging with one another more frequently than expected, with a few exceptions. Nearly 2/3rds of all possible reciprocal translocation combinations between chromosomes have been observed, with the

115 number of rearrangements between chromosomes generally being proportional to the combined length of the chromosomes, as has been observed in the human genome (Bickmore et al., 2001).

Still though there are distinct exceptions, as t(1;2), and t(1;13) rearrangements have been observed infrequently in the pig genome, while involving very long chromosomes. In order to accommodate this we considered whether the gene density of chromosomes played a role in determining the frequency of rearrangement. Chromosomes are known to occupy distinct regions of the nucleus known as chromosome territories (CT) (Cremer and Cremer, 2010). While little is known about

CT in the pig genome, it has been suggested that in the germline chromosomes are generally organize according to relative gene density, with more gene dense chromosomes occupying the nuclear interior, and gene poor chromosomes occupying the periphery (Foster et al., 2012). While several of the most frequent rearrangements were found to have low combined and a small difference between their gene densities, suggesting a similar latitudinal position in the nucleus, we found little evidence suggesting that gene density played a significant role in determining rearrangement partners, indicating a degree of randomness in this assortment.

Looking more closely at individual chromosomes we found similar results, with the longest chromosomes having a tendency to have more breakpoints, but not entirely proportional to length.

In order to better place chromosomes on the same scale we introduced the translocation frequency to demonstrate the rearrangements per Mb of chromosome material. The translocation frequency demonstrated which chromosomes were rearranging more frequently than expected and provided a relative measure with which to judge against other chromosomes. The translocation frequency was independent of length and appeared to rely on factors more specific to individual chromosomes. Typically, longer chromosomes have been suggested to break more frequently than shorter chromosomes, with the length proposed to increase the likelihood for rearrangement

116

(Basrur and Stranzinger, 2008; Raudsepp and Chowdhary, 2011; Warburton, 1991; Bickmore et al., 2002). While this is true on its face, it gives a poor indication of susceptibility to breakage given the disparate lengths between chromosomes and arms. While chromosome 1 is the longest and has by far the most breakpoints, in reality it rearranges just under the frequency expected, while shorter chromosomes 10, 12, and 17, not thought to rearrange frequently have some of the highest translocation frequencies. This indicates that chromosomal factors, independent of length are responsible for determining susceptibility to rearrangement.

Rearrangement breakpoints showed no selection for or against structural chromosome features such as arm morphology, or molecular features such as gene density. Rearrangement on more gene dense chromosomes has been proposed to be selected against in the human genome, due to the increased likelihood of gene disruption (Bickmore et al., 2002). We however could find no evidence suggesting that translocation frequency on chromosomes or chromosome arms was influenced by gene density at all. There also appeared to be no selection for or against rearrangement based on chromosome and chromosome arm morphology. Chromosomes with acro- and telocentric arms rearranged no more frequently than bi-armed chromosomes, and no selection against rearrangement on short or long arms was found. This runs counter to observations in the human genome that have suggested acrocentric chromosomes are more susceptible to breakage (Lin et al., 2018). Although there is no clear indication as to why chromosome structure may influence breakage it has been suggested that breaks near the centromere of chromosomes may be selected against due to an increased likelihood of generating unbalanced rearrangements

(Bickmore et al., 2002). Notably however we did find that the translocation frequency between chromosome arms on the same chromosome was often quite different, with one arm having a translocation frequency 2x higher than the other on average. Although chromosome arms have

117 been known to rearrange at different rates, this highlights a distinct difference between these arms that is independent of length and demonstrates that different chromosome regions rearrange at vastly different frequencies (Aurias et al., 1978).

The differences between chromosome arm translocation frequency led us to consider the distribution of breakpoints across cytogenetic bands, the highest resolution demarcation available.

Rearrangement breakpoints were not evenly distributed across cytogenetic bands, with 48.6% of breakpoints appearing on just 12.7% of bands. Translocation breakpoints on porcine cytogenetic bands has been previously suggested to occur non-randomly. The observation of large numbers of breakpoints on different cytogenetic bands on chromosome 14, and individual bands such as 1q21, underlies the suggestion of regions of fragility in the pig genome, leading to preferentially breakage and rearrangement (Tarocco et al., 1987; Basrur and Stranzinger, 2008). The distribution of cytogenetic breakpoints has been studied more intensely in the human genome, with several cytogenetic bands demonstrating consistent breakpoint re-use. This in turn has led to the suggestion of hotspots for rearrangement in the human genome (Yu et al., 1978; Warburton, 1991;

Ou et al., 2011; Lin et al., 2018).

The re-use of breakpoints is prevalent in the pig genome, with 60% of all bands with rearrangements having at least two observed breakpoints. Length appeared to have a small influence on breakpoint number and translocation frequency, largely due to the wide range of band lengths in the pig genome. Bands with high numbers of breakpoints were found on a variety of chromosomes, and a variety of chromosome positions. The proximity of bands to the centromere or telomere regions had no influence on translocation frequency. This runs counter to proposals in the human genome which suggest rearrangements select against centromeric regions due to the potential for creating genetic imbalances (Cohen et al., 1996). We however find no evidence of

118 this in the pig genome, with chromosome morphology appearing to play no role in translocation frequency.

Looking at more specific features of chromosomes revealed more interesting features of chromosomes associated with translocation frequency. G-negative bands, corresponding to the more transcriptionally active euchromatic had a translocation frequency 4.6x greater than that of

G-positive bands. Translocation breakpoints in the pig genome have previously been suggested to occur more often in G-negative or the equivalent R-positive bands (Gustavsson, et al., 1989; Quach et al., 2016). These results are largely in agreement to studies of breakpoint distribution in the human genome, which continually indicate that G-negative or R-positive bands have higher breakpoint numbers than G-positive or R-negative bands (Yu et al., 1978; Aurias et al., 1978;

Warburton, 1991; Cohen et al., 1996; Lin et al., 2018). Despite this, there have been suggestions that cytogeneticists are biased towards placing rearrangements on G-negative or R-positive bands due to the contrasting nature of these lighter bands against darker bands. Studies of R-banded chromosome rearrangements in the pig (Gustavsson et al., 1989) and human genomes (Cohen et al., 1996; Fantes et al., 2008) indicate that rearrangements are consistently identified in R-positive bands and are placed more often in R-positive bands than R-negative bands. indicating that cytogeneticists are largely correct in their diagnoses.

G-negative bands are considered to be more gene dense than G-positive bands, thus we considered whether gene density was associated with translocation frequency, finding that gene dense bands had higher translocation frequencies than gene poor bands. Gene density with regards to chromosome rearrangements has not been well studied, with one study of somatic rearrangements showing a similar relationship (Lin et al., 2018). This should however be carefully considered as somatic rearrangements face less rigorous selection pressures as constitutional

119 rearrangements, and thus may more readily interrupt genes without fatal repercussions. Indeed, many somatic rearrangements are known to interrupt genes in humans (Heisterkamp et al., 1982) and pigs (Musilova et al., 2014), while the interruption of genes is more rare in constitutional rearrangements (Nilsson et al., 2017). It could be suggested that the more open chromatin composition of gene dense chromosome regions increases susceptibility to breakage, with rearrangement being more likely to occur in intergenic regions that minimizes the disruption of genes.

We also explored the structural nature of cytogenetic bands, finding that regions associated with historical chromosome breakage, and breakage under chemical stress were not significantly related to translocation frequency. EBRs were historically re-used in mammalian evolution, however we could find no evidence that these sites were being disproportionately re-used in the modern pig genome. Meanwhile common fragile sites are known to break under exposure to specific chemical stressors. G-negative bands with common fragile sites had on average high translocation frequencies, this did not quite reach statistical significance. EBRs have been suggested to be re-used in modern mammalian genomes for chromosome breakage, with these regions thought to be enriched for genes and repetitive elements thought to be conducive to rearrangement (Larkin et al., 2009). It appears however that these regions fell out of favour for rearrangement over the course of evolution, with more modern genomic elements appearing to be favoured for rearrangement instead. The cytogenetic positions of common fragile sites have been previously noted to overlap with translocation breakpoints in the pig, however no comprehensive analysis attempting to associate these factors has been attempted (Quach et al., 2016; Ronne, 1995;

Riggs et al., 1993; Long et al., 1991). Many bands with fragile sites had an elevated number of rearrangements, however this was not a consistent feature across all fragile bands. Farming

120 practices occasionally expose animals to poor environmental conditions, which is thought to increase the susceptibility of these regions to breakage (Long et al., 1991). Many of the pigs identified in cytogenetic screening however are from nucleus farms notable for their high environmental standards, suggesting that environmental contamination is not routinely leading to chromosome rearrangement in these pigs. Our results contrast with those found in humans, where bands with fragile sites seem more prone to breakage (Warburton, 1991; Hecht and Hecht, 1984a;

Hecht and Hecht, 1984b; Koduru and Chaganti, 1988). These studies however looked only at breakpoint number, which in the present study indicated that fragile bands had more breakpoints than expected. Overall, we failed to find sufficient evidence that the presence of common fragile sites resulted in increased translocation frequency in the pig genome.

In order to better understand the interaction of variables, we combined chromosomal features associated with or suggestive of having higher translocation frequency. We determined that the G-banding, followed by length of bands, presence of common fragile sites, and gene density were found to significantly contribute to a regression model of translocation breakpoint number. However, only G-banding and gene density contributed to a model of translocation frequency. Chromatin density and physical chromosome length have previously been shown to influence breakpoint number in the human genome (Lin et al., 2018; Bickmore et al., 2002), however, little work has been done to demonstrate if these factors influence translocation frequency in the human genome. Although it is apparent that these chromosomal features influencing breakpoint number and translocation frequency, together at most they explain approximately a third of the variation present amongst bands. We may speculate then that more specific chromosome features, unique to each band, may contribute more specifically to the promotion of translocation events.

121

Despite these variables having a low overall predictive power, we could still establish hotspots of rearrangement in the pig genome. We chose just those bands with the highest breakpoint number and translocation frequency in order to limit the number of bands, representing

9.4% of bands in the pig genome. All bands are in G-negative regions, and 12 of the 28 bands have a common fragile site. These bands were found on sixteen different chromosomes, including many of the shortest chromosomes, providing insight into how a single hotspot band can influence the rearrangement frequency of a whole chromosome. These bands are representative of a band architecture which tend to be more euchromatic, and gene dense, and more likely to have a fragile site. The high amount of breakpoint re-use in the pig genome led us to consider whether these rearrangement hotspots were homologous to rearrangement hotspots in the human genome.

Surprisingly we found that this was not the case, with hotspots in the pig genome rarely being homologous for similar regions in the human genome. Although a similar architecture of bands in both species are susceptible to rearrangement, more specific features of cytogenetic bands, likely developed independently from one another, appear to influence this rearrangement.

Despite the lack of homology between individual hotspots for rearrangement between the genomes, we considered whether recurrent rearrangements in these genomes shared any homology that may explain their formation. Just one recurrent rearrangement is proposed in the pigs, a rcp(12;14)(q13;q21), and a handful of recurrent rearrangements are known in humans. These rearrangements are notable as constitutional rearrangements are typically considered to be unique, private events. Both breakpoints of this rearrangement, 12q13 and 14q21, lie on cytogenetic bands that are homologous for human hotspots for rearrangement, namely 17p13 and 22q11 respectively.

Searching databases of reciprocal translocations we found a rcp(17;22)(p13;q11) in the human genome, in a mosaic form, found recurrently in several patients exhibiting Chronic Myeloid

122

Leukemia, suggesting the breakpoints of the rearrangement interrupt or alter gene activity at the breakpoint region (Hagemeijer et al., 1980; Heim et al., 1986; Helenglass et al., 1987; Yang et al.,

2000; Johansson et al., 2002; El-Zimaity et al., 2004). Two recurrent human rearrangements shared homologous breakpoints with porcine rearrangements as well, with the recurrent t(11;22)(q23;q11) was homologous to a t(5;9)(q21;p13) rearrangement in the pig genome, while the t(8;12)(p23;q13) was homologous to a t(5;14)(q21;q12) porcine rearrangement (Ou et al.,

2011). The t(11;22)(q23;p11) recurrent human rearrangement is known to occur as the result of a rearrangement between a palindromic AT-rich repeat (PATRR) which forms cruciform structures on 11q23, and a long low-copy repeat LCR22-3a on 22q11 (Kurahashi et al., 2000; Kurahasi and

Emanuel, 2001; Kurahashi et al., 2006). Meanwhile the t(8;12)(p23;q13) rearrangement is predicted to be recurrent in the human genome due to the presence of two LCR clusters hundreds of thousands of base pairs long on both 8p23 and 12q13 which share 285kb of significant homology (>94%) (Ou et al., 2011). In each case we do not have access to sequence data of the porcine rearrangement breakpoint junctions, and thus can not know if the breakpoints occur in similar homologous regions, or if they occurred in such a way by chance. This does however suggest that recurrent human rearrangements may share some features with porcine rearrangements, and that similar genomic architecture could underlie rearrangement at these positions in both species.

Given that repetitive elements are associated with these homologous rearrangements, we looked into how repetitive elements in the pig genome may influence translocation frequency of bands. We observed that chromosomes and cytogenetic bands with higher concentrations of

Simple Repeats, and the tRNA family of SINEs, while having lower percent lengths of LTRs had higher translocation frequencies on average. Several specific repetitive elements, including PRE1j,

123

SINE1B_SS, (GGC)n, and (CCCT)n all were enriched in those bands with higher translocation frequencies, however the effect appeared small overall, explaining just 7.5% of the variation in translocation frequency between bands. Repetitive elements in the human genome have been previously associated with balanced and unbalanced translocation breakpoints including LTRs,

PATRRs, and members of retrotransposon family such as LINE, SINE, and Alu sequences

(Nillson et al., 2017; Kurahashi et al., 2000; Ou et al., 2011; Luokonnen et al., 2018; Robberrecht et al., 2013). Recurrent rearrangements in the human genome are proposed to occur via NAHR between highly homologous repetitive sequences such as LCR, or retrotransposons such as LINE and SINE/Alu sequences (Elliot et al., 2005; Ou et al., 2011). Non-recurrent rearrangements however appear to be generated by a wider variety of factors, with these highly repetitive elements being less often associated with their presence (Nilsson et al., 2017). One study of human mosaic reciprocal translocation breakpoints found they were enriched for SINE sequences, while depleted for LTRs (Machiela et al., 2017), while another found that simple repeats such as (AT)n, and

(GAAA)n which are capable of producing hairpin or cruciform structures to be associated with breakpoints in cancers (Bacolla et al., 2016). The simple repeats such as (GCC)n and (CCCT)n observed to be associated with translocation frequency have not yet been linked to rearrangement in other studies. Both rearrangements however have been associated with fragile sites, and the generation of I-DNA motifs respectively (Li et al., 1996; Manzini et al., 1994). Interestingly however porcine repetitive elements (PREs) associated with translocation frequency are closely related to Alu elements in humans, suggesting they have a similar architecture that may lead to the promotion of rearrangement between regions enriched for these elements (Funkhouser et al.,

2017).

124

Mosaic chromosome rearrangements in the pig genome appeared to play by distinctly different set of rules appearing recurrently far more frequently than constitutional rearrangements.

The mos t(7;9)(q24;q24), and mos t(7;18)(q22;q11) rearrangements are the primary example of appearing in about 0.41% of peripheral blood lymphocytes (Musilova et al., 2014). Looking at the homologous regions of the human genome which line up with the breakpoints 7q21-22, 9q21-22, and 18q11-12 we identified that both recurrent rearrangements had direct homologues in the human genome, mos t(7;14)(p15;q11) and mos t(7;14)(q34;q11). The cytogenetic breakpoint regions of these two recurrent somatic translocations occur in the approximate cytogenetic regions of the four genes that code for the T cell receptor (TCR), TRA/TRD, TRG, and TRB, which are present at SSC7q15.3-q21, SSC9q21-q22, and SSC18q11.3-q12 respectively (Hiraiwa et al.,

2001). These genes go through a RAG mediated V(D)J recombination process as part of T-cell maturation, which rearranges exons within the gene loci, creating a new functional exon which will determine the antigen specific receptor that is to be expressed on the cell surface (Capone et al., 1998). In the case of the recurrent rearrangements, these loci appear to aberrantly recombining with one another, leading to reciprocal translocations which juxtaposes the TCR gene loci with one another.

Other mosaic rearrangements share similar breakpoints, including the mos t(9;18)(q22;q11) rearrangement, which has breakpoints at the vicinity of the TRG and TRB genes of the T-cell receptor as well. A direct human homologue to this rearrangement, mos t(7;7)(p15;q34) is the result of such an aberrant recombination event (Cautweiler et al., 2007; Le

Noir et al., 2012). The mos t(6;7)(q21;q22) rearrangement for instance had a direct homologue in the human genome, a mos t(14;19)(q11;q13). This rearrangement has been identified in four patients exhibiting Peripheral T-cell lymphoma (Almire et al., 2007; Lepretre et al., 2000; Shin et

125 al., 2012). In these cases the mos t(14;19)(q11;q13) rearrangement is identified to occur recurrently between the TRA/TRD at 14q11 locus and the PVRL2 locus at 19q13. A curious reciprocal translocation between homologous chromosomes, mos t(7;7)(q22;q26), was also found to have a direct homologue in the human genome, another reciprocal translocation between homologous chromosomes, mos t(14;14)(q11;q32). This rearrangement has also been studied in the human genome, revealing that this rearrangement likely takes place between the TRA/TRD locus on

14q11, and the IgH locus on 14q32, which could be proposed to be the case in this porcine rearrangement as well. The last mosaic rearrangement considered was the other homologous reciprocal translocation, mos t(1;1)(q2.11;q21), which was also found to have a direct human homologue, a mos t(9;15)(q34;q22) rearrangement found in patients with multiple myeloma

(Sawyer et al., 1998; Gabrea et al., 2008). Despite this particular rearrangement not being subject to molecular analysis, other recurrent somatic rearrangements on these chromosome regions involve the aberrant recombination of ABL1 on 9q34 and PML/RARA on 15q22, which could be hypothesized to be the case here. Despite these mosaic rearrangements being homologous for somatic rearrangements in humans associated with cancers, there is no evidence that these rearrangements share similar breakpoints, or that carriers of these rearrangements will develop tumours. These observations do however add exceptional empirical evidence that although constitutional rearrangements in the human and pig genomes appear to be largely private events, with different specific factors appearing to influence rearrangement in either species, the generation of mosaic rearrangements appears to play by a different set of rules, allowing for the presence of chromosome rearrangements that would otherwise be selected against in a constitutional form.

126

Analysing the distribution of constitutional reciprocal translocation breakpoints in the pig genome we have demonstrated that specific chromosomes are more susceptible to rearrangement, and in some cases also appear to have preferential partners for exchange. This preference for rearrangement is also true at the cytogenetic level, as we observed that a sub-set of cytogenetic bands are consistently re-used in rearrangement events. This points towards reciprocal translocation breakpoints being distributed non-randomly in the pig genome, with cytogenetic features including G-banding, the presence of common fragile sites, gene density, and the presence of repetitive elements all playing an apparent role in influencing the frequency at which individual cytogenetic bands rearrange. Although these overarching features may suggest a general genomic architecture that is more susceptible to rearrangement, the precise features of the genome within those regions has yet to be deciphered. These regions were enriched for Simple Repeats and SINE elements, with several specific repeats within those families being associated with frequently rearranging cytogenetic bands. The exact role these elements have in promoting rearrangements however seems to be small, at least at the cytogenetic level. Thus the next step in understanding the underlying promotion of rearrangement in the pig genome should be to sequence breakpoint junctions in order to delineate the surrounding genomic architecture and determine underlying commonalities shared by breakpoints, as well as possible unique features that may make individual chromosome regions susceptible to breakage.

For the first time translocation breakpoints of constitutional and mosaic chromosome rearrangements were investigated in the pig genome, revealing distinct patterns and genomic elements associated with their formation. The breakpoint positions of constitutional and mosaic rearrangements appear to be influenced by different factors. The underlying genomic architecture associated with higher gene density, lower chromatin density, and the presence of simple repeats

127 and tRNA elements being most associated with constitutional breakpoints. Meanwhile mosaic rearrangement breakpoints appear to more often happen near to or within genes, indicating these likely result from the aberrant recombination of genes, particularly those associated with the immune system. This is evidenced by constitutional rearrangement hotspots in the pig and human genomes rarely being syntenic for one another, indicating that these rearrangements are not occurring due to the structure of genes, but instead may be more influenced by the presence of repetitive elements which are less stable over millions of years. Meanwhile mosaic rearrangements are the opposite, appearing to occur at many of the same genes in the pig and human genomes, a reflection of the stability of those genes over millennia. The observations of hotspots for rearrangement in the pig genome, and the identification of architectural features associated with them, provides a starting point for future research into porcine rearrangement breakpoints that looks at the level to identify specific genomic elements where breaks may be occurring, and propose how such genomic structures could influence the frequency with which those genomic regions experience breakage and translocation.

128

Chapter 4: A Genome Wide Association Study of Chromosome Rearrangements in the Domestic Pig

Introduction

Boars are primarily raised to provide meat for human consumption, thus prospective breeding animals are primarily selected for their genetic potential to produce large litters of quick growing offspring with commercially desirable traits. Genetic selection programs aim to evaluate boars on a series of traits associated with these goals, which include growth rate, back fat, and litter size. These phenotypic traits, along with pedigree information are translated into estimated breeding values (EBVs), which along with physical and genetic evaluations help breeders to rank individual animals against one another on their suitability for breeding (Robinson and Buhr, 2005).

While EBVs provide accurate measures of physical characteristics, they are often inaccurate regarding their evaluations of litter size. This is partially due to there being little heritability of litter size, with few distinct genetic trends, while also requiring measuring of the litter sizes produced by their daughters to be truly accurate (Robinson and Quinton, 2002; Southwood and

Kennedy, 1991). In addition, most breeders fail to account for the presence of chromosome rearrangements in their swine herds which account for approximately 50% of cases of hypoprolificacy in boars (Gustavsson, 1989).

The prevalence of chromosome rearrangements in the domestic pig (1/200) is high relative to other species such as humans (1/500) and bovids (1/700) (Ducos et al., 2007; Jacobs et al., 1992;

De Lorenzi et al., 2012). It is unclear why pigs have such a high prevalence of rearrangements however some research has shown that family members of mosaic rearrangements carriers are more likely to be carriers themselves, suggesting a genetic predisposition towards acquiring rearrangements (Rezaei et al., 2020). Despite the large number of chromosome rearrangements

129 identified in domestic species as well as humans, there is little work that has been done to identify if genetic factors are associated with chromosome rearrangements. A limited number of studies which have sought to examine chromosome rearrangements from a genomic perspective have concentrated on the breakpoint junctions of rearrangements (Nilsson et al., 2017; Luokonnen et al., 2018). These studies aimed to identify the signatures of DNA repair mechanisms and the genomic architecture near the sites of breakpoint junctions, revealing a breadth of repair mechanisms and genomic architecture at breakpoint junctions. Chromosome breakage is a routine adversity faced by cells, and many pathways exist to respond to and repair DSBs. Studies have suggested that individuals experiencing deficiency for canonical DNA repair methods are at increased risk of acquiring chromosome rearrangements, indicating that a genetic component may promote chromosome rearrangement in some individuals. It is still unclear however if rearrangements are the result of simple mistakes that occur in a subset of cells, or if individuals and families with carriers of rearrangements may be pre-disposed to acquiring them overtime.

With the availability of commercial high-density SNP arrays for the domestic pig, there has been an increasing number of GWAS and CNV studies conducted on various traits and diseases. GWAS studies using high density SNP arrays provide an efficient platform with which to detect and explain variation of traits of interest in a population. Although GWAS has yet to be applied to the study of constitutional chromosome rearrangements in any species, GWAS has been applied to study economically important traits in the pig such as meat quality (Vernardo et al.,

2017), as well as reproductive traits such as the farrowing interval (Wang et al., 217) and the number of stillborn animals (Wu et al., 2019). SNP array genotyping data may also be used to call

CNVs in the pig genome, which are increasingly viewed as a large source of genomic variation associated with disease (Zhang et al., 2009). Over the past few years thousands of copy number

130 variant regions (CNVR) have been identified in pigs and have been linked to a wide variety of traits including meat quality (Wang et al., 2015), and fertility (Revay et al., 2015).

In this study we report the results of a GWAS of chromosome rearrangement carriers and their family members, identified through the routine cytogenetic screening of Canadian breeding boars. We also report the results of a gene annotation analysis conducted on the significantly associated genomic regions and surrounding genes. Using the genotyping data, CNV analysis of these animals was also conducted, and analysis of the genomic regions and genes surrounding called CNVRs is reported.

Materials and Methods

Animals and Phenotypes

A total of 7,245 reproductively unproven young boars from various nucleus breeding farms in Canada were subject to routine cytogenetic screening at the University of Guelph. Peripheral blood samples were collected weekly and cultured using standard cytogenetic procedures

(discussed in Chapter 2). Chromosome rearrangements were identified through arranging karyotypes of each animal using SmartType software (Digital Scientific, UK). Nineteen carriers of chromosome rearrangements identified in this way were reported to the farms of origin. We requested pedigree information and blood samples from the sire and dam if available in order to perform additional cytogenetic analysis. In eleven cases we were able to obtain blood samples from the sire and dam, resulting in the origin of the rearrangements being established. Eight rearrangements were found to have a de novo origin, with both parents having normal chromosome constitutions, while two rearrangements were found to have been inherited from the dam, and one from the sire. The remaining rearrangements are presumptively de novo, with no evidence of parental inheritance, however this can not be ruled out.

131

We requested genetic information from Illumina PorcineSNP60 SNP arrays for both parents which were carried out by the farms as part of their own genetic selection programs. This allowed us to obtain fifteen complete parent-child trios, as well as SNP genotypes from four additional carriers, and two parents. In addition, we obtained SNP genotypes from 11 non-carrier trios, consisting of non-carrier boars and their parents which were to serve as control animals.

Initially quality control of samples was conducted in Plink and sought to remove low-quality samples with genotyping rates < 90%. This resulted in the removal of four samples. Additional quality control was performed by removing instances where a sample was present amongst case and control animals, and removing those three boars found to have inherited rearrangements, as well as their carrier parents. The remaining carrier boars were carriers of non-recurrent constitutional reciprocal translocations, with no other carriers of rearrangements considered for this study. From an original 81 unique samples, we proceeded with 65 samples (37 cases, 28 controls) for further analysis.

Samples were broken down into three groups in order to run three separate GWAS: Boars,

Dams, and Sires. The boars and parents were separated for these analyses as the chromosome rearrangements must have originated in one of the parents and be passed to their carrier offspring.

In addition, without sequencing of the breakpoints, it is impossible to determine which parent contributed the rearrangement. Studies of human rearrangements suggest that 90% of cases de novo rearrangements arise in the father (Thomas et al., 2010; Hockner et al., 2012), thus we may assume that the sire of carriers had a higher chance of producing the rearrangement relative to the dam. Controls samples were matched accordingly from the control parent-child trios. Carrier boars are presumed to be de novo carriers, while both parents are confirmed or presumed to have normal

132 chromosome compositions. In each case the sire was confirmed to carry no chromosome rearrangements, with the status of three dams being uncertain.

The three separate GWAS analyses were composed of carrier boars v. control boars, dams of carriers v. dams of control boars, and the sires of carriers v. the sires of control boars. The number of case and control samples for each GWAS is listed below (Table 40). These analyses are henceforth referred to as the Boars, Dams, and Sires GWAS respectively.

Table 40: Number of Samples for each GWAS

Analysis Cases Controls Total Boars 14 9 23 Dams 12 10 22 Sires 11 9 20

Given that chromosome rearrangements appear de novo in 1/200 boars, and must be manually detected, along with the several farms unwilling to provide genetic material from carriers, we failed to reach sufficient effective population size (Grossi et al., 2017). Failing to meet this threshold indicates that our results may be biased in some fashion, and the results herein should be considered suggestive at best until larger samples sizes are able to be acquired to complete a more comprehensive analysis, allowing for more precise conclusions to be stated.

Genotyping and Quality Controls

Genotyping was conducted by the nucleus farms as part of their own genetic selection programs. Genotyping was conducted using the Porcine SNP80 BeadChip (GeneSeek, Lincoln,

Nebraska, USA). Quality control of samples was conducted in Plink software, which removed four samples with call rates < 90% (Purcell et al., 2007). Using the remaining samples we then

133 performed independent quality control of SNPs for each analysis in Plink. SNPs were pruned beginning with those on the sex chromosomes, and those without defined positions. This left

62,331 SNPs on autosomal chromosomes. SNPs were then pruned by quality, with those with minor allele frequencies < 0.05, call rates < 0.9, and Hardy-Weinburg equilibrium (HWE) p <

0.0001 excluded from each dataset (Table 41). Samples were clustered in each analysis using the pairwise identity-by-state (IBS) distance method in order to detect population stratification. In each case we failed to detect sufficient populations stratification to necessitate clustering as the numbers of breeds in each analysis were roughly equal.

Table 41: SNPs Removed from each Quality Control Step

Analysis Starting SNPs MAF < 0.05 Geno < 0.90 HWE < 0.0001 Remaining

Boars 62331 6123 828 0 55380 Dams 62331 5863 743 0 55725 Sires 62331 5334 1243 0 55754

Genome-wide Association Studies Tests of genetic association were performed for each group (boars, dams, and sires) separately. Each SNP was evaluated for genetic association in Plink using standard allelic association. In this method, SNPs with a minor allele ‘a’, and a major allele, ‘A’, for case and control groups representing a total number of samples, ‘n’, can be written as a 2 x k contingency table of counts of disease status by allele count (a or A) (Table 42). The null hypothesis is that of no association between either alleles and disease, with the relative frequencies of alleles not expected to differ between case and control groups. Tests for association can therefore be conducted by a chi-square (X2) test for independence between the rows (case/control) and columns

(alleles) of the contingency table.

Table 42: Contingency Table for Association Tests of Alleles between Cases and Controls

134

Allele a A Total Cases m11 m12 m1 Controls m m m 21 22 2 Total m m 2n 1 2 GWAS was performed for each group independently in Plink using standard allelic association. An allelic association test is based on a simple χ2 test for independence of rows and columns.

2 2 (푚푖푗 − 퐸(푚푖푗)) 2 푋2 = ∑ 0 ∑ 0 푖=1 푗=1 퐸(푚푖푗)

푚푖.푚.푗 Where 퐸(푚 ) = 푋2 has a X2 distribution with 1 d.f under the null hypothesis 푖푗 2푛 (no association). The statistical significance of SNPs was based on the positive false discovery rate (pFDR)

(Benjamini and Hochberg, 1995; Storey et al., 2003). At the 1% false discovery rate, the significant

P-values were set to was set to 0.01, and the threshold p-values were calculated as follows:

푉 pFDR = (E ( ) │R < 0) 푅 where V is the confidence threshold, and where R is the number of unadjusted SNPs with P < V.

At the 1% level the P-value significance thresholds were 1.77 x 10-05, 4.89 x 10-06, and

2.61 x 10-05 for the Boars, Dams, and Sires respectively. At the 5% level, the P-value significance thresholds were 8.87 x 10-05, 2.45 x 10-05, and 1.3 x 10-04 for the Boars, Dams, and Sires respectively.

135

Candidate Gene Search and Functional Annotation

Post GWAS bioinformatic analysis was conducted by populating a list of all genes within

2Mb up or downstream of the significant SNPs. The gene annotations were based on the

Sscrofa10.2 swine genome assembly (https://www.ensembl.org/Sus_scrofa/Info/Index).

Annotations of genes were carried out using the NCBI gene database

(https://www.ncbi.nlm.nih.gov/) which provide known gene functions and related literature. (GO) analysis on these lists of genes was conducted using the DAVID Bioinformatics

Resources (https://david.ncifcrf.gov). We applied the default statistical tests and performed

Fisher’s exact test to assess the significance of enriched terms, with only those terms with P < 0.05 being selected (Dennis et al., 2003). Genes related to these significant terms were then explored for functional relevance to chromosome rearrangement, with those genes meeting these criteria being used to populate a list of candidate genes.

Haplotype Block Analysis

To generate haplotype blocks and detect regions of linkage disequilibrium between the significant SNPs and candidate genes associated with chromosome rearrangements we conducted linkage disequilibrium analysis using Haploview 4.2 software (Barrett et al., 2005). Given the poor understanding of chromosome rearrangements in the genome, we chose a large mapping distance, up to 2Mb on either side of each SNP in order to capture any relevant information. A haplotype block was defined in this program using the solid spin algorithm criteria of Gabriel et al. (2002).

CNV Analysis

The calling of copy number variants was performed using PennCNV software, which uses the signal intensity data calculated by the Illumina SNP array to infer copy number states (Wang et al., 2007). PennCNV uses the Log-R Ratio calculated by the Illumina SNP array

136

(log2(Robserved/Rexpected), for each SNP along with B allele frequency of the sample and population to infer copy number states (Pfeiffer et al., 2006). This information was calculated in Illumina

GenomeStudio software (Illumina, San Diego, USA), and transferred to PennCNV for quality control and analysis.

Quality control of samples was performed in Genome Studio and PennCNV. In Genome

Studio we applied the same SNP quality control parameters as we did in Plink, removing any samples and SNPs with < 0.90 call rate, < 0.05 minor allele frequency, and < 0.0001 HWE. In order to address samples with excess Log-R ‘noise’, we applied an additional filter in PennCNV, removing any samples with a Log-R standard deviation > 0.24. SNP marker locations as above were annotated on the Sscrofa 10.2/susScr3 genome assembly (UCSC, 2011).

CNV analysis was carried out using two methods, the first being univariate CNV calling each sample separately, and the second being trio-CNV calling using samples from the carrier, sire, and dam, where possible. Filters were applied such that a CNV must be called across at least three SNP markers. The program specified the SNP coordinates and the copy number for each

CNV call for each sample. Overlapping CNV were grouped into copy number variant regions

(CNVR) manually. We then used CNVRuler (Kim et al., 2012) in order to detect significant association between CNVR and carriers and parents of chromosome rearrangements. A -2 Log

Likelihood Ratio Test was applied in CNV Ruler to detect significant differences in the frequency of CNVR between pigs associated with carriers, and control samples.

Functional Annotation of CNVRs

A list of genes was populated from those that overlapped with CNVRs or were within close proximity (within 50 kb). The gene annotations were based on the Sscrofa10.2 swine genome

137 assembly (https://www.ensembl.org/Sus_scrofa/Info/Index). Annotations of genes were carried out using the NCBI gene database (https://www.ncbi.nlm.nih.gov/) which provide known gene functions and related literature. GO analysis on these lists of genes was conducted using the

DAVID Bioinformatics Resources (https://david.ncifcrf.gov). We applied the default statistical tests as before, with Fisher’s exact test used to assess the significance of enriched terms, with only those terms with P < 0.05 being selected (Dennis et al., 2003). Genes related to these significant terms were then explored for functional relevance to chromosome rearrangement, with those genes meeting this criterion being used to populate a list of candidate genes.

Results

SNP Data Statistics

After quality controls we had a total of 65 samples left across the three analyses. On average between the analyses there were 55,570 SNPs retained for analysis, with the largest difference in SNP number between the analyses being just 479 SNPs. The average number of SNPs per chromosome was 3,087, with chromosome 1 having on average the most SNPs, 5,428, and chromosome 18 having the least, 1565. The average distance between SNPs was 44.11kb with chromosome 1 having the largest average distance between SNPs, 58.06kb, while chromosome 12 had the shortest average distance, 29.12kb.

Genome Wide Association Study

Across the three studies we identified 15 SNP associations meeting our significance threshold, minimum 5% pFDR, with three SNP associations, two from the Dams and one from the

Sires analyses meeting the 1% pFDR threshold (Table 43). These SNPs were spread unevenly across the three analyses, with 9 identified in the Dams, 5 in the Boars, and 1 in the Sires. SNPs were not evenly spread across chromosomes, with two thirds of all significant SNPs being located

138 on chromosome 1 (4 SNPs), and chromosome 3 (6 SNPs). The remaining significant SNPs were located on chromosomes 10, 11, 12, 15, and 17.

For the Boars, the five significant SNP associations were located on three chromosomes,

1, 3, and 17. Two significant SNP associations each were located on chromosomes 1 and 3, in close proximity to one another, just 419.55kb and 70.15kb apart respectively. The nine significant

SNP associations in the Dams were spread across chromosomes 1, 3, 10, 11, 12, and 15. Four of the SNPs being located in close proximity to one another on chromosome 3, within 4.88Mb of one another, with two smaller clusters of two SNPs, 465kb and 626kb apart respectively. The remaining SNP, the lone significant SNP association in the Sires analysis was located on chromosome 1.

Although there were four significant SNP associations on chromosome 1 found across the three analyses, these SNPs from different analyses were not in close proximity to one another. The opposite was true for the six significant SNP associations on chromosome 3, which were located within 8.83Mb of one another. This cluster was located between 91.12 Mb and 99.95 Mb on chromosome 3. No SNPs or any adjacent SNPs were shared between the analyses, with SNPs generally appearing as single SNP association peaks exclusive to each analysis.

Table 43: List of Significant SNPs Associated with Chromosome Rearrangements

Chromosome SNP ID Position p Value Analysis

1 ALGA0003413 58535098 6.76 x 10-06 Dams 1 ASGA0005318 193873995 3.13 x 10-05 Boars 1 ASGA0005319 194293541 3.13 x 10-05 Boars 1 ALGA0009505 286146126 1.91 x 10-05 Sires 3 ASGA0092248 91123979 6.76 x 10-06 Dams 3 ALGA0112408 91589542 6.76 x 10-06 Dams

139

3 ALGA0020048 95385808 2.25 x 10-05 Dams 3 MARC0003169 96012125 2.53 x 10-06 Dams 3 WU_10.2_3_99882029 99882029 7.15 x 10-05 Boars 3 WU_10.2_3_99952176 99952176 7.15 x 10-05 Boars 10 WU_10.2_10_48049509 48049509 2.3 x 10-05 Dams 11 CASI0009142 31405855 1.33 x 10-05 Dams 12 WU_10.2_12_43188411 43188411 1.27 x 10-05 Dams 15 SIRI0001076 23592174 2.53 x 10-06 Dams 17 WU_10.2_17_47439537 47439537 7.88 x 10-05 Boars a

b

140 c

Figure 10: Manhattan plots for the GWAS analyses for the Boars (a), Dams (b), and Sires (c). Chromosomes are listed on the x-axis, and transformed -log10 P values are plotted against the y-axis. The threshold for pFDR p-value significance is plotted as a red line (1%), and blue line (5%). Gene Annotation of Carrier Boars

Genes within 100kb and 2Mb were mapped to each significant SNP based on the 10.2

Sscrofa genome assembly. In total twelve genes were mapped to within 100kb of a significant

SNP, and 164 genes were mapped to within 2Mb. In order to identify possible candidate genes, we performed gene ontology (GO) analysis, on both sets of genes for each analysis. We looked for

GO terms related to biological processes related to the generation of DSBs, recognition of DNA damage, and the initiation and process of DNA repair. These processes are essential to the development of rearrangements, as DNA must be broken and mis-repaired for this process to occur. GO analysis was generally uninformative for those genes within 100kb of the SNPs. In the

Dams and Sires analysis, no significant GO terms were revealed, while just three significant terms were revealed in the Boars. The significant GO terms were all related to the negative regulation of protein modification, or phosphate metabolism (Table 44). None of these functions were related to processes directly relevant to the generation of DSBs, or DNA repair.

Table 44: Significant GO terms of genes within 100kb of significant SNPs from the Boars

141

Number of Fisher Category GO Term GO Term Description Genes Genes p value

negative regulation of Biological PPP1R16B, GO:0031400 protein modification 2 2.6 x 10-03 Process SOCS5 process negative regulation of Biological PPP1R16B, GO:0045936 phosphate metabolic 2 2.7 x 10-03 Process SOCS5 process negative regulation of Biological PPP1R16B, GO:0010563 phosphorus metabolic 2 2.7 x 10-03 Process SOCS5 process

Due to the small number of significant terms, and the general lack of information regarding chromosome rearrangements in mammalian genomes, we expanded our analysis to encompass all genes within 2Mb of the significant SNPs. We failed to identify any significant GO terms amongst the genes from the Sires analysis. Moving on to the Dams, we identified 50 significant GO terms, however, none of these terms appeared related to processes relevant to chromosome rearrangements. The most significant terms were processes largely related to the immune system, such as lymphocyte migration, and neutrophil chemotaxis (Table 46).

GO analysis of those genes nearest the significant SNPs from the Boars analysis, proved the most fruitful, with 38 significant GO terms, many of which appeared relevant to chromosome rearrangement (Table 45). The eleven most significant GO terms, along with several others were all related to processes related to DNA, including DNA repair, recombination, and meiosis. Several genes were routinely associated with these terms, thus we looked up their more specific functions in order to determine relevance to the generation of chromosome rearrangements. Of the 53 genes within 2Mb of the significant SNPs from the Boars analyses, we identified seven genes, ACTR5,

FANCM, MSH2, MSH6, TOP1, SRC, and RBL1 as candidate genes, which would be further explored for associations with chromosome rearrangements in the domestic pig genome (Table

142

47).

Table 45: Significant GO terms of genes within 2Mb of significant SNPs from the Boars

Number Fisher Category GO Term GO Term Description Genes of Genes p value

Biological GO:0000710 meiotic mismatch repair 2 MSH2, MSH6 1.10 x 10-09 Process

Biological maintenance of DNA GO:0043570 2 MSH2, MSH6 3.20 x 10-09 Process repeat elements

Biological negative regulation of GO:0045910 2 MSH2, MSH6 2.10 x 10-07 Process DNA recombination

Biological reciprocal meiotic MSH2, MSH6, GO:0007131 3 2.30 x 10-07 Process recombination FANCM

Biological negative regulation of MSH2, MSH6, GO:0051053 3 3.60 x 10-06 Process DNA metabolic process SRC

Biological MSH2, MSH6, GO:0007127 meiosis I 3 6.30 x 10-06 Process FANCM

Biological MSH2, MSH6, GO:0007126 meiotic nuclear division 4 6.80 x 10-06 Process FANCM, DSN1

Biological MSH2, MSH6, GO:0009411 response to UV 3 8.40 x 10-06 Process ACTR5

Biological meiotic cell cycle MSH2, MSH6, GO:1903046 4 1.50 x 10-05 Process process FANCM, DSN1 negative regulation of Biological GO:0045128 reciprocal meiotic 1 MSH2 1.50 x 10-05 Process recombination Biological regulation of DNA GO:0000018 2 MSH2, MSH6 1.80 x 10-05 Process recombination Biological GO:0006311 meiotic gene conversion 1 MSH2 4.60 x 10-05 Process

Biological GO:0060631 regulation of meiosis I 1 MSH2 2.30 x 10-04 Process

MSH2, MSH6, Biological GO:0006259 DNA metabolic process 5 ACTR5, FANCM, 1.10 x 10-03 Process SRC, TOP1

143

Table 46: Significant GO terms of genes within 2Mb of significant SNPs from the Dams

Number Fisher Category GO Term GO Term Description Genes of Genes p value

Biological cellular response to CCL1, CCL2, GO:0071346 3 2.70 x 10-08 Process interferon-gamma MRC1

Biological response to interferon- CCL1, CCL2, GO:0034341 3 2.20 x 10-07 Process gamma MRC1 Biological GO:0072676 lymphocyte migration 2 CCL1, CCL2 5.50 x 10-06 Process

Biological GO:0030593 neutrophil chemotaxis 2 CCL1, CCL2 6.10 x 10-06 Process

Biological GO:0070555 response to interleukin-1 2 CCL1, CCL2 9.10 x 10-06 Process

Biological GO:1990266 neutrophil migration 2 CCL1, CCL2 1.10 x 10-05 Process

Biological positive regulation of GO:0050729 2 CCL1, CCL2 1.30 x 10-05 Process inflammatory response

Biological GO:0071621 granulocyte chemotaxis 2 CCL1, CCL2 1.50 x 10-05 Process

Biological response to tumor GO:0034612 2 CCL1, CCL2 7.60 x 10-05 Process necrosis factor

Biological ERK1 and ERK2 CCL1, CCL2, GO:0070371 3 1.30 x 10-04 Process cascade FSHR

Linkage Disequilibrium Analysis

We generated haplotype blocks for each of the 15 significant SNPs. Those SNPs in the same analyses which were located close together on chromosomes were grouped into one analysis.

Fourteen of the fifteen significant SNPs failed to generate haplotype blocks centering on the SNP of interest. One haploblock was generated which centered on the significant SNPs on chromosome

144

3 from the Boars analysis. Consequentially, none of the candidate genes were in linkage disequilibrium with any significant SNPs.

We looked at linkage disequilibrium surrounding the SNPs with nearby candidate genes for the Boars and Dams analyses. SNPs from two of the four SNPs with nearby candidate genes in the Boars analyses did not generate haplotype blocks surrounding the SNPs and showed no significant linkage between those SNPs and the candidate genes. The two SNPs on chromosome

3 did generate a haplotype block, 335Kb long, however neither SNP had significant linkage with those SNPs nearest the candidate genes. We generated a further seven analyses for the SNPs with nearby candidate genes from the Dams analyses. This haploblock contained two genes, SOCS5, and PIGF, neither of which appeared functionally relevant to chromosome rearrangement. Across each significant SNP, we failed to identify haploblocks linking significant SNPs and candidate genes.

Figure 11: Linkage disequilibrium plot of a significantly associated region centering on two significant SNPs. The boxes are colored according to a standard color scheme: LOD > 2 and D’ = 1, red; LOD > 2 and D’ < 1, lighter shades of red; LOD < 2 and D’ = 1, blue; LOD < 2 and D’ < 1, white; where D’ is the coefficient of linkage disequilibrium, and LOD is the log of the likelihood odds ratio, a measure of confidence in the value of D’.

145

Table 47: List of SNPs with Nearest Genes

Closest/Candidate Distance Analysis Chr SNP ID Position Gene (Kb)

Dams 1 ALGA0003413 58535098 RIMS1 Within 3 ASGA0092248 91123979 EML6 134.27 3 ALGA0112408 91589542 C2orf73 111.13 3 ALGA0020048 95385808 NRXN1 1275.87 3 MARC0003169 96012125 NRXN1 649.55 10 WU_10.2_10_48049509 48049509 SLC35F5 Within 11 CASI0009142 31405855 C17orf75 594.96 12 WU_10.2_12_43188411 43188411 FSHR 3.78 15 SIRI0001076 23592174 ACTR3 804.12

Sires 1 ALGA0009505 286146126 WHRN 36.66

Boars 1 ASGA0005318 193873995 C14orf28 1036.85 1 ASGA0005319 194293541 C14orf28 617.3 FANCM 926.71 3 WU_10.2_3_99882029 99882029 SOCS5 Within MSH2 575.78 MSH6 1149.63 3 WU_10.2_3_99952176 99952176 SOCS5 12.34 17 WU_10.2_17_47439537 47439537 DHX35 Within ACTR5 226.36 SRC 1497.51 TOP1 1695.98 RBL1 1722.75 *Bolded genes are candidate genes, with apparent functional relevance to chromosome rearrangement. CNV Analysis

CNV analysis was performed using the samples previously integrated into the GWAS.

Using both univariate and trio CNV calling where samples permitted, we identified a total of 226 copy number variants across 65 samples. Overlapping CNVs were re-organized into CNVR, of which there were 137. CNVRs were present on every chromosome, however the distribution was

146 uneven. Each chromosome had an average of 7.6 CNVR, with a maximum of 17 on chromosome

2, and a minimum of just two on chromosome 16. The average length of CNVR was 129.06kb, with the shortest CNVR being just 15.53kb, while the longest CNVR was 738.92kb. The total length of CNVR was 17Mb, covering approximately 0.065% of the pig genome. Case and control animals had similar numbers of CNV calls, with an average of 3.65 per case animal, and 3.43 per control animal. Of the 137 CNVRs, 84 were exclusive to case animals, 35 were exclusive to controls, and 18 were mixed. The majority of CNVR, 86, were copy number losses, with the ratio between losses and gains being much more biased amongst cases, 3.4:1, than controls, 1.27:1.

Most CNVR with more than two observations, such as CNVR100, with eight observations, were made up of a mix of case and control animals, in this case three cases and five controls.

In order to determine the presence of an association between a given CNVR and the presence of chromosome rearrangements (either carrier or parent), we used CNVRuler and the

Fisher’s exact test. Examining all 137 CNVRs found no evidence of an association between a

CNVR and the presence of chromosome rearrangements in a family. Grouping our CNV calls separately as Boars, Dams, and Sires, and producing additional analyses also did not yield any results suggesting association between any CNVR and chromosome rearrangements. Indeed, many

CNVRs with multiple samples were a mix of case and control samples, with no clear indication that animals associated with chromosome rearrangements had a non-random assortment of CNVs relative to control animals.

Annotation of CNVRs

As with the GWAS, we looked at genes within and in close proximity to CNVR in order to determine if those genes had any functional relevance towards chromosome rearrangement.

CNVRs were mapped to genes within the CNVR, and 50Kb up or downstream from the 10.2 swine

147 genome assembly. We then performed functional Gene Ontology analysis was we did with those genes mapped to positions close to significant SNPs. The most significant GO terms for the Carrier

Boars was G-protein coupled receptor signaling pathway, and sensory perception of chemical stimulus. Although these processes do not appear relevant to chromosome rearrangement, other significant GO terms identified amongst these genes do, such as DNA replication, and replication fork processing (Table 48). In contrast the genes found near CNVR in the Control Boars share similar GO terms to the Carrier Boars, but those genes are not related to DNA replication or replication fork processing.

Table 48: Significant GO terms of genes within 50kb of carrier Boar CNVR

Number Fisher Category GO Term GO Term Description Genes of Genes p Value G-protein coupled Biological HTR3A GO:0007186 receptor signalling 1 1.1 x 10-03 Process pathway

FANCM, Biological GO:0006260 DNA replication 3 PDGFA, 1.3 x 10-03 Process TONSL

Biological Replication fork FANCM, GO:0031297 2 2.6 x 10-03 Process processing TONSL DNA-dependent DNA Biological FANCM, GO:0045005 replication maintenance 2 4 x 10-03 Process TONSL of fidelity

A similar theme was found in the GO annotation of genes in Carrier Dams and Control

Dams. The most significant GO terms found in the Carrier Dams analysis were outer dynein arm assembly, and axonemal dynein complex assembly (Table 49). There were however several significant GO terms that appeared related to chromosome rearrangement, such as DNA recombination, DNA repair, and resolution of recombination intermediates. In contrast those genes

148 near the Control Dam CNVRs were unrelated to chromosome rearrangement, with the most significant GO terms being detection of stimulus involved in sensory perception, and detection of chemical stimulus involved in sensory perception.

Table 49: Significant GO terms of genes within 50kb of Dams of carriers CNVR

Number Fisher Category GO Term GO Term Description of Genes p Value Genes Biological outer dynein arm GO:0036158 2 TMEM141, DNAAF5 1.1 x 10-05 Process assembly Biological axonemal dynein GO:0070286 2 TMEM141, DNAAF5 2.9 x 10-05 Process complex assembly Biological GO:0003341 cilium movement 2 TMEM141, DNAAF5 1.7 x 10-04 Process

Biological SMC6, TONSL, GO:0006310 DNA recombination 4 9.20 x10-04 Process GEN1, NSD2,

Biological GO:0035082 axoneme assembly 2 DNAAF5, TMEM141 3.20 x10-04 Process Biological microtubule cytoskeleton FBXW5, DNAAF5, GO:0000226 4 2.20 x10-03 Process organization TMEM141, GEN1 TONSL, NSD2, Biological GO:0006281 DNA repair 5 SMC6, PRPF19, 4.00 x10-03 Process GEN1 Biological cellular protein complex TRAF2, DNAAF5, GO:0043623 4 4.20 x 10-03 Process assembly TMEM141, ANSK4B,

Biological microtubule bundle GO:0001578 2 DNAAF5, TMEM141 1.30 x10-03 Process formation

Biological GO:0071139 resolution of 2 GEN1, SMC6 3.90 x10-04 Process recombination intermediates

Biological GO:0048670 regulation of collateral 2 WNT3, EPHA7 9.40 x10-04 Process sprouting Biological GO:0030163 protein catabolic process 6 USP15, TMEM129, 1.80 x 10-02 Process PCSK9, PRPF19, FBXW5, TRAF2 Biological GO:0006259 DNA metabolic process 5 TONSL, GEN1, 2.10 x 10-02 Process SMC6, ORC4, NSD2

149

We last looked at those genes found near CNVRs in the Carrier and Control Sires analyses.

The most significant GO terms in the Carrier Sires analyses were striated muscle adaptation, and regulation of muscle adaptation, with no terms related to chromosome rearrangement. Similarly, the Control Boars GO analysis showed no terms relevant to chromosome rearrangement, with the most significant term being detection of stimulus involved in sensory perception. From the significant GO terms identified in the Carrier Boars and Carrier Dams, we derived a list of genes with copy number losses, and copy number gains for further discussion, with those genes being

FANCM, GEN1, PRPF19, SMC6, and TONSL.

Discussion

In this study we applied genotyping data to perform allelic association and to identify copy number variants in boars carrying chromosome rearrangements, and their sires and dams. Several

SNPs associated with the presence of chromosome rearrangements were identified, with further analysis revealing genes in the proximity of SNPs identified in the carrier boars that were associated with processes relevant to chromosome rearrangements. Meanwhile the CNV analysis appeared less informative, revealing several CNVRs, but few with any apparent association with chromosome rearrangements. Through this analysis, however, for the first time chromosome rearrangements were investigated from a genetic perspective, revealing SNP associations and nearby genes associated with DNA repair, that could lead to a better understanding of the processes leading to the high rate of chromosome rearrangement in the pig genome, and how to better control this in the future.

To our knowledge this is the first GWAS on chromosome rearrangements performed in the domestic pig. Due to the uncertain origin of apparently de novo chromosome rearrangements, we performed three separate GWAS, examining the carrier boars, dams of carriers, and the sires of

150 carriers separately. In total we revealed 15 SNP associations across the three analyses which were associated with chromosome rearrangements. Most significant SNP associations were identified in the Dams analysis (n = 9), with five significant SNP associations being identified in the Boars, and just one significant SNP association being identified in the Sires. SNP associations were identified on a variety of chromosomes, with a cluster of six SNP associations within 10Mb on chromosome 3 found in the Dams and Boars analyses. Four SNP associations were identified on chromosome 1 in the Boars, Dams, and Sires, however the SNPs were over 50Mb apart from one another. Chromosome rearrangements are particularly understudied in the pig, thus these regions are the first to be associated with the presence of chromosome rearrangements in the pig genome.

No SNP associations were common between the Boars, Dams, and Sires, suggesting that the origin of the rearrangement was perhaps less certain than we believed. In humans, the sequencing of breakpoint junctions has revealed that in 90% of cases the father of the carrier generated the rearrangement in their germline and passed it on to their offspring (Thomas et al.,

2010; Hockner et al., 2012). Observing the significant SNP associations in the Boars revealed that both parents often carried the associated allele, with the dams surprisingly carrying the allele at a higher proportion on average (78% to 60%). This could lead to questions on the exact origin of the rearrangement, if females may generate rearrangements at a higher proportion than males, or whether some of the dams may have actually been carriers themselves. Sequencing the breakpoint junctions of novel de novo rearrangement carriers could reveal if there is sex bias in the generation of rearrangements that could be incorporated into future analyses.

Post GWAS bioinformatic analysis revealed the presence of several genes within 2Mb of the SNPs, however most had no apparent functional relevance to chromosome rearrangement. No

GO terms indicating functional relevance towards the promotion of chromosome rearrangement

151 were identified in either the Dams or the Sires. In contrast however seven genes found within 2Mb of SNP associations from the Boars analysis had functions directly relevant to processes involving chromosome rearrangement. Although several genes nearby SNP associations found in the Dams of carriers were related to functions of the immune system, and many immune genes are implicated at the sites of mosaic rearrangements, there is no evidence to suggest that the activity of such genes acts to promote the generation of non-recurrent constitutional rearrangements, either at the position of those genes, or the genome at large. Analysis of the linkage disequilibrium between significant

SNPs and SNPs within or near candidate genes showed no significant haplotype blocks. Thus, we fail to find evidence supporting any significant SNPs being in linkage disequilibrium with any candidate genes. The low sample size available may hamper our ability to adequately test linkage between SNPs partially due to a mixture of breeds, and the failure to procure enough samples to recreate an ideal population for analysis.

For chromosome rearrangements to occur a series of events must be initiated, starting with the generation of a DSB, via exogenous or endogenous factors. Cellular machinery patrolling the cell for DNA damage must then recognize the DSB, and initiate processes to stop the cell cycle, and recruit DNA repair proteins to the site of the break. These DNA repair proteins then initiate

DNA repair by any number of situational mechanisms, resulting preferentially in the correct repair of the DSB. Many studies focus on the mechanism behind the DNA repair leading to the rearrangement event, however more recent studies have failed to detect one clear mechanism of formation, and instead show several different DNA repair mechanisms may be implicated at the breakpoint junctions of chromosome rearrangements (Nilsson et al., 2017). Thus, across different cases there appears to be no one common underlying method of aberrant repair leading to the generation of rearrangements.

152

Indeed, the generation of DSBs is routine in cells, with several such breaks occurring in each cell each day (Lieber et al., 2003; Martin et al., 1985). Even in those individuals that generate chromosome rearrangements the process of DNA repair appears to occur normally, as cytogenetic investigations often reveal no additional carriers amongst a particular litter. This suggests that even in animals capable of generating the rearrangements, aberrant DNA repair leading to chromosome rearrangement is not routinely occurring in every cell. Given that there appears to be little consistency in the DNA repair mechanism that generates the rearrangement, we thus considered whether any other factors associated with the DNA breakage and the resulting response were present in the carriers of rearrangements or their family members that could at least partially explain why chromosome rearrangements were occurring in those families. Each of the SNP associations found in carrier boars were within 2Mb of at least one gene which plays a role in the maintenance of genome integrity, or the identification and initiation of the DNA damage response.

Although many genes in the past have been implicated at the sites of rearrangement breakpoints, with rearrangement proposed to have occurred via the aberrant recombination between genes, this is not thought to be the case with non-recurrent constitutional rearrangements such as those carried by the Boars in this study. Non-recurrent constitutional rearrangements more often appear to occur in fragile regions of the genome, prone to breakage, with various repetitive elements, and in some cases genes being associated with the positions of breakpoints (Nilsson et al., 2017). No single unifying genic or repetitive elements however has been identified with the formation of a rearrangement. Instead breakage of chromosomes is a relatively common occurrence, occurring non-randomly throughout the genome, in proposed fragile regions with a common overarching genomic architecture (Donaldson et al., 2019). The routine breakage which occurs throughout the genome thus requires molecular machinery within the cells to routinely identify and repair the

153 damage. Chromosome rearrangements appear to occur when such processes are impaired. We thus sought to determine if any genes identified through the GO analysis played roles with functional relevance to DNA maintenance or repair, and how a change in function of those genes may result in the creation of a genomic environment more conducive to the generation of chromosome rearrangements.

Two candidate genes, MSH2 and MSH6, identified in the Boars analysis are part of a family of genes known as the mismatch repair (MMR) genes. MMR recognizes mismatched nucleotides in the DNA and performs an excision-resynthesis reaction to repair the mismatch (Spies and Fishel,

2015). The MSH2 protein complexes with MSH6 to form MutSα which recognizes base-base mismatches, and single and double nucleotide insertions (Johnson et al., 1996). Mutations in the

MMR genes are associated with Lynch syndrome, a non-polyposis colorectal cancer, as well as other malignancies such as cancers of the small bowel, and biliary tract (Aaltonen et al., 1998;

Aarnio et al., 1999; Watson et al., 1998). Over 100 mutations of MMR genes have been reported, primarily truncations resulting in a loss of function, with mutations in the MSH2 gene thought to result in more severe phenotypes than mutations of the MSH6 gene (Vasen et al., 2001; Hendriks et al., 2004; Peltomaki and Vasen, 2004; Janavicius and Elsakov, 2012).

Recent studies of these genes have proposed an association with MMR genes and DNA repair. The protein products of MSH2 and MSH6 have been shown to accumulate near the sites of

DSBs, and complex with ATR, a major initiator of the DNA damage response, helping to initiate

DNA repair via HR (Hong et al., 2008; Pichierri et al., 2001). MSH2 depleted cells have a reduced ability to respond to DNA damage, and conversely a large number of DSBs (Van Oers et al., 2014;

Burdova et al., 2015). The loss of function of MSH2 or MSH6 could therefore inhibit the ability of the cell to efficiently initiate the DNA damage response, resulting in the persistence and

154 accumulation of DSBs throughout the genome overtime. Although the precise mechanism is unclear, the persistence of DSBs in cells is associated with genomic instability and the generation of structural rearrangements (Alt et al., 2013). It could therefore be proposed that animals experiencing deficiency for MSH2 or MSH6 activity could have a genomic environment more conducive to the generation of chromosome rearrangements in the event that chromosome breaks are generated.

FANCM is one of eight Fanconi anemia (FA) genes which assemble into the FA core complex in response to the recognition of DNA damage (Whitby, 2010). This core complex ubiquitinates FANCD2 and FANCI, which in turn complexes with BRCA1 and RAD51, which will result in the stabilization of DSBs, and facilitates its eventual repair (Taniguchi et al., 2002).

FANCM itself is a sensor molecule involved in the activation of several repair and signalling pathways. FANCM anchors the complex, and heterodimerizes with another protein, FAAP24, to activate the DNA damage response, and promote DNA repair (Ciccia et al., 2007; Kim et al.,

2008). FANCM also promotes the repair of stalled replication forks by acting as a branch point translocase which encourages the regression of stalled replication forks (Gari et al., 2008).

Mutations in FANCM are known to impair cellular responses to DNA damage, increasing the susceptibility of cells to DNA damage, and increasing genomic instability (Fouquet et al., 2017).

It thus appears like MSH2 and MSH6, the loss of function of FANCM could lead to an inadequate response to DNA damage, allowing any DSBs generated to persist and accumulate in cells, and produce a genomic environment more conducive to the generation of chromosome rearrangements.

Another gene, ACTR5, produces the ACTR5 (or ARP5) protein, a key component of the

INO80 complex. The INO80 complex is vital to DNA repair and is recruited to sites of UV

155 radiation induced DNA lesions, playing a role in recombinatorial repair (Morrison et al., 2004; van

Attikum et al., 2004; Tsukuda et al., 2009). ACTR5 plays a crucial role in recruiting the INO80 complex to the chromatin, where it participates in chromatin remodelling, allowing DNA repair proteins to access the lesions for repair (Kitiyama et al., 2008). Although ACTR5 has not specifically implicated in INO80 defects, mutations involving genes that produce proteins for the

INO80 complex are associated with hypersensitivity to UV radiation, and an impaired response to

DNA damage (Shen et al., 2000; Wu et al., 2007; Jiang et al., 2010). This results in INO80 mutants having DNA lesions that persist four times longer than in normal individuals (Jiang et al., 2010).

As with the previous three genes, ACTR5 plays a role in the initial response to DNA damage and plays a major role in facilitating its faithful repair. It could therefore be proposed that a loss of function of this gene could lead to the persistence of DSBs generated throughout the genome, and result in a cellular environment more conducive to the generation of chromosome rearrangements.

SRC is another gene that acts during the initial stages of DNA damage. The SRC gene encodes a protein part of the Src family of tyrosine kinases, which are known to target DDR (DNA damage response) proteins. The SRC protein is known to become phosphorylated in response to

IR, and results in the interruption of cell cycle progression via the silencing of the ATR/Chk1 signalling cascade (Fukumoto et al., 2014). Subsequent dephosphorylation of the Src results in the degradation of checkpoint kinases and allows the cell cycle to continue (Fukumoto et al., 2014).

In addition, SRC has also been suggested to be an oncogene, as its activity has been shown to result in recovery from stalled replication forks, resulting in cell proliferation (Fukumoto et al.,

2014). Meanwhile inhibitors of SRC result in prolonged cell cycle arrest, and eventual apoptosis

(Fukumoto et al., 2014). Inhibition of SRC delays the recovery from cell cycle arrest following

DNA repair, and results in the persistent activation of ATM and ATR kinases. Although in the

156 case of the SRC gene, a clear role in the generation of chromosome rearrangements is unclear, its association with the ATM and ATR signalling pathways with which several of the above genes are associated with indicate a change in function could alter the way in which the cell responds to

DNA damage. In this case, we could again propose that animals expressing a deficiency for SRC could have a genomic environment more conducive to generation chromosome rearrangements through the persistence of genomic DSBs.

TOP1 is a DNA topoisomerase which helps to control and alter the topological structure of DNA during transcription. Topoisomerases assist this process by producing temporary nicks in the DNA, allowing the DNA to freely rotate around, relieving physical stress caused by the DNA pulling against supercoils (Champoux, 2001). The loss of function of TOP1 is quite damaging to cells, as the trapping of topoisomerase via topoisomerase inhibitors disrupts this process, stabilizing the cleavage complexes, preventing the re-ligation of the single strand breaks. This leads to the impairment of transcription, as supercoiling prevents the progression of replication forks, which subsequently collapse and in some cases form DSBs (Koster et al., 2007). For this reason, topoisomerase inhibitors are one of the methods used to treat cancer, as they primarily act on dividing cells, and act to supress their division (Xu and Her, 2015). Although there has been little study of TOP1 deficient cells, it could therefore be proposed that cells deficient in TOP1 are at increased risk of generating DSBs during transcription due to the failure to properly unwind

DNA, leading to the stalling of replication forks, and the formation of DSBs. This could lead to an increase in the number of DSBs in each cell, increasing the odds that multiple DSBs exist in a cell simultaneously, providing the substrate necessary for reciprocal translocations to occur. This could in turn result in the accumulation of DSBs beyond the capacity of cells to repair in a timely manner,

157 leading to a genomic environment more susceptible to the mis-repair of chromosome breaks, resulting in the production of chromosome rearrangements.

The retinoblastoma transcriptional co-repressor like 1 gene (RBL1) encodes for a protein which is similar in nature to the product of the retinoblastoma 1 (RB1) gene. Although little is known about the precise function of RBL1, it appears to have similar activity to RB1, and is thought to serve as a tumour suppressor. RB1 is associated with several cellular functions including the cell cycle and DNA repair, with cells deficient in RB1 having deficiencies in DNA repair, and an increase in free DSBs (Cook et al., 2015). The loss of RB1 is associated with loss of cNHEJ and delayed DSB clearance, with a concomitant increase in the use of error prone aNHEJ (Rothkamm et al., 2003). It is also thought that RB1 may facilitate chromatin modification via recruiting chromatin modifiers (Manning and Dyson, 2012). Overall it appears that RB1 plays a supporting role in cNHEJ, which is diminished in its absence. It is unclear whether the use of aNHEJ leads to an increase in rearrangements itself, or if it tends to be active in environments already conducive to the generation of rearrangements. The associated delay in the processing of DSBs in the absence of RB1 however as with several of the other candidate genes may lead to a cellular environment conducive to rearrangement. Although the exact role of RBL1 has yet to be elucidated, if it does indeed have similar activity to RB1, then there is a strong case that a change in function of RBL1 could promote chromosome rearrangement in some animals.

Using the genotyping data from the GWAS, we called CNV using PennCNV, which is typically favoured for porcine CNV calling (Winchester et al., 2009; Ramayo-Caldas et al., 2010).

The quality of CNV calls was unable to be assessed as the SNP arrays in most instances were performed by the farms, with the genotyping data being supplied later. Thus, we had few suitable

DNA samples in our possession to perform quality assurance. In order to compensate we applied

158 the same quality control of samples and SNPs as we did for the GWAS and applied additional filters in PennCNV to remove samples with excessive Log-R Ratio noise. In total we identified

137 CNVR across 51 cases and controls. The majority of CNVR were exclusive to one or two samples, with several of those being cases of inheritance between a parent and child. Case animals tended to have more CNV than control animals, with 1.54x more CNV on average. There was also a clear bias for case animals to have more losses of CNV than gains, especially relative to control animals.

CNVs, like reciprocal translocations, are formed from a rearrangement in genetic material, however the mechanisms by which they form are thought to be different. Whereas reciprocal translocations result from the breakage of chromosomes, followed by their mis-repair using DNA repair mechanisms such as NHEJ, CNVs are thought to be generated by aberrant NAHR between highly homologous chromosome regions. Although the mechanisms by which these events differ, the tendency for pigs within families associated with chromosome rearrangements to have more

CNVs could suggest that similar cellular environments may underlie both types of rearrangements, with animals from such families having less stable genomic environments more conducive to the generation of rearrangements of genetic material.

GO analysis of the genes in close proximity to these CNVRs showed few genes involved in processes relevant to chromosome rearrangement amongst the case samples. Just five genes related to significant GO terms appeared relevant to chromosome rearrangement: GEN1, PRPF19,

SMC6, TONSL, and FANCM. Notably FANCM has already been discussed amongst candidate genes from the Boars GWAS. As most CNVR were limited to a handful of samples rather than being associated with cases at large, we opted to think these genes from the perspective of each individual carrying the CNVR of interest. In those cases we opted to consider how a loss or gain

159 of activity of the particular gene could in theory result in the carrier being more susceptible to generating chromosome rearrangements. In the case of FANCM, a copy number loss was apparent near the site of this gene, suggesting a loss of function could be present. In that case we would expect an impairment in the ability to initiate the DNA damage response, leading to the persistence of DSBs, and creating an environment more conducive to generating chromosome rearrangements.

The TONSL gene overlapped with a copy number loss region in a carrier boar. Similar to other candidate genes identified in carrier boars, the protein product of TONSL plays no direct role in DNA repair, but instead complexes with another protein, MMS22L, forming the MMS22L-

TONSL complex. TONSL functions as a scaffold for the assembly of larger DNA repair complexes (Piwko et al., 2011). The MMS22L-TONSL complex recruits recombination substrates, and stimulates the assembly of RAD51 filaments at the sites of collapsed replication forks, helping to stabilize it, and promote repair via HR (Duro et al., 2010; O’Donnell et al., 2010).

A loss of function of TONSL could lead to a reduced ability to form this complex, and thus delay the repair of replication forks, possible leading to the generation of DSBs. In such cells, the tendency to generate more DSBs could lead to an increase in the likelihood that multiple DSBs exist at the same time, increasing the odds of a rearrangement to occur.

The GEN1 gene overlapped with a CNV region in the dam of a rearrangement carrier, exhibiting a copy number loss. The gene GEN1 encodes a protein that participates in the resolution of Holliday junctions, four-way structures which covalently link DNA during homologous recombination and DNA repair (Boddy et al., 2001). The Gen1 protein acts by producing nicks in the DNA across the junction, releasing the four-way conformation, and allowing the resulting products to be ligated, resolving the junction into separate DNA strands (Wang et al., 2017). The resolution of Holliday junctions is essential for the proper recombination of chromosomes, as they

160 result in crossover events. If left unresolved, Holliday junctions interfere with proper chromosome segregation, which may lead to the generation of DSBs. Gen1 is essential for the HR repair of

Holliday junctions (Wang et al., 2017). Loss of function seen at this locus could impair the ability of Gen1 to promote the repair of Holliday junctions, allowing them to persist, increasing genomic instability in these cells. Overtime this could increase the likelihood for chromosome rearrangements to occur due to through the accumulation of DSBs.

The SMC6 gene is located in the vicinity of GEN1 and is implicated in the same dam. The gene SMC6 produces a protein which is a key component of the SMC5-SMC6 complex. The

SMC5-SMC6 complex is poorly defined, however is thought to play a role in DNA repair by homologous recombination. Studies have shown the Smc5-Smc6 complex is required to initiate a coordinated response to DNA damage by homologous recombination (Lehmann, 2005). The loss of function seen at the site of this copy number variant could indicate the carriers of such as variant face an impaired ability to initiate DNA damage response. As discussed before impaired ability to repair DNA increases genomic instability overtime and increases the likelihood that DSBs from two chromosomes will exist at the same point in time, providing substrates for chromosome rearrangement to occur.

The PRFP19 gene overlapped with a CNV region in the dam of carrier exhibiting a copy number loss. PRPF19 plays a role in pre-mRNA splicing PRPF19, and complexes with CDC5L, forming the PRP19-CDC5L complex, playing a role in the DNA damage response/DDR (Marechal et al., 2014). The complex is recruited to the sites of DNA damage by the RPA complex where

PRPF19 directly ubiquitinates RPA1 and RPA2, which in turn promotes the recruitment of ATRIP, a regulatory partner of ATR kinase, promoting ATR activation (Marechal et al., 2014). RPA preferentially localizes to transcribed genes in response to DNA damage, suggesting that PRPF19

161 may act as a sensor of genomic instability during transcription (Jiang and Sancar, 2006). In keeping with the theme of other candidate genes, PRPF19 plays a role in the initiation of the DNA damage response, suggesting that a loss of function may hinder the ability of cells to respond to DNA damage, increasing genomic instability, and promoting rearrangement.

As many genes nearby significant SNP associations and CNV in carrier boars, and a handful of dams play roles in the initiation of DNA repair, it appears that a mechanism of chromosome rearrangement is more dependent on the ability to respond to collapsed replication forks and DSBs in a timely manner, rather than on impairment of the DNA repair process itself.

Although a precise mechanism of formation is unclear at this time it could be suggested that these boars and at least one parent carry genomic variants which impair the ability of their cells to respond to DNA damage in a timely manner. This allows DSBs to persist in cells for a longer period of time than normal and allow multiple DSBs to exist in the cells at the same time, providing an opportunity for rearrangement. This persistence in turn may push the cell towards alternative pathways associated with the generation of errors, such as aNHEJ (Rodgers and McVey, 2017). In such a cellular environment the DNA repair process itself is not impaired allowing for routine

DSBs to be faithfully repaired. Indeed increased susceptibility to rearrangement would result not from impairment of the repair process itself, but from impairment of the initiation of the response.

This would increase the longevity of DSBs within the cell, making it more likely that two or more

DSBs exist simultaneously within the cell, providing substrate for rearrangement, and perhaps pushing and stressing the cell resulting in the use of more error-prone DNA repair methods in order to reduce the amount of DSBs.

Despite the large number of genes near the sites of SNP associations and CNV it is unclear whether they actually play a role in promoting chromosome rearrangement in these animals, or

162 whether these associations were found by chance. A larger number of samples should help to alleviate this concern. It is also unclear the extent to which changes in the activity of these genes may increase the likelihood of rearrangement. Given the large numbers of pathways that respond to DNA damage and initiate repair, any genetic association with chromosome rearrangements is likely multi-genic in nature, with several variants producing small additive effects to raise the likelihood of chromosome rearrangement. Despite this, even with a small number of carrier boars we were able to find associated SNPs nearby relevant genes, a promising development in understanding how the genomes of individuals may contribute to rearrangement. Although it is currently unclear whether such information could be incorporated into SNP arrays to judge the relative risk of pigs within families acquiring rearrangements, these results do suggest that carriers of rearrangements may have genetic variants that could feasibly promote rearrangement through an inefficient response to DNA damage, allowing for the persistence of DSBs, and increasing the likelihood for rearrangement to occur. Further study which expands on these efforts and is able to determine precise genomic variants in the pig genome associated with chromosome rearrangement, as well as their relative effect of influencing the formation of chromosome rearrangements could identify SNP associations that could be incorporated into existing herd management infrastructure. SNP associations that identify boars and families most at risk of generating chromosome rearrangements, allowing for such animals to be bred out of herds, reducing the overall risk of de novo rearrangement in the population.

Through our study of twelve Boar carriers of non-recurrent constitutional translocations, we identified 5 SNP associations, four of which were nearby genes which played roles in the maintenance of genome integrity, or processes relevant to DNA repair. The identification of these genes, as well as others identified near the sites of CNVR identified in the parents of Boar carriers,

163 shows promising evidence that chromosome rearrangements are not truly random events in the genome, but may instead be influenced by the activity of genes. Indeed, the positions of rearrangement breakpoints which have long been studied appear dependent on the genomic architecture, with more fragile genomic regions appearing to break more often. This however appear to play no role in the actual generation of rearrangements within populations. Instead it appears that a subset of the population may carry SNPs or CNVs which alter the activity of genes relevant to DNA maintenance and repair, which may in turn increase the likelihood that any given chromosome break goes unrepaired or mis-repaired, regardless of genomic position, resulting in such animals acquiring more chromosome rearrangements overtime. Future studies which show more concrete associations between these SNPs and carriers of chromosome rearrangements could be incorporated into existing genetic management programs, and allow for animals more at risk of acquiring chromosome rearrangements to be excluded from breeding eligibility, which in turn may further lower the prevalence of chromosome rearrangements in swine herds below the de novo rate of formation.

164

General Discussion

An investigation into the chromosome composition of 6491 reproductively unproven young boars in Canada led to the observation of 101 carriers of chromosome rearrangements.

Amongst these were 59 carriers of 32 distinct constitutional reciprocal translocations, as well as

39 carriers of mosaic chromosome rearrangements, the first report in the pig through routine cytogenetic screening. Investigations of the gross genomic landscape surrounding breakpoints revealed a general architecture associated with rearrangement. This indicates that less condensed, more transcriptionally active chromatin regions are more susceptible to breakage, however this resolution proved too low to evaluate the exact causes of chromosome breakage in susceptible regions. Lastly, we attempted to find genomic variants in carriers of rearrangements and their parents, identifying five significant SNPs in carrier boars, and seven nearby genes with apparent functional relevance to chromosome rearrangement. Carrier boars and their parents also had an elevated number of CNVR, with a strong tendency towards loss of function, suggesting that these pigs may be pre-disposed to non-reciprocal rearrangements as well. As with the significant SNPs in the carrier boars, several genes were associated with specific CNVR that were functionally relevant to the promotion of chromosome rearrangements.

Cytogenetic screening revealed several novel rearrangements in the pig genome, yet to be reported by other laboratories. This included over 30 novel reciprocal translocations, a new inversion, and for the first time a constitutional deletion of a whole chromosome arm, a del(Yq-).

This reveals that the pig genome may be subject to a breadth of rearrangement well beyond what has previously been observed, with it likely that just a small percentage of possible rearrangements in the pig genome are known. Indeed, recurrence of constitutional rearrangements is quite rare in mammalian genomes, with just a handful of cases reported in humans (Ou et al., 2011). Thus it

165 was surprising when we observed a boar carrying a rcp(12;14)(q13;q21) rearrangement that had previously been reported in France by Pinton et al. (2005). Although we could not sequence both rearrangements in order to confirm that the breakpoints were in close proximity on the chromosomes, similar rearrangements in humans tend to occur within a small range, thus we propose this is likely the case here. Most recurrent rearrangements in the human genome for example are thought to occur via NAHR between highly homologous LCRs and at sites particularly susceptible to breakage such as PATRRs (Kurahashi et al., 2001; Ou et al., 2011). This provides a future topic for investigations into porcine rearrangements in order to determine if similar repetitive elements may be present at the sites of breakpoints and if they may promote recurrent rearrangements.

In addition to the constitutional rearrangements, for the first time in the domestic pig we observed mosaic rearrangements through routine cytogenetic screening. Mosaic rearrangements are not a new phenomenon in the pig and have been previously reported in a pig exhibiting physical malformations, and through the intensive screening of thousands of cells in order to observe a hypothesized rearrangement in the pig (Hansen-Melander and Melander, 1970; Musilova et al.,

2014). In this case we found the mosaic rearrangements by chance in these animals and were not specifically looking for their presence. Several of the mosaic rearrangements were unique cases, not having previously been identified in a mosaic or constitutional form. Meanwhile we also observed several cases of recurrent rearrangements, a t(7;9), and a t(7;18) rearrangement which was present in 21 and 3 boars respectively. These rearrangements, along with a t(9;18) which we observed were previously reported by Musilova et al. (2014), the apparent result of aberrant recombination between genes of the T-cell receptor. This indicates that mosaic rearrangements in the pig genome are influenced by different factors than constitutional rearrangements, such as less

166 extreme selection pressures, allowing rearrangements resulting in gene fusions or interruption to persist in a mosaic form. Indeed, the prevalence of these rearrangements also appears to be quite high, with an estimated 0.53% of peripheral blood lymphocytes appearing to carry a mosaic rearrangement. This is the first estimation of the prevalence of mosaic rearrangements in porcine peripheral blood lymphocytes and provides a reference for future investigations.

With the addition of 34 new cases of distinct constitutional rearrangements, and 41 new mosaic rearrangements, there are now over 270 chromosome rearrangements reported in the domestic pig

(Table S5). These rearrangements include the descriptions of 201 constitutional reciprocal translocations, 42 mosaic rearrangements, 2 complex reciprocal translocations (three or more breakpoints), 15 paracentric and pericentric inversions,6 Robertsonian translocations, 5 unbalanced rearrangements, and 1 deletion. With the rearrangements reported in this study Canada has now reported the second largest number of chromosome rearrangements in the domestic pig, behind only France. Nearly 1 in 4 (23%) of all described constitutional reciprocal translocations, and 1/3 of all chromosome rearrangements described in the domestic pig originated from Canada.

The introduction of a routine cytogenetic screening program revealed novel constitutional rearrangements in pigs, allowing carriers to be removed from breeding eligibility. A total of 34 distinct rearrangements were identified in 59 carriers, with the removal of carriers from our largest population resulting in a 54% decline in the prevalence of rearrangements from 1.14% to 0.52% over a five-year period. This 0.52% figure appears to reflect the de novo rate of formation for translocations in Canadian swine herds, as it reflects both the prevalence in the last three generations from the above herd (after existing rearrangements were screened out), and the prevalence of distinct (first-observation) rearrangements in general. This figure is close to that reported by France, 0.47%, indicating that with the introduction of screening in just a few years

167 the prevalence of rearrangements may be reduced to approximately 0.5%, reflecting the proposed de novo rate of formation (Ducos et al., 2007). The identification and removal of carriers from herds has clear financial benefits as it removes one of the most common causes of hypoprolificacy from impacting herds. Cytogenetic screening is estimated to save $7400 over a two-year period, showing it also has clear financial benefits for breeders (King et al., 2019). This screening however must be continued in perpetuity in order to ensure that all prospective breeding boars carrying chromosome rearrangements are removed, otherwise the prevalence will continue to creep back up over subsequent generations.

Given the high number of reciprocal translocations available for analysis in the pig genome, and the substantial re-use of breakpoints, it appeared relevant to explore the pig genome in order to determine if breakpoints were distributed non-randomly, and whether regions of the genome appeared more susceptible to rearrangement. Although chromosomes and chromosome arms appeared to rearrange in proportion to their length, this turned out to be untrue once length was accounted for. Chromosomes such as 7, 10, 12, and 14 had very high translocation frequencies which were independent of length. Hence, the structure of chromosomes appeared to have very little to do with rearrangement, with factors previously proposed to influence the number of rearrangements on porcine chromosomes including length, gene density, the presence of chromosome arms, the length of chromosome arms appearing to have no impact on translocation frequency of the whole chromosome (Yu et al., 1978; Aurias et al., 1978; Warburton et al., 1991;

Bickmore et al., 2002; Lin et al., 2018). Instead plotting out breakpoints on an ideogram of the pig karyotype revealed a large degree of clustering of breakpoints on a handful of cytogenetic bands.

Those chromosomes and chromosome arms with more of these clusters relative to their length were those same chromosomes with higher translocation frequencies. This indicated that

168 rearrangement in the pig genome was highly concentrated in just a handful of areas which we attempted to explore in more detail.

Although the resolution with which to explore the genome was limited to the cytogenetic band level, we were able to identify a general architecture of cytogenetic bands that appeared more susceptible to rearrangement. Those cytogenetic band most susceptible to rearrangement, were on average more euchromatic, and gene dense, implying a less condensed, more transcriptionally active chromatin environment. Those bands were also enriched for Simple Repeats, and tRNAs such as PREs, tRNAGlu-derived SINEs specific for the porcine lineage, which appear to have a similar organization as Alu repeats in primates (Groenen et al., 2012; Funkhouser et al., 2017).

Although together these factors explain just a small amount of variation between the translocation frequency of cytogenetic bands in the genome this does provide an overall picture of what a cytogenetic band more susceptible to rearrangements looks like generally. It is however apparent that much more specific factors in each cytogenetic band that are associated with the above are responsible for the susceptibility towards rearrangements.

This led to the identification of cytogenetic bands in the pig genome that appeared to be hotspots for chromosome rearrangement. As the resolution with which we could view these bands was limited, we considered whether comparison to homologous regions in the human genome would reveal if hotspots for rearrangement in the human genome were homologous to hotspots in the pig genome. Investigations of rearrangement in the human genome revealed cytogenetic bands with a similar architecture associated with susceptibility to rearrangement (Yu et al., 1978; Aurias et al., 1978; Koduru et al., 1988; Warburton, 1991; Cohen et al., 1996; Marechal et al., 2016; Li et al., 2018). Comparing the homologous regions between the two genomes however revealed that porcine and human rearrangement hotspots were rarely homologous for one another. This revealed

169 that although the general environment of rearrangement is consistent across the two species, those factors promoting rearrangement are likely different to a degree. It is difficult to ascertain those factors most associated with chromosome breakage, as repetitive elements once thought to mediate many rearrangements have been shown to appear less often than expected at breakpoint junctions

(Nilsson et al., 2017). As such there is still much to learn about the genomic landscape surrounding breakpoint junctions in order to better understand the factors that may pre-dispose these regions to breakage.

Surprisingly however the opposite was true of mosaic rearrangements in the pig genome.

The general architecture of bands susceptible to rearrangement remained largely the same, however the breakpoints of mosaic rearrangements were often homologous for other rearrangements in the human genome. In these cases, mosaic rearrangements were often homologous for other mosaic rearrangements in the human genome, the result of apparent aberrant recombination between two genes or translocation bringing two genes together. In the human genome those rearrangements are largely known due to their associations with various types of cancers, particularly of the blood (Freed et al., 2014). This suggests the same mechanisms of rearrangement are present in both species and are actively generating rearrangements. Many genes are highly conserved over the course of evolution, thus likely promoting the continued rearrangement in different species, even ones diverged millions of years ago, resulting in the

Philadelphia chromosome in humans, and the Raleigh chromosome in canids, both bringing together the BCR-ABL genes (Breen and Modiano, 2008). The remainder of the genome is however not nearly as conserved, particularly repetitive elements, suggesting that those more variable regions between species are likely driving the differences in rearrangement position between the human and pig genomes.

170

Despite the exact factors promoting individual chromosome breakage events in the pig genome remaining unknown, we know that each rearrangement is proceeded by the generation of a DSB. The cell must then detect the break, and initiate the DNA repair response, bringing the cell cycle to a halt, remodelling the local chromatin, and recruiting DNA repair proteins. A mistake then must be made in the process of DNA repair, ligating two non-homologous chromosome segments together permanently, resulting in a reciprocal translocation. Investigating the genomes of rearrangement carrier boars curiously revealed the presence of five SNP associations on three chromosomes that met a significant P-value threshold. In the vicinity of four of these SNP associations lay genes, each of which performed functions associated with aspects of DNA damage and repair, thought to be relevant to the generation of chromosome rearrangements. Studying the functions of those genes proteins products suggests that defects in the process of DNA repair may not necessarily drive the high prevalence of chromosome rearrangements seen in the pig, but instead suggests that some pigs carry genomic variants which inhibit the ability of cells to respond to DSBs in a timely manner, thus allowing DSBs to persist and accumulate in cells. Although this does not appear to influence in anyway the propensity of a given cell to experience chromosome breaks, this is likely to impact the ability of cells to respond to chromosome breaks once they occur. This increases the overall genomic instability of the cell, raising the likelihood that chromosome rearrangements are generated.

Given that DSBs are routinely generated in cells, defects in the process of DNA repair are likely to be quite deleterious in an individual. Cells with defects in a DNA repair mechanism often undergo an increase in mutations, which increases the genomic instability of cells, and is associated with the development of cancer (O’Driscoll, 2012). As such it appears that such defects may not be compatible with life if held in a constitutional state. Indeed, it is rare to observe a litter

171 with more than one de novo rearrangement carrier, suggesting that most cells from the parent who generated the rearrangement are still chromosomally normal. Thus it could be proposed that genomic variants which play a small but additive role in inhibiting the cell from responding quickly or efficiently to DSBs could lead to an increased risk of developing chromosome rearrangements, without greatly increasing the mutagenic potential of the cells, by increasing the number of unbalanced events resulting from defective or alternative DNA repair methods.

It has been long suggested that persistence of DSBs is associated with chromosome rearrangements, but it is unclear exactly how this occurs (Alt et al., 2013). DSB persistence has been proposed to push the cell towards alternative mechanisms of DNA repair, often associated with chromosome rearrangement such as aNHEJ. It is however unclear why this would occur, and whether it is the persistence of the DSB itself, or the use of aNHEJ that promotes rearrangement.

Indeed, the exact mechanism by which any chromosome rearrangements occur is largely unknown.

Although the general events necessary for the generation of rearrangements are known, individual factors that may promote rearrangement have yet to be elucidated. No one DNA repair mechanism has been shown to cause rearrangement, and instead a variety of canonical (NHEJ) and non- canonical (aNHEJ, MMBIR, and FoSTeS), have been implicated at the sites of breakpoint junctions (Nilsson et al., 2017). Thus, future research should be conducted to determine if DSB persistence can push cells towards utilizing error-prone methods of DNA repair, and how this occurs.

The rarity of recurrence for constitutional rearrangements in the pig genome suggests that whichever DNA repair mechanisms are resulting in the generation of rearrangements, there appears to be little preference for specific chromosome regions to rearrange together. At the chromosome level, most chromosome to chromosome rearrangements appear almost randomly

172 assorted, with few rearrangement partners appearing to display any preference for one another.

Indeed, at the cytogenetic band level as well, 15q13, which has participated in 11 distinct rearrangement events has done so with a different cytogenetic band each time. Thus, the rearrangements generated in the pig genome appear largely to be a function of which chromosome regions experience breakage at any given time, rather than specific bands preferentially undergoing rearrangement. Thus although chromosome breaks themselves appear non-randomly, with specific cytogenetic bands appearing more prone to breakage, this does not appear to influence particular rearrangement events, and it appears that any two DSBs that exist at the same time within a cell may be just as likely to undergo rearrangement as another pair. As such we have little evidence that suggests that the pig genome is anymore susceptible to breakage than other species such as humans. The evidence suggests however that genomic variants exist in some pigs that reduce their ability to efficiently respond to that DNA damage, resulting in DSBs that do occur to more frequently undergo rearrangement.

Regions of the pig genome appear more susceptible to breakage however the same two bands rarely translocate together. This suggests that rearrangement uses available DSBs rather than specific DSBs routinely to generate rearrangements. With repeated constitutional breakpoints likely being a function of the likelihood of a break occurring at that particular region, rather than a programmed rearrangement event whereby two regions will only rearrange with one another.

Thus, this results in what we observe though screening, where breakpoints are continually being re-used as they are more susceptible to breakage, however their partners are in a sense random, reflecting other breaks which occurred simultaneously, rather than a band with a specific architecture uniquely compatible with rearrangement together. This is also consistent with the idea that some pigs are less able to effectively respond to DNA breakage and allow DSBs to persist in

173 cells as this would result in some chromosome regions breaking more often, but overall randomly in a sense, allowing for the randomness of rearrangements seen. However, this does not account for the ability of the DSBs to come together and generate chromosome rearrangements. Cells with an excess of DSBs and how repair mechanisms respond to this are rarely studied, and it is difficult to determine exactly how such an event would be conducive to the development of rearrangements.

In such individuals it is highly likely that otherwise rare chromosome rearrangements would occur more often, in turn raising the likelihood of producing germline rearrangements, and produce carrier offspring.

In summary routine cytogenetic screening reveals a high prevalence of chromosome rearrangements in Canadian boars, similar to figures reported in France, indicating that the domestic pig regardless of breed or herd has a high prevalence of chromosome rearrangements.

The pig genome itself, at the resolution of cytogenetic bands, is shown to have distinct hotspots for rearrangement, however the exact reasons why those regions are susceptible to breakage, and whether the pig genome itself is especially susceptible to breakage remain unknown. In turn a

GWAS of carrier boars revealed several significant SNP associations. Analysis of the genomic regions surrounding these SNPs revealed several genes which played roles in the maintenance of

DNA, and the DNA repair process. The functions of these genes suggest that some pigs may carry genomic variants that reduce the ability of cells to efficiently respond to DNA damage, however a direct causal link between these genes and animals more susceptible to acquiring rearrangements has yet to be established. This in turn results in the persistence of DSBs, which is associated with pushing the cell towards alternative, error-prone mechanisms of DNA repair, which may in turn lead to the generation of reciprocal translocations, rather than faithfully repaired chromosomes.

Although there are many questions left to be answered this study has revealed a number of avenues

174 for future investigations in order to better understand the factors and mechanisms leading to chromosome rearrangement in the pig genome.

175

Summary, Conclusions, and Future Directions

Summary

Routine cytogenetic screening of reproductively unproven young boars resulted in the observation of 101 carriers of 74 novel chromosome rearrangements. For the first time an apparently recurrent constitutional rearrangement, a constitutional chromosome arm deletion, and mosaic rearrangements were identified in the domestic pig through routine cytogenetic screening. Over a five-year period, the prevalence of constitutional rearrangements dropped 54% in a population which successfully removed carriers from breeding eligibility, resulting in a prevalence of rearrangements close to the proposed de novo rate of formation. Subsequent analysis of the breakpoints of rearrangements revealed substantial re-use in both constitutional and mosaic rearrangements. Translocation breakpoints in constitutional rearrangements appeared to be non-randomly distributed, with an architecture of bands characterized by higher gene density, a more euchromatic landscape, and higher density of two classes of repetitive elements, having the highest translocation frequencies. Chromosomes and cytogenetic bands appeared to demonstrate no preference for breakage together, suggesting that rearrangements resulted from

DSBs available, rather than selecting for particular DSBs. The opposite was true of mosaic rearrangements, where recurrence for rearrangements was frequent, with cytogenetic bands associated with particular genes most often undergoing rearrangement together, the result of apparent aberrant recombination between those genes. Subsequent GWAS and CNV analysis revealed several candidate genes associated with carrier boars, and a loss of function in a carrier boar, and a handful of carrier dams. These genes had common elements, with many playing roles in responding to initiating the DNA damage response, and prevention of DSBs. These results indicate that genomic variants may exist in the pig genome which impair the ability of cells to

176 respond to DSBs, allowing them to persist in cells, which is thought to create a cellular environment more conducive to generating chromosome rearrangements via alternative or error prone DNA repair mechanisms.

Overall this work has established 101 new carriers of chromosome rearrangements, including 33 novel constitutional rearrangements, 41 novel mosaic rearrangements, and for the first time recurrent constitutional and mosaic rearrangements in the domestic pig. We have also shown that the prevalence of constitutional rearrangements will decrease to the de novo rate of formation within five years of beginning routine screening, with this de novo rate being consistent across different populations of pigs. We have also established a distinct genomic architecture of cytogenetic bands associated with higher frequency of translocation. In turn we have also shown that constitutional and mosaic rearrangements in mammalian genomes show consistently different patterns in their formation between species, with constitutional rearrangements in pigs and humans appearing more associated with repetitive elements, resulting in hotspots for rearrangement rarely appearing in syntenic regions between the two species, while mosaic rearrangements appear to often occur between the same genes in both pigs and humans. Lastly through a GWAS of rearrangement carrier boars we revealed for the first time

SNP associations with carrier boars, as well as CNVR present in carrier boars and their parents.

Genes identified nearby these SNPs and CNVR appear to play roles in the maintenance of DNA or DNA repair, suggesting that there may be genomic variants in the pig genome which may result in animals that are more susceptible to acquiring genomic rearrangements due to the impairment of DSB repair. This provides distinct avenues for future research into genomic variants linked to the promotion of chromosome rearrangements in the pig genome.

177

Conclusions

The routine cytogenetic screening of prospective breeding boars in Canada revealed 101 carriers of chromosome rearrangements or abnormalities. This included the identification of 31 novel constitutional reciprocal translocations, yet to be described in the literature, and the identification of the first apparently recurrent constitutional reciprocal translocation identified in the domestic pig, a rcp(12;14)(q13;q21) also identified in France. Amongst these cases we identified two carriers of a Y chromosome rearrangement, the second and fourth such cases ever identified in pigs. We also observed a rare occurrence of two full brothers carrying different reciprocal translocations. Notably we also identified a deletion of the long arm of chromosome Y in an observably phenotypically normal boar, the first such deletion ever identified in the domestic pig. Throughout our analysis we also made the first routine observation of mosaic chromosome rearrangements in commercial swine herds, allowing the first estimation of the frequency of somatic rearrangement in peripheral blood lymphocytes of 1/200 cells. Comparison of the mosaic rearrangement breakpoint regions with their human homologous counterparts reveals that many of the rearrangements appear in similar genomic locations to human somatic rearrangements associated with various cancers.

The routine cytogenetic screening of over 6000 boars over a five year period showed a high prevalence of chromosome rearrangements in Canadian swine herds. Overtime, those breeders that reliably removed identified carriers from breeding eligibility saw a reduction in the prevalence of constitutional reciprocal translocations, resembling the proposed de novo rate of formation. This same decrease however was not observed for mosaic rearrangements, showing that different selection pressures and/or mechansisms likely influenced their formation.

178

Comprehensive analysis of translocation breakpoints in the domestic pig revealed that they were non-randomly distributed across chromosomes, and cytogenetic bands. This indicates that particular genomic regions are more susceptible to breakage than others and are consistently re- used. Those cytogenetic bands with the highest number of breakpoints and/or highest translocation frequencies were referred to as hotspots for rearrangement in the pig genome. Several chromosomal features were identified to be associated with higher translocation frequency, including a more euchromatic chromatin conformation, higher gene density, and the presence of common fragile sites on euchromatic bands. These features however explained just a small portion of the variation in translocation frequency and breakpoint number for cytogenetic bands. Even analysis of repetitive elements associated with breakpoints in the human genome identified several that were significantly associated with higher translocation frequency, but still explained little of the overall variation between cytogenetic bands.

More specific analysis of breakpoint locations in the pig genome revealed that they were not more likely to occur in the same homologous regions as human rearrangement breakpoints than would be expected by chance. This indicates that rearrangement breakpoints in the pig genome appear to be influenced by an independent set of factors, perhaps unique to the pig genome, and that the genomic landscape underlying human rearrangements is not necessarily the same in the pig genome, perhaps pointing towards the independent evolution of repetitive elements between those genomes. Surprisingly however the opposite was true when considering somatic rearrangements, as the breakpoints of somatic rearrangements in the pig were routinely found to occur in the same homologous regions as human rearrangements associated with various cancers.

In these cases, it can be proposed that the underlying genomic architecture at these breakpoint

179 junctions could be shared, with similar factors, such as gene architecture, or aberrant V(D)J recombination could be promoting rearrangement at these genomic coordinates in both species.

A GWAS of translocation carrier boars, their sires, and their dams revealed several significant SNPs in the carrier boars and dams of carriers. GO analysis of the genes within 2Mb of these SNPs revealed several genes with possibly relevant functions to the promotion of chromosome rearrangement. In the case of the boars, five candidate genes were proposed: MSH2,

MSH6, ACTR5, FANCM, and TOP1. Despite linkage analysis showing no significant haplotype blocks between these genes and the significant SNPs, the identification of genes nearby these SNPs with relevant functions to the promotion of translocations is quite promising. These genes are related to the recognition of DNA damage, the initiation of the DNA damage response, and the relief of stress from supercoiling during DNA transcription respectively. Although these genes do not play direct roles in DNA repair, it allows a model of rearrangement to be proposed which is consistent with rearrangement being elevated in some individuals, but not comprising every gamete.

The loss or change in the function of the protein products encoded by these genes could each increase genomic instability overtime, leading to the generation of DSBs, or fail to initiate timely DSB repair, allowing DSBs to persist in cells longer than is normal. This could result in a higher likelihood that multiple DSBs will exist in the same cell at any given time, providing substrate for reciprocal translocations to occur. Thus, in these individuals, this would increase the likelihood that any meiotic cells would acquire chromosome rearrangements, creating gametes carrying the abnormality that could go on produce live offspring who will be carriers. It is unclear whether these genes are truly causative, requiring more samples from translocation carriers, and more in-depth analysis of their genomes, however in this small sample the identification of five

180 genes nearby significant SNPs that could feasibly increase the likelihood of rearrangement formation is incredible.

Appropriate candidate genes could not be identified in the dams or sires of carriers, with no significant SNPs being identified in the sires, while those significant SNPs identified in the

Dams were nearby few genes that could be suggested to play roles in rearrangement. Those genes that did play roles in relevant processes appeared weak as candidate genes overall. Analysis of

CNV however told a slightly different story. Although no one CNVR could be associated with chromosome rearrangements, several individual carrier boars, and dams of carriers had proposed

CNV which overlapped or were very near to genes playing roles relevant to the formation of chromosome rearrangements. Although it is unclear if these CNV are truly there, and if they played a role in the formation of chromosome rearrangements in those individuals or genetic lines, this does provide another avenue for investigation in order to determine as well if the generation of chromosome rearrangements has a more overarching general component, with a subset of animals in swine herds carrying a set of genomic variants which increase the likelihood of acquiring rearrangements. Alternatively, it could be explored if the generation of chromosome rearrangements is a largely private event, with the specific genome of the originator promoting rearrangement in a unique way shared by few other members of the herd.

Future Directions

This small sample of carriers of chromosome rearrangements and their parents reveals unexpected associations between SNPs with significantly different allelic frequencies between carriers of chromosome rearrangements, and control animals. In these cases, genes were identified within 2Mb of these SNPs which played roles in the suppression of DSB generation, or the response to DNA damage providing promising evidence that functional variants may exist in the

181 genomes of rearrangement carriers that may be linked to the promotion of chromosome rearrangement events. Although we could find no evidence of direct causal linkage between these genes and the presence of chromosome rearrangements in the pig genome, this still provides promising evidence that there may be associations between genes producing proteins related to

DNA damage, and carriers of chromosome rearrangements. The inability to determine which parent contributed the rearrangement in each case however supresses our ability to make concrete associations, as at least one of the alleles contributed to the carriers in each case is derived from a parent that did not contribute the rearrangement.

Expanding the study of chromosome rearrangements and applying NGS technologies may be imperative to fully understanding the genetic and genomic landscape of chromosome rearrangements in swine herds. The application of NGS to sequence breakpoint junctions would provide the ability to determine the mechanism of formation of each rearrangement by analysing the DNA repair mechanism signatures at breakpoint junctions. In addition, this would also allow the determination of the parental origin of the rearrangement in each case, allowing us to study those parents who contributed the rearrangement in order to determine associations between SNPs and the chromosome rearrangements without interference of the genotypes of the other parent. The expansion in the number of DNA samples from carrier boars and their parents could lead to more concrete determinations of associations between SNPs and the presence of chromosome rearrangements. As the formation of chromosome rearrangements is likely to be influenced in a multi-genic way, an increase in the number of samples would allow better analysis of the genomic profiles of rearrangement carriers and their parents who initially generated the rearrangement.

Identification of SNPs significantly associated with the presence of chromosome rearrangements, either directly or indirectly via linkage disequilibrium could open the door to

182 additional genomic analysis of these animals. Those genes with proposed relevant functions near the sites of these SNP associations could be sequenced, allowing the determination of any single nucleotide variants, or copy number variants that may change the function of these genes. This would allow for a more comprehensive proposal of the factors and mechanisms leading to the high prevalence of chromosome rearrangements in swine herds. In tandem with this, if genes nearby significant SNPs are shown to have some functional relevance, with animals of interest having apparent changes in function that could be linked to the formation of chromosome rearrangements, this information could be applied to existing SNP arrays already prevalent in the swine industry to identify those prospective breeding boars carrying those associated alleles. Experimentally removing such boars from breeding could then be done, and overtime as routine cytogenetic screening continues we could determine if the prohibition of boars carrying associated alleles has the affect of reducing the prevalence of chromosome rearrangements in swine herds.

Chromosome rearrangements are routinely spontaneously generated in swine herds at a rate of 1/200 live births. In both France and Canada, this de novo rate of formation is quite consistent, and has failed to be lowered over the years, even as carriers are excluded from breeding.

This is possibly thanks in part to siblings of carriers being permitted to breed, thus helping to pass on those alleles associated with rearrangements, allowing rearrangements to continually crop up in swine herds. Given the high spontaneous rate of formation, this ensures that for proper herd management, and the maintenance of litter sizes, routine cytogenetic screening must be performed in perpetuity, an expensive endeavour for swine breeders who often work within tight margins.

The identification of existing SNPs associated with the presence of chromosome rearrangements that could be incorporated into existing breeding values could potentially lead to the identification of those animals from genetic lines most as risk of acquiring chromosome rearrangements. The

183 removal of such animals from breeding eligibility could in turn have the effect of reducing the prevalence of chromosome rearrangements in swine herds below the 1/200 spontaneous rate of formation seen in Canada and France, at no additional cost to breeders already implementing SNP array technologies on their herds.

This could also be applied in other capacities in order to determine if similar mechanisms are related to animals at increased risk of generating somatic rearrangements. Somatic rearrangements in humans are particularly associated with the development of cancers, thus the ability to determine individuals more at risk of acquiring somatic rearrangements could be quite useful in a healthcare setting and identify individuals that should be more often screened for cancers. Despite chromosome rearrangements being a complicated disease state, we have identified some evidence suggesting that there may be a genetic component related to the generation of chromosome rearrangements. SNP associations in carrier boars of non-recurrent constitutional rearrangements are in the proximity of several genes, which play roles in the maintenance of DNA or DNA repair. Although we can not establish a direct causal link between these genes and the promotion of rearrangement in the genome, this does suggest that genomic variants may exist in these genes which work to promote the generation of chromosome rearrangements in the pig genome. Much additional work is needed in order to confirm such an association within the population, however for the first time there is distinct evidence of such an association that could be used to spur future genomic investigations of rearrangement carriers.

184

References

Aaltonen, L. A., Salovaara, R., Kristo, P., Canzian, F., Hemminki, A., Peltomäki, P., Chadwick, R.B., Kääriäinen, H., Eskelinen, M., Järvinen, H., & Mecklin, J. P. (1998). Incidence of hereditary nonpolyposis colorectal cancer and the feasibility of molecular screening for the disease. New England Journal of Medicine, 338(21), 1481-1487.

Aarnio, M., Sankila, R., Pukkala, E., Salovaara, R., Aaltonen, L. A., de la Chapelle, A., Peltomäki, P., Mecklin, J.P., & Järvinen, H. J. (1999). Cancer risk in mutation carriers of DNA‐mismatch‐repair genes. International journal of cancer, 81(2), 214-218.

Abyzov, A., Li, S., Kim, D. R., Mohiyuddin, M., Stütz, A. M., Parrish, N. F., Mu, X.J., Clark, W., Chen, K., Hurles, M., & Korbel, J. O. (2015). Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nature communications, 6(1), 1-12.

Agarwal, S., Tafel, A. A., & Kanaar, R. (2006). DNA double-strand break repair and chromosome translocations. DNA repair, 5(9-10), 1075-1081.

Akesson, A., & Henricson, B. (1972). Embryonic death in pigs caused by unbalanced karyotype. Acta veterinaria Scandinavica, 13(2), 151.

Almire, C., Bertrand, P., Ruminy, P., Maingonnat, C., Wlodarska, I., Martín‐Subero, J. I., Siebert, R., Tilly, H., & Bastard, C. (2007). PVRL2 is translocated to the TRA@ locus in t (14; 19)(q11; q13)‐positive peripheral T‐cell lymphomas. Genes, Chromosomes and Cancer, 46(11), 1011-1018.

Alonso, R.A. & Cantu, J.M. (1982). A Robertsonian translocation in the domestic pig (Sus scrofa) 37,XX,-13,-17,t rob(13;17). Ann. Genet, 25(1), 50–52.

Arakaki, D. T., & Sparkes, R. S. (1963). Microtechnique for culturing leukocytes from whole blood. Cytogenetic and Genome Research, 2(2-3), 57-60.

Astachova, N. M., Vysotskaya, L. V., & Graphodatsky, A. S. (1991). Detailed analysis of a new translocation in pig.

Aurias, A., Prieur, M., Dutrillaux, B., & Lejeune, J. (1978). Systematic analysis of 95 reciprocal translocations of autosomes. Human genetics, 45(3), 259-282.

Aylon, Y., Liefshitz, B., & Kupiec, M. (2004). The CDK regulates repair of double‐strand breaks by homologous recombination during the cell cycle. The EMBO journal, 23(24), 4868-4875.

Bacolla, A., Tainer, J. A., Vasquez, K. M., & Cooper, D. N. (2016). Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences. Nucleic acids research, 44(12), 5673-5688.

185

Bailey, J. A., Liu, G., & Eichler, E. E. (2003). An Alu transposition model for the origin and expansion of human segmental duplications. The American Journal of Human Genetics, 73(4), 823-834.

Barrett, J. C., Fry, B., Maller, J. D. M. J., & Daly, M. J. (2005). Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics, 21(2), 263-265. Basrur, P. K., & Stranzinger, G. (2008). Veterinary cytogenetics: past and perspective. Cytogenetic and genome research, 120(1-2), 11-25.

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological), 57(1), 289-300.

Bickmore, W. A. (2001). Karyotype analysis and chromosome banding. e LS.

Bickmore, W. A., & Teague, P. (2002). Influences of chromosome size, gene density and nuclear position on the frequency of constitutional translocations in the human population. Chromosome research, 10(8), 707-715.

Bloom, S. E., & Goodpasture, C. (1976). An improved technique for selective silver staining of nucleolar organizer regions in human chromosomes. Human Genetics, 34(2), 199- 206.

Boddy, M. N., Gaillard, P. H. L., McDonald, W. H., Shanahan, P., Yates 3rd, J. R., & Russell, P. (2001). Mus81-Eme1 are essential components of a Holliday junction resolvase. Cell, 107(4), 537-548.

Bösch, V. B., Hohn, H., & Rieck, G. W. (1985). Hermaphroditismus verus bei einem graviden Mutterschwein rnit einem 39, XX, 14+‐Mosaik. Reproduction in Domestic Animals, 20(3), 161-168.

Bouters, R., Bonte, P., & Vandeplassche, M. (1974, October). Anomalies chromosomiques et mortalite: embryonnaire chez le porc. In Ist World Congress on Genet. Applied to Livestock Production (pp. 7-11).

Breen, M., & Modiano, J. F. (2008). Evolutionarily conserved cytogenetic changes in hematological malignancies of dogs and humans–man and his best friend share more than companionship. Chromosome Research, 16(1), 145-154. Breeuwsma, A. J. (1968). A case of XXY sex chromosome constitution in an intersex pig. Reproduction, 16(1), 119-NP.

Breeuwsma, A. J. (1970) Studies on intersexuality in pigs. Ph.D. thesis, Research Institute for Animal Husbandry “Schoonoord”. Ziest, The Netherlands.

186

Bruere, A. N., Fielden, E. D., & Hutchings, H. (1968). XX/XY mosaicism in lymphocyte cultures from a pig with freemartin characteristics. New Zealand Veterinary Journal, 16(3), 31-38.

Bumgarner, R. (2013). Overview of DNA microarrays: types, applications, and their future. Current protocols in molecular biology, 101(1), 22-1.

Burdova, K., Mihaljevic, B., Sturzenegger, A., Chappidi, N., & Janscak, P. (2015). The mismatch-binding factor MutSβ can mediate ATR activation in response to DNA double-strand breaks. Molecular cell, 59(4), 603-614. Canadian Centre for Swine Improvement (CCSI) 2019 Annual Report. https://www.ccsi.ca/meetings/annual/2019AnnualReportEnglish.pdf

Carvalho, C. M., Pehlivan, D., Ramocki, M. B., Fang, P., Alleva, B., Franco, L. M., Belmont, J.W., Hastings, P.J., & Lupski, J. R. (2013). Replicative mechanisms for CNV formation are error prone. Nature genetics, 45(11), 1319.

Carvalho, C. M., Ramocki, M. B., Pehlivan, D., Franco, L. M., Gonzaga-Jauregui, C., Fang, P., McCall, A., Pivnick, E.K., Hines-Dowell, S., Seaver, L.H., & Friehling, L. (2011). Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome. Nature genetics, 43(11), 1074.

Caspersson, T., Zech, L., Johansson, C., & Modest, E. J. (1970). Identification of human chromosomes by DNA-binding fluorescent agents. Chromosoma, 30(2), 215-227.

Cauwelier, B., Cavé, H., Gervais, C., Lessard, M., Barin, C., Perot, C., Van den Akker, J., Mugneret, F., Charrin, C., Pagès, M.P., & Grégoire, M. J. (2007). Clinical, cytogenetic and molecular characteristics of 14 T-ALL patients carrying the TCRβ-HOXA rearrangement: a study of the Groupe Francophone de Cytogenetique Hematologique. Leukemia, 21(1), 121-128.

Champoux, J. J. (2001). DNA topoisomerases: structure, function, and mechanism. Annual review of biochemistry, 70(1), 369-413.

Chance, B., Sies, H., & Boveris, A. (1979). Hydroperoxide metabolism in mammalian organs. Physiological reviews, 59(3), 527-605.

Chiang, C., Jacobsen, J. C., Ernst, C., Hanscom, C., Heilbut, A., Blumenthal, I., Mills, R.E., Kirby, A., Lindgren, A.M., Rudiger, S.R., & McLaughlan, C. J. (2012). Complex reorganization and predominant non-homologous repair following chromosomal breakage in karyotypically balanced germline rearrangements and transgenic integration. Nature genetics, 44(4), 390.

Christensen, K., & Nielsen, P. B. (1980). A case of blood chimerism (XX, XY) in pigs. Animal blood groups and biochemical genetics, 11(1), 55-57.

187

Ciccia, A., Ling, C., Coulthard, R., Yan, Z., Xue, Y., Meetei, A. R., Laghmani, E.H., Joenje, H., McDonald, N., de Winter, J.P., & Wang, W. (2007). Identification of FAAP24, a Fanconi anemia core complex protein that interacts with FANCM. Molecular cell, 25(3), 331-343.

Clarkson, B. G., Fisher, K. R. S., & Partlow, G. D. (1995). Agonadal presumptive XX/XY leukochimeric pig. The Anatomical Record, 242(2), 195-199.

Cohen, O., Cans, C., Gilardi, J. L., Roth, H., Mermet, M. A., Jalbert, P., Demongoet, J., & Cuillel, M. (1996). Cartographic study: breakpoints in 1574 families carrying human reciprocal translocations. Human genetics, 97(5), 659-667.

Conrad, D. F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T. D., Barnes, C., Campbell, P., et al. (2010). Origins and functional impact of copy number variation in the human genome. Nature, 464(7289), 704-712.

Cook, R., Zoumpoulidou, G., Luczynski, M. T., Rieger, S., Moquet, J., Spanswick, V. J., Hartley, J.A., Rothkamm, K., Huang, P.H., & Mittnacht, S. (2015). Direct involvement of retinoblastoma family proteins in DNA repair by non-homologous end-joining. Cell reports, 10(12), 2006-2018.

Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2017. Available online: https://www.R- project.org/ (17-07-19).

COSMIC; http://www.sanger.ac.uk/genetics/CGP/cosmic/

Dagorn, R. (1978). Note aux établissements Départementaux de l’Elevage. Institut Technique du Porc, Paris.

Daley, G. Q., Van Etten, R. A., & Baltimore, D. (1990). Induction of chronic myelogenous leukemia in mice by the P210bcr/abl gene of the Philadelphia chromosome. Science, 247(4944), 824-830.

Daniel, K., Lange, J., Hached, K., Fu, J., Anastassiadis, K., Roig, I., Cooke, H.J., Stewart, A.F., Wassmann, K., Jasin, M., & Keeney, S. (2011). Meiotic homologue alignment and its quality surveillance are controlled by mouse HORMAD1. Nature cell biology, 13(5), 599-610.

Danielak-Czech, B., Kozubska-Sobocińska, A., & Rejduch, B. (2016). Molecular cytogenetics in the diagnostics of balanced chromosome mutations in the pig (Sus scrofa)–a review. Annals of Animal Science, 16(3), 679-699.

Danielak-Czech, B., Kozubska-Sobocinska, A., Slota, E., Rejduch, B. and Kwaczynska, A. (1996). Preliminary identification of pair 1 chromosome rearrangement in the Polish Landrace sow. Archivos de Zootecnia (Cordoba), 45, 215–219.

188

De Lorenzi, L., Morando, P., Planas, J., Zannotti, M., Molteni, L., & Parma, P. (2012). Reciprocal translocations in cattle: frequency estimation. Journal of Animal Breeding and Genetics, 129(5), 409-416.

Deininger, P. L., Moran, J. V., Batzer, M. A., & Kazazian Jr, H. H. (2003). Mobile elements and mammalian genome evolution. Current opinion in genetics & development, 13(6), 651-658. Dennis, G., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C., & Lempicki, R. A. (2003). DAVID: database for annotation, visualization, and integrated discovery. Genome biology, 4(9), R60.

Ducos, A., Berland, H. M., Bonnet, N., Calgaro, A., Billoux, S., Mary, N., Garnier-Bonnet A, Darré R, & Pinton, A. (2007). Chromosomal control of pig populations in France: 2002–2006 survey. Genetics Selection Evolution, 39(5), 583.

Ducos, A., Berland, H. M., Pinton, A., Guillemot, E., Seguela, A., Blanc, M. F., Darre, A., & Darre, R. (1998). Nine new cases of reciprocal translocation in the domestic pig (Sus scrofa domestica L.). Journal of Heredity, 89(2), 136-142.

Ducos, A., Berland, H. M., Pinton, A., Séguéla, A., Brun-Baronnat, C., & Darré, R. (2000). Contrôle chromosomique des populations animales d'élevage. Productions Animales 1 (13), 25-35.(2000).

Ducos, A., Calgaro, A., Mouney-Bonnet, N., Loustau, A. M., Revel, C., Barasc, H., Mary, N., & Pinton, A. (2017). Chromosomal control of pig populations in France: a 20-year perspective. Journées de la Recherche Porcine en France, 49, 49-50.

Ducos, A., Pinton, A., Berland, H. M., Seguela, A., Blanc, M. F., Darre, A., & Darre, R. (1998). Five new cases of reciprocal translocation in the domestic pig. Hereditas, 128(3), 221-229.

Ducos, A., Revay, T., Kovacs, A., Hidas, A., Pinton, A., Bonnet-Garnier, A., Molteni L., Slota E., Switonski M., Arruga M.V., van Haeringen W.A., Nicolae I., Chaves R., Guedes- Pinto H., Andersson M., & Iannuzzi L. (2008). Cytogenetic screening of livestock populations in Europe: an overview. Cytogenetic and genome research, 120(1-2), 26- 41.

Ducos, A., Revay, T., Kovacs, A., Hidas, A., Pinton, A., Bonnet-Garnier, A., Molteni, L., Slota, E., Switonski, M., Arruga, M.V., & Van Haeringen, W. A. (2008). Cytogenetic screening of livestock populations in Europe: an overview. Cytogenetic and Genome Research, 120(1-2), 26-41.

Duro, E., Lundin, C., Ask, K., Sanchez-Pulido, L., MacArtney, T. J., Toth, R., Ponting, C.P., Groth, A., Helleday, T., & Rouse, J. (2010). Identification of the MMS22L-TONSL complex that promotes homologous recombination. Molecular cell, 40(4), 632-644.

189

Dutrillaux, B. (1973). Coloration des chromosomes humains par l'acridine orange après traitement par le 5 bromodéoxyuridine.

Dutrillaux, B. (1973). Nouveau système de marquage chromosomique: les bandes T. Chromosoma, 41(4), 395-402.

Dutrillaux, B., & Lejeune, J. (1971). Sur une nouvelle technique d'analyse du caryotype humain. CR Acad. Sci.(Paris), 272, 2638-2640.

Edwards, J. H., Harnden, D. G., Cameron, A. H., Crosse, V. M., & Wolf, O. H. (1960). A new trisomic syndrome. The lancet, 275(7128), 787-790.

Elefanty, A. G., Hariharan, I. K., & Cory, S. (1990). bcr‐abl, the hallmark of chronic myeloid leukaemia in man, induces multiple haemopoietic neoplasms in mice. The EMBO journal, 9(4), 1069-1078.

Elliott, B., Richardson, C., & Jasin, M. (2005). Chromosomal translocation mechanisms at intronic alu elements in mammalian cells. Molecular cell, 17(6), 885-894.

El‐Zimaity, M. M., Kantarjian, H., Talpaz, M., O'Brien, S., Giles, F., Garcia‐Manero, G., Verstovsek, S., Thomas, D., Ferrajoli, A., Hayes, K., & Nebiyou Bekele, B. (2004). Results of imatinib mesylate therapy in chronic myelogenous leukaemia with variant Philadelphia chromosome. British journal of haematology, 125(2), 187-195.

Falk, M., Lukášová, E., & Kozubek, S. (2008). Chromatin structure influences the sensitivity of DNA to γ-radiation. Biochimica et Biophysica Acta (BBA)-Molecular Cell Research, 1783(12), 2398-2414.

Fantes, J. A., Boland, E., Ramsay, J., Donnai, D., Splitt, M., Goodship, J. A., Stewart, H., Whiteford, M., Gautier, P., Harewood, L., & Holloway, S. (2008). FISH mapping of de novo apparently balanced chromosome rearrangements identifies characteristics associated with phenotypic abnormality. The American Journal of Human Genetics, 82(4), 916-926.

Fechheimer, N. S. (1979). Cytogenetics in animal production. Journal of dairy science, 62(5), 844-853.

Feuk, L., Carson, A. R., & Scherer, S. W. (2006). Structural variation in the human genome. Nature Reviews Genetics, 7(2), 85-97.

Ford, C. E., & Hamerton, J. L. (1956). The chromosomes of man. Nature, 178(4541), 1020- 1023.

Ford, C. E., Jones, K. W., Polani, P. E., De Almeida, J. C., & Briggs, J. H. (1959). A sex- chromosome anomaly in a case of gonadal dysgenesis (Turner's syndrome). The

190

Lancet, 711-713.

Ford, C. E., Pollock, D. L., & Gustavsson, I. (1980). Proceedings of the First International Conference for the Standardisation of Banded Karyotypes of Domestic Animals University of Reading Reading, England 2nd‐6th August 1976. Hereditas, 92(1), 145- 162.

Foster, H. A., Griffin, D. K., & Bridger, J. M. (2012). Interphase chromosome positioning in in vitro porcine cells and ex vivo porcine tissues. BMC cell biology, 13(1), 30.

Fouquet, B., Pawlikowska, P., Caburet, S., Guigon, C., Mäkinen, M., Tanner, L., Hietala, M., Urbanska, K., Bellutti, L., Legois, B., & Bessieres, B. (2017). A homozygous FANCM mutation underlies a familial case of non-syndromic primary ovarian insufficiency. Elife, 6, e30490.

Friedberg, E. C., Walker, G. C., Siede, W., & Wood, R. D. (Eds.). (2005). DNA repair and mutagenesis. American Society for Microbiology Press.

Fries, R., & Stranzinger, G. (1982). Chromosomal mutations in pigs derived from X-irradiated semen. Cytogenetic and Genome Research, 34(1-2), 55-66.

Fukumoto, Y., Morii, M., Miura, T., Kubota, S., Ishibashi, K., Honda, T., Okamoto, A., Yamaguchi, N., Iwama, A., Nakayama, Y., & Yamaguchi, N. (2014). Src family kinases promote silencing of ATR-Chk1 signaling in termination of DNA damage checkpoint. Journal of Biological Chemistry, 289(18), 12313-12329.

Funkhouser, S. A., Steibel, J. P., Bates, R. O., Raney, N. E., Schenk, D., & Ernst, C. W. (2017). Evidence for transcriptome-wide RNA editing among Sus scrofa PRE-1 SINE elements. BMC genomics, 18(1), 360.

Gabrea, A., Martelli, M. L., Qi, Y., Roschke, A., Barlogie, B., Shaughnessy Jr, J. D., Sawyer, J.R., & Kuehl, W. M. (2008). Secondary genomic rearrangements involving immunoglobulin or MYC loci show similar prevalences in hyperdiploid and nonhyperdiploid myeloma tumors. Genes, Chromosomes and Cancer, 47(7), 573-590.

Gabriel, S. B., Schaffner, S. F., Nguyen, H., Moore, J. M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., & Liu-Cordero, S. N. (2002). The structure of haplotype blocks in the human genome. Science, 296(5576), 2225-2229.

Gari, K., Décaillet, C., Stasiak, A. Z., Stasiak, A., & Constantinou, A. (2008). The Fanconi anemia protein FANCM can promote branch migration of Holliday junctions and replication forks. Molecular cell, 29(1), 141-148.

Ghezraoui, H., Piganeau, M., Renouf, B., Renaud, J. B., Sallmyr, A., Ruis, B., Oh, S., Tomkinson, A.E., Hendrickson, E.A., Giovannangeli, C., & Jasin, M. (2014). Chromosomal translocations in human cells are generated by canonical

191

nonhomologous end-joining. Molecular cell, 55(6), 829-842.

Giglio, S., Calvari, V., Gregato, G., Gimelli, G., Camanini, S., Giorda, R., Ragusa, A., Guerneri, S., Selicorni, A., Stumm, M., & Tonnies, H. (2002). Heterozygous submicroscopic inversions involving olfactory receptor–gene clusters mediate the recurrent t (4; 8)(p16; p23) translocation. The American Journal of Human Genetics, 71(2), 276-285.

Goetze, S., Mateos-Langerak, J., Gierman, H. J., de Leeuw, W., Giromus, O., Indemans, M. H., Koster, J., Ondrej, V., Versteeg, R., & van Driel, R. (2007). The three-dimensional structure of human interphase chromosomes is related to the transcriptome map. Molecular and cellular biology, 27(12), 4475-4487.

Gustavsson, I. (1988). Standard karyotype of the domestic pig: Committee for the Standardized Karyotype of the Domestic Pig. Hereditas, 109(2), 151-157.

Gustavsson, I. (1990). Chromosomes of the pig. In Advances in veterinary science and comparative medicine (Vol. 34, pp. 73-107). Academic Press.

Gustavsson, I., Hageltorn, M., Zech, L., & Reiland, S. (1973). Identification of the chromosomes in a centric fusion/fission polymorphic system of the pig (Sus scrofa L.). Hereditas, 75(1), 153-155.

Gustavsson, I., Switoński, M., Iannuzzi, L., Plöen, L., & Larsson, K. (1989). Banding studies and synaptonemal complex analysis of an X-autosome translocation in the domestic pig. Cytogenetic and Genome Research, 50(4), 188-194.

Hageltorn, M., & Gustavsson, I. (1973). Giemsa staining patterns for identification of the pig mitotic chromosomes. Hereditas, 75(1), 144-146.

Hageltorn, M., Gustavsson, I., & Zech, L. (1976). Detailed analysis of a reciprocal translocation (13q‐; 14q+) in the domestic pig by G‐and Q‐staining techniques. Hereditas, 83(2), 268-272.

Hagemeijer, A., Kroeze, W. S., & Abels, J. (1980). Cytogenetic follow-up of patients with nonlymphocytic leukemia I. Philadelphia chromosome-positive chronic myeloid leukemia. Cancer Genetics and Cytogenetics, 2(4), 317-326.

Hancock, J. L., & Daker, M. G. (1981). Testicular hypoplasia in a boar with abnormal sex chromosome constitution (39 XXY). Reproduction, 61(2), 395-397.

Hansen-Melander, E., & Melander, Y. (1970). Mosaicism for translocation heterozygosity in a malformed pig. Hereditas, 64(2), 199-202.

Hassold, T. J. (1980). A cytogenetic study of repeated spontaneous abortions. American journal of human genetics, 32(5), 723.

192

Hassold, T., & Hunt, P. (2001). To err (meiotically) is human: the genesis of human aneuploidy. Nature Reviews Genetics, 2(4), 280-291.

Hassold, T., Warburton, D., Kline, J., & Stein, Z. (1984). The relationship of maternal age and trisomy among trisomic spontaneous abortions. American journal of human genetics, 36(6), 1349.

Hastings, P. J., Ira, G., & Lupski, J. R. (2009). A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS genetics, 5(1).

Hecht, F., & Hecht, B. K. (1984). Fragile sites and chromosome breakpoints in constitutional rearrangements I. Amniocentesis. Clinical genetics, 26(3), 169-173.

Hecht, F., & Hecht, B. K. (1984). Fragile sites and chromosome breakpoints in constitutional rearrangements II. Spontaneous abortions, stillbirths and newborns. Clinical genetics, 26(3), 174-177.

Heisterkamp, N., Groffen, J., Stephenson, J. R., Spurr, N. K., Goodfellow, P. N., Solomon, E., Carritt, B., & Bodmer, W. F. (1982). Chromosomal localization of human cellular homologues of two viral oncogenes. Nature, 299(5885), 747-749.

Helenglass, G., Testa, J. R., & Schiffer, C. A. (1987). Philadelphia chromosome‐positive acute leukemia: Morphologic and clinical correlations. American journal of hematology, 25(3), 311-324.

Hendriks, Y. M., Wagner, A., Morreau, H., Menko, F., Stormorken, A., Quehenberger, F., ... & Tops, C. (2004). Cancer risk in hereditary nonpolyposis colorectal cancer due to MSH6 mutations: impact on counseling and surveillance. Gastroenterology, 127(1), 17-25.

Henricson, B., & Backstrom, L. (1964). Translocation heterozygosity in a boar. Hereditas, 52(2), 166-170.

Höckner, M., Spreiz, A., Frühmesser, A., Tzschach, A., Dufke, A., Rittinger, O., Kalscheuer, V., Singer, S., Erdel, M., Fauth, C., & Grossmann, V. (2012). Parental origin of de novo cytogenetically balanced reciprocal non-Robertsonian translocations. Cytogenetic and genome research, 136(4), 242-245. Hoffelder, D. R., Luo, L., Burke, N. A., Watkins, S. C., Gollin, S. M., & Saunders, W. S. (2004). Resolution of anaphase bridges in cancer cells. Chromosoma, 112(8), 389-397.

Holmquist, G. P. (1992). Chromosome bands, their chromatin flavors, and their functional features. American journal of human genetics, 51(1), 17.

Hong, Z., Jiang, J., Hashiguchi, K., Hoshi, M., Lan, L., & Yasui, A. (2008). Recruitment of mismatch repair proteins to the site of DNA damage in human cells. Journal of cell

193

science, 121(19), 3146-3154. https://www.ensembl.org/Sus_scrofa/Info/Index https://www.ncbi.nlm.nih.gov/

Iannuzzi, L., & Di Berardino, D. (2008). Tools of the trade: diagnostics and research in domestic animal cytogenetics. Journal of applied genetics, 49(4), 357-366.

Ira, G., Pellicioli, A., Balijja, A., Wang, X., Fiorani, S., Carotenuto, W., Liberi, G., Bressan, D., Wan, L., Hollingsworth, N.M., & Haber, J. E. (2004). DNA end resection, homologous recombination and DNA damage checkpoint activation require CDK1. Nature, 431(7011), 1011-1017.

Jacobs, P. A., & Hassold, T. J. (1995). 4 the origin of numerical chromosome abnormalities. In Advances in genetics (Vol. 33, pp. 101-133). Academic Press.

Jacobs, P. A., & Strong, J. A. (1959). A case of human intersexuality having a possible XXY sex-determining mechanism. Nature, 183(4657), 302-303.

Janavicius, R., & Elsakov, P. (2012). Novel germline MSH2 mutation in lynch syndrome patient surviving multiple cancers. Hereditary cancer in clinical practice, 10(1), 1.

Ji, J., Clegg, N. J., Peterson, K. R., Jackson, A. L., Laird, C. D., & Loeb, L. A. (1996). In vitro expansion of GGC: GCC repeats: identification of the preferred strand of expansion. Nucleic acids research, 24(14), 2835-2840.

Jiang, G., & Sancar, A. (2006). Recruitment of DNA damage checkpoint proteins to damage in transcribed and nontranscribed sequences. Molecular and cellular biology, 26(1), 39- 49. Jiang, Y., Wang, X., Bao, S., Guo, R., Johnson, D. G., Shen, X., & Li, L. (2010). INO80 chromatin remodeling complex promotes the removal of UV lesions by the nucleotide excision repair pathway. Proceedings of the National Academy of Sciences, 107(40), 17274-17279.

Johansson, B., Fioretos, T., & Mitelman, F. (2002). Cytogenetic and molecular genetic evolution of chronic myeloid leukemia. Acta haematologica, 107(2), 76-94.

Khanna, K. K., & Jackson, S. P. (2001). DNA double-strand breaks: signaling, repair and the cancer connection. Nature genetics, 27(3), 247-254.

Kim, J. H., Hu, H. J., Yim, S. H., Bae, J. S., Kim, S. Y., & Chung, Y. J. (2012). CNVRuler: a copy number variation-based case–control association analysis tool. Bioinformatics, 28(13), 1790-1792.

Kim, J. M., Kee, Y., Gurtan, A., & D'Andrea, A. D. (2008). Cell cycle–dependent chromatin loading of the Fanconi anemia core complex by FANCM/FAAP24. Blood, The Journal

194

of the American Society of Hematology, 111(10), 5215-5222.

King, W. A., Gustavsson, I., Popescu, C. P., & Linares, T. (1981). Gametic products transmitted by rcp (13q‐; 14q+) translocation heterozygous pigs, and resulting embryonic loss. Hereditas, 95(2), 239-246.

Kitayama, K., Kamo, M., Oma, Y., Matsuda, R., Uchida, T., Ikura, T., Tashiro, S., Ohyama, T., Winsor, B., & Harata, M. (2009). The human actin-related protein hArp5: nucleo- cytoplasmic shuttling and involvement in DNA repair. Experimental cell research, 315(2), 206-217.

Koduru, P., & Chaganti, R. (1988). Congenital chromosome breakage clusters within Giemsa- light bands and identifies sites of chromatin instability. Cytogenetic and Genome Research, 49(4), 269-274.

Kogo, H., Tsutsumi, M., Ohye, T., Inagaki, H., Abe, T., & Kurahashi, H. (2012). HORMAD1‐ dependent checkpoint/surveillance mechanism eliminates asynaptic oocytes. Genes to Cells, 17(6), 439-454.

Konfortova, G. D., Miller, N. G. A., & Tucker, E. M. (1995). A new reciprocal translocation (7q+; 15q–) in the domestic pig. Cytogenetic and Genome Research, 71(3), 285-288.

Koster, D. A., Palle, K., Bot, E. S., Bjornsti, M. A., & Dekker, N. H. (2007). Antitumour drugs impede DNA uncoiling by topoisomerase I. Nature, 448(7150), 213-217.

Krallinger, H. F. (1927). Über die Chromosomenzahl beim Rinde sowie einige allgemeine Bemerkungen über die Chromosomenforschung in der Säugetierklasse. Verh. anat. Ges., Erg.-H. z. Anat. Anz, 63.

Krallinger, H. F. (1931). Cytologische studien an einigen haussäugetieren. Springer. Kubickova, S., Cernohorska, H., Musilova, P., & Rubes, J. (2002). The use of laser microdissection for the preparation of chromosome-specific painting probes in farm animals. Chromosome Research, 10(7), 571-577.

Kurahashi, H., & Emanuel, B. S. (2001). Long AT-rich palindromes and the constitutional t (11; 22) breakpoint. Human Molecular Genetics, 10(23), 2605-2617.

Kurahashi, H., Inagaki, H., Ohye, T., Kogo, H., Kato, T., & Emanuel, B. S. (2006). Palindrome-mediated chromosomal translocations in humans. DNA repair, 5(9-10), 1136-1145.

Kurahashi, H., Kogo, H., Tsutsumi, M., Inagaki, H., & Ohye, T. (2012). Failure of homologous synapsis and sex-specific reproduction problems. Frontiers in genetics, 3, 112.

195

Kurahashi, H., Shaikh, T. H., Hu, P., Roe, B. A., Emanuel, B. S., & Budarf, M. L. (2000). Regions of genomic instability on 22q11 and 11q23 as the etiology for the recurrent constitutional t (11; 22). Human Molecular Genetics, 9(11), 1665-1670.

LaFramboise, T. (2009). Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances. Nucleic acids research, 37(13), 4181-4193.

Lahbib-Mansais, Y., Karlskov-Mortensen, P., Mompart, F., Milan, D., Jørgensen, C. B., Cirera, S., Gorodkin, J., Faraut, T., Yerle, M., & Fredholm, M. (2005). A high- resolution comparative map between pig chromosome 17 and human chromosomes 4, 8, and 20: identification of synteny breakpoints. Genomics, 86(4), 405-413.

Lahbib-Mansais, Y., Mompart, F., Milan, D., Leroux, S., Faraut, T., Delcros, C., & Yerle, M. (2006). Evolutionary breakpoints through a high-resolution comparative map between porcine chromosomes 2 and 16 and human chromosomes. Genomics, 88(4), 504-512.

Langford, C. F., Miller, N. G. A., Tucker, E. M., Telenius, H., & Thomsen, P. D. (1993). Preparation of chromosome‐specific paints and complete assignment of chromosomes in the pig flow karyotype. Animal genetics, 24(4), 261-267.

Larkin, D. M., Pape, G., Donthu, R., Auvil, L., Welge, M., & Lewin, H. A. (2009). Breakpoint regions and homologous synteny blocks in chromosomes have different evolutionary histories. Genome research, 19(5), 770-777.

Le Noir, S., Ben Abdelali, R., Lelorch, M., Bergeron, J., Sungalee, S., Payet-Bornet, D., Villarese, P., Petit, A., Callens, C., Lhermitte, L., & Baranger, L. (2012). Extensive molecular mapping of TCRα/δ-and TCRβ-involved chromosomal translocations reveals distinct mechanisms of oncogene activation in T-ALL. Blood, The Journal of the American Society of Hematology, 120(16), 3298-3309.

Lee, J. A., Carvalho, C. M., & Lupski, J. R. (2007). A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. cell, 131(7), 1235-1247.

Lehmann, A. R. (2005). The role of SMC proteins in the responses to DNA damage. DNA repair, 4(3), 309-314.

Lejeune, J. Gautier, M., Turpin, R. (1959). Etude des chromosomes somatiques de neuf enfants mongoliens. CR Acad Sci (Paris), 248, 1721-1722.

Lepretre, S., Buchonnet, G., Stamatoullas, A., Lenain, P., Duval, C., Anjou, J., Callat, M.P., Tilly, H., & Bastard, C. (2000). Chromosome abnormalities in peripheral T-cell lymphoma. Cancer genetics and cytogenetics, 117(1), 71-79.

Lieber, M. R. (2010). The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annual review of biochemistry, 79, 181-

196

211.

Lieber, M. R., Ma, Y., Pannicke, U., & Schwarz, K. (2003). Mechanism and regulation of human non-homologous DNA end-joining. Nature reviews Molecular cell biology, 4(9), 712-720.

Lin, C. Y., Shukla, A., Grady, J. P., Fink, J. L., Dray, E., & Duijf, P. H. (2018). Translocation breakpoints preferentially occur in euchromatin and acrocentric chromosomes. Cancers, 10(1), 13.

Ločniškar, F., Gustavsson, I., Hageltorn, M., & Zech, L. (1976). Cytological origin and points of exchange of a reciprocal chromosome translocation (1p‐; 6q+) in the domestic pig. Hereditas, 83(2), 272-275.

Lojda, L. (1975). The cytogenetic pattern in pigs with hereditary intersexuality similar to the syndrome of testicular feminization in man.

Long, S. E. (1991). Reciprocal translocations in the pig (Sus scrofa): a review. The Veterinary record, 128(12), 275-278.

Lu, S., Wang, G., Bacolla, A., Zhao, J., Spitser, S., & Vasquez, K. M. (2015). Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell reports, 10(10), 1674-1680.

Lupski, J. R. (2015). Structural variation mutagenesis of the human genome: Impact on disease and evolution. Environmental and molecular mutagenesis, 56(5), 419-436.

Machiela, M. J., Jessop, L., Zhou, W., Yeager, M., & Chanock, S. J. (2017). Characterization of breakpoint regions of large structural autosomal mosaic events. Human molecular genetics, 26(22), 4388-4394.

Madan, K., Ford, C. E., & Polge, C. (1978). A reciprocal translocation, t (6p+; 14q−), in the pig. Reproduction, 53(2), 395-398.

Mäkinen, A., Andersson, M., & Nikunen, S. (1998). Detection of the X chromosomes in a Klinefelter boar using a whole human X chromosome painting probe. Animal reproduction science, 52(4), 317-323.

Manning, A. L., & Dyson, N. J. (2012). RB: mitotic implications of a tumour suppressor. Nature reviews Cancer, 12(3), 220-226.

Manuelidis, L., & Chen, T. L. (1990). A unified model of eukaryotic chromosomes. Cytometry: The Journal of the International Society for Analytical Cytology, 11(1), 8- 25.

197

Manzini, G., Yathindra, N., & Xodo, L. E. (1994). Evidence for intramolecularly folded i- DNA structures in biologically relevant CCC-repeat sequences. Nucleic acids research, 22(22), 4634-4640.

Maréchal, A., Li, J. M., Ji, X. Y., Wu, C. S., Yazinski, S. A., Nguyen, H. D., Liu, S., Jiménez, A.E., Jin, J., & Zou, L. (2014). PRP19 transforms into a sensor of RPA-ssDNA after DNA damage and drives ATR activation via a ubiquitin-mediated circuitry. Molecular cell, 53(2), 235-246.

Martin, G. M., Smith, A. C., Ketterer, D. J., Ogburn, C. E., & Disteche, C. M. (1985). Increased chromosomal aberrations in first metaphases of cells isolated from the kidneys of aged mice. Israel journal of medical sciences, 21(3), 296-301.

Massip, K., Bonnet, N., Calgaro, A., Billoux, S., Baquié, V., Mary, N., Bonnet-Garnier, A., Ducos, A., Yerle, M., & Pinton, A. (2009). Male meiotic segregation analyses of peri- and paracentric inversions in the pig species. Cytogenetic and genome research, 125(2), 117-124.

McFeely, R. A. (1966). A direct method for the display of chromosomes from early pig embryos. Reproduction, 11(1), 161-163.

McFeely, R. A. (1967). Chromosome abnormalities in early embryos of the pig. Reproduction, 13(3), 579-581.

Mehta, A., & Haber, J. E. (2014). Sources of DNA double-strand breaks and models of recombinational DNA repair. Cold Spring Harbor perspectives in biology, 6(9), a016428. Milligan, J. R., Wu, C. C., Aguilera, J. A., Fahey, R. C., & Ward, J. F. (1995). DNA repair by thiols in air shows two radicals make a double-strand break. Radiation research, 143(3), 273-280.

Mitelman, F., Johansson, B., & Mertens, F. (2007). The impact of translocations and gene fusions on cancer causation. Nature Reviews Cancer, 7(4), 233. Mittleman Database; https://mitelmandatabase.isb-cgc.org/

Miyake, Y. I., Kawata, K., Ishikawa, T., & Umezu, M. (1977). Translocation heterozygosity in a malformed piglet and its normal littermates. Teratology, 16(2), 163-167.

Moore, J. K., & Haber, J. E. (1996). Cell cycle and genetic requirements of two pathways of nonhomologous end-joining repair of double-strand breaks in Saccharomyces cerevisiae. Molecular and cellular biology, 16(5), 2164-2173.

Moorhead, P. S., Nowell, P. C., Mellman, W. J., Battips, D. T., & Hungerford, D. A. (1960). Chromosome preparations of leukocytes cultured from human peripheral blood. Experimental cell research, 20(3), 613-616.

198

Morrison, A. J., Highland, J., Krogan, N. J., Arbel-Eden, A., Greenblatt, J. F., Haber, J. E., & Shen, X. (2004). INO80 and γ-H2AX interaction links ATP-dependent chromatin remodeling to DNA damage repair. Cell, 119(6), 767-775.

Musilova, P., Drbalova, J., Kubickova, S., Cernohorska, H., Stepanova, H., & Rubes, J. (2014). Illegitimate recombination between T cell receptor genes in humans and pigs (Sus scrofa domestica). Chromosome research, 22(4), 483-493.

National Center for Biotechnology Information (NCBI)[Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; (1988) – (cited 2020 Feb 23). Available from: https://www.ncbi.nlm.nih.gov/ Nilsson, D., Pettersson, M., Gustavsson, P., Förster, A., Hofmeister, W., Wincent, J., Zachariadis, V., Anderlid, B., Nordgren, A., Makitie, O., et al. (2017). Whole‐genome sequencing of cytogenetically balanced chromosome translocations identifies potentially pathological gene disruptions and highlights the importance of microhomology in the mechanism of formation. Human mutation, 38(2), 180-192.

Nowell, P. C., & Hungerford, D. A. (1960). Chromosome studies on normal and leukemic human leukocytes. Journal of the National Cancer Institute, 25(1), 85-109.

O’Driscoll, M. (2012). Diseases associated with defective responses to DNA damage. Cold Spring Harbor perspectives in biology, 4(12), a012773.

O'Connor, R. E., Fonseka, G., Frodsham, R., Archibald, A. L., Lawrie, M., Walling, G. A., & Griffin, D. K. (2017). Isolation of subtelomeric sequences of porcine chromosomes for translocation screening reveals errors in the pig genome assembly. Animal genetics, 48(4), 395-403.

O'Donnell, L., Panier, S., Wildenhain, J., Tkach, J. M., Al-Hakim, A., Landry, M. C., Escribano-Diaz, C., Szilard, R.K., Young, J.T., Munro, M., & Canny, M. D. (2010). The MMS22L-TONSL complex mediates recovery from replication stress and homologous recombination. Molecular cell, 40(4), 619-631.

Opheim, K. E., Brittingham, A., Chapman, D., & Norwood, T. H. (1995). Balanced reciprocal translocation mosaicism: how frequent?. American journal of medical genetics, 57(4), 601-604.

Ou, Z., Stankiewicz, P., Xia, Z., Breman, A. M., Dawson, B., Wiszniewska, J., Szafranski, P., Cooper, M. L., Rao, M., Shao, L. et al. (2011). Observation and prediction of recurrent human translocations mediated by NAHR between nonhomologous chromosomes. Genome research, 21(1), 33-46.

Padula, A. M. (2005). The freemartin syndrome: an update. Animal Reproduction Science, 87(1-2), 93-109.

199

Pannunzio, N. R., Li, S., Watanabe, G., & Lieber, M. R. (2014). Non-homologous end joining often uses microhomology: implications for alternative end joining. DNA repair, 17, 74-80. Patau, K. A., Smith, D. W., Therman, E. M., Inhorn, S. L., and Wagner, H. P (1960). Multiple congenital anomaly caused by an extra autosome. The Lancet, 790-793.

Peiffer, D. A., Le, J. M., Steemers, F. J., Chang, W., Jenniges, T., Garcia, F., Haden, K., Li, J., Shaw, C.A., Belmont, J., & Cheung, S. W. (2006). High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome research, 16(9), 1136-1148.

Peltomäki, P., & Vasen, H. (2004). Mutations associated with HNPCC predisposition—update of ICG-HNPCC/INSiGHT mutation database. Disease markers, 20(4-5), 269-276.

Pfeiffer, P., Goedecke, W., & Obe, G. (2000). Mechanisms of DNA double-strand break repair and their potential to induce chromosomal aberrations. Mutagenesis, 15(4), 289-302.

Pichierri, P., Franchitto, A., Piergentili, R., Colussi, C., & Palitti, F. (2001). Hypersensitivity to camptothecin in MSH2 deficient cells is correlated with a role for MSH2 protein in recombinational repair. Carcinogenesis, 22(11), 1781-1787.

Pinton, A., Calgaro, A., Bonnet, N., Ferchaud, S., Billoux, S., Dudez, A. M., Mary, N., Massip, K., Bonnet-Garnier, A., Yerle, M. & Ducos, A. (2009). Influence of sex on the meiotic segregation of at (13; 17) Robertsonian translocation: a case study in the pig. Human reproduction, 24(8), 2034-2043.

Pinton, A., Ducos, A., & Yerle, M. (2003). Chromosomal rearrangements in cattle and pigs revealed by chromosome microdissection and chromosome painting. Genetics Selection Evolution, 35(7), 685.

Pinton, A., Ducos, A., Berland, H., Seguela, A., Brun‐Baronnat, C., Darré, A., Darré, R., Schmitz, A., & Yerle, M. (2000). Chromosomal abnormalities in hypoprolific boars. Hereditas, 132(1), 55-62.

Pinton, A., Ducos, A., Séguéla, A., Berland, H. M., Darré, R., Darré, A., Pinton, P., Schmitz, A., Cribiu, E.P., & Yerle, M. (1998). Characterization of reciprocal translocations in pigs using dual-colour chromosome painting and primed in situ DNA labelling. Chromosome Research, 6(5), 361-366.

Pinton, A., Faraut, T., Yerle, M., Gruand, J., Pellestor, F., & Ducos, A. (2005). Comparison of male and female meiotic segregation patterns in translocation heterozygotes: a case study in an animal model (Sus scrofa domestica L.). Human Reproduction, 20(9), 2476-2482.

Piwko, W., Buser, R., & Peter, M. (2011). Rescuing stalled replication forks: MMS22L- TONSL, a novel complex for DNA replication fork repair in human cells.

200

Popescu, C. P., & Legault, C. (1979). A new reciprocal translocation t (4q+; 14q-) in the domestic pig (Sus scrofa domesticus)[cytogenetics, chromosomes, reduced prolificacy, France]. Annales de Genetique et de Selection Animale (France).

Popescu, C. P., Bonneau, M., Tixier, M., Bahri, I., & Boscher, J. (1984). Reciprocal translocations in pigs: their detection and consequences on animal performance and economic losses. Journal of Heredity, 75(6), 448-452.

Popescu, C. P., Boscher, J., & Tixier, M. (1983). Une nouvelle translocation réciproque t, rcp (7 q-; 15 q+) chez un verrat «hypoprolifique». Génétique Sélection Évolution, 15(4), 479-488.

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I., Daly, M.J., & Sham, P. C. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. The American journal of human genetics, 81(3), 559-575. Quach, A. T., Revay, T., Villagomez, D. A., Macedo, M. P., Sullivan, A., Maignel, L., Wyss, S., Sullivan, B. & King, W. A. (2016). Prevalence and consequences of chromosomal abnormalities in Canadian commercial swine herds. Genetics Selection Evolution, 48(1), 66.

Quilter, C. R., Wood, D., Southwood, O. I., & Griffin, D. K. (2003). X/XY/XYY mosaicism as a cause of subfertility in boars: a single case study. Animal genetics, 34(1), 51-54.

Ramayo-Caldas, Y., Castelló, A., Pena, R. N., Alves, E., Mercadé, A., Souza, C. A., Fernández, A.I., Perez-Enciso, M., & Folch, J. M. (2010). Copy number variation in the porcine genome inferred from a 60 k SNP BeadChip. BMC genomics, 11(1), 593.

Raudsepp, T., & Chowdhary, B. P. (2011). Cytogenetics and chromosome maps. The genetics of the pig, (Ed. 2), 134-178. Rejduch, B., Slota, E., Rozycki, M., & Koscielny, M. (2003). Chromosome number polymorphism in a litter of European wild boar [Sus scrofa scrofa L.]. Animal Science Papers and Reports, 1(21). Richardson, C., & Jasin, M. (2000). Frequent chromosomal translocations induced by DNA double-strand breaks. Nature, 405(6787), 697-700.

Riggs, P. K., Kuczek, T., Chrisman, C. L., & Bidwell, C. A. (1993). Analysis of aphidicolin- induced chromosome fragility in the domestic pig (Sus scrofa). Cytogenetic and Genome Research, 62(2-3), 110-116.

Risch, N., Stein, Z., Kline, J., & Warburton, D. (1986). The relationship between maternal age and chromosome size in autosomal trisomy. American journal of human genetics, 39(1), 68.

201

Robberecht, C., Voet, T., Esteki, M. Z., Nowakowska, B. A., & Vermeesch, J. R. (2013). Nonallelic homologous recombination between retrotransposable elements is a driver of de novo unbalanced translocations. Genome research, 23(3), 411-418.

Robinson, J. A. B., & Buhr, M. M. (2005). Impact of genetic selection on management of boar replacement. Theriogenology, 63(2), 668-678.

Robinson, J. A. B., & Quinton, V. M. (2002, August). Genetic parameters of early neo-natal piglet survival and number of piglets born. In 7th World Congress on Genetics Applied to Livestock Production.

Rønne, M. (1990). Chromosome preparation and high resolution banding. In vivo (Athens, Greece), 4(6), 337-365.

Rønne, M. (1995). Localization of fragile sites in the karyotype of Sus scrofa domestica: present status. Hereditas, 122(2), 153-162.

Rothkamm, K., Krüger, I., Thompson, L. H., & Löbrich, M. (2003). Pathways of DNA double- strand break repair during the mammalian cell cycle. Molecular and cellular biology, 23(16), 5706-5715.

Sánchez‐Sánchez, R., Gómez‐Fidalgo, E., Pérez‐Garnelo, S., Martín‐Lluch, M., & De la Cruz‐ Vigo, P. (2019). Prevalence of chromosomal aberrations in breeding pigs in Spain. Reproduction in Domestic Animals, 54, 98-101.

Sankaranarayanan, K. (1979). The role of non-disjunction in aneuploidy in man an overview. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis, 61(1), 1- 28.

Sawyer, J. R., Lukacs, J. L., Munshi, N., Desikan, K. R., Singhal, S., Mehta, J., Siegel, D., Shaughnessy, J. & Barlogie, B. (1998). Identification of new nonrandom translocations in multiple myeloma with multicolor spectral karyotyping. Blood, The Journal of the American Society of Hematology, 92(11), 4269-4278.

Sax, K. (1938). Chromosome aberrations induced by X-rays. Genetics, 23(5), 494.

Schwerin, M., Golisch, D., & Ritter, E. (1986). A Robertsonian translocation in swine. Genetique, selection, evolution, 18(4), 367.

Scriven, P. N., Handyside, A. H., & Ogilvie, C. M. (1998). Chromosome translocations: segregation modes and strategies for preimplantation genetic diagnosis. Prenatal Diagnosis: Published in Affiliation With the International Society for Prenatal Diagnosis, 18(13), 1437-1449.

Seabright, M. (1971). A rapid banding technique for human chromosomes. lancet, 2, 971-972.

202

Séguéla-Arnaud, M., Crismani, W., Larchevêque, C., Mazel, J., Froger, N., Choinard, S., Lemhemdi, A., Macaisne, N., Van Leene, J., Gevaert, K., & De Jaeger, G. (2015). Multiple mechanisms limit meiotic crossovers: TOP3α and two BLM homologs antagonize crossovers in parallel to FANCM. Proceedings of the National Academy of Sciences, 112(15), 4713-4718.

Shaw, G. (2013). Polymorphism and single nucleotide polymorphisms (SNP s). BJU international, 112(5), 664-665.

Shen, X., Mizuguchi, G., Hamiche, A., & Wu, C. (2000). A chromatin remodelling complex involved in transcription and DNA processing. Nature, 406(6795), 541-544.

Shin, S. Y., Jang, S., Park, C. J., Chi, H. S., Lee, K. H., Huh, J., & Seo, E. J. (2012). A rare case of Lennert’s type peripheral T‐cell lymphoma with t (14; 19)(q11. 2; q13. 3). International journal of laboratory hematology, 34(3), 328-332.

Shin, Y. H., Choi, Y., Erdin, S. U., Yatsenko, S. A., Kloc, M., Yang, F., Wang, P.J., Meistrich, M.L., & Rajkovic, A. (2010). Hormad1 mutation disrupts synaptonemal complex formation, recombination, and chromosome segregation in mammalian meiosis. PLoS genetics, 6(11).

Skinner, B. M., Sargent, C. A., Churcher, C., Hunt, T., Herrero, J., Loveland, J. E., Dunn, M., Louzada, S., Fu, B., Chow, W., et al. (2016). The pig X and Y Chromosomes: structure, sequence, and evolution. Genome research, 26(1), 130-139.

Smit, AFA, Hubley, R & Green, P. RepeatMasker Open-4.0. 2013-2015 . Somlev, B., Hansen-Melander, E., Melander, Y., & Holm, L. (1970). XX/XY chimerism in leucocytes of two intersexual pigs. Hereditas, 64(2), 203-210.

Soulas-Sprauel, P., Rivera-Munoz, P., Malivert, L., Le Guyader, G., Abramowski, V., Revy, P., & De Villartay, J. P. (2007). V (D) J and immunoglobulin class switch recombinations: a paradigm to study the regulation of DNA end-joining. Oncogene, 26(56), 7780-7791.

Southwood, O. I., & Kennedy, B. W. (1991). Genetic and environmental trends for litter size in swine. Journal of animal science, 69(8), 3177-3182.

Stankiewicz, P., & Lupski, J. R. (2010). Structural variation in the human genome and its role in disease. Annual review of medicine, 61, 437-455.

Statistics Canada. Canadian Farm Cash Receipts. 2018.

Statistics Canada. Table 32-10-0160-01 Hogs statistics, number of hogs on farms at end of semi-annual period (x 1,000)

203

Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, 100(16), 9440-9445.

Sumner, A. T. (1972). A simple technique for demonstrating centromeric heterochromatin. Exp. Cell Res., 75, 304-306.

Sumner, A. T., Evans, H. J., & Buckland, R. A. (1971). New technique for distinguishing between human chromosomes. Nature New Biology, 232(27), 31-32.

Syeda, A. H., Hawkins, M., & McGlynn, P. (2014). Recombination and replication. Cold Spring Harbor perspectives in biology, 6(11), a016550.

Tarocco, C., Franchi, F., & Croci, G. (1987). A new reciprocal translocation involving chromosomes 1/14 in a boar. Genetique, selection, evolution, 19(3), 381.

Tate, J. G., Bamford, S., Jubb, H. C., Sondka, Z., Beare, D. M., Bindal, N., ... & Fish, P. (2019). COSMIC: the catalogue of somatic mutations in cancer. Nucleic acids research, 47(D1), D941-D947. (cancer.sanger.ac.uk)

Taylor, T. H., Gitlin, S. A., Patrick, J. L., Crain, J. L., Wilson, J. M., & Griffin, D. K. (2014). The origin, mechanisms, incidence and clinical consequences of chromosomal mosaicism in humans. Human reproduction update, 20(4), 571-581.

Telenius, H., Ponder, B. A., Tunnacliffe, A., Pelmear, A. H., Carter, N. P., Ferguson‐Smith, M. A., Nordenskjold, M., Pfragner, R., Ponder, B. A. (1992). Cytogenetic analysis by chromosome painting using DOP‐PCR amplified flow‐sorted chromosomes. Genes, Chromosomes and Cancer, 4(3), 257-263.

Thomas, N. S., Morris, J. K., Baptista, J., Ng, B. L., Crolla, J. A., & Jacobs, P. A. (2010). De novo apparently balanced translocations in man are predominantly paternal in origin and associated with a significant increase in paternal age. Journal of medical genetics, 47(2), 112-115.

Tjio, J. H., & Levan, A. (1956). The chromosome number of man. Hereditas, 42(1‐2), 1-6. Toyama, Y. (1974). Sex chromosome mosaicisms in five swine intersexes. Japanese Journal of Zootechnical Science.

Treff, N. R., Tao, X., Schillings, W. J., Bergh, P. A., Scott Jr, R. T., & Levy, B. (2011). Use of single nucleotide polymorphism microarrays to distinguish between balanced and normal chromosomes in embryos from a translocation carrier. Fertility and sterility, 96(1), e58-e65.

Truong, L. N., Li, Y., Shi, L. Z., Hwang, P. Y. H., He, J., Wang, H., Razavian, N., Berns, M.W. & Wu, X. (2013). Microhomology-mediated End Joining and Homologous Recombination share the initial end resection step to repair DNA double-strand breaks

204

in mammalian cells. Proceedings of the National Academy of Sciences, 110(19), 7720- 7725.

Tsukuda, T., Lo, Y. C., Krishna, S., Sterk, R., Osley, M. A., & Nickoloff, J. A. (2009). INO80- dependent chromatin remodeling regulates early and late stages of mitotic homologous recombination. DNA repair, 8(3), 360-369.

Uimari, P., & Tapio, M. (2011). Extent of linkage disequilibrium and effective population size in Finnish Landrace and Finnish Yorkshire pig breeds. Journal of animal science, 89(3), 609-614.

USDA, 2020 Livestock and Poultry: World Markets and Trade https://downloads.usda.library.cornell.edu/usda- esmis/files/73666448x/sb397r25n/0z709c25b/livestock_poultry.pdf van Attikum, H., Fritsch, O., Hohn, B., & Gasser, S. M. (2004). Recruitment of the INO80 complex by H2A phosphorylation links ATP-dependent chromatin remodeling with DNA double-strand break repair. Cell, 119(6), 777-788.

Van Holde, K. E. (2012). Chromatin. Springer Science & Business Media. van Oers, J. M., Edwards, Y., Chahwan, R., Zhang, W., Smith, C., Pechuan, X., Schaetzlein, S., Jin, B., Wang, Y., Bergman, A., & Scharff, M. D. (2014). The MutSβ complex is a modulator of p53-driven tumorigenesis through its functions in both DNA double- strand break repair and mismatch repair. Oncogene, 33(30), 3939-3946.

Vasen, H. F. A., Stormorken, A., Menko, F. H., Nagengast, F. M., Kleibeuker, J. H., Griffioen, G., Taal, B.G., Moller, P., & Wijnen, J. T. (2001). MSH2 mutation carriers are at higher risk of cancer than MLH1 mutation carriers: a study of hereditary nonpolyposis colorectal cancer families. Journal of Clinical Oncology, 19(20), 4074-4080.

Verardo, L. L., Sevón-Aimonen, M. L., Serenius, T., Hietakangas, V., & Uimari, P. (2017). Whole-genome association analysis of pork meat pH revealed three significant regions and several potential genes in Finnish Yorkshire pigs. BMC genetics, 18(1), 13.

Villagomez, D. A. F., Gustavsson, I., Jönsson, L., & Plöen, L. (1995). Reciprocal chromosome translocation, rcp (7; 17)(q26; q11), in a boar giving reduced litter size and increased rate of piglets dying in the early life. Hereditas, 122(3), 257-267.

Villagómez, D. A., Gustavsson, I., & Plöen, L. (1995). Synaptonemal complex analysis in a boar with tertiary trisomy, product of a rcp (7; 17)(q26; q11) translocation. Hereditas, 122(3), 269-277.

Villagomez, D. A., Revay, T., Donaldson, B., Rezaei, S., Pinton, A., Palomino, M., ... & King, W. A. (2017). Azoospermia and testicular hypoplasia in a boar carrier of a novel Y-

205

autosome translocation. Sexual Development, 11(1), 46-51.

Visscher, P. M., Wray, N. R., Zhang, Q., Sklar, P., McCarthy, M. I., Brown, M. A., & Yang, J. (2017). 10 years of GWAS discovery: biology, function, and translation. The American Journal of Human Genetics, 101(1), 5-22.

Vogt, D. W., Arakaki, D. T., & Brooks, C. C. (1974). Reduced litter size associated with aneuploid cell lines in a pair of full-brother Duroc boars. Am. J. Vet. Res. 35, 1127- 1130.

Vorsanova, S. G., Yurov, Y. B., & Iourov, I. Y. (2010). Human interphase chromosomes: a review of available molecular cytogenetic technologies. Molecular cytogenetics, 3(1), 1.

Wang, C., Higgins, J. D., He, Y., Lu, P., Zhang, D., & Liang, W. (2017). Resolvase OsGEN1 mediates DNA repair by homologous recombination. Plant physiology, 173(2), 1316- 1329.

Wang, H. C., & Fedoroff, S. (1972). Banding in human chromosomes treated with trypsin. Nature New Biology, 235(54), 52-54.

Wang, K., Li, M., Hadley, D., Liu, R., Glessner, J., Grant, S. F., Hakonarson, H., & Bucan, M. (2007). PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome research, 17(11), 1665-1674.

Wang, L., Xu, L., Liu, X., Zhang, T., Li, N., Zhang, Y., Yan, H., Zhao, K., Liu, G.E., Zhang, L., & Wang, L. (2015). Copy number variation-based genome wide association study reveals additional variants contributing to meat quality in Swine. Scientific reports, 5, 12535.

Wang, Y., Ding, X., Tan, Z., Ning, C., Xing, K., Yang, T., ... & Wang, C. (2017). Genome- wide association study of piglet uniformity and farrowing interval. Frontiers in genetics, 8, 194.

Warburton, D. (1991). De novo balanced chromosome rearrangements and extra marker chromosomes identified at prenatal diagnosis: clinical significance and distribution of breakpoints. American journal of human genetics, 49(5), 995.

Watson, P., Vasen, H. F., Mecklin, J. P., Bernstein, I., Aarnio, M., Järvinen, H. J., Myrhøj, T., Sunde, L., Wijnen, J.T., & Lynch, H. T. (2008). The risk of extra‐colonic, extra‐ endometrial cancer in the Lynch syndrome. International journal of cancer, 123(2), 444-449.

Whitby, M. C. (2010). The FANCM family of DNA helicases/translocases. DNA repair, 9(3), 224-236.

206

Winchester, L., Yau, C., & Ragoussis, J. (2009). Comparing CNV detection methods for SNP arrays. Briefings in functional genomics and proteomics, 8(5), 353-366.

Winiwarter, H. V. (1912). Etudes sur la spermatogenese humaine. Arch Biol (Liege), 27(91), 189.

Wolstenholme, J. (1995). An audit of trisomy 16 in man. Prenatal diagnosis, 15(2), 109-121.

Wu, P., Wang, K., Zhou, J., Yang, Q., Yang, X., Jiang, A., Jiang, Y., Li, M., Zhu, L., Bai, L. & Li, X. (2019). A genome wide association study for the number of animals born dead in domestic pigs. BMC genetics, 20(1), 4.

Wu, S., Shi, Y., Mulligan, P., Gay, F., Landry, J., Liu, H., Lu, J., Qi, H.H., Wang, W., Nickoloff, J.A., & Wu, C. (2007). A YY1–INO80 complex regulates genomic stability through homologous recombination–based repair. Nature structural & molecular biology, 14(12), 1165-1172.

Wyrobek, A. J., Schmid, T. E., & Marchetti, F. (2005). Relative susceptibilities of male germ cells to genetic defects induced by cancer chemotherapies. JNCI Monographs, 2005(34), 31-35.

Xu, Y., & Her, C. (2015). Inhibition of topoisomerase (DNA) I (TOP1): DNA damage repair and anticancer therapy. Biomolecules, 5(3), 1652-1670.

Yamaguchi, H., Uchihori, Y., Yasuda, N., Takada, M., & Kitamura, H. (2005). Estimation of yields of OH radicals in water irradiated by ionizing radiation. Journal of radiation research, 46(3), 333-341.

Yang, C. P., Wu, J. H., Hung, I. J., & Jaing, T. H. (2000). Cytogenetic pattern of childhood leukemia in Taiwan. Journal of the Formosan Medical Association= Taiwan yi zhi, 99(4), 281-289.

Yang, M. Y., & Long, S. E. (1993). Folate sensitive common fragile sites in chromosomes of the domestic pig (Sus scrofa). Research in veterinary science, 55(2), 231-235.

Yerle, M., Galman, O., & Echard, G. (1991). The high-resolution GTG-banding pattern of pig chromosomes. Cytogenetic and Genome Research, 56(1), 45-47.

Yerle, M., Lahbib‐Mansais, Y., Gellin, J., & Thomsen, P. D. (1993). Localization of the porcine growth hormone gene to chromosome 12pl. 2 p1–5. Animal Genetics, 24(2), 129-131.

Yu, C. W., Borgaonkar, D. S., & Bolling, D. R. (1978). Break points in human chromosomes. Human heredity, 28(3), 210-225.

207

Zhang, F., Gu, W., Hurles, M. E., & Lupski, J. R. (2009). Copy number variation in human health, disease, and evolution. Annual review of genomics and human genetics, 10, 451-481.

Appendix I Cases of Novel Constitutional Reciprocal Translocation in Canadian Boars

208

Case #1: A Canadian Yorkshire boar carrying a rcp(3;13)(q21;q21)

Figure 12: A. GTG-banded karyotype of a Yorkshire boar carrying a t(3;13)(q21;q21). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #2: A Duroc boar carrying a rcp(1;7)(q21;p11)

Figure 13: A. GTG-banded karyotype of a Duroc boar carrying a t(1;7)(q21;p11). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints.

209

Case #3: A Canadian Yorkshire boar carrying a rcp(5;12)(q11;q12)

Figure 14: A. GTG-banded karyotype of a Duroc boar carrying a t(5;12)(q11;q12). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #4: A Canadian Yorkshire boar carrying a rcp(1;3)(p23;q25)

Figure 15: A. GTG-banded karyotype of a Yorkshire boar carrying a t(1;3)(p23;q25). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #5: A Duroc boar carrying a rcp(4;12)(p11;p15)

210

Figure 16: A. GTG-banded karyotype of a Duroc boar carrying a t(4;12)(p11;p15). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #6: A French Landrace boar carrying a rcp(4;9)(p13;p24)

Figure 17: A. GTG-banded karyotype of a Landrace boar carrying a t(4;9)(p13;p24). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #7: A Yorkshire boar carrying a rcp(14;15)(q13;q15)

211

Figure 18: A. GTG-banded karyotype of a Landrace boar carrying a t(14;15)(q13;q15). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #8: A French Yorkshire boar carrying a rcp(Y;13)(p13;q33)

Figure 19: A. GTG-banded karyotype of a Yorkshire boar carrying a t(Y;13)(p13;q33). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #9: A Duroc boar carrying a rcp(9;13)(q24;q31)

212

Figure 20: A. GTG-banded karyotype of a Duroc boar carrying a t(9;13)(q24;q31). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #10: A French Landrace boar carrying a rcp(3;6)(q25;q11)

Figure 21: A. GTG-banded karyotype of a Duroc boar carrying a t(9;13)(q24;q31). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #11: A Yorkshire boar carrying a rcp(4;6)(q11;q27)

213

Figure 22: A. GTG-banded karyotype of a Yorkshire boar carrying a t(4;6)(q11;q27). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #12: A Yorkshire boar carrying a rcp(2;15)(q13;q24)

Figure 23: A. GTG-banded karyotype of a Yorkshire boar carrying a t(2;15)(q13;q24). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #13: A Yorkshire boar carrying a rcp(1;14)(q21;q14)

214

Figure 24: A. GTG-banded karyotype of a Yorkshire boar carrying a t(1;14)(q21;q14). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #14: A French Landrace boar carrying a rcp(3;6)(q13;p13)

Figure 25: A. GTG-banded karyotype of a Landrace boar carrying a t(3;6)(q13;p13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #15: A Duroc boar carrying a rcp(12;14)(q13;q21)

215

Figure 26: A. GTG-banded karyotype of a Yorkshire boar carrying a t(12;14)(q13;q21). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #16: A French Yorkshire boar carrying a rcp(6;7)(q33;q22)

Figure 27: A. GTG-banded karyotype of a Yorkshire boar carrying a t(6;7)(q33;q22). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #17: A Duroc boar carrying a rcp(13;18)(q21;q13)

216

Figure 28: A. GTG-banded karyotype of a Yorkshire boar carrying a t(13;18)(q21;q13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #18: A Duroc boar carrying a rcp(10;13)(p13;q31)

Figure 29: A. GTG-banded karyotype of a Yorkshire boar carrying a t(10;13)(p13;q31). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #19: A Landrace boar carrying a rcp(12;14)(q15;q23)

217

Figure 30: A. GTG-banded karyotype of a Landrace boar carrying a t(12;14)(q15;q23). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #20: A French Yorkshire boar carrying a rcp(10;13)(q13;q21)

Figure 31: A. GTG-banded karyotype of a Yorkshire boar carrying a t(10;13)(q13;q21). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #21: A Duroc boar carrying a rcp(2;10)(p17;q13)

218

Figure 32: A. GTG-banded karyotype of a Duroc boar carrying a t(2;10)(p17;q13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #22: A French Yorkshire boar carrying a rcp(15;18)(q24;q24)

Figure 33: A. GTG-banded karyotype of a Yorkshire boar carrying a t(15;18)(q24;q24). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #23: A French Yorkshire boar carrying a rcp(6;15)(q33;q13)

219

Figure 34: A. GTG-banded karyotype of a Yorkshire boar carrying a t(6;15)(q33;q13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #24: A French Yorkshire boar carrying a rcp(2;17)(p17;q13)

Figure 35: A. GTG-banded karyotype of a Yorkshire boar carrying a t(2;17)(p17;q13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #25: A French Yorkshire boar carrying a rcp(4;15)(q21;q11)

220

Figure 36: A. GTG-banded karyotype of a Yorkshire boar carrying a t(4;15)(q21;q11). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #26: A Yorkshire boar carrying a rcp(Y;1)(q11;q17)

Figure 37: A. GTG-banded karyotype of a Yorkshire boar carrying a t(Y;1)(q11;q17). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #27: A French Yorkshire boar carrying a rcp(9;14)(p13;q11)

221

Figure 38: A. GTG-banded karyotype of a Yorkshire boar carrying a t(9;14)(p13;q11). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #28: A Canadian Landrace boar carrying a rcp(1;14)(q2.11;q25)

Figure 39: A. GTG-banded karyotype of a Landrace boar carrying a t(1;14)(q2.11;q25). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #29: A French Yorkshire boar carrying a rcp(5;18)(q21;q11)

222

Figure 40: A. GTG-banded karyotype of a Yorkshire boar carrying a t(5;18)(q21;q11). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #30: A Yorkshire boar carrying a rcp(5;13)(q21;q43)

Figure 41: A. GTG-banded karyotype of a Yorkshire boar carrying a t(5;13)(q21;q43). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #31: A French Yorkshire boar carrying a rcp(7;9)(q15;p24)

223

Figure 42: A. GTG-banded karyotype of a French Yorkshire boar carrying a t(7;9)(q15;p24). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #32: A Crossbred boar carrying a rcp(13;14)(q31;q29)

Figure 43: A. GTG-banded karyotype of a Yorkshire boar carrying a t(13;14)(q31;q29). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Repeated Cases of Constitutional Reciprocal Translocations Acquired via Inheritance

Case #2b through 2i: Duroc boars carrying identical rcp(1;7)(q21;p11)

224

Case #2, a carrier of a rcp(1;7)(q21;p11), was experimentally bred after initial cytogenetic investigation revealed a rearrangement, producing fifteen litters. Peripheral blood samples from fifteen boars resulting from these litters entered our lab over a three week period for cytogenetic analysis. Eight of these boars, referred to as Cases #2b through #2i, were found to have inherited the rcp(1;7)(q21;p11) rearrangement from Case 2. The karyotype of case 2b is presented below

(Figure 44).

Figure 44: A. GTG-banded karyotype of a Duroc boar carrying a t(1;7)(q21;p11). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #3b and 3c: Canadian Yorkshire boars carrying identical rcp(5;12)(q11;q12)

Case #3, a carrier of a rcp(5;12)(q11;q12), was bred experimentally after initial cytogenetic reporting, producing fifteen litters. Peripheral blood samples from five boars resulting from these litters entered our lab in the same week for cytogenetic analysis. Two of these boars, referred to as

Cases #3b and #3c, were found to have inherited the rcp(5;12)(q11;q12) rearrangement from Case

#3. The karyotype of case 3b is presented below (Figure 45).

225

Figure 45: A. GTG-banded karyotype of a Duroc boar carrying a t(5;12)(q11;q12). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #15b-15g: A Duroc boar carrying a rcp(12;14)(q13;q21)

Case #15, a carrier of a rcp(12;14)(q13;q21) was identified in our laboratory. After the reporting of the rearrangement, six more carriers of the rcp(12;14)(q13;q21) were identified. These carriers were all Duroc boars and gilts/sows from the same farm as Case #15, and was later revealed to be directly related to the original carrier. The karyotype of case 15b is presented below

(Figure 46).

226

Figure 46: A. GTG-banded karyotype of a Duroc boar carrying a t(12;14)(q13;q15). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Cases 17b through 17g: Duroc boars and gilts carrying identical rcp(13;18)(q21;q13)

Case #17, a carrier of a rcp(13;18)(q21;q13) was identified in our laboratory. After the reporting of the rearrangement, six additional carriers of the rcp(13;18)(q21;q13) were identified.

Cases 17b, 17c, 17d, and 17g were boars, while cases 17e and 17f were gilts. All were the offspring of the original carrier of the rcp(13;18)(q21;q13), and inherited this rearrangement. The karyotype of case 17b is presented below (Figure 47).

227

Figure 47: A. GTG-banded karyotype of a Duroc boar carrying a t(13;18)(q21;q13). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Cases 18b through 18f: Duroc boars and gilts carrying identical rcp(10;13)(p13;q31)

Case #22, a carrier of a rcp(10;13)(p13;q31) was identified in our laboratory. After the reporting of the rearrangement, five additional carriers of the rcp(10;13)(p13;q31) were identified.

Cases 18b was a boars, while cases 18c through 18f were gilts. All were the offspring of the original carrier, Case 18, and inherited this rearrangement directly. The karyotype of case 18b is presented below (Figure 48).

228

Figure 48: A. GTG-banded karyotype of a Duroc boar carrying a t(10;13)(p13;q31). Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Cases of Mosaic Reciprocal Translocation in Canadian Boars

Cases #65 through #67: Boars carrying recurrent mos t(7;9)(q24;q24)

Peripheral blood samples from three unrelated boars, of various breeds, two Duroc, and one of unknown breed entered our laboratory for the purpose of cytogenetic control. Standard

GTG-banding and karyotyping was performed for each animal. In each case it was revealed that one of the two karyotypes arranged carried a mos t(7;9)(q24;q24). In each case additional karyotypes (n = 25) were performed, with no additional abnormal karyotypes being observed, indicating this rearrangement was present in a mosaic state. This particular mosaic rearrangement has been found recurrently in Canadian swine herds, with 18 cases previously described in Rezaei et al. (2020). With these three cases included there have now been 21 total cases of mos t(7;9)(q24;q24) described during the routine cytogenetic screening of Canadian swine herds. The first 18 cases are described in detail by Rezaei et al. (2020).

Figure 49: A. GTG-banded karyotype of a Pietrain boar carrying two mos t(7;9)(q24;q24) in a mosaic state. Derivative chromosomes are placed to the right. Arrows indicate presumptive

229 breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #68: A Duroc boar carrying a recurrent mos t(7;18)(q22;q11)

A peripheral blood sample from a Duroc boar entered our laboratory for the purpose of cytogenetic control. Standard GTG-banding and the arrangement of two karyotypes revealed in one karyotype presenting with a mos t(7;18)(q22;q11). Additional karyotypes were arranged bringing the total number to 25, with no additional rearrangements being observed, indicating this rearrangement was in a mosaic state. This rearrangement has been described previously in two other boars by our laboratory, and is proposed to occur recurrently in Canadian swine herds (Rezaei et al., 2020).

Figure 50: A. GTG-banded karyotype of a Duroc boar carrying a t(7;18)(q22;q11) in a mosaic state. Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #69: A Landrace boar carrying a mos t(8;16)(q21;q21)

230

Figure 51: A. GTG-banded karyotype of a Landrace boar carrying a t(8;16)(q21;q21) in a mosaic state. Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #70: A Yorkshire boar carrying a mos t(1;2)(p23;q22)

Figure 52: A. GTG-banded karyotype of a French Yorkshire boar carrying two mos t(1;2)(p23;q23) in a mosaic state. Derivative chromosomes are placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of each chromosome pair indicates the normal chromosome structure. The ideogram to the right of each chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #71: A Duroc boar carrying a mos t(1;1)(1q2.11;1q21)

231

Figure 53: A. GTG-banded karyotype of a Duroc boar carrying two mos t(1;1)(q2.11;q21) in a mosaic state. Derivative chromosomes are indicated with arrows positioned at the presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the far left of the chromosome pair indicates normal chromosome structure. The ideogram directly left of the chromosome pair indicates the first derivative chromosome structure, and the ideogram to the right indicates the second derivative chromosome structure. Arrows indicate presumptive breakpoints.

Cases of Chromosomal Deletion, Inversion, and Chimerism in Canadian Boars

Case #72: A Boar carrying a del(Y)

A peripheral blood sample from a boar entered our laboratory for the purpose of cytogenetic control. GTG-banding and subsequent karyotyping revealed an apparent deletion of the long arm of chromosome Y. Subsequent karyotypes revealed that this was not an artifact but a constitutional deletion of the long arm of chromosome Y (Yq-). The deletion appears to be at or near the centromere, with the entire long arm missing. This is the first case of a Y-chromosome deletion identified in pigs, and is just the second case of a boar carrying a constitutional aneuploidy ever identified.

232

Figure 54: A. GTG-banded karyotype of a boar carrying a del(Y). The derivative chromosome is highlighted by an arrow. B. GTG-banded chromosomes. The chromosome to the right is the derivative chromosome. The chromosome to the left is a normal Y chromosome for reference. The ideogram to the left of the chromosome pair indicates the normal chromosome structure. The ideogram to the right of the chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case #73: A Duroc boar carrying an inv(9)(p11;p22)

Figure 55: A. GTG-banded karyotype of a Duroc boar carrying a inv(9)(p11;p22). The derivative chromosome is placed to the right. Arrows indicate presumptive breakpoints. B. GTG-banded chromosomes. The ideogram to the left of the chromosome pair indicates the normal chromosome structure. The ideogram to the right of the chromosome pair indicates the derivative chromosome structure. Arrows indicate presumptive breakpoints. Case 74: A Duroc boar carrying two distinct cell lines XX and XY

A peripheral blood sample from a Duroc boar entered our laboratory for the purposes of cytogenetic control. GTG-banding and the subsequent arrangement of multiple karyotypes

233 revealed that the boar carried two distinct cell lines with different sex chromosomes composition.

The arrangement of additional karyotypes, bringing the total number to 10, indicated that nine cells had an XY composition, and one cell had an XX composition. The boar is thus described as being

38, XX [1]/XY[9]/38.

Figure 56: Karyotypes of a boar exhibiting two distinct cell lines. A. GTG-banded karyotype of a boar with XY sex chromosomes. B. GTG-banded karyotype of the same boar with a different cell line with XX sex chromosomes. Table 50: List of Constitutional Reciprocal Translocations in the Domestic Pig

Year Rearrangement Country Breed Author

2017 t(1;2) United Kingdom Not Indicated O'Connor et al., 2017 1987 t(1;3)(p;q) Russia Large White Konovalov et al., 1987 2019 t(1;3)(p23;q25) Canada Yorkshire Donaldson et al., 2019 2007 t(1;4)(q27;q21) Germany Pietrain Ducos et al., 2007 1994 t(1;5)(p21;q21) Poland Not Indicated Danielak-Czech et al., 1994 2016 t(1;5)(q21;q23) Canada Landrace Quach et al., 2016 2009 t(1;6)(p22;q12) Canada Large White x Duroc Quach et al; 2009 1992 t(1;6)(p11;q11) Switzerland Landrace Yang et al., 1992 1976 t(1;6)(p11;q35) Sweden Large White Locniskar et al., 1976 1998 t(1;6)(q12;q22) France Gascon Ducos et al., 1998a 2014 t(1;6)(q17;p11) Spain Not Indicated Martin-Iluch et al., 2014 2000 t(1;6)(q17;q35) France Synthetic Pinton et al., 2000 2007 t(1;7)(q17;q13) The Netherlands Not Indicated Ducos et al., 2007 2019 t(1;7)(q21;p11) Canada Duroc Donaldson et al., 2019 1988 t(1;7)(q2.13;q24) Sweden Landrace Gustavsson et al., 1988

234

1982 t(1;8)(p13;q27) Sweden Yorkshire Gustavsson et al., 1982 1998 t(1;9)(p-;p+) France Large White Ducos et al., 1998a 1992 t(1;10)(q2.11;p15) Sweden Not Indicated Ravaoarimanana et al., 1992 2010 t(1;11)(q-;p+) Spain Not Indicated Rodriguez et al., 2010 Kuokkanen and Makinen, 1988 t(1;11)(p23;q15) Finland Landrace 1988 1994 t(1;11)(p21;q15) Bulgaria Landrace Tzocheva et al., 1994 2007 t(1;11)(q11;q11) France Pietrain Ducos et al., 2007 2007 t(1;11)(q24;p13) Spain Not Indicated Ducos et al., 2007 2007 t(1;13)(q27;q41) The Netherlands Duroc Ducos et al., 2007 1984 t(1;14)(p25;q15) Sweden Yorkshire Gustavsson, 1984 1987 t(1;14)(q17;q21) Italy Large White Tarocco et al., 1987 2019 t(1;14)(q21;q14) Canada Yorkshire Donaldson et al., 2019 1982 t(1;14)(q23;q21) Germany Synthetic Golisch et al., 1982 2016 t(1;14)(q2.11;q12) France Landrace x Duroc Barasc et al., 2016 2019 t(1;14)(q2.11;q25) Canada Landrace Donaldson et al., 2019 1992 t(1;14)(q2.12;q22) U.S.A Not Indicated Zhang et al., 1992 2019 t(1;15)(q-;q+) Spain Synthetic Sanchez-Sanchez et al., 2019 Kuokkanen and Makinen, 1988 t(1;15)(p25;q13) Finland Landrace 1988 2007 t(1;15)(q17;q22) The Netherlands Synthetic Line Ducos et al., 2007 1988 t(1;15)(q27;q26) France Large White Popescu et al., 1988 2016 t(1;15)(q2.11;q13) Canada Yorkshire Quach et al., 2016 1982 t(1;16)(p11;q22) Switzerland Landrace Fries and Strazinger, 1982 1981 t(1;16)(q11;q11) Germany Landrace Forster et al., 1981 2007 t(1;17)(p11;q11) The Netherlands Pietrain Ducos et al., 2007 1984 t(1;17)(q21;q11) Sweden Yorkshire Gustavsson, 1984 t(1;18)(q2.13; 1991 Sweden Hampshire Villagomez et al., 1991 q21) 1982 t(2;4)(p17;q11) Sweden Yorkshire Gustavsson et al., 1982 2016 t(2;5)(p16;p11) Canada Landrace Quach et al., 2016 2002 t(2;6)(p17;q27) France Yorkshire Ducos et al., 2002 2007 t(2;8)(p11;p13) France Large White Ducos et al., 2007 2011 t(2;8)(q-;q+) Spain Not Indicated Garcia-Vazquez et al., 2011 2007 t(2;9)(q13;q24) France Large White Ducos et al., 2007 2019 t(2;10)(q-;q+) Spain Not Indicated Sanchez-Sanchez et al., 2019 2019 t(2;10)(p17;q13) Canada Duroc Donaldson et al., 2019 2007 t(2;14)(p15;q26) France Crossbred Ducos et al., 2007 1993 t(2;14)(p14;q23) Sweden Hampshire Villagomez et al., 1993 1998 t(2;14)(q13;q27) France Landrace Ducos et al., 1998b 2007 t(2;14)(q21;q24) The Netherlands Not Indicated Ducos et al., 2007 1982 t(2;15)(p14;q11) Switzerland Landrace Fries and Strazinger, 1982 2019 t(2;15)(q13;q24) Canada Yorkshire Donaldson et al., 2019 2007 t(2;15)(q28;q24) Germany Not Indicated Ducos et al., 2007 2007 t(2;16)(q28;q21) France Sino-European Ducos et al., 2007

235

2019 t(2;17)(p17;q13) Canada Yorkshire Donaldson et al., 2019 2007 t(2;17)(p12;q14) France Duroc Ducos et al., 2007 2016 t(3;4)(p15;q13) Canada Landrace Quach et al., 2016 2017 t(3;4)(p13;q15) France Large White Feve et al., 2017 1998 t(3;5)(p13;q23) France Landrace Ducos et al., 1998b 2008 t(3;6)(p14;q21) Mexico Mexican Hairless Villagomez et al., 2008 2019 t(3;6)(q13;p13) Canada Landrace Donaldson et al., 2019 2019 t(3;6)(q25;q11) Canada Landrace Donaldson et al., 2019 1984 t(3;7)(p13;q21) France Not Indicated Bahri et al., 1984 2007 t(3;8)(q25;p21) France Synthetic Line Ducos et al., 2007 2007 t(3;11)(q13;p11) France Synthetic Line Ducos et al., 2007 2016 t(3;12)(p13;q15) Canada Yorkshire Quach et al., 2016 1998 t(3;13)(p15;q31) France Large White Ducos et al., 1998a 2019 t(3;13)(q21;q21) Canada Yorkshire Donaldson et al., 2019 2019 t(3;14)(p14;q23) Spain Duroc Sanchez-Sanchez et al., 2019 2002 t(3;15)(q27;q13) France Large White Ducos et al., 2002 2007 t(3;16)(q23;q22) France Crossbred Ducos et al., 2007 2019 t(3;18)(q14;q21) Spain Duroc Sanchez-Sanchez et al., 2019 2019 t(3;18)(p14;q24) Spain Large White Sanchez-Sanchez et al., 2019 2007 t(4;5)(p13;q21) France Crossbred Ducos et al., 2007 2019 t(4;6)(q11;q27) Canada Yorkshire Donaldson et al., 2019 2002 t(4;6)(q21;p14) France Pietrain Ducos et al., 2002 1998 t(4;6)(q21;q28) France Large White Ducos et al., 1998b 2011 t(4;7)(p+;q-) Spain Not Indicated Garcia-Vazquez et al., 2011 2019 t(4;9)(p13;p24) Canada Landrace Donaldson et al., 2019 2000 t(4;12)(p13;q13) France Crossbred Pinton et al., 2000 2019 t(4;12)(p11;p15) Canada Duroc Donaldson et al., 2019 2007 t(4;12)(q21;q13) France Crossbred Ducos et al., 2007 2007 t(4;13)(p15;q41) The Netherlands Not Indicated Ducos et al., 2007 1986 t(4;13)(q24;q41) Finland Landrace Makinen and Remes, 1986 Large White x 1979 t(4;14)(q25;q22) France Popescu and Legault, 1979 Landrace 1988 t(4;15)(q11;qter) France Pietrain Popescu et al., 1988 2019 t(4;15)(q21;q11) Canada Yorkshire Donaldson et al., 2019 2007 t(4;15)(q25;q11) France Landrace, French Ducos et al., 2007 2007 t(4;16)(q25;q21) The Netherlands Synthetic Line Ducos et al., 2007 2017 t(5;6) United Kingdom Not Indicated O'Connor et al., 2017 2007 t(5;7)(q23;p11) France Sino-European Ducos et al., 2007 2002 t(5;8)(p12;q21) France Large White Ducos et al., 2002 2002 t(5;8)(p11;p23) France Synthetic Ducos et al., 2002 1984 t(5;8)(q12;q27) Sweden Yorkshire Gustavsson, 1984 2000 t(1;7)(q17;q26) France Large White Pinton et al., 2000 2007 t(5;9)(p11;p24) France Landrace Ducos et al., 2007 2007 t(5;9)(q21;p13) France Landrace Ducos et al., 2007 2019 t(5;12)(q11;q12) Canada Yorkshire Donaldson et al., 2019

236

2019 t(5;13)(q21;q41) Canada Yorkshire Donaldson et al., 2019 1984 t(5;14)(p11;q11) France Hampshire x Duroc Popescu and Tixier, 1984 2007 t(5;14)(q21;q12) France Duroc Ducos et al., 2007 1993 t(5;15)(q25;q25) Czech Republic Landrace Parkanyi et al., 1993 2002 t(5;17)(p12;q13) France Yorkshire Ducos et al., 2002 2019 t(5;18)(q21;q11) Canada Yorkshire Donaldson et al., 2019 2016 t(6;7)(p15;q13) Canada Duroc Quach et al., 2016 2019 t(6;7)(q33;q22) Canada Yorkshire Donaldson et al., 2019 2007 t(6;8)(p15;q27) France Pietrain Ducos et al., 2007 1991 t(6;8)(q33;q26) France Gaschon x Meishan Bonneau et al., 1991 Pietrain x Large 1998 t(6;13)(p15;q41) France Ducos et al., 1998b White 2007 t(6;13)(p13;q49) France Large White Ducos et al., 2007 1978 t(6;14)(p11;q13) United Kingdom Large White x Essex Madan et al., 1978 Pietrain x Large 1998 t(6;14)(q27;q21) France Ducos et al., 1998a White 1974 t(6;15) Belgium Landrace Bouters et al., 1974 Pietrain x Large 1991 t(6;15)(p15;q13) France Bonneau et al., 1991 White 2019 t(6;15)(q33;q13) Canada Yorkshire Donaldson et al., 2019 2014 t(6;16)(p13;q23) Poland Large White Kociucka et al., 2014 1998 t(6;16)(q11;q11) France Synthetic Ducos et al., 1998a 1992 t(7;8)(q13;q27) Sweden Not Indicated Ravaoarimanana et al., 1992 2002 t(7;8)(q24;p21) France Landrace x Meishan Ducos et al., 2002 2007 t(7;9)(q11;q26) France Large White Ducos et al., 2007 2019 t(7;9)(q15;p24) Canada Yorkshire This Thesis 2007 t(7;9)(q15;q15) France Large White Ducos et al., 2007 2007 t(7;10)(q13;q11) The Netherlands Not Indicated Ducos et al., 2007 2017 t(7;10) United Kingdom Not Indicated O'Connor et al., 2017 1982 t(7;11)(q21;q11) Sweden Yorkshire Gustavsson et al., 1982 2017 t(7;12) United Kingdom Not Indicated O'Connor et al., 2017 2007 t(7;12)(q11;p15) France Pietrain Ducos et al., 2007 Kuokkanen and Makinen, 1987 t(7;12)(q24;q15) Finland Yorkshire 1987 1988 t(7;13)(p13;q21) Sweden Hampshire Gustavsson et al., 1988 1997 t(7;13)(q13;q46) Poland Duroc Danielak-Czech et al., 1997 2007 t(7;14)(q15;q27) France Crossbred Ducos et al., 2007 2007 t(7;14)(q26;q25) The Netherlands Not Indicated Ducos et al., 2007 2016 t(7;15)(q13;q13) Canada Duroc Quach et al., 2016 1995 t(7;15)(24;q12) United Kingdom Large White Konfortova et al., 1995 1997 t(7;15)(q24;q26) Finland Yorkshire Makinen et al., 1997 1983 t(7;15)(q25;q25) France Large White Popescu et al., 1983 1991 t(7;17)(q26;q11) Sweden Hampshire Villagomez et al., 1991 1999 t(8;10)(p11;q13) Finland Landrace Makinen et al., 1999 2007 t(8;12)(p11;p11) The Netherlands Not Indicated Ducos et al., 2007

237

2016 t(8;13)(p21;q41) Canada Landrace Quach et al., 2016 1992 t(8;13)(q27;q36) Sweden Not Indicated Ravaoarimanana et al., 1992 1992 t(8;14)(p23;q27) Sweden Not Indicated Ravaoarimanana et al., 1992 1997 t(8;14)(p21;q25) Poland Landrace Danielak-Czech et al., 1997 1983 t(9;11)(p24;q11) Sweden Yorkshire Gustavsson et al., 1983 2007 t(9;11)(q14;p13) Germany Meishan Ducos et al., 2007 2019 t(9;13)(q24;q31) Canada Duroc Donaldson et al., 2019 2007 t(9;14)(p24;q15) France Synthetic Line Ducos et al., 2007 2009 t(9;14)(p24;q27) Canada Not Indicated Quach et al; 2009 2019 t(9;14)(p13;q11) Canada Yorkshire Donaldson et al., 2019 2003 t(9;14)(q14;q23) Poland Not Indicated Rejduch et al., 2003 Landrace x Large 1998 t(9;15)(p24;q13) France Ducos et al., 1998a White 2007 t(9;17) (p24;q23) France Crossbred Ducos et al., 2007 2007 t(10;11)(q16;q13) France Polish Ducos et al., 2007 2019 t(10;13)(p15;q31) Canada Duroc Donaldson et al., 2019 2019 t(10;13)(p13;q31) Canada Duroc Donaldson et al., 2019 1996 t(10;13)(q11;q11) Poland Landrace Danielak-Czech et al., 1996 1996 t(10;13)(q16;q21) Poland Landrace Danielak-Czech et al., 2007 2009 t(10;13)(q11;q11) Canada Large White x Duroc Quach et al; 2009 2019 t(10;13)(q13;q21) Canada Yorkshire Donaldson et al., 2019 2007 t(10;13)(q13;q22) The Netherlands Synthetic Line Ducos et al., 2007 1982 t(10;15)(p15;q13) Switzerland Landrace Fries and Strazinger, 1982 2007 t(10;17)(q11;q21) The Netherlands Large White Ducos et al., 2007 2007 t(10;18)(p11;q24) France Pietrain Ducos et al., 2007 1998 t(11;13)(q+;q-) France Large White Ducos et al., 1998a Henricson and Backstrom, 1964 t(11;15)(p15;q13) Sweden Landrace 1964 Pietrain x Large 1998 t(11;16)(p14;q14) France Ducos et al., 1998a White 2007 t(11;17)(p13;q21) France Synthetic Line Ducos et al., 2007 1988 t(12;13)(q13;q11) Russia Mini Siberian Astakhova, et al., 1988 2007 t(12;14)(q13;q15) France Duroc Ducos et al., 2007 2005 t(12;14)(q13;q21) France Not Indicated Pinton et al., 2005 2019 t(12;14)(q13;q21) Canada Duroc Donaldson et al., 2019 2007 t(12;14)(q15;q13) France Pietrain Ducos et al., 2007 2016 t(12;14)(q15;q23) Canada Duroc Quach et al., 2016 2019 t(12;14)(q15;q23) Canada Landrace Donaldson et al., 2019 1987 t(12;15)(q;q) Russia Large White Konovalov et al., 1987 1976 t(13;14)(q21;q25) Sweden Yorkshire Hageltorn et al., 1976 2007 t(13;14)(q31;q21) France Synthetic Line Ducos et al., 2007 2019 t(13;14)(q31;q29) Canada Crossbred This Thesis 2017 t(13;15) United Kingdom Not Indicated O'Connor et al., 2017 2007 t(13;15)(q31;q26) France Pietrain Ducos et al., 2007 2007 t(13;16)(q41;q21) The Netherlands Not Indicated Ducos et al., 2007

238

Pietrain x Large 1998 t(13;17)(q41;q11) France Ducos et al., 1998b White 2019 t(13;18)(q21;q13) Canada Duroc Donaldson et al., 2019 2019 t(14;15)(q13;q15) Canada Large White Donaldson et al., 2019 2007 t(14;15)(q28;q13) France Synthetic Line Ducos et al., 2007 1982 t(14;15)(q29;q24) Germany Not Indicated Golisch et al., 1982 2007 t(14;16)(q13;q21) France Pietrain Ducos et al., 2007 1988 t(15;16)(q26;q21) Sweden Yorkshire Gustavsson et al., 1988 1998 t(15;17)(q13;q21) France Large White Ducos et al., 1998a 2002 t(15;17)(q24;q21) France Landrace Ducos et al., 2002 2019 t(15;18)(q24;q24) Canada Yorkshire Donaldson et al., 2019 1986 t(16;17)(q23;q21) France Landrace Popescu and Boscher, 1986 2007 t(17;18)(q21;q11) France Pietrain Ducos et al., 2007 1989 t(X;13)(q24;q21) Sweden Hampshire Gustavsson et al., 1989 1994 t(X;14)(p21;q11) Canada Yorkshire Singh et al., 1994 2012 t(Y;1) France Not Indicated Barasc et al., 2012 2019 t(Y;1)(q11;q17) Canada Yorkshire Donaldson et al., 2019 2019 t(Y;13)(p13;q33) Canada Yorkshire Donaldson et al., 2019 2007 t(Y;14)(q11;q11) France Duroc Ducos et al., 2007

Table 51: List of Mosaic Reciprocal Translocations and non-Reciprocal rearrangements in the Domestic Pig

Year Rearrangement Country Breed Author

De la Cruz-Vigo et al., 2014 t(1;2;7)(q11;q17;p16;q13) Spain Not Indicated 2014 1987 t(2;9;14)(q23;q22;q25) Finland Not Indicated Makinen et al., 1987 2018 38X,del(Y) Canada Not Indicated Donaldson et al., 2019 Danielak-Czech et al., 1996 inv(1)(p22;q11) Poland Not Indicated 1996 2002 inv(1)(p24;q29) France Landrace Ducos et al., 2007 2005 inv(1)(q18;q24) France Large White Ducos et al., 2007 2002 inv(2)(p11;q21) France Pietrain Ducos et al., 2007 2000 inv(2)(p13;q11) France Not Indicated Pinton et al., 2000 2005 inv(2)(p13;q12) France Synthetic Ducos et al., 2007 2006 inv(2)(q13;q25) The Netherlands Duroc Ducos et al., 2007 1982 inv(2;2)(p+;q28) Switzerland Landrace Fries and Strazinger, 1982 Sanchez-Sanchez et al., 2019 inv(4)(p15;q24) Spain Not Indicated 2019 2004 inv(6)(p14;q12) France Sino-European Ducos et al., 2007 2002 inv(8)(p11;q25) France Large White Ducos et al., 2007 2004 inv(8)(p21;q11) France Pietrain Ducos et al., 2007 2016 inv(8)(q11;q25) Canada Duroc Quach et al., 2016 1982 inv(9;9)(p17;q14) Switzerland Landrace Fries and Strazinger, 1982

239

2019 inv9(p11;p21) Canada Duroc Donaldson et al., 2019 2019 mos t(1;1)(q2.11;q21) Canada Duroc Donaldson et al., 2019 Hansen-Melander and 1970 mos t(1;11)(q-;q+) Sweden Landrace Melander, 1970 2019 mos t(1;2)(p23;q22) Canada Yorkshire Donaldson et al., 2019 2018 mos t(2;8)(q23;q21) Canada Duroc Rezaei et al., 2019 2017 mos t(3;10)(q23;p13) Canada Yorkshire Rezaei et al., 2019 2015 mos t(3;13)(q21;q49) Canada Duroc Rezaei et al., 2019 2017 mos t(3;7)(p15;q13) Canada Not Indicated Rezaei et al., 2019 2017 mos t(3;7)(q23;q26) Canada Landrace Rezaei et al., 2019 2018 mos t(5;9)(q21;p22) Canada Duroc Rezaei et al., 2019 2018 mos t(6;16)(p15;q21) Canada Yorkshire Rezaei et al., 2019 2017 mos t(6;7)(q21;q22) Canada Not Indicated Rezaei et al., 2019 2019 mos t(7;13)(q22;q21) Canada Pietrain Rezaei et al., 2019 2019 mos t(7;18)(q15;q22) Canada Duroc Rezaei et al., 2019 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2018 mos t(7;18)(q22;q11) Canada Duroc Rezaei et al., 2019 2018 mos t(7;18)(q22;q11) Canada Large White Rezaei et al., 2019 2018 mos t(7;7)(q24;q15) Canada Yorkshire Rezaei et al., 2019 2018 mos t(7;9)(q15;q15) Canada Duroc Rezaei et al., 2019 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(7;9)(q21;q21) Czech Republic Crossbred Musilova et al., 2014

240

2015 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2016 mos t(7;9)(q24;q24) Canada Large White Rezaei et al., 2019 2017 mos t(7;9)(q24;q24) Canada Landrace Rezaei et al., 2019 2017 mos t(7;9)(q24;q24) Canada Landrace Rezaei et al., 2019 2017 mos t(7;9)(q24;q24) Canada Not Indicated Rezaei et al., 2019 2017 mos t(7;9)(q24;q24) Canada Landrace Rezaei et al., 2019 2017 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Yorkshire Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Landrace Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Yorkshire Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Not Indicated Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Yorkshire Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Landrace Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Yorkshire Rezaei et al., 2019 2018 mos t(7;9)(q24;q24) Canada Duroc Rezaei et al., 2019 2019 mos t(7;9)(q24;q24) Canada Unknown Rezaei et al., 2019 2019 mos t(7;9)(q24;q24) Canada Duroc Donaldson et al., 2019 2019 mos t(7;9)(q24;q24) Canada Duroc Donaldson et al., 2019 2019 mos t(7;9)(q24;q24) Canada Yorkshire Donaldson et al., 2019 2019 mos t(8;16)(q21;q21) Canada Landrace Donaldson et al., 2019 2017 mos t(8;9)(q21;q24) Canada Landrace Rezaei et al., 2019 2017 mos t(9;13)(p32;q41) Canada Not Indicated Rezaei et al., 2019 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2014 mos t(9;18)(q21;q11) Czech Republic Crossbred Musilova et al., 2014 2018 mos t(9;18)(q22;q11) Canada Landrace Rezaei et al., 2019 1986 rob(13;17) Germany Not Indicated Schwerin et al., 1986 2006 rob(13;17) France Landrace Ducos et al., 2007 2016 rob(13;17) Canada Landrace Quach et al., 2016 2005 rob(14;15) France Landrace Ducos et al., 2007 2006 rob(14;17) The Netherlands Not Indicated Ducos et al., 2007 Landrace x 1991 rob(16;17) Russia Astakhova et al., 1991 Black 1982 t(1;12)(q39;q+) Switzerland Landrace Fries and Strazinger, 1982 1982 t(1;16)(p+;q14) Switzerland Landrace Fries and Strazinger, 1982 1982 t(2;4)(q21;p+) Switzerland Landrace Fries and Strazinger, 1982 1982 t(2;7)(p14;q+) Switzerland Landrace Fries and Strazinger, 1982

241

1982 t(5;15)(p15;q15) Switzerland Landrace Fries and Strazinger, 1982

Table 52: List of Common Fragile Sites in the Pig Genome

Common Fragile Sites Author

1p25 Rønne, 1995 1p23 Rønne, 1995 1p22 Rønne, 1995 1p14 Rønne, 1995 1p13 Rønne, 1995 1p11 Rønne, 1995 1q11 Rønne, 1995 1q17 Rønne, 1995 1q21 Rønne, 1995 1q22 Rønne, 1995 1q26 Rønne, 1995 1q27 Rønne, 1995 1q29 Rønne, 1995 1q2.11 Rønne, 1995 2p14 Rønne, 1995 2q12 Rønne, 1995 2q21 Rønne, 1995 2q23 Rønne, 1995 3p14 Rønne, 1995 3q14 Rønne, 1995 3q25 Rønne, 1995 4p14 Rønne, 1995 4q21 Rønne, 1995 4q25 Rønne, 1995 6p15 Rønne, 1995 6p12 Rønne, 1995 6q28 Rønne, 1995 6q31 Rønne, 1995 6q32 Rønne, 1995 7q21 Rønne, 1995 7q23 Rønne, 1995 7q24 Rønne, 1995 8q12 Rønne, 1995 8q21 Rønne, 1995 8q22 Rønne, 1995 9q21 Rønne, 1995

242

10p15 Rønne, 1995 11q12 Rønne, 1995 12q11 Rønne, 1995 13q21 Rønne, 1995 13q31 Rønne, 1995 13q33 Rønne, 1995 13q34 Rønne, 1995 13q41 Rønne, 1995 13q46 Rønne, 1995 13q47 Rønne, 1995 14q15 Rønne, 1995 14q21 Rønne, 1995 14q25 Rønne, 1995 14q26 Rønne, 1995 14q27 Rønne, 1995 15q14 Rønne, 1995 15q15 Rønne, 1995 15q24 Rønne, 1995 16q21 Rønne, 1995 17q21 Rønne, 1995 18q21 Rønne, 1995 Xp21 Rønne, 1995 Xq21 Rønne, 1995 Xq22 Rønne, 1995

Table 53: List of Inferred and Established Evolutionary Breakpoint Regions in the Pig Genome

Band Author

1p12 Inferred 1p21 Inferred 1q11 Inferred 1q13 Inferred 1q15 Inferred 1q17 Inferred 1q18 Inferred 1q21 Inferred 1q22 Inferred 1q23 Inferred 1q24 Inferred 1q27 Inferred 1q29 Inferred

243

2p11 Lahbib-Mansais et al., 2006 2p16 Lahbib-Mansais et al., 2006 2p17 Lahbib-Mansais et al., 2006 2q11 Lahbib-Mansais et al., 2006 2q13 Lahbib-Mansais et al., 2006 2q21 Lahbib-Mansais et al., 2006 3p11 Inferred 3p15 Inferred 3p16 Inferred 3p17 Inferred 4q15 Inferred 4q21 Inferred 5p11 Inferred 5p14 Inferred 5q21 Inferred 5q23 Inferred 6p12 Inferred 6p14 Inferred 6p15 Inferred 6q21 Inferred 6q27 Inferred 6q31 Inferred 7p11 Inferred 7q13 Inferred 7q14 Inferred 7q15 Inferred 7q21 Inferred 7q22 Inferred 8p11 Inferred 8p12 Inferred 8q12 Inferred 9p24 Inferred 9q11 Inferred 9q15 Inferred 9q21 Inferred 9q22 Inferred 9q23 Inferred 9q25 Inferred 9q26 Inferred 10p13 Inferred 10p15 Inferred 10p16 Inferred 10q11 Inferred 10q12 Inferred 12p12 Inferred

244

12p13 Inferred 12q12 Inferred 12q13 Inferred 13q11 Inferred 13q22 Inferred 13q24 Inferred 13q41 Inferred 13q46 Inferred 14q11 Inferred 14q12 Inferred 14q14 Inferred 14q16 Inferred 14q21 Inferred 14q24 Inferred 14q25 Inferred 15q12 Inferred 15q13 Inferred 15q15 Inferred 16q21 Lahbib-Mansais et al., 2006 16q23 Lahbib-Mansais et al., 2006 17q11 Lahbib-Mansais et al., 2005 17q12 Lahbib-Mansais et al., 2005 17q21 Lahbib-Mansais et al., 2005 18q11 Inferred 18q22 Inferred 18q24 Inferred

245