Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2015 Molecular and quantitative genetic basis and control of severe combined immunodeficiency and porcine reproductive and respiratory syndrome in pigs Emily Hannah Waide Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Agriculture Commons, Animal Diseases Commons, Animal Sciences Commons, and the Genetics Commons

Recommended Citation Waide, Emily Hannah, "Molecular and quantitative genetic basis and control of severe combined immunodeficiency and porcine reproductive and respiratory syndrome in pigs" (2015). Graduate Theses and Dissertations. 14878. https://lib.dr.iastate.edu/etd/14878

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Molecular and quantitative genetic basis and control of severe combined immunodeficiency and porcine reproductive and respiratory syndrome in pigs

by

Emily Hannah Waide

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Genetics

Program of Study Committee: Jack C.M. Dekkers, Co-major Professor Christopher K. Tuggle, Co-major Professor Dorian Garrick Peng Liu Joan Cunnick

Iowa State University

Ames, Iowa

2015

Copyright © Emily Hannah Waide, 2015. All rights reserved. ii

TABLE OF CONTENTS Page LIST OF FIGURES iv LIST OF TABLES viii ACKNOWLEDGEMENTS x ABSTRACT xiv CHAPTER 1: GENERAL INTRODUCTION 1 Introduction 1 Thesis Organization 7 Literature Cited 8 CHAPTER 2: LITERATURE REVIEW 16 Genetics of Disease 16 Porcine Reproductive and Respiratory Syndrome 17 Severe Combined Immunodeficiency 23 Genome Wide Association Studies 25 Genomic Prediction and Genomic Selection 35 Literature Cited 39 CHAPTER 3. NOT ALL SCID PIGS ARE CREATED EQUALLY: TWO INDEPENDENT MUTATIONS IN THE ARTEMIS CAUSE SEVERE COMBINED IMMUNE DEFICIENCY (SCID) IN PIGS 50 Abstract 50 Introduction 51 Materials and Methods 53 Results 59 Discussion 67 Conclusions 72 Literature Cited 72 Figures 77 CHAPTER 4. GENOME WIDE ASSOCIATION OF PIGLET RESPONSE TO ONE OF TWO ISOLATES OF THE PORCINE REPRODUCTIVE AND RESPIRATORY SYNDROME VIRUS 86 Abstract 86 Introduction 87 Materials and Methods 88 Results 94 Discussion 101 Conclusions 110 Literature Cited 111 Figures 118 Tables 126 CHAPTER 5: GENOMIC PREDICTION OF PIGLET RESPONSE TO ONE OF TWO ISOLATES OF THE PORCINE REPRODUCTIVE AND RESPIRATORY SYNDROME VIRUS 145 Abstract 145 Introduction 146 Materials and Methods 149 Results 154 Discussion 160

iii

Conclusions 168 Literature Cited 169 Figures 172 Tables 178 CHAPTER 6: GENERAL DISCUSSION 179 Severe Combined Immunodeficient Pigs 179 Porcine Reproductive and Respiratory Syndrome 185 Literature Cited 196

APPENDIX A. RESULTS FROM PHASING OF SNP GENOTYPES FROM COMMERCIAL ANIMALS TO PREDICT SCID CARRIER STATUS. 200 APPENDIX B. MOLECULAR GENETIC TESTS USED TO DETERMINE SCID STATUS IN THE IOWA STATE UNIVERSITY RFI POPULATION. 201

iv

LIST OF FIGURES

Figure 3.1 Flow cytometry shows very few CD4+ CD8+ and γδ+ cells in SCID pig thymus. The number in each diagram is the percentage of each cell type in thymocyte populations of a SCID pig or a non- SCID carrier littermate (A and B). CD8a (x-axis) and CD4 (y-axis) (A) and γδ (B) are T cell surface markers. The least square means of percentages of each thymocyte subpopulation are shown in (C). Error bars represent the SE of the estimates. Bars with different letters (a and b) indicate statistically significant (p < 0.01) differences between SCID and non-SCID cell number within the cellular subset. 77

Figure 3.2 Effect of ionizing radiation on fibroblasts from SCID and normal pigs. Fibroblasts from SCID piglets (n = 10) and normal littermates (n = 10) were exposed to increasing doses of gamma rays. Colonies (>2 mm) were stained and counted after 14 d. Survival proportion was calculated as the average number of surviving colonies for three replicates at each radiation dose divided by the number of colonies from nonirradiated cells for each animal. Error bars represent the SE of the least squares means. *p < 0.0001, difference between SCID and normal pigs within radiation dose. 78

Figure 3.3 Manhattan plot of the genome-wide association study for SCID status. Results show the -log10(p-value) of the association of ordered SNPs on S. scrofa 1–18, X, Y, and unknown (U) with SCID status based on the dfam option in PLINK. 79

Figure 3.4 Pedigree of SCID ancestors showing carriers of mutated haplotypes based on the Illumina porcine SNP60 panel. Circles represent females and squares represent males. Green symbols are h16 haplotype carriers and pink symbols are h12 haplotype carriers, with lines of each color tracking the respective SCID haplotype through the pedigree to the founder generation. Beige symbols are pigs that were genotyped with the SNP60 panel and did not carry either SCID haplotype. White symbols in generations 0–7 represent nongenotyped individuals. Boxes in generation 8 give information on the numbers of SCID (Affected) and non-SCID (Unaffected) piglets and the number of piglets that died before their SCID phenotype was determined (Unknown) from each parent pair. 80

Figure 3.5 Two independent mutations were found in Artemis. (A) The coding region of the Artemis transcript is indicated by slanted lines and genotypes at mutated positions are shown for chromosomes that carry normal, h12, and h16 haplotypes. Mutant alleles that cause SCID are shown in black lettering. (B) Genomic sequence of the h16 haplotype shows a splice donor site mutation (g.51578763 G→A) responsible for the lack of exon 8 in all h16 transcripts. Capital letters denote exonic sequence whereas lowercase letters denote intronic sequence. (C) A nonsense point mutation (g.51584489 G→A) in exon 10 changes the tryptophan at position 267 to a stop codon in the h12 haplotype. 81

Figure 3.6 Rescue of sensitivity to ionizing radiation. Fibroblasts from compound heterozygous SCID (n = 2) and normal (n = 3) littermates were transfected with human Artemis-expressing plasmid (5 mg, Artemis), with a molar equivalent of pExodus plasmid without the Artemis gene (3.45 mg, Empty Vector), or shocked without plasmid added. Fibroblasts were exposed to a 4 Gy radiation dose 24 h after transfection. Colonies (>2 mm) were counted after 14 d of growth. The average number of surviving colonies for three replicates for a given plasmid was divided by the number of colonies from shock only cells for each animal. Error bars represent the SE of the least squares means. Dots show individual observations. Bars with different letters (a and b) within affected status represent statistical differences between means with p < 0.01. 82

Supplemental Figure 3.1 cDNA structures for the Artemis gene from normal, h16, and h12 haplotypes. Exons are represented by blocks, with exon number shown above the exons present in each transcript. Fractions show the number of sequences for that transcript over the total number of transcripts sequenced for that haplotype. For h12 transcripts, purple indicates that a portion of

v

intron 4 was present in the transcript sequence. The red color shows splicing inside of exons, with the size of the red box indicating the amount of the exon present in the sequenced transcript. Dashed lines show the normal size of the improperly spliced exon for some transcripts. 83 Supplemental Figure 3.2 Flow cytometry shows no differences between the lymphocyte counts of SCID pigs with different genotypes. There were no differences between the numbers of any lymphocytes seen among the SCID genotypes (p>0.10). SCID and carrier littermates had the same numbers of NK cells (CD16; p>0.10), but all SCID pigs had lower numbers of B (CD21) and T (CD3e) cells (p<0.01). Error bars represent the standard errors of the least square means. Bars with different letters (a and b) represent statistically significant differences between least square means within marker type. 84

Supplemental Figure 3.3 Radiosensitivity of fibroblasts from all SCID affected genotypes and normal piglets. Fibroblasts from h12/h12, h16/h16, h12/h16, and normal animals (n=4 per genotype) were exposed to indicated doses of gamma-rays (Gy). Survival proportion was calculated as the average number of surviving colonies for three replicates at each radiation dose was divided by the number of colonies from non-irradiated cells for each animal. Error bars represent the standard error of the least squares means. * indicates statistically significant (p<0.05) difference between all SCID genotypes and normal pigs. 85

Figure 4.1 Genome wide association results for VL using single SNP and Bayes-B analyses. (A) Bayes-B results showing the percent of genetic variance explained by 1-Mb non-overlapping windows of SNPs across chromosomes for KS06 (top) and NVSL (bottom). (B) Single SNP analysis results showing the -log10(p-value) of ordered SNPs across chromosomes and for unmapped (U) SNPs for KS06 (top) and NVSL (bottom). Dashed lines indicate genome wide significance at a Bonferroni corrected p-value of 0.05. 118

Figure 4.2 Genome wide association results for WG using single SNP and Bayes-B analyses. (A) Bayes-B results showing the percent of genetic variance explained by 1-Mb non-overlapping windows of SNPs across chromosomes for KS06 (top) and NVSL (bottom). (B) Single SNP analysis results showing the -log10(p-value) of ordered SNPs across chromosomes and for unmapped (U) SNPs for KS06 (top) and NVSL (bottom). Dashed lines indicate genome wide significance at a Bonferroni corrected p-value of 0.05. 119

Figure 4.3 Manhattan plots of –log10(P) of the main (top) and interaction (bottom) of SNP genotype with PRRSV isolate for VL (A) and WG (B) across chromosomes and unmapped (U) SNPs. Dashed lines indicate genome wide significance at a Bonferroni corrected P of 0.05. 120

Figure 4.4 Manhattan plots of –log10(P) of the main (top) and interaction (bottom) of SNP genotype with genotype at the WUR SNP for VL (A) and WG (B) across chromosomes and unmapped (U) SNPs. Dashed lines indicate genome wide significance at a Bonferroni corrected P of 0.05. 121

Supplemental Figure 4.1 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for KS06 Viral Load against the percent of genetic variance explained by that window from Bayes-B. 122

Supplemental Figure 4.2 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for NVSL Viral Load against the percent of genetic variance explained by that window from Bayes-B. 123

Supplemental Figure 4.3 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for KS06 Weight Gain against the percent of genetic variance explained by that window from Bayes-B. 124

vi

Supplemental Figure 4.4 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for NVSL Weight Gain against the percent of genetic variance explained by that window from Bayes-B. 125

Figure 5.1 Principal components analysis of the SNP genotype data. Each point represents a single animal, with each color representing one of the 8 breeding companies used in this study. The breed makeup of the animals for each breeding company are shown in the same color. LR=Landrace, LW=Large White, and Y=Yorkshire. Breeds are presented as breed of sire x breed of dam. 172

Figure 5.2 Scenarios used for training and validation for genomic prediction. Pink, purple, and blue colored rectangles represent each of the 3 breeding companies with pigs in both NVSL and KS06 trials, with the trial number indicated inside the rectangle. Gray colored rectangles represent individual breeding companies with only one trial of PRRSV infected pigs. Arrows indicate direction of genomic prediction, with the tail originating from the trials used in training and the head pointing towards the trial(s) used for validation. (A) Genomic prediction across PRRSV isolates using all data indicated by black arrows; across breeding company using both PRRSV isolates in training indicated by red arrows. (B) Genomic prediction across PRRSV isolate including validation breeding company in training indicated by purple arrow; Genomic prediction across breeding company within PRRSV isolate and across breeding companies and isolate indicated by small and large blue arrows, respectively. 173

Figure 5.3 Accuracy of genomic prediction across PRRSV isolates for Viral Load. Accuracy, presented as the average correlation between genomic estimated breeding values and adjusted phenotypes divided by the square root of heritability in the validation population, of genomic prediction for VL in the given validation population. (A) Accuracy of genomic prediction across PRRSV isolates for the whole genome, genome when we account for WUR genotype (Genome - WUR), and using only the SNPs in the 1 Mb window containing the WUR SNP (WUR only). (B) Average accuracy of genomic prediction across PRRSV isolates without validation breeding company in training (Excluding Breed Co) and with validation breeding company in training (Including Breed Co). (C) Average accuracy of prediction across breeding companies with indicated PRRSV isolate validation population when training within the same PRRSV isolate, across isolate, or using both isolates. Individual points represent the accuracy of prediction for one trial with breeding companies represented in both PRRSV isolates having the same black shape and gray diamonds representing the trials which were not replicated. 174

Figure 5.4 Genomic prediction accuracy for Viral Load using SNP subsets based on single-SNP GWAS and functional analysis. Accuracy of genomic prediction using all SNPs (Whole Genome), those SNPs with -log10(P)>2.5 from single-SNP analysis, or SNPs located near in enriched biologically relevant terms (Gene Ontology). 175

Figure 5.3 Accuracy of genomic prediction across PRRSV isolates for Weight Gain. Accuracy, presented as the average correlation between genomic estimated breeding values and adjusted phenotypes divided by the square root of heritability in the validation population, of genomic prediction for WG in the given validation population. (A) Accuracy of genomic prediction across PRRSV isolates for the whole genome, genome when we account for WUR genotype (Genome - WUR), and using only the SNPs in the 1 Mb window containing the WUR SNP (WUR only). (B) Average accuracy of genomic prediction across PRRSV isolates without validation breeding company in training (Excluding Breed Co) and with validation breeding company in training (Including Breed Co). (C) Average accuracy of prediction across breeding companies with indicated PRRSV isolate validation population when training within the same PRRSV isolate, across isolate, or using both isolates. Individual points represent the accuracy of prediction for one trial with breeding companies represented in both PRRSV isolates having the same black shape and gray diamonds representing the trials which were not replicated. 176

vii

Figure 5.6 Genomic prediction accuracy for Weight Gain using SNP subsets based on single-SNP GWAS and functional analysis. Accuracy of genomic prediction using all SNPs (Whole Genome), those SNPs with -log10(P)>2.5 from single-SNP analysis, or SNPs located near genes in enriched biologically relevant gene ontology terms (Gene Ontology). 177

viii

LIST OF TABLES

Table 4.1 General information on each PRRS infection trial. 126 Table 4.2 Estimates (SE) of heritability and litter effects for Viral Load (VL) and Weight Gain (WG) for KS06 and NVSL PRRSV isolates 127

Table 4.3 Top 1-Megabase (Mb) genomic regions associated with Viral Load in the KS06 and NVSL trials. 128

Table 4.4 Top 1-Megabase (Mb) genomic regions associated with Weight Gain in the KS06 and NVSL trials. 129

Table 4.5 GWAS results for the interaction of SNP genotype and PRRSV isolate or genotype at the WUR SNP. 131

Table 4.6 Information for gene lists created from genes located within 100, 250, or 100 kb of SNPs associated with the trait at three –log10(P) thresholds (2, 2.5 and 3) for Viral Load and Weight Gain for the KS06 and NVLS isolates. 132

Table 4.7 Most Enriched GO Terms for the gene lists for Viral Load for the KS06 and NVSL isolates. 133

Table 4.8 Most Enriched GO Terms for the gene lists for Weight Gain for the KS06 and NVSL isolates. 134

Table 4.10 Most enriched GO terms for the interaction with PRRSV isolate for Viral Load and Weight Gain. 136

Table 4.11 Most enriched GO terms for the interaction with genotype at the WUR SNP for Viral Load (VL). 137

Supplemental Table 4.1 Number of independent principal components for Sus scrofa chromosomes 1 through 18, X, and unknown (U) from Principal Components Analysis. 138

Supplemental Table 4.2 Top enriched Biological Process Slim GO terms for KS06 Viral Load. Green colored cells contain GO terms considered to be biologically relevant to VL after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed after PRRSV infection (Schroyen et al., 2015). Underlined GO terms were enriched in genes differentially expressed in porcine alveolar macrophages after co-infection with PRRSV and Mycoplasma hyopneumoniae (Li et al, 2015). 139

Supplemental Table 4.3 Top enriched Biological Process Slim GO terms for NVSL VL. Green colored cells contain GO terms considered to be biologically relevant to VL after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed after PRRSV infection (Schroyen et al., 2015). Underlined GO terms were enriched in genes differentially expressed in porcine alveolar macrophages after co-infection with PRRSV and Mycoplasma hyopneumoniae (Li et al, 2015). 140

Supplemental Table 4.4 Top enriched Biological Process Slim GO terms for KS06 Weight Gain. Green colored cells contain GO terms considered to be biologically relevant to WG after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed in blood

ix

after PRRSV infection (Schroyen et al., 2015). Underlined GO terms contain genes shown to be involved in feed efficiency traits in beef cattle (Oliveria et al., 2014). 141

Supplemental Table 4.5 Top enriched Biological Process Slim GO terms for NVSL Weight Gain. Green colored cells contain GO terms considered to be biologically relevant to WG after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed in blood after PRRSV infection (Schroyen et al., 2015). Underlined GO terms contain genes shown to be involved in feed efficiency traits in beef cattle (Oliveria et al., 2014). 142

Supplemental Table 4.6 Most Enriched GO Terms in Panther Complete Categories for Viral Load. 143

Supplemental Table 4.7 Most Enriched GO Terms in Panther Complete Categories for Weight Gain. 144

x

ACKNOWLEDGEMENTS

First and foremost, I would like to acknowledge the love, support, and guidance of my family; without you all, I would have never made it to where I am today. Mama, you are my confidant, therapist, and most importantly, my mama. I will forever be grateful for everything that you have done for me since the day I was born. Your selflessness is endless. Dad, thank you for your guidance, advice, and help through the years. John and Ben, thank you both for always putting a smile on my face, even when times are tough. Your confidence in me is something that holds a very special place in my heart. To my sweet sister, Nancy, even though you are no longer here, the love that we shared is constantly with me. My memories of you have gotten me through the worst struggles, and I hope that you are proud of the woman I have become since we last saw one another.

I would like to thank my advisors and their wives. Dr. Jack Dekkers, I will never forget our first meeting for breakfast when I came to interview. You were so wise, so accepting, so honest. I’m very glad that you didn’t tell me how cold it would be in January when I moved to

Iowa (it was much warmer when I interviewed in June). Your unwavering calm demeanor has taught me so much in the years since our first meeting. You are such an admirable man. Thank you for your academic advising, as well as personal. Mrs. Sue Dekkers, I’m not going to go too much into our first meeting, road-tripping to Chicago and refusing to touch Jack’s new iPhone after Nick Boddicker nicked it up. But, I will say that your bright personality has brought many smiles to my face! You are a wonderful GradMa to all of us. I’ve enjoyed our times spent together. Dr. Chris Tuggle, I tear up just trying to think of what to say to thank you for everything. I hope that one day, I will have half of the finesse that you exhibit daily. You are such a deep scientific thinker, yet you never fail to notice the most subtle emotional change. I am

xi eternally grateful for the time I’ve spent in your office, and I believe I may still have a box of your facial tissues at my desk. Thank you for picking me up when I was down and for your confidence in me. Mrs. Maureen Tuggle, you are such a kind spirit, always thoughtful and willing to lend a helping hand no matter the circumstances. Your kindness towards me has taught me the importance of grace and love. I have such great admiration and respect for all of you. And to all four of you, I would get in the hot tub with you now.

I would also like to show my appreciation for my committee members for their assistance and guidance through my graduate studies. Dr. Joan Cunnick, thank you for all of your help with the endless immunology questions that have come along with the SCID project. Dr. Peng Liu, thank you for sharing your knowledge of statistics with me. You are such a wonderful teacher.

Dr. Dorian Garrick, I am very grateful for all of our interactions. You have taught me so much about the animal breeding industry, and I know that will serve me well throughout my career.

I would like to acknowledge Dr. Jason Ross, Theresa Johnson, and Candi Hager for their help with the cell culture work and other aspects of the SCID project, Dr. Martine Schroyen and past and present Tuggle group undergraduates for their help with lab work, Dr. Ania Wolc, Dr.

Jian Zeng, and Dr. Dinesh Thekkoot for technical assistance, Dr. Matthew Ellinwood, Jackie

Jens, Elizabeth Snella, and the Ellinwood colony for their help with the SCID project, the SCID,

Genomic Selection, PRRS, Tuggle, and Dekkers groups for their advice and comments on my research, and Gary, Tom, and other staff at the Lauren Christian Swine Research Center for their help with the animals. I am very grateful to those at Kansas State Univeristy that spent countless hours working on both the SCID and PRRS projects; thank you Dr. Bob Rowland, Dr. Catherine

Ewen, Maureen Kerrigan, Carol Wyatt, Becky Eaves, Ben Trible, and others in the Rowland group for their work on both the SCID and PRRS projects.

xii

I am very grateful to the USDA for the National Needs grant 2010-38420-20328 for the fellowship that funded my first 3 years of graduate school. Special thanks to the Iowa State

University Vice President of Research and Development for funding my SCID work. I would like to thank the USDA NIFA Award 2013-68004-20362, USDA NIFA PRRS CAP Award

2008-55620-19132, the National Pork Board, and the NRSP-8 Swine Genome and

Bioinformatics Coordination projects, and the PRRS Host Genetics Consortium consisting of

USDA ARS, Kansas State University, Iowa State University, Michigan State University,

Washington State University, Purdue University, University of Nebraska-Lincoln, PIC/Genus,

Newsham Choice Genetics, Fast Genetics, Genetiporc, Inc., Genesus, Inc., PigGen Canada, Inc.,

IDEXX Laboratories, Tetracore, Inc for their contributions. I would also like to acknowledge the

United States Army for the veterans benefits that aided me throughout my studies.

If I named each friend who has supported me during my time at Iowa State University, my thesis would double in length, so I will only give names to a handful. Drs. Nick and Rebecca

Boddicker, you were critical to my emotional, mental, and academic well-being my first years here. Thank you for sharing your lives and your sweet daughter’s life with me. Drs. Nick and

Mariana Serão, there aren’t words enough to thank you both for everything. You were always there when I needed help. Nick, you were my best friend. You taught me so much about life, animal breeding, and myself. I will never forget that. Thank you both for opening your homes and hearts to me. To my 11 best friends in the dark, know that I could not have survived the past years without your support and unwavering love. I can’t imagine life without you all.

I am very grateful and excited to have accepted the Jane H. Booker Chair of Canine

Genetics at The Seeing Eye, Inc., in Morristown, New Jersey. I am looking forward to applying all of the genetic knowledge that I have gained during my time at Iowa State University to such

xiii an amazing cause as selecting the best dogs to be used for guides for people who are blind.In

2011, I moved across the country and fell in love with Iowa, and I know that 2015 will bring the same thing with my cross-country move to New Jersey. And, I promise not to attempt to talk the team into breeding seeing eye pigs. I will be happy to stick with dogs.

Last, but most certainly not least, I would like to thank the most adorable and attitude- filled interruption of my graduate studies, Nancy Tilden Waide. As proud as I am to earn my

Ph.D., there are no letters that could ever follow my name more important than M-O-M. You are my love, my life, my future. Everything that I do is for you, and there will never be an acheivement that I am more proud of than being your mama. I am so excited for our next journey in life.

xiv

ABSTRACT

Disease causes large economic losses in the swine industry, not only through the cost of medical treatment, but also due to reduced production performance of sick pigs. Expanding our knowledge of the genetic basis of diseases will allow more conscientious breeding programs to alleviate some of the economic impact. The overall objective of this thesis was to identify the molecular and quantitative genetic basis of two diseases in pigs, severe combined immunodeficiency (SCID) and porcine reproductive and respiratory syndrome (PRRS). SCID is a naturally occuring primary immunodeficiency in humans, horses, dogs, and, as we discovered, also in pigs. Immunological assays determined that these pigs had low/no B or T cells, but NK cells were present, resulting in T- B- NK+ SCID. Genetic mapping of the lesion and molecular characterization revealed two independent mutations in the Artemis gene that cause SCID in these pigs. Each of these mutations were associated with a distinct haplotype of single nucleotide polymorphisms (SNPs), both of which were traced back to the founder generation of this pig population, which was sourced from commercial pig populations in the midwestern United

States. This suggests that the deleterious mutations repsonsible for causing SCID in these pigs may be present in the swine industry; although, likely at a low frequency. The presence of SCID in commercial swine operations may go undetected, but a PRRS outbreak certainly will not.

Thirteen trials, each with ~200 piglets from commercial breeding programs, were experimentally infected with one of two PRRS virus (PRRSV) isolates. Phenotypes analyzed were viral load

(VL) in blood during the first 21 days post infection (dpi) and weight gain (WG) from 0 to 42 dpi. We utilized genotypes determined on this population using the Illumina Porcine SNP60 panel to perform genome wide association studies (GWAS) and found several genomic regions associated with each trait. None of these genomic regions were associated in both PRRSV isolate

xv trials, except for the QTL on Sus Scrofa (SSC) 4 that was previously identified, which is associated with VL in both isolates and with WG in the NVSL isolate. These results contradicted the previously estimated high genetic correlations of each trait between the two

PRRSV isolates and lead us to believe that response to each PRRSV isolate is controlled by different loci, at least for loci with detectable associations with these traits. Gene ontology (GO) annotation information of genes in SNP-associated regions showed enrichment of immunology- related GO terms in VL-associated regions and metabolism-related GO terms in WG-associated regions. We then used these data to assess the accuracy of genomic prediction across PRRSV isolates and across breeding companies. Incorporation of annotation information enabled the identification of subsets of SNPs that could be used for genomic prediction of response to

PRRSV in pigs that have not been exposed to the virus. However, predictions based on these

SNP subsets were not as accurate as predictions using SNPs across the whole genome. The work described in this thesis presents opportunities for disease reduction through genetic testing for

SCID carrier status and genomic selection for response to PRRSV infection. Preventing the mating of SCID carriers will reduce losses in of nursery piglets, and genomic selection of pigs that are predicted to have more desirable response to PRRSV infection would lessen the impact of this disease.

1

CHAPTER 1: GENERAL INTRODUCTION

Introduction

Disease is a topic of great concern in the swine industry, as diseased pigs have reduced production performance and require costly medical treatment. In addition to the economic impact, the welfare of diseased animals is of concern to farmers. The Pig Site

(http://www.thepigsite.com/diseaseinfo/) lists over 140 diseases or health conditions that may affect pigs. Swine diseases caused by viruses include swine influenza (Nelson et al., 2015),

African and classical swine fever (Grau et al., 2015), foot and mouth disease (Grau et al., 2015), porcine epidemic diarrhea virus (PEDV; Huang et al., 2013; Stevenson et al., 2013), porcine circovirus type 2 (PCV2) (Harding, 2004; Rose et al., 2012; Trible et al., 2012), and porcine reproductive and respiratory syndrome (PRRS; Lunney et al., 2010). Bacterial pathogens that often infect pigs include Eschericia coli (McLamb et al., 2013; Glass-Kaastra et al., 2014),

Salmonella (Pires et al., 2014; Cortés et al., 2015), Streptococcus suis (Dee et al., 1993; Fittipaldi et al., 2011; Glass-Kaastra et al., 2014), and Mycoplasma hyopneumonia (Young et al., 1983;

Bargen, 2004). Communicable diseases are of great concern in the swine industry, as transmission via direct pig-to-pig contact leads to quick spread within a farm, and various transmission routes, such as aerosol or mechanical transmission by vehicle or human traffic, cause spread from farm to farm (Ribbens et al., 2004; Cho and Dee, 2006; Alvarez-Ordóez et al.,

2013; Dvorak et al., 2013; Tobias et al., 2014; Hall and Neumann, 2015).

The inherent susceptibility of pigs to these diseases may also play a role in the economic impact of disease in the swine industry. Piglets may be more susceptible to pathogens commonly found on farms due to an inability to mount an appropriate immune response. The genetic basis of susceptibility to pathogens may involve variation in many genes and have a complex

2 inheritance pattern (Reiner et al., 2002; Boddicker et al., 2012; Reiner et al., 2014; Ramis et al.,

2015) or may result from mutations in a single gene (Goetstouwers et al., 2014; Wang et al.,

2014). The latter scenario is the case for piglets affected with Severe Combined

Immunodeficiency (SCID; Cino-Ozuna et al., 2012). These SCID piglets lack B and T cells

(Ewen et al., 2014), which are indispensable to both the adaptive and innate immune systems. B and T cells play a role in regulation of the innate immune system through cytokine production

(Zhao et al., 2009). These piglets fare well while nursing from the sow, but succumb to opportunistic infections once they have been weaned. In the swine industry, attributing deaths in the nursery barn to genetic disorders such as SCID is unlikely, as parentage information is typically not tracked past weaning. A solution to this possible problem is to avoid mating SCID carriers to one another, which can be achieved by genetic testing for the causative mutations described in this thesis.

SCID is a primary immune deficiency that naturally occurs in humans (Cossu, 2010), horses (Wiler et al., 1995; Bernoco and Bailey, 1998; Perryman, 2004), dogs (Meek et al., 2001), and mice (Bosma et al., 1983). Recently, we discovered the first cases of naturally occuring

SCID in pigs (Cino-Ozuna et al., 2012) during the necropsy of four piglets that died soon after arrival at Kansas State University as part of a PRRSV infection trial. These piglets were from a line of pigs selected for residual feed intake at Iowa State University (Cai et al., 2008). The inheritance of SCID in pigs followed a simple Mendelian recessive pattern, which agrees with the majority of SCID cases in humans (Cossu, 2010). Flow cytometry was used to quantify the numbers of B, T, and NK cells in circulating blood of these SCID piglets and showed that they were affected with T- B- NK+ SCID (Ewen et al., 2014). It was also shown that these SCID pigs were unable to reject cells from two human cancer lines after injection in the SCID piglets’ ears

3

(Basel et al., 2012). These piglets lack the ability to fight foreign cells or pathogens, such as viruses, bacteria, and even human cancer cells.

SCID pigs provide a valuable model for biomedical research, and can be used to study many aspects of human immune function, disease, treatment methods, vaccination protocols, and transplantation, among others. In addition to studies investigating human immune function,

SCID pigs also provide a model that would be useful in examining pig diseases. Understanding the host response to the PRRS virus (PRRSV) would benefit greatly from infection of SCID pigs, as there is much to learn about the contribution of B and T cells to clearance of the virus

(Murtaugh et al., 2002; Mateu and Diaz, 2008; Darwich et al., 2010; Loving et al., 2015).

Recently, a group at Kansas State University conducted a study in which SCID and non-SCID littermates were infected with the PRRSV, then followed for 21 dpi (Chen et al., 2015). This study showed that SCID piglets had lower virus levels in blood up to and including 11 dpi, at which time this relationship reversed and non-SCID piglets began clearing the PRRSV, while the

SCID piglets were unable to control replication. The differences in amount of PRRSV between

SCID and non-SCID pigs in this study can be attributed to the actions of B and T cells. Chen et al. (2015) hypothesized that cytokines secreted by T cells were responsible for both the decreased (IL-10; Patton et al., 2009; Cecere et al., 2012) and increased (IFN-γ; Rowland et al.,

2001) levels of PRRSV in the SCID pigs at early and later time points, respectively. Further studies are needed to examine the mechanisms responsible for the results found in this study. In particular, measuring the levels of IL-10 and IFN- γ would give information on the hypotheses in

Chen et al. (2015). These valuable SCID pigs were used as a model to study the PRRSV, because this virus causes great economic losses to the swine industry (Holtkamp et al., 2013) and understanding the immunobiology of PRRSV infection may help to mitigate its impact.

4

The cost of PRRSV outbreaks to the swine industry can be categorized into three main groups: prevention, resolution of the disease, and biological effects on the diseased herd.

Prevention efforts include strict biosecurity protocols to prevent introduction of the virus to a farm (Dee et al., 2003; Dee et al., 2004; Dee et al., 2010; Alonso et al., 2013) and vaccination against the PRRSV, which provides little protection, if any (Bourry et al., 2015; Kim et al., 2015;

Lyoo, 2015; Renukaradhya et al., 2015; Rose et al., 2015). Methods used to resolve the PRRSV infection include veterinarian treatment through administration of antiviral medication (Liu et al.,

2015) or other compounds (Sun et al., 2014), or a complete depopulation and repopulation of the infected farm (Dee and Joo, 1994; Dee and Joo, 1997). PRRSV infected pigs have poorer growth rates than uninfected pigs (Mengeling et al., 1998; M.M. Li et al., 2015). All of these results of

PRRSV outbreaks contribute to the economic impact of PRRS, which has been estimated to be over $664 million in the US each year (Holtkamp et al., 2013). Although these methods help to reduce the impact of PRRS, the combination of all have not been completely effective in eliminating the impact of PRRSV outbreaks. Therefore, there is value in additional tools that can be used to reduce the effects of PRRSV infection, such as genetic selection of pigs that are less affected by infection with the PRRSV (Lewis et al., 2007; Lunney et al., 2011; Rowland et al.,

2012).

The Immune Response Annotation Group annotated 1,369 genes involved in the immune response of pigs (Dawson et al., 2013), each presenting an opportunity for genetic variation in immune response to pathogens. Genetic variability in susceptibility to several diseases has been previously described, including diseases caused by bacteria (Sellwood et al., 1975; Van Diemen et al., 2002) and viruses (Halbur et al., 1998; Reiner et al., 2002; Petry et al., 2005; Vincent et al.,

2005; Vincent et al., 2006; Ait-Ali et al., 2007; Nakajima et al., 2007; Petry et al., 2007;

5

Doeschl-Wilson et al., 2009). In light of these and other findings, the PRRS Host Genetics

Consortium (PHGC) was established with the aim to study the genetic basis of piglet response to

PRRS infection (Lunney et al., 2011). Genomic analysis of the first 8 trials of the PHGC project showed that there was a major quantitative trait (QTL) on Sus scrofa chromosome (SSC) 4 that affected both the amount of PRRSV in the blood of infected piglets and the weight gain from 0 to 42 days post infection (dpi) (Boddicker et al., 2012; Boddicker et al., 2014a; Boddicker et al., 2014b). Nine PHGC trials were carried out using the NVSL 97-7985 (NVSL; Osorio et al.,

2002) PRRSV isolate. In order to assess the consistency of results with a more recent PRRSV isolate, KS2006-72109 (KS06) was used to infect an additional 5 trials of piglets. All of these data were available for determining the association of genetic variation in these pigs to VL and

WG.

Genome wide association studies (GWAS) employ statistical methods to identify regions of the genome that are associated with the trait of interest (Dean, 2003; Balding, 2006; Zeng et al., 2015). Bionformatic tools can then be used to determine if they contain candidate genes that have been shown to have a function related to the trait of interest. GWAS results of complex polygenic traits, such as response to infection with one of two isolates of the Porcine

Reproductive and Respiratory Syndrome virus (PRRSV), may reveal one or two genomic regions with relatively large effects if such loci exist (Boddicker et al., 2012) with many other genomic regions having much smaller effects. In an effort to reduce the time required to identify candidate genes in these numerous genomic regions with small effects on this complex trait, genes located near moderately associated SNPs can be compiled into gene lists. These lists can then be assessed for overrepresenatation of gene ontology (GO) annotation terms to reveal functional processes or pathways that may be involved in the trait. In Chapter 4 of this thesis, we

6 identified genomic regions associated with VL and WG after infection with one of two PRRSV isolates, then perfomed GO term enrichment analyses on these GWAS results.

Another genomic analysis technique employed in this thesis is genomic prediction, which involves the use of only genotypes of genetic markers that have been observed in both the training and validation populations to predict the genetic value of animals in the validation population for a phenotype of interest (Meuwissen et al., 2001). Genomic prediction estimates an animal’s genetic value without that animal having an observed phenotype for that trait. The accuracy of genomic prediction is typically quite high when the animals in the population for which phenotypes are being predicted (validation) are closely related to the animals in the population used to estimate the effects of each SNP (training; Habier et al., 2007; Goddard,

2008; Hayes et al., 2009; Daetwyler et al., 2013). Genomic prediction accuracy is also affected by the similarity of phenotypes observed versus those that are predicted; slightly different conditions in the training versus the validation population could alter genomic prediction accuracy, for example running speed on flat ground versus an inclined plane or weight gained after infection with one virus isolate versus another isolate. In these examples, the observed phenotypes are the same, running speed or weight gain after virus infection, but the conditions are slightly altered, flat versus incline or infection with different virus isolates. The data used in this study allowed us to assess both of these scenarios; we assessed differences in accuracy of genomic prediction across genetic backgrounds and PRRSV isolates. Furthermore, we examined the contribution of different subsets of SNPs based on GWAS or GO annotation enrichment results, only SNPs near the SSC 4 QTL, or the whole genome (Chapter 5).

The work presented in this thesis is based on the following hypotheses: 1) A Mendelian recessive mutation in a single gene causes the spontaneous SCID phenotype in pigs from a

7 selection line at Iowa State University; 2) Gene ontology (GO) annotation information can be used to aid identification of genomic regions that are associated with viral and weight gain traits in pigs infected with one of two PRRSV isolates; 3) Genomic prediction of VL and WG in response to PRRSV infection will be most accurate when the same virus isolates are used to infect training and validation populations, and genomic prediction across PRRSV isolate will be more accurate with increased genetic relationships between training and validation populations; and 4) Genomic prediction of VL and WG in response to PRRSV infection will be moderately accurate when the PRRSV isolate used to infect the training population is different from that of the validation population, and SNP subsets based on association information or gene ontology information will increase the accuracy of genomic prediction across PRRSV isolates. The objectives of this thesis were to test these hypotheses as follows:

1) Perform genetic mapping, targeted gene sequencing, and functional validation to identify

the causative mutation of SCID in pigs.

2) Identify genomic regions associated with response to infection with one of two PRRSV

isolates using genome-wide association analyses.

3) Assess the enrichment of gene ontology annotation terms in genes near regions shown to

be associated with PRRSV response by the genome-wide association analyses.

4) Evaluate the accuracy of genomic prediction across PRRSV isolates, across genetic lines,

and using SNP subsets based on association information or gene ontology information.

Thesis Organization

Three manuscripts were prepared for publication in scientific journals and are included as

Chapters in this thesis. I, Emily Waide, was the major contributor and writer of each of the three manuscripts, and Drs. Jack Dekkers and Christopher Tuggle supervised the research presented. A

8 review of literature relevant to this research can be found in Chapter 2. The mutations discovered to cause the only known naturally occuring cases of SCID in pigs are described in Chapter 3. An application for a U.S. patent has been published pertaining to these two mutations and other work concerning the SCID pig discovery, Publication No. US-2015-0216147. The results of genome wide association studies of piglets infected with one of two isolates of the PRRSV are described in Chapter 4. Genomic prediction of response to experimental PRRSV infection using various training and validation populations as well as different SNP subsets is detailed in Chapter

5. Chapter 6 includes general conclusions of this research, as well as the applicability of the findings to the swine industry.

Literature Cited

Ait-Ali, T., A. D. Wilson, D. G. Westcott, M. Clapperton, M. Waterfall, M. A. Mellencamp, T. W. Drew, S. C. Bishop, and A. L. Archibald. 2007. Innate immune responses to replication of porcine reproductive and respiratory syndrome virus in isolated Swine alveolar macrophages. Viral Immunol. 20:105–118.

Alonso, C., M. P. Murtaugh, S. A. Dee, and P. R. Davies. 2013. Epidemiological study of air filtration systems for preventing PRRSV infection in large sow herds. Prev. Vet. Med. 112:109– 117.

Alvarez-Ordóez, A., F. Martínez-Lobo, H. Arguello, A. Carvajal, and P. Rubio. 2013. Swine Dysentery: Aetiology, Pathogenicity, Determinants of Transmission and the Fight against the Disease. Int. J. Environ. Res. Public Health 10:1927–1947.

Balding, D. J. 2006. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7:781–91.

Bargen, L. E. 2004. A system response to an outbreak of enzootic pneumonia in grow/finish pigs. Can. Vet. J. 45:856–9.

Basel, M. T., S. Balivada, A. P. Beck, M. A. Kerrigan, M. M. Pyle, J. C. M. Dekkers, C. R. Wyatt, R. R. R. Rowland, D. E. Anderson, S. H. Bossmann, and D. L. Troyer. 2012. Human xenografts are not rejected in a naturally occurring immunodeficient porcine line: a human tumor model in pigs. Biores. Open Access 1:63–68.

Bernoco, D., and E. Bailey. 1998. Frequency of the SCID gene among Arabian horses in the USA. Anim. Genet. 29:41–2.

9

Boddicker, N. J., A. Bjorkquist, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. Dekkers. 2014a. Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection. Genet Sel Evol 46:18.

Boddicker, N. J., D. J. Garrick, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. M. Dekkers. 2014b. Validation and further characterization of a major quantitative trait locus associated with host response to experimental infection with porcine reproductive and respiratory syndrome virus. Anim. Genet. 45:48–58.

Boddicker, N., E. H. Waide, R. R. R. Rowland, J. K. Lunney, D. J. Garrick, J. M. Reecy, and J. C. M. Dekkers. 2012. Evidence for a major QTL associated with host response to Porcine reproductive and respiratory syndrome virus challenge. J. Anim. Sci. 90:1733–1746.

Bosma, G. C., R. P. Custer, and M. J. Bosma. 1983. A severe combined immunodeficiency mutation in the mouse. Nature 301:527–30.

Bourry, O., C. Fablet, G. Simon, and C. Marois-Créhan. 2015. Efficacy of combined vaccination against Mycoplasma hyopneumoniae and porcine reproductive and respiratory syndrome virus in dually infected pigs. Vet. Microbiol.

Cai, W., D. S. Casey, and J. C. M. Dekkers. 2008. Selection response and genetic parameters for residual feed intake in Yorkshire swine. J. Anim. Sci. 86:287–298.

Cecere, T. E., S. M. Todd, and T. Leroith. 2012. Regulatory T cells in arterivirus and coronavirus infections: do they protect against disease or enhance it? Viruses 4:833–46.

Chen, N., J. C. M. Dekkers, C. L. Ewen, and R. R. R. Rowland. 2015. Porcine reproductive and respiratory syndrome virus replication and quasispecies evolution in pigs that lack adaptive immunity. Virus Res. 195:246–9.

Cho, J. G., and S. A. Dee. 2006. Porcine reproductive and respiratory syndrome virus. Theriogenology 66:655–62.

Cino-Ozuna, A., R. Rowland, J. Nietfeld, M. Kerrigan, J. Dekkers, and C. Wyatt. 2012. Preliminary Findings of a Previously Unrecognized Porcine Primary Immunodeficiency Disorder. Vet. Pathol. 50:144–146.

Cortés, P., D. A. Spricigo, C. Bardina, and M. Llagostera. 2015. Remarkable diversity of Salmonella bacteriophages in swine and poultry. FEMS Microbiol. Lett. 362:1–7.

Cossu, F. 2010. Genetics of SCID. Ital. J. Pediatr. 36:76.

Darwich, L., I. Díaz, and E. Mateu. 2010. Certainties, doubts and hypotheses in porcine reproductive and respiratory syndrome virus immunobiology. Virus Res. 154:123–132.

Dawson, H. D., J. E. Loveland, G. Pascal, J. G. R. Gilbert, H. Uenishi, K. M. Mann, Y. Sang, J. Zhang, D. Carvalho-Silva, T. Hunt, M. Hardy, Z. Hu, S.-H. Zhao, A. Anselmo, H. Shinkai, C. Chen, B. Badaoui, D. Berman, C. Amid, M. Kay, D. Lloyd, C. Snow, T. Morozumi, R. P.-Y.

10

Cheng, M. Bystrom, R. Kapetanovic, J. C. Schwartz, R. Kataria, M. Astley, E. Fritz, C. Steward, M. Thomas, L. Wilming, D. Toki, A. L. Archibald, B. Bed’Hom, D. Beraldi, T.-H. Huang, T. Ait-Ali, F. Blecha, S. Botti, T. C. Freeman, E. Giuffra, D. a Hume, J. K. Lunney, M. P. Murtaugh, J. M. Reecy, J. L. Harrow, C. Rogel-Gaillard, and C. K. Tuggle. 2013. Structural and functional annotation of the porcine immunome. BMC Genomics 14:332.

Dean, M. 2003. Approaches to identify genes for complex human diseases: lessons from Mendelian disorders. Hum. Mutat. 22:261–74.

Dee, S. A., A. R. Carlson, N. L. Winkelman, and M. M. Corey. 1993. Effect of management practices on the Streptococcus suis carrier rate in nursery swine. J. Am. Vet. Med. Assoc. 203:295–9.

Dee, S. A., J. Deen, S. Otake, and C. Pijoan. 2004. An experimental model to evaluate the role of transport vehicles as a source of transmission of porcine reproductive and respiratory syndrome virus to susceptible pigs. Can. J. Vet. Res. 68:128–33.

Dee, S. A., and H. S. Joo. 1994. Prevention of the spread of porcine reproductive and respiratory syndrome virus in endemically infected pig herds by nursery depopulation. Vet. Rec. 135:6–9.

Dee, S. A., and H. Joo. 1997. Strategies to control PRRS: a summary of field and research experiences. Vet. Microbiol. 55:347–53.

Dee, S., J. Deen, K. Rossow, C. Weise, R. Eliason, S. Otake, H. S. Joo, and C. Pijoan. 2003. Mechanical transmission of porcine reproductive and respiratory syndrome virus throughout a coordinated sequence of events during warm weather. Can. J. Vet. Res. 67:12–9.

Dee, S., S. Otake, and J. Deen. 2010. Use of a production region model to assess the efficacy of various air filtration systems for preventing airborne transmission of porcine reproductive and respiratory syndrome virus and Mycoplasma hyopneumoniae: Results from a 2-year study. Virus Res. 154:177–184.

Van Diemen, P. M., M. B. Kreukniet, L. Galina, N. Bumstead, and T. S. Wallis. 2002. Characterisation of a resource population of pigs screened for resistance to salmonellosis. Vet. Immunol. Immunopathol. 88:183–196.

Doeschl-Wilson, A. B., I. Kyriazakis, A. Vincent, M. F. Rothschild, E. Thacker, and L. Galina- Pantoja. 2009. Clinical and pathological responses of pigs from two genetically diverse commercial lines to porcine reproductive and respiratory syndrome virus infection. J. Anim. Sci. 87:1638–1647.

Dvorak, C. M. T., M. P. Lilla, S. R. Baker, and M. P. Murtaugh. 2013. Multiple routes of porcine circovirus type 2 transmission to piglets in the presence of maternal immunity. Vet. Microbiol. 166:365–374.

Ewen, C., A. Cino-Ozuna, H. He, M. Kerrigan, J. Dekkers, C. Tuggle, R. Rowland, and C. Wyatt. 2014. Analysis of blood leukocytes in a naturally occurring immunodeficiency of pigs shows the defect is localized to B and T cells. Vet. Immunol. Immunopathol. 162:174–179.

11

Fittipaldi, N., J. Xu, S. Lacouture, P. Tharavichitkul, M. Osaki, T. Sekizaki, D. Takamatsu, and M. Gottschalk. 2011. Lineage and virulence of Streptococcus suis serotype 2 isolates from North America. Emerg. Infect. Dis. 17:2239–44.

Glass-Kaastra, S. K., D. L. Pearl, R. J. Reid-Smith, B. McEwen, D. Slavic, S. A. McEwen, and J. Fairles. 2014. Antimicrobial susceptibility of Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from a diagnostic veterinary laboratory and recommendations for a surveillance system. Can. Vet. J. 55:341–8.

Goetstouwers, T., M. Van Poucke, W. Coppieters, V. U. Nguyen, V. Melkebeek, A. Coddens, K. Van Steendam, D. Deforce, E. Cox, and L. J. Peelman. 2014. Refined candidate region for F4ab/ac enterotoxigenic Escherichia coli susceptibility situated proximal to MUC13 in pigs. PLoS One 9:e105013.

Grau, F. R., M. E. Schroeder, E. L. Mulhern, M. T. McIntosh, and M. A. Bounpheng. 2015. Detection of African swine fever, classical swine fever, and foot-and-mouth disease viruses in swine oral fluids by multiplex reverse transcription real-time polymerase chain reaction. J. Vet. Diagnostic Investig. 27:140–149.

Halbur, P., M. F. Rothschild, B. Thacker, X.-J. Meng, P. Paul, and J. Bruna. 1998. Differences in susceptibility of Duroc , Hampshire , and Meishan reproductive and respiratory syndrome virus ( PRRSV ). 115:181–189.

Hall, W., and E. Neumann. 2015. Fresh Pork and Porcine Reproductive and Respiratory Syndrome Virus: Factors Related to the Risk of Disease Transmission. Transbound. Emerg. Dis. 62:350–366.

Harding, J. C. S. 2004. The clinical expression and emergence of porcine circovirus 2. Vet. Microbiol. 98:131–5.

Holtkamp, D. J., J. B. Kliebenstein, E. J. Neumann, J. J. Zimmerman, H. F. Rotto, T. K. Yoder, C. Wang, P. E. Yeske, C. L. Mowrer, and C. a. Haley. 2013. Assessment of the economic impact of porcine reproductive and respiratory syndrome virus on United States pork producers. J. Swine Heal. Prod. 21:72–84.

Huang, Y.-W., A. W. Dickerman, P. Pineyro, L. Li, L. Fang, R. Kiehne, T. Opriessnig, and X.-J. Meng. 2013. Origin, Evolution, and Genotyping of Emergent Porcine Epidemic Diarrhea Virus Strains in the United States. MBio 4:e00737–13–e00737–13.

Kim, T., C. Park, K. Choi, J. Jeong, I. Kang, S.-J. Park, and C. Chae. 2015. Comparison of Two Commercial Type 1 Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) Modified Live Vaccines against Heterologous Type 1 and Type 2 PRRSV Challenge in Growing Pigs.D. W. Pascual, editor. Clin. Vaccine Immunol. 22:631–640.

Lewis, C. R. G., T. Ait-Ali, M. Clapperton, A. L. Archibald, and S. Bishop. 2007. Genetic perspectives on host responses to porcine reproductive and respiratory syndrome (PRRS). Viral Immunol. 20:343–358.

12

Li, M. M., K. M. Seelenbinder, M. A. Ponder, L. Deng, R. P. Rhoads, K. D. Pelzer, J. S. Radcliffe, C. V Maxwell, J. A. Ogejo, and M. D. Hanigan. 2015. Effects of porcine reproductive and respiratory syndrome virus on pig growth, diet utilization efficiency, and gas release from stored manure. J. Anim. Sci. 93:4424–35.

Liu, X., C. Guo, Y. Huang, X. Zhang, and Y. Chen. 2015. Inhibition of porcine reproductive and respiratory syndrome virus by Cecropin D in vitro. Infect. Genet. Evol. 34:7–16.

Loving, C. L., F. A. Osorio, M. P. Murtaugh, and F. A. Zuckermann. 2015. Innate and adaptive immunity against Porcine Reproductive and Respiratory Syndrome Virus. Vet. Immunol. Immunopathol. 167:1–14.

Lunney, J. K., D. A. Benfield, and R. R. R. Rowland. 2010. Porcine reproductive and respiratory syndrome virus: an update on an emerging and re-emerging viral disease of swine. Virus Res. 154:1–6.

Lunney, J. K., J. P. Steibel, J. M. Reecy, E. Fritz, M. F. Rothschild, M. Kerrigan, B. Trible, and R. R. Rowland. 2011. Probing genetic control of swine responses to PRRSV infection: current progress of the PRRS host genetics consortium. BMC Proc. 5 Suppl 4:S30.

Lyoo, Y. S. 2015. Porcine reproductive and respiratory syndrome virus vaccine does not fit in classical vaccinology. Clin. Exp. Vaccine Res. 4:159.

Mateu, E., and I. Diaz. 2008. The challenge of PRRS immunology. Vet. J. 177:345–351.

McLamb, B. L., A. J. Gibson, E. L. Overman, C. Stahl, and A. J. Moeser. 2013. Early Weaning Stress in Pigs Impairs Innate Mucosal Immune Responses to Enterotoxigenic E. coli Challenge and Exacerbates Intestinal Injury and Clinical Disease.C. Kanellopoulos-Langevin, editor. PLoS One 8:e59838.

Meek, K., L. Kienker, C. Dallas, W. Wang, M. J. Dark, P. J. Venta, M. L. Huie, R. Hirschhorn, and T. Bell. 2001. SCID in Jack Russell terriers: a new animal model of DNA-PKcs deficiency. J. Immunol. 167:2142–2150.

Mengeling, W. L., K. M. Lager, and A. C. Vorwald. 1998. Clinical effects of porcine reproductive and respiratory syndrome virus on pigs during the early postnatal interval. Am. J. Vet. Res. 59:52–5.

Murtaugh, M. P., Z. Xiao, and F. Zuckermann. 2002. Immunological responses of swine to porcine reproductive and respiratory syndrome virus infection. Viral Immunol. 15:533–547.

Nakajima, E., T. Morozumi, K. Tsukamoto, T. Watanabe, G. Plastow, and T. Mitsuhashi. 2007. A naturally occurring variant of porcine Mx1 associated with increased susceptibility to influenza virus in vitro. Biochem. Genet. 45:11–24.

Nelson, M. I., J. Stratton, M. L. Killian, A. Janas-Martindale, and A. L. Vincent. 2015. Continual Reintroduction of Human Pandemic H1N1 Influenza A Viruses into Swine in the United States, 2009 to 2014. J. Virol. 89:6218–26.

13

Osorio, F. A, J. A Galeota, E. Nelson, B. Brodersen, a Doster, R. Wills, F. Zuckermann, and W. W. Laegreid. 2002. Passive transfer of virus-specific antibodies confers protection against reproductive failure induced by a virulent strain of porcine reproductive and respiratory syndrome virus and establishes sterilizing immunity. Virology 302:9–20.

Patton, J. B., R. R. Rowland, D. Yoo, and K.-O. Chang. 2009. Modulation of CD163 receptor expression and replication of porcine reproductive and respiratory syndrome virus in porcine macrophages. Virus Res. 140:161–171.

Perryman, L. E. 2004. Molecular Pathology of Severe Combined Immunodeficiency in Mice, Horses, and Dogs. Vet. Pathol. 41:95–100.

Petry, D. B., J. W. Holl, J. S. Weber, A. R. Doster, F. a. Osorio, and R. K. Johnson. 2005. Biological responses to porcine respiratory and reproductive syndrome virus in pigs of two genetic populations. J. Anim. Sci. 83:1494–1502.

Petry, D. B., J. Lunney, P. Boyd, D. Kuhar, E. Blankenship, and R. K. Johnson. 2007. Differential immunity in pigs with high and low responses to porcine reproductive and respiratory syndrome virus infection. J. Anim. Sci. 85:2075–2092.

Pires, A. F. A., J. A. Funk, and C. Bolin. 2014. Risk factors associated with persistence of Salmonella shedding in finishing pigs. Prev. Vet. Med. 116:120–8.

Ramis, G., E. Marco, V. Magaña, P. González-Contreras, G. Swierczynski, J. M. Abellaneda, A. Sáez-Acosta, A. Mrowiec, and F. J. Pallarés. 2015. Evidence that periweaning failure-to-thrive syndrome (PFTS) has a genetic predisposition. Vet. Rec. 176:596.

Reiner, G., N. Bertsch, D. Hoeltig, M. Selke, H. Willems, G. F. Gerlach, B. Tuemmler, I. Probst, R. Herwig, M. Drungowski, and K. H. Waldmann. 2014. Identification of QTL affecting resistance/susceptibility to acute Actinobacillus pleuropneumoniae infection in swine. Mamm. Genome 25:180–191.

Reiner, G., E. Melchinger, M. Kramarova, E. Pfaff, M. Büttner, A. Saalmüller, and H. Geldermann. 2002. Detection of quantitative trait loci for resistance/susceptibility to pseudorabies virus in swine. J. Gen. Virol. 83:167–172.

Renukaradhya, G. J., X.-J. Meng, J. G. Calvert, M. Roof, and K. M. Lager. 2015. Live porcine reproductive and respiratory syndrome virus vaccines: Current status and future direction. Vaccine 33:4069–80.

Ribbens, S., J. Dewulf, F. Koenen, H. Laevens, and A. de Kruif. 2004. Transmission of classical swine fever. A review. Vet. Q. 26:146–155.

Rose, N., T. Opriessnig, B. Grasland, and A. Jestin. 2012. Epidemiology and transmission of porcine circovirus type 2 (PCV2). Virus Res. 164:78–89.

14

Rose, N., P. Renson, M. Andraud, F. Paboeuf, M. F. Le Potier, and O. Bourry. 2015. Porcine reproductive and respiratory syndrome virus (PRRSv) modified-live vaccine reduces virus transmission in experimental conditions. Vaccine 33:2493–9.

Rowland, R. R. R., J. Lunney, and J. Dekkers. 2012. Control of porcine reproductive and respiratory syndrome (PRRS) through genetic improvements in disease resistance and tolerance. Front. Genet. 3:1–6.

Rowland, R. R. R., B. Robinson, J. Stefanick, T. S. Kim, L. Guanghua, S. R. Lawson, and D. A. Benfield. 2001. Inhibition of porcine reproductive and respiratory syndrome virus by interferon- gamma and recovery of virus replication with 2-aminopurine. Arch. Virol. 146:539–555.

Sellwood, R., R. A Gibbons, G. W. Jones, and J. M. Rutter. 1975. Adhesion O F Enteropathogenic Escherichia Coli To Pig Intestinal Brush Borders : the Existence of Two P I G Phenotypes. 8.

Stevenson, G. W., H. Hoang, K. J. Schwartz, E. R. Burrough, D. Sun, D. Madson, V. L. Cooper, A. Pillatzki, P. Gauger, B. J. Schmitt, L. G. Koster, M. L. Killian, and K. J. Yoon. 2013. Emergence of Porcine epidemic diarrhea virus in the United States: clinical signs, lesions, and viral genomic sequences. J. Vet. Diagn. Invest. 25:649–54.

Sun, N., Z.-W. Wang, C.-H. Wu, E. Li, J.-P. He, S.-Y. Wang, Y.-L. Hu, H.-M. Lei, and H.-Q. Li. 2014. Antiviral activity and underlying molecular mechanisms of Matrine against porcine reproductive and respiratory syndrome virus in vitro. Res. Vet. Sci. 96:323–327.

Tobias, T. J., A. Bouma, J. van den Broek, A. van Nes, A. J. J. M. Daemen, J. A. Wagenaar, J. A. Stegeman, and D. Klinkenberg. 2014. Transmission of Actinobacillus pleuropneumoniae among weaned piglets on endemically infected farms. Prev. Vet. Med. 117:207–14.

Trible, B. R., A. Ramirez, A. Suddith, A. Fuller, M. Kerrigan, R. Hesse, J. Nietfeld, B. Guo, E. Thacker, and R. R. R. Rowland. 2012. Antibody responses following vaccination versus infection in a porcine circovirus-type 2 (PCV2) disease model show distinct differences in virus neutralization and epitope recognition. Vaccine 30:4079–4085.

Vincent, A L., B. J. Thacker, P. G. Halbur, M. F. Rothschild, and E. L. Thacker. 2005. In vitro susceptibility of macrophages to porcine reproductive and respiratory syndrome virus varies between genetically diverse lines of pigs. Viral Immunol. 18:506–512.

Vincent, A., B. Thacker, P. Halbur, M. F. Rothschild, and E. L. Thacker. 2006. An investigation of susceptibility to porcine reproductive and respiratory syndrome virus between two genetically diverse commercial lines of pigs. J. Anim. Sci. 84:49–57.

Wang, L., Y. C. Chen, D. J. Zhang, H. T. Li, D. Liu, and X. Q. Yang. 2014. Functional characterization of genetic variants in the porcine TLR3 gene. Genet. Mol. Res. 13:1348–57.

Wiler, R., R. Leber, B. B. Moore, L. F. VanDyk, L. E. Perryman, and K. Meek. 1995. Equine severe combined immunodeficiency: a defect in V(D)J recombination and DNA-dependent kinase activity. Proc. Natl. Acad. Sci. U. S. A. 92:11485–11489.

15

Young, T. F., R. F. Ross, and J. Drisko. 1983. Prevalence of antibodies to Mycoplasma hyopneumoniae in Iowa swine. Am. J. Vet. Res. 44:1946–8.

Zeng, P., Y. Zhao, C. Qian, L. Zhang, R. Zhang, J. Gou, J. Liu, L. Liu, and F. Chen. 2015. Statistical analysis for genome-wide association study. J. Biomed. Res. 29:285–97.

Zhao, J, X. Yang, S.L. Aug, K.D. Kim, H. Tang, and Y-X. Fu. 2009. Do adaptive immune cells suppress or activate innate immunity? Trends in Immunol. 30(1): 8-12.

16

CHAPTER 2: LITERATURE REVIEW

Genetics of Disease

Genetic control of diseases in animals ranges from the effect of one gene to the actions of hundreds or even thousands of genes. Disorders that segregate in a Mendelian recessive manner are often caused by mutations in one gene, and the associated phenotype may be categorical; the animal is affected or unaffected by the disease. Single gene mutations have also been shown to have major effects on susceptibility to one of several diseases; for example, a mutation in FUT1 is responsible for postweaning diarrhea edema disease (Wang et al., 2012) and MX1 (Pastoret et al., 2012) and RIG-I (Barber et al., 2010) play important roles in susceptibility to the influenza virus in pigs and ducks, respectively. On the other hand, complex immunological traits are affected by genetic variants in many genes; for example, many genomic regions are involved in antibody response to vaccination for Bovine Respiratory Syncytial Virus (Glass et al., 2012), the top 20 SNPs associated with Johne’s disease in cattle were located on 8 different chromosomes

(Zare et al., 2014), and a major QTL was found to be involved in viral load and weight gain following infection with the Porcine Reproductive and Respiratory Syndrome virus but with many additional regions showing small effects (Boddicker et al., 2012; Boddicker et al., 2014a;

Boddicker et al., 2014b). The aims of this thesis were as follows: 1) discover the mutations that cause the Mendelian recessive spontaneous SCID that was discovered in pigs (Cino-Ozuna et al.,

2012), 2) identify genomic regions associated with response to one of two isolates of PRRSV, 3) evaluate the overrepresentation of gene ontology terms of genes in regions associated with response to one of two PRRSV isolates, and 4) assess the accuracy of genomic prediction of response to PRRSV infection with different isolates or genetic backgrounds in training and validation populations when using the whole genome or smaller subsets of SNPs. The remainder

17 of Chapter 2 of this thesis is a review of scientific literature that give background information on the diseases that were investigated in this research and the tools that were used to address the objectives provided above.

Porcine Reproductive and Respiratory Syndrome

Porcine Reproductive and Respiratory Syndrome (PRRS) is caused by an enveloped single-stranded positive sense RNA arterivirus (Wensvoort et al., 1991; Conzelmann et al.,

1993). The primary cell type infected by the PRRS virus (PRRSV) in vitro is the alveolar macrophage (Duan et al, 1997). PRRSV infection causes abortions, stillborn and mummified piglets, and those piglets that are born alive are weak (Keffaber, 1989). The repiratory symptoms of PRRSV infection cause increased morbidity and mortality in young pigs and leaves these nursery-aged piglets more susceptible to secondary respiratory infections (Zimmerman et al.,

1997). Research on the pathogenesis of PRRSV may provide knowledge on methods that can be used to reduce the economic impact of PRRSV infection in the swine industry (Holtkamp et al.,

2013).

Immune response to PRRS. The PRRSV must overcome several hurdles after entering its host before it is able to replicate and go on to infect its next target. An initial challenge to the virus is finding a host cell that will permit the virus to enter. Porcine alveolar macrophages are widely thought to be the predominant cell type infected with PRRS virus (Duan et al., 1997); however, the virus has been shown to proliferate in other cell types or lines, such as monkey kidney cells in vitro (Calvert et al., 2007). The permissiveness of cells to entry of the PRRS virus is affected by the expression of cell surface receptors and cytokine activity (Darwich et al.,

2010). Once inside the cell, the virus must hijack the replication machinery of the host to use for

18 its own multiplication. In order to successfully replicate, while avoiding detection by the host immune system, the virus will need to alter the host’s expression of cytokines, such as interferons (Bautista and Molitor, 1997). Changes in expression of other cytokines, including IL-

1, IL-6, IL-8, IL-10, and TNF-α, have also been detected during PRRS infection studies

(Darwich et al., 2010). Antibodies to PRRS virus antigens have been detected as early as 5 days post infection (Yoon et al., 1995), while neutralizing antibodies are not present until several weeks after infection (Murtaugh et al., 2002; Molina et al., 2008). This early production of non- neutralizing antibodies is actually thought to exacerbate PRRSV infection, as it may increase internalization of an antibody coated PRRSV particle by macrophages (Yoon et al., 1996; Gu et al., 2015). The exact mechanisms of PRRSV pathogenesis are unclear and confusing at best

(Mateu and Diaz, 2008; Darwich et al., 2010; Loving et al., 2015), which makes development of methods to control the disease extremely difficult. In addition, the PRRSV is a rapidly mutating virus (Fang et al., 2007; Murtaugh et al., 2010; Shi et al., 2010; Mu et al., 2013), which provides added difficulty in prevention, especially in creation of an effective vaccine (Hu and Zhang,

2014; Li et al., 2014; Park et al., 2014; Renukaradhya et al., 2015).

Transmission of PRRSV. Many excretions from infected pigs have been shown to contain PRRS virus particles, which increases the incidence of spread (Zimmerman et al., 1997).

The ability of the virus to persist in infected pigs for long periods of time has been implicated in transmission of the PRRSV (Wills et al., 2003). Several methods have been attempted in order to eliminate PRRS from commercial populations, including removal of animals following positive diagnosis, depopulation of entire herds (Dee and Joo, 1994; Dee and Joo, 1997), and postponing introduction of new animals into a herd that has broken with the virus (Corzo et al., 2010). These practices may be effective in temporarily eliminating the PRRS virus but are costly and require

19 intense management practices. The most successful strategy to control PRRS virus infections in the swine industry will include a combination of proper management, improved biosecurity on each farm, as well as during transportation, effective vaccination and veterinary care, and genetic improvement of pigs for resistance to the PRRS virus.

Genetic control of response to PRRS. Host immune response to PRRSV infection involves both the innate and the adaptive parts of the immune system. With this, there are many opportunities for genetic control of the response (Lewis et al., 2007). Breed differences have been noted in lung lesions, rectal temperatures, and viremia during PRRSV infections (Halbur et al., 1998; Petry et al., 2005); Petry et al. (2005) also noted within breed differences of these response to PRRSV infection phenotypes. Antibody titers, lung lesion scores, and viral tissue distribution after infection with highly pathogenic PRRSV were markedly different between wild and domestic pigs (Do et al., 2015). Do et al. (2015) found that infection with highly pathogenic

PRRSV in wild pigs caused higher mortality, hemorrhagic lesions in internal organs, anti-

PRRSV IgG antibody titers, and lung lesion scores at 7 dpi (lower at 10 dpi), and a wider viral tissue distribution compared to domestic pigs. Several estimates of heritabilities for response to

PRRSV infection have been published. Lewis et al. (2009) showed that the heritability of reproductive traits of sows was increased during PRRSV outbreak compared to normal health conditions; heritability of mummified piglets was 0.10 and heritability of matings per conception was 0.46 during a PRRSV outbreak. The majority of reproductive traits investigated by Serão et al. (2014) also had higher heritabilities during PRRSV infection compared to before the PRRSV outbreak. This increased heritability after PRRSV infection was also seen in the survivability of nursery age piglets, with an estimated heritability of 0.26 (Vukasinovic and Clutter, 2010). The heritability of PRRS viremia measured as a presence or absence of virus binary trait was shown

20 to be very low, 0.10 (Biffani et al., 2010). On the other hand, the first 8 trials of NVSL data have been previously analyzed, and heritabilities were estimated to be 0.44 for VL and 0.29 for WG for these quantitative traits (Boddicker et al., 2014a). For response to the KS06 PRRSV isolate, the estimates of heritability were higher, 0.65 for VL and 0.44 for WG (Hess et al., 2014).

To date, few studies have employed the use of genotyping platforms, such as SNP chips, to investigate host genetics in response to PRRSV. In one of these, two major QTL were shown to affect the amount of PRRSV-specific antibodies in commercial sows (Serão et al., 2014). The

PRRS Host Genetics Consortium (PHGC; Lunney et al., 2011) was established with the intention of exploring the genetic basis of host response to PRRSV. In total, 15 trials of ~200 commercial nursery age piglets each were experimentally infected with the PRRSV, then followed for 42 days post infection (dpi). In 10 of these trials, pigs were infected with the NVSL 97-7985

(NVSL) virus isolate (Osorio et al., 2002), which is more virulent than the KS2006-72109

(KS06) isolate, which was used to inoculate pigs in the remaining 5 trials. Blood samples were taken at 0, 4, 7, 11, and 14 dpi, then weekly until the completion of the trial at 42 dpi. Individual body weights were observed weekly throughout the trial. Two of the phenotypes that were observed during each of these trials were viral load (VL), calculated as the area under the curve of log-viremia up to and including 21 dpi, and weight gain (WG), calculated as the difference between body weight at 42 and 0 dpi. Each pig was genotyped using the Illumina Porcine SNP60

Beadchip (Ramos et al., 2009), enabling investigation of specific genomic regions associated with response to PRRSV infection.

Boddicker et al. (2012) discovered several genomic regions associated with response to infection with the NVSL PRRSV isolate in the first 3 PHGC trials, including a major QTL on

Sus scrofa chromosome (SSC) 4, in which the favorable allele was associated with decreased VL

21 and increased WG. VL was shown to have a heritability of 0.39, with the SSC 4 QTL explaining approximately 15% of the genetic variation in the initial 5 trials of the PHGC study (Boddicker et al., 2012; Boddicker et al., 2014a). These results were expanded to include the first 8 trials of

NVSL infected pigs (Boddicker et al., 2014b). Several additional genomic regions were shown to have smaller effects on each trait, but the focus of these papers was the SSC 4 QTL effect. It is important to further investigate the effects of this and other QTL in the KS06 trials, to determine if there are QTL controlling response to more than one PRRSV isolate. PRRSV is a rapidly mutating virus, and pigs are unlikely to be naturally exposed to the same isolate in multiple years

(Fang et al., 2007; Kimman et al., 2009; Mu et al., 2013). Ideally, identified QTL would improve response to multiple PRRSV isolates; furthermore, these QTL would ideally have a positive effect on response to other pathogens. A review on genetic control of mice to Leishmania discussed co-localization of QTL that are involved in susceptibility to Leishmania and QTL that influence susceptibility to other pathogens, including bacteria, parasites, and viruses (Lipoldova and Demant, 2006). High genetic correlations of response traits between the two PRRSV isolates used in this study have been reported; 0.95 for VL and 0.78 for WG (Hess et al., 2015).

Furthermore, Hess et al. (2015) showed that the SSC 4 QTL was associated with VL in both isolates, but WG in only the NVSL isolate. Although this QTL has a large effect on the phenotypes, over 80% of the genetic variance is not explained by this region. In this thesis, we aim to identify additional regions associated with response to PRRSV infection.

Although PRRS is an economically devastating disease that has plagued the swine industry for over two and a half decades (Benfield et al., 1992), host immune response to

PRRSV infection is poorly understood (Loving et al., 2015). Interferon-gamma (IFN-γ) is a cytokine that plays an important role in immune responses to viruses (Novelli and Casanova,

22

2004; Carrero, 2013), and is secreted by T cells, NK cells, macrophages, and neurons (Meier et al., 2003; Sun et al., 2012; Li et al., 2014). PRRS-specific IFN-γ response has been shown to be delayed (Meier et al., 2003; Díaz et al., 2005; Díaz et al., 2006), erratic (Mateu and Diaz, 2008;

Dotti et al., 2013), or even nonexistent (Dotti et al., 2011; Dotti et al., 2013). Antibody production occurs within the first two weeks after PRRSV infection (Yoon et al., 1995), while neutralizing antibodies are not observed until later in the infection (Nelson et al., 1994; Lopez and Osorio, 2004). The role of neutralizing antibodies in clearance of the PRRSV is debatable, though, as several studies have reported the presence of PRRSV alongside neutralizing antibodies in pigs (Murtaugh et al., 2002; Wills et al., 2003). Furthermore, antibodies may also act to exacerbate PRRSV infections through antibody-dependent enhancement of infectivity of the virus (Yoon et al., 1996; Gu et al., 2015). Infection of animals that lack an important component of the immune system may help to answer some of the questions that remain about immune response to PRRSV infection. This is challenging, though, as the PRRSV has only been shown to naturally infect pigs (Albina, 1997), and, until recently (Cino-Ozuna et al., 2012;

Suzuki et al., 2012; Watanabe et al., 2013; Huang et al., 2014; Ito et al., 2014; Lee et al., 2014), immunodeficient pigs had not been available for use as research models.

The recent identification of pigs without functional adaptive immune systems (Cino-

Ozuna et al., 2012) provided a unique model to study the pathogenesis of the PRRSV. These pigs were affected with severe combined immunodeficiency (SCID) and lacked both B and T cells, while NK cells were present in circulating blood (Ewen et al., 2014). Infection of SCID piglets and their non-SCID littermates gave insight into the role of the adaptive immune system in response to PRRSV infection (Chen et al., 2015). This study showed that up to and including 11 dpi, SCID piglets had lower levels of viremia in circulating blood compared to non-SCID

23 littermates. This was hypothesized to be due to a decreased number of macrophages present for infection in the SCID pigs. Although there is not expected to be any difference in the raw numbers of macrophages in SCID pigs, the lack of an adaptive immune system may reduce the cytokine signals that affect macrophages. In particular, IL-10, which is a cytokine produced by B and T cells, macrophages, and dendritic cells, may act to increase infectivity of macrophages and lead to increased numbers of permissive macrophages for PRRSV infection (Couper et al., 2008;

Patton et al., 2009; Cecere et al., 2012). At 21 dpi, this relationship between PRRSV levels in

SCID and non-SCID pigs reversed; non-SCID piglets began clearing the virus as expected, while viremia in the SCID pigs continued to increase (Chen et al., 2015). While T cells were blamed for the increased PRRSV levels in non-SCID pigs at 11 dpi, they were credited for reducing viremia by 21 dpi through production of IFN-γ protecting macrophages from infection (Rowland et al., 2001). This study took advantage of pigs lacking B and T cells to study the pathogenesis of the PRRSV, but the utility of SCID pigs extends past pig-specific pathogens into biomedical research to further studies of human immune function.

Severe Combined Immunodeficiency

Severe Combined Immunodeficiency (SCID) is a primary immunodeficiency that has been documented as naturally occuring in humans (Cossu, 2010), horses (Wiler et al., 1995;

Bernoco and Bailey, 1998; Perryman, 2004), dogs (Meek et al., 2001), mice (Bosma et al.,

1983), and recently in pigs (Cino-Ozuna et al., 2012). Genetic modification techniques have also been utilized to introduce the disease into mice, creating a widely used biomedical model . SCID is a large group of diseases, which are subdivided by the presence/functionality of B, T, and NK cells (Cossu, 2010; van der Burg and Gennery, 2011). Further divisions of these subgroups can

24 be made based on the genetic lesion that causes the disease (Cossu, 2010). Cossu (2010) describes the following general classifications of typical SCID and their inheritance patterns:

Table 2.1. General classifications of typical SCID, as described in Cossu et al. (2010). Cellular phenotype of selected SCID Inheritance pattern Number of genes with (with + or - indicating presence or known mutations absence of cell types, respectively) T- B- NK+ Autosomal recessive 6 T- B+ NK+ Autosomal recessive 10 Autosomal dominant >37 T- B- NK- Autosomal recessive 3 T- B+ NK- Autosomal recessive 2 X-linked 1

In general, SCID infants appear normal at birth but quickly fail to thrive in the post-natal months. In humans, maternal antibody transfer occurs in utero by placental passage (Chucri et al., 2010). This is not the case with pigs (Chucri et al., 2010), but piglets acquire maternal antibodies through suckling of colostrum. Therefore, SCID piglets are not likely to show ill effects of the disease until maternal antibodies decay to a critical level, often occuring after they are weaned from the sow. At weaning, which occurs around 2 or 3 weeks of age, piglets at commercial farms are typically moved from the farrowing house, where they were born, to a nursery barn, where they are generally mixed in pens with non-littermates. This increased density of piglets increases the likelihood of becoming infected with transmissible diseases

(Vicca et al., 2002). Normal piglets are generally able to mount an appropriate immune response to combat the disease, whereas the SCID piglets discovered by Cino-Ozuna et al. (2012) can not and succumb to the infection. We could assume that production of any SCID piglets in the swine industry would be caused by recessive mutations, as autosomal dominant inheritance would require one parent to be SCID affected. The loss of piglets from SCID carrier-by-carrier matings

25 would not be desirable in the swine industry; however SCID pigs could be a valuable commodity in biomedical research settings.

SCID mice have been used extensively in biomedical research (Pearson et al., 2008); for example, aiding in human vaccine development (Koo et al., 2009), studies of human hepatitis virus infections (Chayama et al., 2011), human immunodeficiency virus infection (Denton and

García), autoimmune diseases (Morel, 2004), and human hematopoietic stem cell transplantation

(Tanner et al., 2014). Although mice provide a smaller vehicle to study human immune function, the fact remains that the immune systems of mice are more dissimilar to the human than pig immune systems (Mestas and Hughes, 2004).

After many years of dedicated work towards creating a SCID pig, transgenic technology aided in the development of several SCID pig models, a RAG1 knockout pig (Ito et al., 2014), a

RAG1/2 knockout pig (Huang et al., 2014; Lee et al., 2014) and two Il2rg knockout pigs (Suzuki et al., 2012; Watanabe et al., 2013). We (Cino-Ozuna et al., 2012) discovered naturally occuring

SCID in a line of pigs selected for residual feed intake at Iowa State University (Cai et al., 2008).

These pigs were shown to lack both B and T cells, but have NK cells (Ewen et al., 2014). The third Chapter of this thesis discusses the work that we did to discover the mutation causing this immunodeficiency in pigs.

Genome Wide Association Studies

Genome wide association studies (GWAS) are methods that assess the statistical association of genetic markers with phenotypes (Dean, 2003; Balding, 2006; Zeng et al., 2015).

The overall goal of GWAS is to identify specific genomic regions that have an effect on the trait of interest. To perform GWAS, animals with both phenotypes and genotypes are required.

26

Genotypes may be microsatellites, SNPs, or sequence based, and the nature of phenotypes may be categorical (for example, affected vs. unaffected, or white vs. brown vs. black) or quantitative, numerically continuous, measurements (for example, weight or height). The sample size required for sufficient power of detection in GWAS depends on the genetic architecture of the phenotype of interest, LD between DNA markers, and the population structure of the sample (Laurie et al.,

2010; Hong and Park, 2012). In light of the various phenotypes and species of interest in scientific research, many methods have been developed to perform GWAS (Dean, 2003;

Balding, 2006; Zeng et al., 2015).

Prior to the current SNP chip era, where platforms are available to genotype individuals for thousands to hundreds of thousands of SNPs, QTL mapping was performed using a limited number of genetic markers across the genome. Highly polymorphic microsattelite markers were used extensively for QTL mapping experiments. The first example of QTL mapping in pigs utilized a wild boar x Large White cross (Andersson et al., 1994). Many QTL mapping studies followed, resulting in the identification of just under 14,000 QTL for over 650 traits in pigs, which have been gathered and are presented on the Animal Genome website

(http://www.animalgenome.org/cgi-bin/QTLdb/SS/index). These QTL are generally not finely mapped, often spanning half of a chromosome (Georges et al., 1995; Goddard and Hayes, 2009), and are likely to provide little to no information outside of the specific breed crosses in which they were identified (Dekkers, 2012). Furthermore, these microsatellites are very expensive and difficult to genotype, and the small number of genotyped individuals reduced the power of QTL detection (Weller et al., 1990). Currently, the most common practice in agriculture is to use single nucleotide polymorphism (SNP) chips, which provide genotypes for thousands or hundreds of thousands of genetic loci covering the entire genome of the target species. As

27 described by Ramos et al. (2009), the biallelic SNPs on these widely used chips are chosen based on the amount of information the SNPs provide; SNPs should be easily genotyped, providing clear genotype results, SNPs should segregate in multiple populations, the frequency of the minor allele should not be too low, and other criteria. These criteria often exclude causative mutations from SNP chips. In order to identify causative mutations, GWAS is first used to identify genomic regions that are associated with the trait of interest. Once chromosomal locations that are associated with the trait of interest have been identified, bioinformatic analyses may be done to determine if there are annotated genes in these regions that may play a role in altering the phenotype of interest. For pigs, this can be done by searching for the associated chromosomal location in the most recent Sus scrofa genome assembly using Ensembl

(http://www.ensembl.org/Sus_scrofa/Info/Index?db=core). Finally, if desired, targeted sequencing of candidate genes in the associated chromosomal location can be performed to determine the specific causative mutation.

Several methods can be used to identify regions of the genome that are associated with a trait of interest, including Bayesian variable selection (Meuwissen et al., 2001; Fernando and

Garrick, 2013; Garrick and Fernando, 2013), in which all markers are simultaneously fitted as random effects, and single SNP methods (Balding, 2006), in which the genotype at each locus is fitted as a fixed effect one at a time. Both of these methods rely on linkage disequilibrium, which is the non-random assortment of alleles at loci, between observable markers and the causative mutations that cause phenotypic differences (Lande and Thompson, 1990; Devlin and Risch,

1995). Due to LD between SNPs on the chip, GWAS often shows a block of linked SNPs to be associated with the phenotype of interest; therefore, it may be advisable to assess the association of genomic regions with the trait of interest as opposed to each individual SNP. In Chapter 4 of

28 this thesis, we performed GWAS using two methods, single SNP and Bayesian GWAS, which are discussed further below.

Single SNP GWAS method. Single SNP association methods, as implemented in

ASReml4 (Gilmour et al., 2014), test the association of SNPs with the phenotype one at a time by fitting them as fixed effects in a linear mixed model, such as the one below:

� = �� + �!�� + �� + �� + � where y = vector of phenotypic observations, X = design matrix associating phenotypes with fixed effects, b = vector of fixed effects, Zi = matrix of SNP genotypes for SNP i, for the AA,

AB, and BB genotypes, respectively, gi = vector of fixed genotype class effects for SNP i, W = incidence matrix associating phenotypes with a random effect (for example, the pen to which an animal was randomly assigned to in a study), r = vector of random effects, V = incidence matrix associating phenotypes with random animal polygenic effects, u = vector of random animal polygenic effects with variance-covariance matrix based on the pedigree relationship matrix, and

! e = vector of residual errors assumed to be i.i.d. ~N(0,�! ). SNP genotypes are fitted as class effects in this model and results may be presented as the –log10(p-value) of the combined additive and effects at each SNP. P-values are measures of strength of association between the SNP and the phenotype of interest, and range from 0 to 1, with smaller p-values reflecting stronger association with the phenotype. The -log10 of the p-value is calculated for use in Manhattan plots so that SNPs with stronger associations (smaller p-values), have higher peaks due to larger -log10(p-value) values. The animal polygenic effect, u, is used to account for correlations between observations due to genetic relationships that are picked up by the provided pedigree (Gilmour et al., 2014).

29

In the single-SNP method, each test is carried out as if it were an independent test and, thus, it is important to perform multiple testing correction to results. The following table shows the possible outcomes when testing multiple null hypotheses (Benjamini and Hochberg, 1995), with a total number of tests equal to n.

Table 2.2. Possible outcomes when testing multiple null hypotheses, as described by Benjamini and Hochberg (1995). Hypothesis Accept Null Reject Null Total

True Null U V n0

False Null T S n - n0 Total n - R R n

A type I error occurs when we reject the null hypothesis (in single SNP GWAS, this means that we declare that the SNP has a significant effect on the phenotype of interest), but the null hypothesis is true (in single SNP GWAS, this means that the SNP does NOT have an effect on the phenotype with which the SNP was associated). In general, we determine 5% or a proportion of 0.05 false positives to be acceptable. This means that V/n0 = 0.05, which is also called α.

The problem with high dimensional data is that we are performing tens of thousands of tests, which leads to a very large number of false positives when we are controlling only at the single test level, as described above. In order to reduce the total number of false positives, we must correct the p-value of each test using a mutliple testing correction method. Several methods can be utilized to correct p-values for multiple testing. Bonferroni correction is the simplest method of controlling the family-wise error rate (FWER; Bonferroni, 1935). FWER is the probability of at least one false positive result when all n null hypotheses are true (Sham and

Purcell, 2014). The following formula is used to calculate the statistical significance threshold at a Bonferroni corrected p-value of 0.05:

0.05 � = ! �

30

where TB = the Bonferroni corrected significance threshold and n = the number of tests performed. For example, to calculate a significance threshold at a Bonferroni corrected p-value of 0.05 when 1,000 tests have been performed, TB = 0.05/1,000 = 0.00005. In this example, any p-value less than 0.00005 would be considered to be a statistically significantly association between the dependent and independent variables. Alternatively, each p-value could be corrected by multiplying the p-value by the number of tests performed. For example, a raw p-value of

0.00005 when 1,000 tests were performed can be Bonferroni corrected as follows: 0.00005 x

1,000 = 0.05. Traditional Bonferroni correction, is considered stringent, as it sets a significance threshold based on the total number of tests, assumed to be independent of one another, without consideration of correlations between tests (Sham and Purcell, 2014); in the case of single SNP

GWAS, this can be caused by LD between SNPs. Bonferroni correction for multiple testing can be modified to take correlations between tests through the use of Principal Components Analysis

(PCA; Gao et al., 2008).

As described by Gao et al. (2008), when performing single SNP GWAS on genotype data from a large number of SNPs located near one another, a test of association of one SNP with the phenotype of interest is not independent of that for a closely linked SNP. These two SNPs are in

LD with one another, resulting in the association of one SNP with the phenotype to be correlated with that of the other SNP. In this case, traditional Bonferroni correction, as described above, overcorrects the significance threshold by assuming no correlation between the tests (Sham and

Purcell, 2014). In Chapter 4 of this thesis, we performed PCA on SNP genotypes directly to determine the number of effective loci, or the number of independent tests, as described by Gao et al. (2008). In our PCA of SNP genotypes, we calculated the number of effective loci as the number of principal components required to explain 99.5% of the genetic variance (Chapter 4).

31

This number of effective loci was then used to determine the multiple test corrected significance threshold at a Bonferroni p-value equal to 0.05.

While single SNP GWAS methods require correction of results for multiple testing, other methods, such as Bayesian variable selection, do not. Bayesian methods applied to GWAS are described in the following.

Bayesian variable selection. Bayesian methods are useful when performing GWAS, especially on datasets containing a greater number of marker genotypes than the number of animals with phenotypic information, as explained in Meuwissen et al. (2001), Fernando and

Garrick (2013), and Garrick and Fernando (2013). The following is an example model that can be used to perform Bayesian GWAS:

!

� = �� + ���!�! + � !!! where y = vector of phenotypic observations, X = incidence matrix relating phenotypes to fixed effects, b = vector of fixed effects, zi = vector of genotype covariates for SNP i, αi = allele substitution effect for SNP i, δi = indicator for whether the effect of SNP i is included (δi = 1) or excluded from (δi = 0) the model for a given iteration of the Monte Carlo Markov Chain, and e = vector of residual errors.

In the Bayesian GWAS mixed model shown above, all SNPs are simultaneously fitted as random effects in an iterative manner. Bayesian methods incorporate information from two sources, the prior information and the data itself. The user may specify a number of intial iterations that will be thrown out, called the burn-in period (Fernando and Garrick, 2008).

Following this burn-in, iterations sample from the posterior distribution of the previous iterations. If a marker is fitted in the Bayesian model and explains a larger proportion of the

32 genetic variance than other markers that were fitted, that SNP will be more likely to be included in the model in the following iterations. An important parameter that is specified for Bayesian

GWAS methods is π, which is equal to the proportion of SNPs that are expected to have no effect on the phenotype. Bayesian variable selection in GWAS assumes that a SNP either has an effect equal to 0, with probability equal to π, or has an effect that is sampled from a normal distribution, with probability equal to 1-π (Fernando and Garrick, 2013). If we set π=0.99, we are informing the Bayesian model that we expect only 1% of the SNPs in the genome to be associated with the phenotype. With this example, if our data includes genotypes for 50,000

SNPs, each iteration of Bayesian variable selection will fit approximately 500 SNPs in the model.

As discussed above, often SNPs located close to one another are in high LD. Therefore, in Bayesian methods, if two SNPs in high LD with one another are also in high LD with a causative muation, only one of the markers may be fitted into the model in each iteration. In light of this, it is important to consider the association of all SNPs within a chromosomal segment or window (Hayes et al., 2010; Sahana et al., 2010; Fernando and Garrick, 2013). This window- based approach is implemented in GenSel (Fernando and Garrick, 2008) by summing the effects of all SNPs within a 1 Megabase (Mb) window.

Several important differences between the single SNP and Bayesian GWAS methods used in this thesis are described in the following. First, in single SNP GWAS, the association of one SNP has no effect on the statistical association of another SNP, as association of each SNP is estimated in individual tests. In Bayesian GWAS models, the proportion of SNPs associated with the trait of interest is assumed known and is equal to (1-π), where the value of π is given by the user. This is related to the second difference, in that there is no prior assumption on the number

33 of SNPs that have an effect on the phenotype in single SNP methods, as there is in Bayesian

GWAS. Third, the software in which single SNP GWAS was implemented in this thesis,

ASReml4 (Gilmour et al., 2014), allows for additional random effects to be fitted in the model, whereas the Bayesian GWAS software used, GenSel (Fernando and Garrick, 2008), only allows for fixed effects to be fitted in the model beyond the random effects of SNPs.

Functional Analysis of GWAS Results

Complex traits are controlled by the action and interaction of many gene products and, therefore, may be altered by mutations in one of any of these genes. In other words, these traits are polygenic, and the action of many genes result in the observed phenotype. GWAS methods test the effects of genetic loci as independently acting components, which is often not ideal for identifying loci with small effects. This problem is compounded by the thresholds considered to be statistically significant when testing effects of over 50,000 SNPs. These small effects could be discovered through browsing each region with moderate associations in the genome (such as the

Ensembl search described above) to identify candidate genes, but this would be very time consuming, as there could be hundreds of such regions. Another approach would be to create lists of genes located near moderately associated SNPs in these regions through tools such as

BioMart (www.biomart.org), then perform gene ontology (GO) annotation enrichment analyses of the lists. Many genes are annotated with GO terms that describe their function or activity in certain processes, functions, or pathways (The Gene Ontology Consortium, 2001). This type of functional analysis narrows the candidate gene list to those contained in biologically relevant GO terms.

Several studies have used functional analysis of GWAS results to add information to

GWAS results. Serão et al. (2013) identified several GO terms that were enriched in genes near

34

SNPs involved in feed efficiency in cattle. This study also identified several genes along the

MAPK pathway that were near associated SNPs. Another study found that genes in several metabolic-related GO terms were near SNPs associated with feed efficiency in Nelore cattle, a

Bos indicus breed (Oliveira et al., 2014). This approach was also used to identify GO terms enriched in genes near SNPs shown to be associated with average daily gain in pigs (Fontanesi et al., 2014). These functional analyses of GWAS results give information about the biological function of genes near associated SNPs.

Functional analyses of GWAS results add valuable information to complex traits, such as those mentioned above, or for investigation of genomic regions associated with immune response. These traits are very polygenic, meaning that there are many genes that have generally small effects on the overall phenotype measured. Exploring overrepresentation of GO terms for genes in regions shown to be associated with the complex trait of interest may reveal biological pathways or processes that are known to have an effect on the phenotype. For example, we found that USP18 was one of the 1,390 genes within 250 kb of SNPs associated with VL in the KS06 data with P<0.003 (Chapter 4). It may not be immediately clear how the USP18 gene, which encodes an enzyme responsible for cleavage of ubiquitin from , may be involved in response to viral infection (B. Li et al., 2015). When we look at enriched GO terms, we see that

USP18 is annotated with the immune response GO term. Further investigation of this gene shows that it plays an important role in interferon signaling (Francois-Newton et al., 2012). Mounting an immune response involves a complicated network of gene products that interact with one another. Therefore, GO term enrichment, pathway analysis, or other similar methods can provide additional information to genomic analysis results. These functional analyses also greatly reduce the amount of time required for bioinformatic analyses of GWAS results, particularly for

35 complex traits with many small effects, as this would result in a large number of SNPs that have moderate associations with the phenotype of interest. Creating lists of genes located near these moderately associated SNPs and performing GO annotation enrichment analyses is less time intensive compared to browsing the genome for candidate genes in each region with moderately associated SNP.

Genomic Prediction and Genomic Selection

The overall goal of animal breeding is the production of animals with superior performance for economically valuable traits. Genetic progress, or the increase in the genetic value of animals from one generation to the next, was the focus of animal breeding programs long before breeders had the ability to observe genetic markers. Prior to genotyping capabilities, superior animals were selected to produce the next generation based on phenotypic observations of the traits used as selection criteria. Phenotypic selection of a heritable trait leads to genetic improvement, but requires phenotypes for each trait used as selection criteria on selection candidates or close relatives of those candidates. For traits that are only expressed in one sex or later in life, or those that involve response to disease, obtaining phenotypic observations may take a very long time (increasing the generation interval), be very costly, or may be impossible altogether. Genetic progress for these types of traits would benefit greatly from the ability to observe and select upon genetic markers, without having phenotypic observations on selection candidates or their relatives, through genomic selection techniques. Genomic selection practices require the prediction of total genetic merit of an individual using only genotypic information for the selection candidate, which is known as genomic prediction (Meuwissen et al., 2001).

The first era of genetic selection in animals used blood group markers as a phenotypic indicator of the inheritance of the percentage of fat in milk (Neimann-Sorensen and Robertson,

36

1961; Rendel, 1961). The simple inheritance pattern of blood groups gave limited insight into the polygenic inheritance pattern of fat percentage and would only be useful for traits with QTL in specific chromosomal locations. Marker assisted selection (MAS) was the second era of direct selection on DNA markers (Smith and Simpson, 1986; Lande and Thompson, 1990). MAS relied on relatively few DNA markers associated with QTL with large effects, and were generally based on tests developed using data from specific breed crosses (Andersson, 2001).

The theory of genomic selection was introduced by Meuwissen, Hayes, and Goddard

(2001), interestingly before dense genome-wide genotyping technologies were even available.

The invention of high throughput genotyping arrays, known as SNP chips, gives animal breeders an opportunity to realize large scale, genomic prediction of the genetic merit of selection candidates. This technology increases the rate of genetic gain acheivable in herds by impacting two components of the breeder’s equation described by Falconer and Mackay (1996). The breeder’s equation is used to predict the rate of genetic gain in the trait being selected upon, as shown below:

��� ∆� = ! � where ΔG is the rate of genetic change per unit of time, i is the intensity of selection, r is the accuracy of selection, σA is the additive genetic standard deviation in the pool of selection candidates, and L is the generation interval, or the average age of a parent when their offspring are born. Genomic selection can increase the rate of genetic change (ΔG) in several ways: 1) increasing the accuracy of selection, 2) decreasing the generation interval, and 3) increasing the genetic standard deviation. Selection accuracy can be defined as the correlation between the estimated breeding value and the true breeding value of each selection candidate. Addition of molecular breeding values through genotypic information provides information on Mendelian

37 segregation of alleles among full-siblings, which can not be calculated using only pedigree for breeding value estimation. Generation interval reduction may have a greater contribution to increasing the rate of genetic progress in some species, such as dairy cattle, where young bulls with high genomic values are selected before extensive progeny data is available (Schaeffer,

2006). In pigs, the contribution of generation interval reduction may be little to none, as the generation interval is already quite small (Wellmann et al., 2013). The use of genomic information allows distinction between the genetic value of full-siblings, reducing within-family selection (Lillehammer et al., 2011), which in turn may lead to reduced inbreeding and less of a decrease in the genetic variation available for selection.

To predict the genetic merit of a selection candidate using only the candidate’s genotypic information, the phenotype for that particular animal is not necessary, but phenotypes must be collected on animals in the training population, or close relatives of those animals in the training population. The training population contains animals for which both phenotypic and genotypic information is available. SNPs for which genotypic information is available on the training population must be the same SNPs for which genotypic information is observed on animals in the validation population, or the selection candidates. Phenotypes and genotypes for animals in the training population are used to estimate SNP effects that may then be used to predict the genetic value of animals in the validation population.

The basis of prediction of an animal’s genetic merit is a mixed linear model (Henderson,

1975), such as the one that follows:

� = �� + �� + � + � where y = vector of phenotypic observations, X = design matrix associating phenotypes with fixed effects, b = vector of fixed effects, W = incidence matrix associating phenotypes with

38 random effects, r = vector of random effects (for example, the pen to which an animal was randomly assigned in a study), u = vector of random animal polygenic effects with variance- covariance matrix based on the pedigree relationship matrix, and e = vector of residual errors. In this equation, the genetic value of an animal for the trait observed in y is predicted using only pedigree information and is given in u; these values are the best linear unbiased predictions

(BLUP). If the aim is to predict the genetic merit of an individual that does not have phenotypic information for the trait being selected upon, this individual should be in the pedigree and should have relatives with phenotypic observations for this trait. EBVs for all animals in the pedigree, including those without phenotypes for the trait being selected upon, are predicted in BLUP.

In the case of MAS, the linear mixed model given above can be expanded upon, as follows:

� = �� + �� + �� + � + � where Z is a matrix of genotypes at the marker and g is a vector of effects for each allele for the given marker. This equation can also be used when incorporating multiple markers in the prediction equation.

Bayesian models may also be used in genomic prediction (Fernando and Garrick, 2008); the general Bayesian model used for genomic training is the same model provided for Bayesian

GWAS methods above, with the genomic EBV (GEBV) calculated with the following equation:

!

����! = ���! !!! where the GEBV for individual i is equal to the sum of zj = vector of genotype covariates for SNP j and �j = allele substitution effect for SNP j across all k SNPs. Meuwissen et al. (2001) used simulations to show that Bayesian methods predicted GEBV with higher accuracy than a least

39 squares model, and the least squares model was unable to estimate effects of a large number of markers.

A prime example of the application of genomic selection can be found in Wolc et al.

(2015). This study described simulations and real data for 16 traits comparing genomic selection of layer chickens to traditional pedigree-based selection. Simulations showed that genomic selection would increase the selection response and decrease the amount of inbreeding compared to pedigree-based selection. Experimental results showed that on average, the genomic selection line significantly outperformed the pedigree-based selection line. Furthermore, this study demonstrated that retraining the allele effect estimates increased the accuracy of genomic prediction in future generations.

Literature Cited

Albina, E. 1997. Epidemiology of porcine reproductive and respiratory syndrome (PRRS): An overview. Vet. Microbiol. 55:309–316.

Andersson, L., C. Haley, H. Ellegren, S. Knott, M. Johansson, K. Andersson, L. Andersson- Eklund, I. Edfors-Lilja, M. Fredholm, I. Hansson, and al. et. 1994. Genetic mapping of quantitative trait loci for growth and fatness in pigs. Science (80-. ). 263:1771–1774.

Andersson, L. 2001. Genetic dissection of phenotypic diversity in farm animals. Nat. Rev. Genet. 2:130–8.

Balding, D. J. 2006. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7:781–91.

Barber, M. R. W., J. R. Aldridge Jr., R. G. Webster, and K. E. Magor. 2010. Association of RIG- I with innate immunity of ducks to influenza. Proc. Natl. Sci. 107(13):5913-5918.

Bautista, E. M., and T. W. Molitor. 1997. Cell-mediated immunity to porcine reproductive and respiratory syndrome virus in swine. Viral Immunol. 10:83–94.

Benfield, D. A., E. Nelson, J. E. Collins, L. Harris, S. M. Goyal, D. Robison, W. T. Christianson, R. B. Morrison, D. Gorcyca, and D. Chladek. 1992. Characterization of swine infertility and respiratory syndrome (SIRS) virus (isolate ATCC VR-2332). J. Vet. Diagn. Invest. 4:127–33.

40

Benjamini, Y., and Y. Hochberg. 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B 57:289–300.

Bernoco, D., and E. Bailey. 1998. Frequency of the SCID gene among Arabian horses in the USA. Anim. Genet. 29:41–2.

Biffani, S., S. Botti, a. Caprera, E. Giuffra, and a. Stella. 2010. Genetic Susceptibility to Porcine Reproductive and Respiratory Syndrome (PRRS) virus in commercial pigs in Italy. Proc. Ninth World Congr. Genet. Appl. to Livest. Prod.:592–595.

Boddicker, N. J., A. Bjorkquist, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. Dekkers. 2014a. Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection. Genet Sel Evol 46:18.

Boddicker, N. J., D. J. Garrick, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. M. Dekkers. 2014b. Validation and further characterization of a major quantitative trait locus associated with host response to experimental infection with porcine reproductive and respiratory syndrome virus. Anim. Genet. 45:48–58.

Boddicker, N., E. H. Waide, R. R. R. Rowland, J. K. Lunney, D. J. Garrick, J. M. Reecy, and J. C. M. Dekkers. 2012. Evidence for a major QTL associated with host response to Porcine reproductive and respiratory syndrome virus challenge. J. Anim. Sci. 90:1733–1746.

Bosma, G. C., R. P. Custer, and M. J. Bosma. 1983. A severe combined immunodeficiency mutation in the mouse. Nature 301:527–30.

Van der Burg, M., and A. R. Gennery. 2011. Educational paper. Eur. J. Pediatr. 170:561–571.

Cai, W., D. S. Casey, and J. C. M. Dekkers. 2008. Selection response and genetic parameters for residual feed intake in Yorkshire swine. J. Anim. Sci. 86:287–298.

Calvert, J. G., D. E. Slade, S. L. Shields, R. Jolie, R. M. Mannan, R. G. Ankenbauer, and S.-K. W. Welch. 2007. CD163 expression confers susceptibility to porcine reproductive and respiratory syndrome viruses. J. Virol. 81:7371–7379.

Carrero, J. A. 2013. Confounding roles for type I interferons during bacterial and viral pathogenesis. Int. Immunol. 25:663–9.

Cecere, T. E., S. M. Todd, and T. Leroith. 2012. Regulatory T cells in arterivirus and coronavirus infections: do they protect against disease or enhance it? Viruses 4:833–46.

Chayama, K., C. N. Hayes, N. Hiraga, H. Abe, M. Tsuge, and M. Imamura. 2011. Animal model for study of human hepatitis viruses. J. Gastroenterol. Hepatol. 26:13–8.

Chen, N., J. C. M. Dekkers, C. L. Ewen, and R. R. R. Rowland. 2015. Porcine reproductive and respiratory syndrome virus replication and quasispecies evolution in pigs that lack adaptive immunity. Virus Res. 195:246–9.

41

Chucri, T. M., J. M. Monteiro, A. R. Lima, M. L. B. Salvadori, J. R. K. Junior, and M. A. Miglino. 2010. A review of immune transfer by the placenta. J. Reprod. Immunol. 87:14–20.

Cino-Ozuna, A., R. Rowland, J. Nietfeld, M. Kerrigan, J. Dekkers, and C. Wyatt. 2012. Preliminary Findings of a Previously Unrecognized Porcine Primary Immunodeficiency Disorder. Vet. Pathol. 50:144–146.

Conzelmann, K. K., N. Visser, P. Van Woensel, and H. J. Thiel. 1993. Molecular characterization of porcine reproductive and respiratory syndrome virus, a member of the arterivirus group. Virology 193:329–339.

Corzo, C. A., E. Mondaca, S. Wayne, M. Torremorell, S. Dee, P. Davies, and R. B. Morrison. 2010. Control and elimination of porcine reproductive and respiratory syndrome virus. Virus Res. 154:185–192.

Cossu, F. 2010. Genetics of SCID. Ital. J. Pediatr. 36:76.

Couper, K. N., D. G. Blount, and E. M. Riley. 2008. IL-10: the master regulator of immunity to infection. J. Immunol. 180(9):5771-5777.

Darwich, L., I. Díaz, and E. Mateu. 2010. Certainties, doubts and hypotheses in porcine reproductive and respiratory syndrome virus immunobiology. Virus Res. 154:123–132.

Dean, M. 2003. Approaches to identify genes for complex human diseases: lessons from Mendelian disorders. Hum. Mutat. 22:261–74.

Dee, S. A., and H. S. Joo. 1994. Prevention of the spread of porcine reproductive and respiratory syndrome virus in endemically infected pig herds by nursery depopulation. Vet. Rec. 135:6–9.

Dee, S. A., and H. Joo. 1997. Strategies to control PRRS: a summary of field and research experiences. Vet. Microbiol. 55:347–53.

Dekkers, J. 2012. Application of Genomics Tools to Animal Breeding. Curr. Genomics 13:207– 212.

Denton, P. W., and J. V. García. Humanized mouse models of HIV infection. AIDS Rev. 13:135–48.

Devlin, B., and N. Risch. 1995. A Comparison of Linkage Disequilibrium Measures for Fine- Scale Mapping. Genomics 29:311–322.

Díaz, I., L. Darwich, G. Pappaterra, J. Pujols, and E. Mateu. 2005. Immune responses of pigs after experimental infection with a European strain of Porcine reproductive and respiratory syndrome virus. J. Gen. Virol. 86:1943–1951.

Díaz, I., L. Darwich, G. Pappaterra, J. Pujols, and E. Mateu. 2006. Different European-type vaccines against porcine reproductive and respiratory syndrome virus have different immunological properties and confer different protection to pigs. Virology 351:249–259.

42

Do, T. D., C. Park, K. Choi, J. Jeong, M. K. Vo, T. T. Nguyen, and C. Chae. 2015. Comparison of pathogenicity of highly pathogenic porcine reproductive and respiratory syndrome virus between wild and domestic pigs. Vet. Res. Commun. 39:79–85.

Dotti, S., G. Guadagnini, F. Salvini, E. Razzuoli, M. Ferrari, G. L. Alborali, and M. Amadori. 2013. Time-course of antibody and cell-mediated immune responses to Porcine Reproductive and Respiratory Syndrome virus under field conditions. Res. Vet. Sci. 94:510–517.

Dotti, S., R. Villa, E. Sossi, G. Guadagnini, F. Salvini, M. Ferrari, and M. Amadori. 2011. Comparative evaluation of PRRS virus infection in vaccinated and naïve pigs. Res. Vet. Sci. 90:218–225.

Duan, X., H. J. Nauwynck, and M. B. Pensaert. 1997. Virus quantification and identification of cellular targets in the lungs and lymphoid tissues of pigs at different time intervals after inoculation with porcine reproductive and respiratory syndrome virus (PRRSV). Vet. Microbiol. 56:9–19.

Ewen, C., A. Cino-Ozuna, H. He, M. Kerrigan, J. Dekkers, C. Tuggle, R. Rowland, and C. Wyatt. 2014. Analysis of blood leukocytes in a naturally occurring immunodeficiency of pigs shows the defect is localized to B and T cells. Vet. Immunol. Immunopathol. 162:174–179.

Falconer, D., and T. Mackay. 1996. Introduction to quantitative genetics. 4th ed. Longmans Green, Harlow, Essex, UK.

Fang, Y., P. Schneider, W. P. Zhang, K. S. Faaberg, E. a. Nelson, and R. R. R. Rowland. 2007. Diversity and evolution of a newly emerged North American Type 1 porcine arterivirus: Analysis of isolates collected between 1999 and 2004. Arch. Virol. 152:1009–1017.

Fernando, R. L., and D. J. Garrick. 2008. GenSel-User manual for a portfolio of genomic selection related analyses. Anim. Breed. Genet. Iowa State Univ. Ames:0–24.

Fernando, R. L., and D. Garrick. 2013. Bayesian methods applied to GWAS. Methods Mol. Biol. 1019:237–74.

Fontanesi, L., G. Schiavo, G. Galimberti, D. G. Calò, and V. Russo. 2014. A genomewide association study for average daily gain in Italian large white pigs. J. Anim. Sci. 92:1385–1394.

Francois-Newton, V., M. Livingstone, B. Payelle-Brogard, G. Uzé, and S. Pellegrini. 2012. USP18 establishes the transcriptional and anti-proliferative interferon α/β differential. Biochem. J. 446:509–16.

Garrick, D. J., and R. L. Fernando. 2013. Implementing a QTL detection study (GWAS) using genomic prediction methodology. Methods Mol. Biol. 1019:275–298.

Georges, M., D. Nielsen, M. Mackinnon, A. Mishra, R. Okimoto, A. T. Pasquino, L. S. Sargeant, A. Sorensen, M. R. Steele, and X. Zhao. 1995. Mapping quantitative trait loci controlling milk production in dairy cattle by exploiting progeny testing. Genetics 139:907–20.

43

Gilmour, A. R., B. J. Gogel, B. R. Cullis, and R. Thompson. 2014. ASReml User Guide.

Glass, E. J., R. Baxter, R. J. Leach, and O. C. Jann. 2012. Genes controlling vaccine responses and disease resistance to respiratory viral pathogens in cattle. Vet. Immunol. Immunopathol. 148:90–99.

Goddard, M. E., and B. J. Hayes. 2009. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat. Rev. Genet. 10:381–91.

Gu, W., L. Guo, H. Yu, J. Niu, M. Huang, X. Luo, R. Li, Z. Tian, L. Feng, and Y. Wang. 2015. Involvement of CD16 in antibody-dependent enhancement of porcine reproductive and respiratory syndrome virus infection. J. Gen. Virol. 96:1712–1722.

Halbur, P., M. F. Rothschild, B. Thacker, X.-J. Meng, P. Paul, and J. Bruna. 1998. Differences in susceptibility of Duroc , Hampshire , and Meishan reproductive and respiratory syndrome virus ( PRRSV ). 115:181–189.

Hayes, B. J., J. Pryce, A. J. Chamberlain, P. J. Bowman, and M. E. Goddard. 2010. Genetic architecture of complex traits and accuracy of genomic prediction: coat colour, milk-fat percentage, and type in Holstein cattle as contrasting model traits. PLoS Genet. 6:e1001139.

Henderson, C. R. 1975. Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447.

Hess, A. S., N. J. Boddicker, R. R. R. Rowland, J. K. Lunney, G. S. Plastow, and J. C. M. Dekkers. 2014. Genetic Parameters and Effects for a Major QTL of Piglets Experimentally Infected with a Second Porcine Reproductive and Respiratory Syndrome Virus. In: The Tenth World Congress of Genetics Applied to Livestock Production. Vancouver, BC, Canada.

Hess, A. S., N. J. Boddicker, R. R. R. Rowland, J. K. Lunney, G. S. Plastow, and J. C. M. Dekkers. 2015. A comparison of genetic parameters and effects for a major QTL between piglets infected with one of two isolates of porcine reproductive and respiratory syndrome virus. In: North American PRRS Symposium. Chicago, Illinois, USA.

Hong, E. P., and J. W. Park. 2012. Sample size and statistical power calculation in genetic association studies. Genomics Inform. 10:117–22.

Hu, J., and C. Zhang. 2014. Porcine reproductive and respiratory syndrome virus vaccines: current status and strategies to a universal vaccine. Transbound. Emerg. Dis. 61:109–20.

Huang, J., X. Guo, N. Fan, J. Song, B. Zhao, Z. Ouyang, Z. Liu, Y. Zhao, Q. Yan, X. Yi, A. Schambach, J. Frampton, M. a Esteban, D. Yang, H. Yang, and L. Lai. 2014. RAG1/2 Knockout Pigs with Severe Combined Immunodeficiency. J. Immunol. 193:1496–503.

Ito, T., Y. Sendai, S. Yamazaki, M. Seki-Soma, K. Hirose, M. Watanabe, K. Fukawa, and H. Nakauchi. 2014. Generation of recombination activating gene-1-deficient neonatal piglets: a model of T and B cell deficient severe combined immune deficiency. PLoS One 9:e113833.

44

Keffaber, K. K. 1989. Reproductive failure of unknown etiology. Am. Assoc. Swine Practitioners Newslett. 1:1-10.

Kimman, T. G., L. a Cornelissen, R. J. Moormann, J. M. J. Rebel, and N. Stockhofe-Zurwieden. 2009. Challenges for porcine reproductive and respiratory syndrome virus (PRRSV) vaccinology. Vaccine 27:3704–3718.

Koo, G. C., A. Hasan, and R. J. O’Reilly. 2009. Use of humanized severe combined immunodeficient mice for human vaccine development. Expert Rev. Vaccines 8:113–20.

Lande, R., and R. Thompson. 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743–756.

Laurie, C. C., K. F. Doheny, D. B. Mirel, E. W. Pugh, L. J. Bierut, T. Bhangale, F. Boehm, N. E. Caporaso, M. C. Cornelis, H. J. Edenberg, S. B. Gabriel, E. L. Harris, F. B. Hu, K. B. Jacobs, P. Kraft, M. T. Landi, T. Lumley, T. A. Manolio, C. McHugh, I. Painter, J. Paschall, J. P. Rice, K. M. Rice, X. Zheng, and B. S. Weir. 2010. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34:591–602.

Lee, K., D.-N. Kwon, T. Ezashi, Y.-J. Choi, C. Park, A. C. Ericsson, A. N. Brown, M. S. Samuel, K.-W. Park, E. M. Walters, D. Y. Kim, J.-H. Kim, C. L. Franklin, C. N. Murphy, R. M. Roberts, R. S. Prather, and J.-H. Kim. 2014. Engraftment of human iPS cells and allogeneic porcine cells into pigs with inactivated RAG2 and accompanying severe combined immunodeficiency. Proc. Natl. Acad. Sci. U. S. A. 111:7260–7265.

Lewis, C. R. G., T. Ait-Ali, M. Clapperton, A. L. Archibald, and S. Bishop. 2007. Genetic perspectives on host responses to porcine reproductive and respiratory syndrome (PRRS). Viral Immunol. 20:343–358.

Lewis, C. R. G., M. Torremorell, L. Galina-Pantoja, and S. C. Bishop. 2009. Genetic parameters for performance traits in commercial sows estimated before and after an outbreak of porcine reproductive and respiratory syndrome. J. Anim. Sci. 87:876–884.

Li, B., L. Du, X. Xu, B. Sun, Z. Yu, Z. Feng, M. Liu, Y. Wei, H. Wang, G. Shao, and K. He. 2015. Transcription analysis on response of porcine alveolar macrophages to co-infection of the highly pathogenic porcine reproductive and respiratory syndrome virus and Mycoplasma hyopneumoniae. Virus Res. 196:60–69.

Li, X., A. Galliher-Beckley, L. Pappan, B. Trible, M. Kerrigan, A. Beck, R. Hesse, F. Blecha, J. C. Nietfeld, R. R. Rowland, and J. Shi. 2014. Comparison of host immune responses to homologous and heterologous type II porcine reproductive and respiratory syndrome virus (PRRSV) challenge in vaccinated and unvaccinated pigs. Biomed Res. Int. 2014.

Lillehammer, M., T. H. E. Meuwissen, and A. K. Sonesson. 2011. Genomic selection for maternal traits in pigs. J. Anim. Sci. 89:3908–16.

Lipoldova, M. and P. Demant. 2006. Genetic susceptibility to infectious disease: lessons from mouse models of leishmaniasis. Nat. Rev. Genet. 7(4):294-305.

45

Lopez, O. J., and F. A. Osorio. 2004. Role of neutralizing antibodies in PRRSV protective immunity. Vet. Immunol. Immunopathol. 102:155–163.

Loving, C. L., F. A. Osorio, M. P. Murtaugh, and F. A. Zuckermann. 2015. Innate and adaptive immunity against Porcine Reproductive and Respiratory Syndrome Virus. Vet. Immunol. Immunopathol. 167:1–14.

Lunney, J. K., J. P. Steibel, J. M. Reecy, E. Fritz, M. F. Rothschild, M. Kerrigan, B. Trible, and R. R. Rowland. 2011. Probing genetic control of swine responses to PRRSV infection: current progress of the PRRS host genetics consortium. BMC Proc. 5 Suppl 4:S30.

Mateu, E., and I. Diaz. 2008. The challenge of PRRS immunology. Vet. J. 177:345–351.

Meek, K., L. Kienker, C. Dallas, W. Wang, M. J. Dark, P. J. Venta, M. L. Huie, R. Hirschhorn, and T. Bell. 2001. SCID in Jack Russell terriers: a new animal model of DNA-PKcs deficiency. J. Immunol. 167:2142–2150.

Meier, W. A., J. Galeota, F. A. Osorio, R. J. Husmann, W. M. Schnitzlein, and F. A. Zuckermann. 2003. Gradual development of the interferon-gamma response of swine to porcine reproductive and respiratory syndrome virus infection or vaccination. Virology 309:18–31.

Mestas, J., and C. C. W. Hughes. 2004. Of mice and not men: differences between mouse and human immunology. J. Immunol. 172:2731–2738.

Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829.

Molina, R. M., S. H. Cha, W. Chittick, S. Lawson, M. P. Murtaugh, E. a. Nelson, J. Christopher- Hennings, K. J. Yoon, R. Evans, R. R. R. Rowland, and J. J. Zimmerman. 2008. Immune response against porcine reproductive and respiratory syndrome virus during acute and chronic infection. Vet. Immunol. Immunopathol. 126:283–292.

Morel, L. 2004. Mouse models of human autoimmune diseases: Essential tools that require the proper controls. PLoS Biol. 2:2–5.

Mu, C. L., X. Y. Lu, E. Z. Duan, J. Chen, W. J. Li, F. H. Zhang, D. P. Martin, M. F. Yang, P. A Xia, and B. A Cui. 2013. Molecular evolution of porcine reproductive and respiratory syndrome virus isolates from central China. Res Vet Sci 95:908–912.

Murtaugh, M. P., T. Stadejek, J. E. Abrahante, T. T. Y. Lam, and F. C. C. Leung. 2010. The ever-expanding diversity of porcine reproductive and respiratory syndrome virus. Virus Res. 154:18–30.

Murtaugh, M. P., Z. Xiao, and F. Zuckermann. 2002. Immunological responses of swine to porcine reproductive and respiratory syndrome virus infection. Viral Immunol. 15:533–547.

46

Neimann-Sorensen, A., and A. Robertson. 1961. The Association between Blood Groups and Several Production Characteristics in Three Danish Cattle Breeds. Acta Agric. Scand. 11:163– 196.

Nelson, E. A, J. Christopher-Hennings, and D. a Benfield. 1994. Serum immune responses to the proteins of porcine reproductive and respiratory syndrome (PRRS) virus. J. Vet. Diagn. Invest. 6:410–415.

Novelli, F., and J.-L. Casanova. 2004. The role of IL-12, IL-23 and IFN-gamma in immunity to viruses. Cytokine Growth Factor Rev. 15:367–77.

Oliveira, P., A. Cesar, M. Nascimento, A. Chaves, P. Tizioto, T. Rr, L. Dpd, and R. L. Rosa AN, Sonstegard T, Mourão GB, Reecy JM, Garrick DJ, Mudadu MA, Coutinho LL. 2014. Identification of genomic regions associated with feed efficiency in Nelore cattle. :1–10.

Osorio, F. A, J. A Galeota, E. Nelson, B. Brodersen, a Doster, R. Wills, F. Zuckermann, and W. W. Laegreid. 2002. Passive transfer of virus-specific antibodies confers protection against reproductive failure induced by a virulent strain of porcine reproductive and respiratory syndrome virus and establishes sterilizing immunity. Virology 302:9–20.

Park, C., H. W. Seo, K. Han, I. Kang, and C. Chae. 2014. Evaluation of the efficacy of a new modified live porcine reproductive and respiratory syndrome virus (PRRSV) vaccine (Fostera PRRS) against heterologous PRRSV challenge. Vet. Microbiol. 172:432–42.

Pastoret, S., H. Ameels, F. Bossiroy, a. Decreux, F. De Longueville, A. Thomas, and D. Desmecht. 2012. Detection of disease resistance and susceptibility alleles in pigs using oligonucleotide microarray hybridization. J. Vet. Diagnostic Investig. 24:479–488.

Patton, J. B., R. R. Rowland, D. Yoo, and K.-O. Chang. 2009. Modulation of CD163 receptor expression and replication of porcine reproductive and respiratory syndrome virus in porcine macrophages. Virus Res. 140:161–171.

Pearson, T., D. L. Greiner, and L. D. Shultz. 2008. Humanized SCID mouse models for biomedical research. Curr. Top. Microbiol. Immunol. 324:25–51.

Perryman, L. E. 2004. Molecular Pathology of Severe Combined Immunodeficiency in Mice, Horses, and Dogs. Vet. Pathol. 41:95–100.

Petry, D. B., J. W. Holl, J. S. Weber, a. R. Doster, F. A. Osorio, and R. K. Johnson. 2005. Biological responses to porcine respiratory and reproductive syndrome virus in pigs of two genetic populations. J. Anim. Sci. 83:1494–1502.

Ramos, A. M., R. P. M. A. Crooijmans, N. A. Affara, A. J. Amaral, L. Alan, J. E. Beever, C. Bendixen, C. Churcher, R. Clark, P. Dehais, M. S. Hansen, J. Hedegaard, Z. Hu, H. H. Kerstens, A. S. Law, D. Milan, D. J. Nonneman, G. A. Rohrer, M. F. Rothschild, and T. P. L. Smith. 2009. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology. 4:e6524.

47

Rendel, J. 1961. Relationship between blood groups nd the fat percentage of the milk in cattle. Nature 189:408–409.

Renukaradhya, G. J., X.-J. Meng, J. G. Calvert, M. Roof, and K. M. Lager. 2015. Live porcine reproductive and respiratory syndrome virus vaccines: Current status and future direction. Vaccine 33:4069–80.

Rowland, R. R. R., B. Robinson, J. Stefanick, T. S. Kim, L. Guanghua, S. R. Lawson, and D. A. Benfield. 2001. Inhibition of porcine reproductive and respiratory syndrome virus by interferon- gamma and recovery of virus replication with 2-aminopurine. Arch. Virol. 146:539–555.

Sahana, G., B. Guldbrandtsen, L. Janss, and M. S. Lund. 2010. Comparison of association mapping methods in a complex pedigreed population. Genet. Epidemiol. 34:455–62.

Schaeffer, L. R. 2006. Strategy for applying genome-wide selection in dairy cattle. J. Anim. Breed. Genet. 123:218–223.

Serão, N., O. Matika, R. A. Kemp, J. C. S. Harding, S. C. Bishop, G. S. Plastow, and J. C. M. Dekkers. 2014. Genetic analysis of reproductive traits and antibody response in a PRRS outbreak herd. J. Anim. 92:2905–2921.

Serão, N. V., D. González-Peña, J. E. Beever, D. B. Faulkner, B. R. Southey, and S. L. Rodriguez-Zas. 2013. Single nucleotide polymorphisms and haplotypes associated with feed efficiency in beef cattle. BMC Genet. 14:94.

Shi, M., T. T. Y. Lam, C. C. Hon, R. K. H. Hui, K. S. Faaberg, T. Wennblom, M. P. Murtaugh, T. Stadejek, and F. C. C. Leung. 2010. Molecular epidemiology of PRRSV: A phylogenetic perspective. Virus Res. 154:7–17.

Smith, C., and S. P. Simpson. 1986. The use of genetic polymorphisms in livestock improvement. J. Anim. Breed. Genet. 103:205–217.

Sun, Y., M. Han, C. Kim, J. G. Calvert, and D. Yoo. 2012. Interplay between Interferon- Mediated Innate Immunity and Porcine Reproductive and Respiratory Syndrome Virus. Viruses 4:424–446.

Suzuki, S., M. Iwamoto, Y. Saito, D. Fuchimoto, S. Sembon, M. Suzuki, S. Mikawa, M. Hashimoto, Y. Aoki, Y. Najima, S. Takagi, N. Suzuki, E. Suzuki, M. Kubo, J. Mimuro, Y. Kashiwakura, S. Madoiwa, Y. Sakata, A. C. F. Perry, F. Ishikawa, and A. Onishi. 2012. Il2rg gene-targeted severe combined immunodeficiency pigs. Cell Stem Cell 10:753–758.

Tanner, A., S. E. Taylor, W. Decottignies, and B. K. Berges. 2014. Humanized mice as a model to study human hematopoietic stem cell transplantation. Stem Cells Dev. 23:76–82.

The Gene Ontology Consortium. 2001. Creating the Gene Ontology Resource: Design and Implementation. Genome Res. 11:1425–1433.

48

Vicca, J., D. Maes, L. Thermote, J. Peeters, F. Haesebrouck, and A. Kruif. 2002. Patterns of Mycoplasma hyopneumoniae Infections in Belgian Farrow-to-Finish Pig Herds with Diverging Disease-Course. J. Vet. Med. Ser. B 49:349–353.

Vukasinovic, N., and a C. Clutter. 2010. Analysis of Survival of Pigs Challenged with the Porcine Reproductive and Respiratory ( PRRS ) Virus. In: Proceedings of the Ninth World Congress on Genetics Applied to Livestock Production. Leipzig, Germany. p. 647–650.

Wang, S. J., W. J. Liu, L. G. Yang, C. A. Sargent, H. B. Liu, C. Wang, X. D. Liu, S. H. Zhao, N. A. Affara, A. X. Liang, and S. J. Zhang. 2012. Effects of FUT1 gene mutation on resistance to infectious disease. Mol. Biol. Rep. 39:2805–2811.

Watanabe, M., K. Nakano, H. Matsunari, T. Matsuda, M. Maehara, T. Kanai, M. Kobayashi, Y. Matsumura, R. Sakai, M. Kuramoto, G. Hayashida, Y. Asano, S. Takayanagi, Y. Arai, K. Umeyama, M. Nagaya, Y. Hanazono, and H. Nagashima. 2013. Generation of Interleukin-2 Receptor Gamma Gene Knockout Pigs from Somatic Cells Genetically Modified by Zinc Finger Nuclease-Encoding mRNA. PLoS One 8:1–8.

Weller, J. I., Y. Kashi, and M. Soller. 1990. Power of daughter and granddaughter designs for determining linkage between marker loci and quantitative trait loci in dairy cattle. J. Dairy Sci. 73:2525–37.

Wellmann, R., S. Preuß, E. Tholen, J. Heinkel, K. Wimmers, and J. Bennewitz. 2013. Genomic selection using low density marker panels with application to a sire line in pigs. Genet. Sel. Evol. 45:28.

Wensvoort, G., C. Terpstra, J. M. Pol, E. a ter Laak, M. Bloemraad, E. P. de Kluyver, C. Kragten, L. van Buiten, a den Besten, and F. Wagenaar. 1991. Mystery swine disease in The Netherlands: the isolation of Lelystad virus. Vet. Q. 13:121–130.

Wiler, R., R. Leber, B. B. Moore, L. F. VanDyk, L. E. Perryman, and K. Meek. 1995. Equine severe combined immunodeficiency: a defect in V(D)J recombination and DNA-dependent protein kinase activity. Proc. Natl. Acad. Sci. U. S. A. 92:11485–11489.

Wills, R. W., A. R. Doster, J. A. Galeota, J.-H. Sur, and F. A. Osorio. 2003. Duration of infection and proportion of pigs persistently infected with porcine reproductive and respiratory syndrome virus. J. Clin. Microbiol. 41:58–62.

Wolc, A., H. H. Zhao, J. Arango, P. Settar, J. E. Fulton, N. P. O’Sullivan, R. Preisinger, C. Stricker, D. Habier, R. L. Fernando, D. J. Garrick, S. J. Lamont, and J. C. M. Dekkers. 2015. Response and inbreeding from a genomic selection experiment in layer chickens. Genet. Sel. Evol. 47:59.

Yoon, K. J., L. L. Wu, J. J. Zimmerman, H. T. Hill, and K. B. Platt. 1996. Antibody-dependent enhancement (ADE) of porcine reproductive and respiratory syndrome virus (PRRSV) infection in pigs. Viral Immunol. 9:51–63.

49

Yoon, K.-J., L.-L. Wu, J. J. Zimmerman, H. T. Hill, and K. B. Platt. 1996. Antibody-Dependent Enhancement (ADE) of Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) Infection in Pigs. Viral Immunol. 9:51–63.

Yoon, K.-J., J. J. Zimmerman, S. L. Swenson, M. J. McGinley, K. a. Eernisse, a. Brevik, L. L. Rhinehart, M. L. Frey, H. T. Hill, and K. B. Platt. 1995. Characterization of the Humoral Immune Response to Porcine Reproductive and Respiratory Syndrome (PRRS) Virus Infection. J. Vet. Diagnostic Investig. 7:305–312.

Zare, Y., G. E. Shook, M. T. Collins, and B. W. Kirkpatrick. 2014. Genome-wide association analysis and genomic prediction of Mycobacterium avium subspecies paratuberculosis infection in US Jersey cattle. PLoS One 9:e88380.

Zeng, P., Y. Zhao, C. Qian, L. Zhang, R. Zhang, J. Gou, J. Liu, L. Liu, and F. Chen. 2015. Statistical analysis for genome-wide association study. J. Biomed. Res. 29:285–97.

Zimmerman, J. J., K. J. Yoon, R. W. Wills, and S. L. Swenson. 1997. General overview of PRRSV: A perspective from the United States. Vet. Microbiol. 55:187–196.

50

CHAPTER 3. NOT ALL SCID PIGS ARE CREATED EQUALLY: TWO

INDEPENDENT MUTATIONS IN THE ARTEMIS GENE CAUSE SEVERE

COMBINED IMMUNE DEFICIENCY (SCID) IN PIGS1

A paper published in the Journal of Immunology 2015, 195(7):3171-3179.

Emily H Waide2,3, Jack CM Dekkers2, Jason W Ross2, Raymond RR Rowland4, Carol R Wyatt4,

Catherine L Ewen5, Alyssa B Evans2, Dinesh M Thekkoot6, Nicholas J Boddicker6, Nick VL

Serão7, N Matthew Ellinwood2, Christopher K Tuggle2,8

Abstract

Mutations in over 30 genes are known to result in impairment of the adaptive immune system, causing a group of disorders collectively known as severe combined immunodeficiency

(SCID). SCID disorders are split into groups based on their presence and/or functionality of B, T, and NK cells. Piglets from a line of Yorkshire pigs at Iowa State University were shown to be affected by T- B- NK+ SCID, representing the first example of naturally occurring SCID in pigs.

Here, we present evidence for two spontaneous mutations as the molecular basis for this SCID

1 This work was supported by the Iowa State University Bailey Faculty Development Award and by Presidential Initiative for Interdisciplinary Research Grants funded by the Iowa State University Research Foundation. E.H.W. was a Fellow supported by U.S. Department of Agriculture/National Institute of Food and Agriculture National Needs Grant 2010- 38420-20328. 2 Department of Animal Science, Iowa State University, Ames, Iowa 3 Current address: The Seeing Eye, Inc., Morristown, New Jersey 4 College of Veterinary Medicine, Kansas State University, Manhattan, Kansas 5 STEMCELL Technologies, Vancouver, British Columbia, Canada 6 Genesus, Inc., Oakville, Manitoba, Canada 7 Department of Animal Science, North Carolina State University, Raleigh, North Carolina 8 Correspondence: 2255 Kildee Hall (phone: 515-294-4252, email: [email protected])

51 phenotype. Flow cytometry analysis of thymocytes showed an increased frequency of immature

T cells in SCID pigs. Fibroblasts from these pigs were more sensitive to ionizing radiation than non-SCID piglets, eliminating the RAG1 and RAG2 genes. Genetic and molecular analyses showed two mutations were present in the Artemis gene, which in homozygous or compound heterozygous state cause the immunodeficient phenotype. Rescue of SCID fibroblast radiosensitivity by human Artemis protein demonstrated that the identified Artemis mutations are the direct cause of this cellular phenotype. The work presented here reveals two mutations in the

Artemis gene that cause T- B- NK+ SCID in pigs. The SCID pig can be an important biomedical model, but these mutations would be undesirable in commercial pig populations. The identified mutations and associated genetic tests can be used to address both of these issues.

Introduction

Severe Combined Immunodeficiency (SCID) is a diverse group of primary immunodeficiencies that involve impaired development or function of both B and T cells

(Notarangelo, 2010). Phenotypes causing SCID can be classified based on the presence or absence of B, T, and Natural Killer (NK) cells and further categorized by the genetic defect that causes the disease. Mutations in genes necessary for V(D)J recombination at the T cell receptor

(TCR) or Immunoglobulin (IG) clusters, which are required for receptor maturation and T and B cell survival, respectively, lead to T- B- NK+ SCID (Cossu, 2010). T- B- NK+ SCID defects can be split into two groups based on sensitivity of affected cells to ionizing radiation. Lesions in the

RAG1 and RAG2 genes result in immunodeficient animals that are resistant to radiation (Schwarz et al., 1996). Cells with defects in DNA-PKcs, DNA ligase IV, Cernunnos, or Artemis are

52 radiosensitive, as they are unable to repair damage to DNA caused by ionizing radiation (De

Villartay, 2009).

SCID animals are valuable biomedical models and have been used to study many aspects of human immune function, disease progression, as well as vaccine and drug development and testing. SCID mice, in particular, have been widely used as a vehicle to study the human immune system and play important roles in pre-clinical studies of the stability and function of human stem cells and their differentiated counterparts as potential therapies (Blum and Benvenisty,

2008; Cunningham et al., 2012). In order to improve SCID mice as biomedical models, genetic engineering has been utilized to overcome some of the differences between the immune environment of mice and humans (Shultz et al., 2012). Much progress has been made in creating immunodeficient mice models used in biomedical research (Garcia and Freitas, 2012; Shultz et al., 2012) through selection of appropriate mice strains and conditioning regimens (Brehm and

Shultz, 2012). However, regardless of these advances, the translatability of rodent immunology and inflammatory studies to humans is imperfect (Mestas and Hughes, 2004; Seok et al., 2013) and other models may be beneficial (Meurens et al., 2012; Plews et al., 2013).

Anatomical, genetic, and immunological similarities underpin the usefulness of swine as a large animal model for humans (Meurens et al., 2012; Prather et al., 2013). Recently, transgenic methods have been used to create SCID pigs, including mutations in the X-linked

IL2RG gene (Suzuki et al., 2012; Watanabe et al., 2013) and mutations in RAG1 or in both RAG1 and RAG2 (Huang et al., 2014; Lee et al., 2014). Recently, our group (Cino-Ozuna et al., 2012) reported the first identification of a naturally occurring form of SCID in a line of purebred

Yorkshire pigs that had been selected for increased feed efficiency at Iowa State University (Cai et al., 2008). Necropsy of four piglets that died abnormally early in a viral challenge study

53 showed they lacked functional adaptive immune systems. In subsequent work, we (Basel et al.,

2012) demonstrated that these immunodeficient pigs failed to reject human cancer cells, and

Ewen et al. (2014) reported minimal circulating B and T cells but normal amounts of NK cells in a preliminary analysis of these SCID pigs. We present in this paper the first report of the characterization of the genetic lesions that cause the SCID phenotype in this pig population.

Quantitative real-time PCR (qRT-PCR) and flow cytometry assays confirmed that the SCID pigs lack B and T cells, but do have NK cells; we also demonstrate T cell maturation defects in the thymus. Irradiation of fibroblasts from these SCID pigs and normal controls showed that the genetic defect is in the group of genes involved in DNA damage repair. Genetic mapping, phasing of genetic markers, and targeted sequencing revealed two independent mutations in the

Artemis gene that cause SCID in this line of pigs. We confirmed that mutations in Artemis are responsible for the radiosensitivity of fibroblasts by rescuing this phenotype with expression of human Artemis in fibroblasts derived from the SCID pigs. These Artemis deficient pigs provide a large animal model to study various aspects of human disease and immune function.

Materials and Methods

All animal experiments were performed under a protocol approved by the Iowa State

University Institutional Animal Care and Use Committee (Ames, Iowa).

Flow cytometry to determine cellular phenotype

Thymic tissues were collected into ice-cold Minimal Essential Media, supplemented with

HEPES, and penicillin/streptomycin. Tissues were minced into 2x2 cm2 pieces and treated with

100 U/ml collagenase (collagenase I; Life Technologies, Grand Island, New York) in HBSS

(+calcium and magnesium; Life Technologies) for 30 min at 37°C, with gentle agitation. Cells

54 were washed in PBS, counted, and 1,000,000 cells were stained using the following fluorescently conjugated antibodies: Alexa Fluor647 conjugated anti-pig CD8a (clone 76-2-11; BD

Biosciences, San Diego, California), FITC conjugated anti-pig CD4 (clone 74-12-4; BD

Biosciences), PE-conjugated anti-pig γδ (clone MAC320; BD Biosciences), or the appropriate isotype controls (BD Biosciences), or a with FITC-conjugated mouse anti-pig CD3ε (clone

PPT3; Southern Biotech, Birmingham, Alabama). Cells were washed and resuspended in 400 µL

0.2% BSA/PBS solution prior to acquisition on an EC800 flow cytometer (Sony Biotechnology,

Champaign, Illinois). Data was analyzed using FCS Express 4 software (DeNovo Software, Los

Angeles, California).

For samples used in the homozygous h12 and h16 comparison, an aliquot of 100 µL of whole blood, collected in ACD, was transferred to a 12x75 mm polystyrene tube. Fc receptors were blocked with 10 µL of normal mouse serum (Equitech-Bio, Inc, Kerrville, Texas). Directly- conjugated antibodies to CD3ε (FITC conjugated, clone PPT3; Southern Biotech, Birmingham,

Alabama), CD16 (PE-conjugated, clone G7; Serotec, Raleigh, North Carolina), and CD21 (APC- conjugated, clone B-ly4; BD Biosciences) were added and incubated for 30 minutes. Red blood cells were lysed using Multi-species RBC Lysis Buffer (eBiosciences, San Diego, California), and then washed twice in PBS-1% BSA (Fraction V bovine serum albumin; GE Healthcare Life

Sciences). Cells were acquired on a Sony EC800 cytometer, equipped with counts/µL capabilities, and those detected within the lymphocyte gate were analyzed for the expression of

CD3ε, CD21, or CD16 lineage-specific markers.

Radiosensitivity testing of fibroblasts from SCID and non-SCID piglets

Tissue processing and fibroblast collection was performed similarly as described by Ross et al. (2010). Ear snip tissue of 20 young piglets from five litters from carrier-by-carrier matings

55 was collected into PBS with Gentamycin and stored at 4°C until processing, three to 24 hours.

Tissue was then soaked in 100% ethanol for 2-3 minutes, washed twice with PBS, then transferred to a dry dish and minced. Trypsin was added to the minced tissue and incubated at

37°C for 30 minutes. Trypsin was deactivated with cell culture medium, containing low glucose

DMEM, 15% Fetal Bovine Serum (FBS), and Gentamycin, then centrifuged for 10 minutes at

600 x g. Cell culture media was removed from tissue and replaced with fresh media and cultured in the same medium to allow fibroblasts to migrate from tissue pieces. Fibroblasts were passaged to purify the culture, then stored in freezing media made up of FBS with 10% DMSO in liquid nitrogen. Frozen cultures were thawed and cultured in cell culture medium for approximately 7 days prior to further analysis.

Radiosensitivity assay. For irradiation, 150,000 fibroblast cells were suspended in 15 mL fresh cell culture medium with four tubes per biological replicate. Samples were irradiated using a gamma irradiation machine (Mary Greeley Medical Center, Ames, Iowa) at 0, 2, 4, or 8 gamma rays (Gy). Each sample was then plated in triplicate at a density of 1,000 cells per plate and cultured for two weeks, with one media change 7 days post irradiation. After 14 days of culture, plates were washed with PBS, stained with methyl blue, and the number of individual colonies with a diameter greater than 2 mm determined by visual inspection were counted(Darroudi et al., 2007).

Statistical analysis. The average number of colonies of three replicates for each animal and radiation dose was used as observations. Percentage survival for each animal was calculated by dividing the average number of colonies at each radiation dose (2, 4, and 8 Gy) by the average number of colonies at 0 Gy irradiation for that animal. Survival percentage was analyzed using

Proc Mixed in SAS 9.3 (SAS, Inc.; Cary, North Carolina), with the fixed effects of SCID status,

56 radiation dose, and their interaction. Litter and animal within litter were fitted as random effects.

Different error variances were allowed for each radiation dose.

Data from irradiation of fibroblasts from pigs homozygous for each haplotype, and from compound heterozygous and normal pigs were also analyzed as described above, with the additional fixed effects of haplotype within SCID status and the interaction of haplotype and radiation dose.

Genome wide association analysis

Subjects. Genomic DNA from 20 SCID affected pigs, 50 unaffected littermates, their six parents, and 96 ancestors from the previous seven generations of the experimental residual feed intake (RFI) selection line at Iowa State University (Cai et al., 2008) (ISU) was isolated from tail or ear tissue using the Qiagen DNeasy Blood and Tissue Kit (Hilden, Germany). DNA samples were sent to GeneSeek Inc. (Lincoln, Nebraska) and genotyped using the Illumina Porcine

SNP60 Beadchip (Ramos et al., 2009). SCID status was determined by plotting lymphocyte against white blood cell numbers obtained from complete blood cell analysis of whole blood from each piglet (data not shown).

Statistical analysis. Initial attempts to map the genomic region harboring the causative mutation using runs of homozygosity (Drögemüller et al., 2011) in the affected piglets were unsuccessful. Associations of SNPs with the binary phenotype of affected versus unaffected

SCID status were then analyzed with the dfam, disease within family, option of Plink (Purcell et al., 2007) to determine the genomic region most strongly associated with the categorical disease phenotype. The dfam option examines differential transmission of alleles from parents to affected versus unaffected offspring. This analysis assumes homogeneity of the associated allele within each family, but not between families.

57

Haplotype analysis. Following the results from the association study, genotypes of subjects used for the GWAS for 21 SNPs located in a 1-Mb region surrounding Artemis from the

SNP60 panel were separated into haplotypes using PHASE 2.1.1(Stephens et al., 2001). This revealed two haplotypes that segregated with SCID status when present in a homozygous or compound heterozygous state. Carrier status of these haplotypes was added to the pedigree of the six parents of the SCID piglets, which showed that both could be traced back to the founders of the experimental population (Figure 3.5).

Candidate gene sequencing

Total RNA was extracted from either ear tissue or fibroblasts cultured from ear tissue using the TriZol extraction method (Gibco BRL, Life Technologies, New York). mRNA was reverse transcribed using SuperScript VILO (Invitrogen, Carlsbad, California). To examine the entire Artemis coding sequence for variation, the full coding region of the Artemis cDNA was

PCR amplified using primers in the first and last exon: forward 5’-

GGATCCGTGTTCGCCAACGCT-3’ and reverse 5’-

GCGGCCGCAGAGCTGCCTTTTAGGTTAT-3’, using an annealing temperature of 60°C and an elongation time of 2 minutes and 30 seconds. Artemis cDNA PCR products were cloned into a

TOPO vector and grown in E. coli bacteria on LB plates with Ampicillin. Individual colonies were then picked into LB-Ampicillin media and PCR amplified to ensure that each colony contained a vector with Artemis cDNA. Positive PCR products were cleaned using Exo-Sap and sent to the ISU DNA Facility for sequencing.

To examine the genomic region for sequence variation, primers were designed to amplify targeted exons of Artemis and portions of each surrounding intron. The genomic region containing exons 7 and 8 was PCR amplified using primers: forward 5’-

58

CTCAGTGGGTTAGGGACCTG-3’ and reverse 5’-GCCATCTGATAGGGTTTCCA-3’, using an annealing temperature of 54°C and an elongation time of 60 seconds. The genomic region containing exons 10 and 11 was PCR amplified using primers: forward 5’-

GCTAAAGTCCAGGCCAGTTG-3’ and reverse 5’-CAAGAGTCCCCACCAGTCTT-3’, using an annealing temperature of 56°C and an elongation time of 60 seconds. PCR products were cleaned using Exo-Sap and sent to the ISU DNA Facility for sequencing. Sequences for each haplotype were compared to those of normal littermates and to the reference sequence obtained from Sus scrofa build 10.2.

Rescue of radiosensitive phenotype

Fibroblasts for h12/h16 compound heterozygous SCID (n=2) and normal animals (n=3) from 3 litters were cultured in the same manner as described above. Fibroblasts were transfected with either 5 µg of the human Artemis plasmid (Certo et al., 2012; pExodus CMV.Artemis, which was procured through deposition to Addgene [plasmid #40211] by Andrew Scharenberg), with 3.45 µg of the empty pExodus vector construct (also deposited by A. Scharenberg.

Addgene; plasmid #39991), or shocked with no plasmid added. Cells were electroporated using electroporation media and conditions described by Ross et al. (2010). Briefly, 1,000,000 cells in

200 µL were electroporated using three, 1 ms square-wave pulses of 300 volts in a 2 mm gap cuvette using a BTX ECM 2001 electroporator. Transfected cells were incubated overnight in cell culture media. One day after transfection, fibroblasts were subjected to 4 Gy radiation and plated in triplicate as described above. Cells were cultured for 14 days, with one media change seven days post irradiation, stained with methyl blue, and colonies determined by visual inspection and counted.

59

Statistical Analysis. The average number of colonies of three replicates for each animal and plasmid was used as observations. Data were analyzed using Proc Mixed in SAS 9.3 (SAS,

Inc.; Cary, North Carolina), including the fixed effects of SCID status, plasmid, and their interaction. Litter and animal within litter were fitted as random effects. Different error variances were allowed for data from normal versus SCID animals.

Results

SCID phenotype segregates as an autosomal recessive trait in Yorkshire pig population

Weaned piglets of the eighth generation of a line of purebred Yorkshires selected for increased feed efficiency (residual feed intake; RFI) at Iowa State University (ISU; Cai et al.,

2008) were sent to Kansas State University (KSU) to undergo experimental infection with the

Porcine Reproductive and Respiratory Syndrome virus to evaluate their ability to respond to an infectious disease stressor (Dunkelberger et al., 2015). Necropsy of four of these piglets that died soon after arrival at KSU showed that they had very low numbers of circulating B and T cells, and their lymph nodes and thymus were small (Cino-Ozuna et al., 2012). Pedigree analysis of the four litters that these piglets were from showed that the six parents of these four litters were related. To explore the genetic basis of this phenotype, matings between these six parents were repeated to produce a total of 13 litters (data not shown), each of which produced at least one affected piglet, as diagnosed by numbers of white blood cells and lymphocytes based on complete blood cell (CBC) counts of whole blood; plots of lymphocyte to white blood cell counts within litter distinguished suspected SCID piglets from non-SCID littermates (data not shown). A total of 176 piglets were born, 12 of which died before their SCID status could be determined. Of the remaining 164 piglets, 31 (19%) were confirmed to have SCID based on very

60 low lymphocyte counts as measured by standard CBC or flow cytometry. This frequency was lower (p=0.07) than the expected 25% if the immunodeficiency phenotype was caused by an autosomal recessive mutation. However, it is possible that a portion of the 12 pigs with unknown

SCID phenotype in these litters were actually SCID, thus the SCID phenotype may be under- reported in this population. Based on these results, we hypothesized that the immunodeficiency phenotype was caused by an autosomal recessive mutation.

T- B- NK+ SCID pigs are defective in thymocyte maturation

Flow cytometry analysis, recently published by Ewen et al. (2014), demonstrated that concentrations of circulating leukocytes in peripheral blood of normal and affected pigs were similar among granulocytes, monocytes, and NK cell populations; however, B and T cell populations were significantly reduced in SCID piglets. Consistent findings were produced when quantitative real-time PCR (qRT-PCR) was used to measure the relative expression of B, T, and

NK cell marker genes in whole blood of SCID and non-SCID piglets; non-SCID piglets were shown to have significantly higher levels of B and T cell marker gene expression compared to

SCID littermates (p<0.01; data not shown).

To help ascertain the mechanism underlying the marked T cell paucity in the SCID pigs, thymocytes were prepared from thymic tissues and stained with CD3ε, CD8a, CD4, and γδ lineage markers to monitor thymocyte development in these animals. Flow cytometric analysis revealed that the frequency of immature CD4- CD8a- double negative thymocytes was significantly greater in the SCID pigs, 82.5% (±2.94%), compared to their non-SCID littermates,

30.4% (±1.8%; p=0.001; Figure 3.1), indicating an arrest of differentiation at or prior to TCR rearrangement (Šinkora and Butler, 2009). Furthermore, the frequencies of CD4+ CD8a+ double positive and CD4+ and CD8+ single positive cells, which are usually observed following TCR

61 rearrangement (Sinkora et al., 2000; Sinkora et al., 2007), were also significantly reduced in thymocytes derived from SCID piglets (p<0.01). These cells were almost exclusively CD3ε negative, as CD3+ cells are approximately10-fold reduced in abundance in SCID thymocytes compared to those in non-SCID (Figure 3.1C). Aberrant thymocyte development was not restricted to αβ-TCR-bearing cells, as γδ+ thymocytes were also significantly decreased in SCID affected pigs compared to non-SCID pigs (p=0.0054; Figure 3.1). These findings from flow cytometry and qRT-PCR of lymphocyte marker expression indicated that the SCID phenotype may be caused by a lesion early in the lymphocyte developmental pathway, as both B and T cells were absent or low in number, but late enough that NK cells were unaffected, resulting in T- B-

NK+ SCID.

SCID fibroblasts are sensitive to ionizing radiation

Human SCID patients with a T- B- NK+ phenotype have defects in genes in the somatic recombination pathway required for TCR and IG maturation. This involves two groups of genes; the first group includes RAG1 and RAG2, whose proteins catalyze the first step of TCR and IG maturation, creating a double-strand DNA break (DSDB) in pre-T and pre-B cells(Oettinger et al., 1990). The second group of genes repairs these breaks in the process of forming functional

TCR and IG receptors (Xu, 2006). Defects in RAG1 and RAG2 affect somatic DNA recombination, which occurs only in B and T lymphocytes. However, the ubiquitously expressed genes in the second group are involved in the non-homologous end joining (NHEJ) DNA damage repair pathway, which is important in all cells (Bassing et al., 2002). Cells with lesions in DNA damage repair genes show increased sensitivity to ionizing radiation (Dvorak and

Cowan, 2010). To determine if the genetic defect was in the first or second group of genes, we tested the ability of fibroblasts from SCID and normal pigs to repair DSDB caused by ionizing

62 radiation. Fibroblasts cultured from skin tissue of SCID and non-SCID littermates from five litters were gamma-irradiated. Fibroblast survival was significantly decreased for cells from the

SCID pigs compared to cells from normal littermates at all radiation doses tested (p<0.0001;

Figure 3.2). Even at the lowest dose of radiation, 2 gamma-rays (Gy), the proportion of fibroblasts that survived and developed colonies was significantly lower for those derived from

SCID pigs, 0.28 ± 0.03, compared to those of normal pigs, 0.73 ± 0.03 (p<0.0001). This radiosensitivity in fibroblasts suggested that the mutation causing SCID in these pigs was in one of the ubiquitously expressed genes involved in NHEJ repair of DSDB.

Genetic mapping of the locus mutated in SCID pigs

Genotypes for over 60,000 single nucleotide polymorphisms (SNP) represented on the

Illumina Porcine SNP60 panel (Ramos et al., 2009), covering the entire porcine genome, were obtained on the six carrier parents, 20 SCID affected piglets, 50 unaffected littermates of the

SCID piglets, and 96 ancestors of these animals. A genome wide association study was performed to identify the region harboring the causative mutation. We identified a 5.6 Mb region on Sus scrofa chromosome 10 that was differentially transmitted (p=2.7 x 10-7 for SNP with strongest association) to SCID affected versus unaffected piglets (Figure 3.3). This region was found to contain the porcine Artemis gene, which, when mutated in humans (Moshous et al.,

2001) or mice (Rooney et al., 2002), causes a T- B- NK+ immunodeficiency phenotype. Lesions in Artemis have also been shown to cause increased sensitivity to ionizing radiation (Li et al.,

2002).

To confirm the suspected autosomal recessive manner of transmission, a haplotype analysis was performed using 21 SNPs from the SNP60 panel in the 1-Mb region surrounding

Artemis. Results showed two distinct haplotypes (h12 and h16), which when present in a

63 compound heterozygous or homozygous recessive state, were associated with the SCID phenotype. Both of these SNP haplotypes could be traced back to the founder generation of the

RFI selection line (Cai et al., 2008), which was sourced from commercial Yorkshire populations in the Midwest region of the USA (Figure 3.4).

We obtained SNP genotypes from animals from two sources to evaluate the segregation of SCID haplotypes in commercial pigs. The National Swine Registry (NSR) provided us with

SNP60 data on 985 Yorkshire pigs, and a swine breeding company provided genotypes of over

15,000 pigs from 3 of their breeding lines. Results from phasing of the genotypes of 55 SNPs located in the 2 Mb surrounding the Artemis gene into haplotypes are presented in Appendix A.

Results showed that all analyzed datasets except for one line from the breeding company contained animals carrying haplotypes identical to the SCID affected animals from the ISU population. There were 6 animals in the NSR data that were homozygous for SCID haplotypes, which would indicate that these animals were SCID affected; this is not possible, though, as each of these animals were used for breeding purposes and SCID animals are not able to survive to breeding age. Overall, these results show that it is possible that the mutations in Artemis that cause SCID may be segregating in commercial pigs, but if they are, they are present at a low frequency.

Candidate gene sequencing identifies independent mutations in Artemis

To identify the causal mutation, the coding portion of the Artemis cDNA from fibroblast mRNA was sequenced for affected piglets and normal littermates (Figure 3.5). Selected exons and flanking portions of introns were also amplified and sequenced from genomic DNA.

Multiple examples of alternative splicing events were seen in cDNA of both SCID and normal pigs (Supplemental Figure 3.1). The most abundant transcript observed in cDNA from normal

64 pigs contained all exons present in the human Artemis cDNA, but is missing the second exon of an Artemis porcine cDNA sequence isolated from alveolar macrophages that was reported by others (NM_001258445.1). This exon was not present in any of the Artemis transcripts sequenced in this study, although it exists in the pig genome (Sus scrofa 10.2) between exon 1 and what we refer to as exon 2 and may represent a bona fide exon. However, based on BLAST analysis of sequences that are present in dbEST (17 February 2015), cDNAs with the top six alignments did not contain this exon. In addition, this exon did not align to any portion of human

Artemis cDNA sequences (NM_001033855.2). Both genome-guided transcriptome assembly and de novo transcriptome assembly using whole blood RNAseq data from 31 pigs in an unrelated project was found to contain only one Artemis transcript, which did not contain this exon (H. Liu and C.K. Tuggle, unpublished data). Based on these findings, we believe the swine reference transcript that contains this extra exon (NM_001258445.1) is a minor transcript. For clarity and consistency with the human literature, we denoted this extra exon as exon 1A and considered the second exon that is present in the transcripts observed in our study as exon 2. Translation of the longest transcript we sequenced from non-SCID pigs would produce a protein containing 712 amino acids, with 81% sequence identity to the human encoded Artemis protein.

Haplotype 16 has a lesion causing a predicted splice defect. The majority of cDNAs sequenced from normal pigs contained exons 1-14. The most complete transcript sequenced from cDNA from the h16 SCID haplotype contained all these exons except for exon 8, resulting in the loss of 141 nucleotides (Supplemental Figure 3.1). In fact, none of the cDNA sequences obtained from h16 chromosomes contained exon 8. Genomic sequencing of exon 8 and a portion of the surrounding introns revealed a splice donor site mutation in intron 8 (g.51578763 GàA; Figure

3.5b). This GàA mutation was only seen in the h16 genomic sequence, as a G nucleotide at this

65 position was seen in both the h12 and normal sequences; to distinguish h16 from h12 transcripts in compound heterozygous animals, we used a synonymous point mutation in exon 13 that we identified to be present only in h16 transcripts (position g.51587796). This GàA nucleotide conversion disrupts the signals required to correctly splice exon 8 to exon 9, which explains the observed lack of exon 8 in cDNAs expressed from h16 haplotype chromosomes. An Artemis transcript with no exon 8 would retain the normal reading frame but would be expected to produce a protein missing 47 amino acids of the predicted full length 712-amino acid Artemis protein.

Haplotype 12 has a nonsense point mutation and a predicted severely truncated protein.

Multiple examples of alternative splicing, including aberrant splicing within exons and one intron, were observed for transcripts from chromosomes that carried the h12 haplotype

(Supplemental Figure 3.1). All transcripts sequenced from homozygous h12 animals were found to lack the 137 long exon 10, which would cause a frameshift that results in a stop codon shortly after the missing exon. The predicted protein translated from transcripts missing exon 10 would be severely truncated; at 277 amino acids long, it would be < 50% of normal length. To investigate the cause of this lack of exon 10 in all h12 transcripts, exons 10 and 11 and portions of the surrounding introns were amplified from genomic DNA of homozygous h12 animals. Signal sequences required for splicing of exon 10 were identical to those seen in the sequence of normal pigs, as well as in the reference sequence (Sscrofa 10.2, www.ensembl.org/Sus_scrofa/Info/Index). However, sequence analysis of the exon 10 genomic region in homozygous h12 pigs identified a GàA nonsense point mutation at g.51584489 that changes the Tryptophan amino acid codon at position 267 to a stop codon (Figure 3.5C). Our interpretation of these data is that any transcript that contained exon 10 was not stable enough for

66 cDNA analysis to detect. Further, while not identified as part of any cDNA from homozygous h12 pigs, the predicted translation of this presumably unstable transcript containing exon 10 with the g.51584489 GàA mutation would also produce a severely truncated protein of 266 amino acids.

Animals homozygous for either mutation are not different in numbers of specific lymphocytes or in radiosensitivity

The flow cytometry and radiosensitivity assays described earlier were performed on a combination of compound heterozygous and homozygous h12 or h16 SCID animals. To determine if homozygosity for either mutation would result in altered severity of phenotypes, flow cytometry of circulating blood and assays of fibroblast irradiation were performed.

Peripheral blood from SCID piglets that were homozygous for either mutation or compound heterozygous had greatly decreased percentages of CD21+ and CD3+ cells compared to carrier littermates (p<0.01), but no differences were seen among pigs with different SCID genotypes

(p>0.10; Supplemental Figure 3.2). Fibroblasts from each SCID genotype were equally sensitive to ionizing radiation at all radiation doses (p>0.10; Supplemental Figure 3.3), and fibroblasts from each SCID genotype were significantly more sensitive than fibroblasts from normal pigs at

2 and 4 Gy (p<0.05) and tended to be more sensitive at 8 Gy (p=0.087; Supplemental Figure

3.3).

Transfection with human Artemis rescues radiosensitivity of fibroblasts

To further confirm the role of Artemis in the observed SCID phenotype, we assessed the complementation of the radiosensitivity of SCID pig fibroblasts by expression of human Artemis.

Fibroblasts cultured from compound heterozygous SCID (n=2) and normal (n=3) piglets from three litters were transfected with a human Artemis containing plasmid (pArt), with the empty

67 vector construct (pExo), or shocked without the addition of either plasmid. Cells were then subjected to 4 Gy radiation and survival measured. Addition of pArt increased the radioresistance of fibroblasts from SCID pigs (p=0.001), while cells from normal littermates were unaffected by the addition of pArt (p=0.11; Figure 3.6), and pExo had no effect on the radiosensitivity of fibroblasts of SCID or non-SCID genotypes.

Discussion

Through immunophenotyping, genetic mapping, cDNA and genomic DNA sequencing, fibroblast radiation sensitivity testing, and cDNA rescue of radiosensitivity phenotypes of fibroblasts, we provide substantial evidence that a novel T- B- NK+ immunodeficiency trait in pigs is caused by two independent mutations in the same gene that codes for a DNA repair enzyme, which is known to cause this type of SCID phenotype in human patients. We document that the identified mutations in the Artemis gene are necessary to observe a SCID phenotype in directed matings and are sufficient to cause the radiosensitivity phenotype observed in SCID fibroblasts from this population.

Novel immunodeficiency phenotype in pigs is a recessive Mendelian trait that shows defects in thymocyte maturation consistent with flawed somatic recombination

Cino Ozuno et al. (2012) reported the initial analysis of the first cases of naturally occurring SCID in pigs, which was serendipitously found within a line selected for a commercially relevant trait at ISU (Cai et al., 2008). In our expanded analysis of litters born within the pedigree of this line, a Mendelian and non-sex-linked segregation of SCID affected status in litters born to carrier-by-carrier matings was observed, indicating that the SCID phenotype was an autosomal recessive trait (Cossu, 2010). Findings from qRT-PCR of lymphocyte marker expression in whole blood showed that the SCID pigs had very low numbers

68 of B and T cells, while NK cells were present. These results suggested that the ability of B and T lymphocytes to recombine genetic material to form functional receptors was compromised in the

SCID affected piglets. Flow cytometry of thymus tissue showed an increased population of CD4-

CD8a- and γδ- T cells in SCID pigs compared to non-SCID littermates. This suggested an arrest in early thymocyte development and TCR rearrangement in the SCID pigs (Sinkora et al., 2007;

Šinkora and Butler, 2009). The increased sensitivity to ionizing radiation in fibroblasts of SCID pigs further narrowed the list of candidate genes to be evaluated for mutations to those involved in the NHEJ DNA damage repair system.

Two distinct mutations in the Artemis gene found in SCID pigs

Genomic association analysis of the population segregating the SCID phenotype pointed to a region containing the Artemis gene, while SNP genotype phasing results indicated that there were two distinct haplotypes that each carried a mutation postulated to cause the SCID phenotype. Both of these haplotypes could be traced back to the founders of this population, which suggests that these deleterious mutations may be present in commercial Yorkshire pigs in the US. This is an important implication for the swine industry, as SCID piglets thrive while suckling from their dam, but succumb to opportunistic infections after weaning. Parentage information is generally not tracked into the nursery, at least in part due to cross fostering practices, the lack of permanent individual identification and the frequent use of mixed semen from more than one boar for artificial insemination. Therefore, although the death of several littermates represents a concentrated loss for that litter, this clue to a genetic defect is not likely to be recognized on commercial farms. In addition, if the frequency of the SCID mutations is low, carrier-by-carrier matings may be rare in industry practice.

69

Sequencing of Artemis cDNA from non-SCID and SCID pigs carrying each haplotype identified multiple examples of alternative splicing. The cause for absence of specific exons in sequences of each SCID haplotype was explored using targeted genomic sequencing. A point mutation at the splice donor site for intron 8 in the h16 genomic sequence explained the lack of exon 8 in all cDNA sequences from the h16 haplotype. A compendium of 48 different causative mutations in Artemis in human SCID patients has cataloged six splice site mutations, although none involve exon 8 (Pannicke et al., 2010). Translation of the most complete h16 transcript would produce a protein of 665 amino acids, lacking a 47 amino acid long portion of the beta-

CASP region, which is required for Artemis function (Callebaut et al., 2002; Poinsignon et al.,

2004). Genomic deletions of Artemis exons 5 through 8 (Moshous et al., 2001) and exons 7 and 8

(Li et al., 2002) have been shown to cause SCID in human patients. For h12, no cDNAs were identified that contained exon 10, but the most complete cDNAs from homozygous h12 cells contained all other exons that were present in the non-SCID cDNA sequence. Because the lack of exon 10 causes a frameshift and the open reading frame stops in exon 11, the longest protein that is predicted to be encoded by the observed cDNAs from haplotype h12 is 277 amino acids long, containing only 39% of the normal protein. Interestingly, genomic sequence of the h12 haplotype identified a nonsense point mutation in exon 10, which would result in a protein of only 266 amino acids if this exon 10 were present in mRNA. This severely truncated protein would lack over half of the total amino acid sequence, including a highly conserved part of the beta-CASP motif (Callebaut et al., 2002). Therefore, in both observed cDNAs and RNA transcripts predicted from genomic sequence, h12 can encode only a greatly shortened protein, with 277 or 266 amino acids, respectively. In humans, an 8 base pair insertion in exon 14 of

Artemis that changed a cysteine to a stop codon at position 330 of the protein was also shown to

70 cause SCID (Tomashov-Matar et al., 2012). Pannicke et al. (2010) lists several small deletions causing SCID, the majority of which lead to a frameshift that introduces novel nonsense codons that cause truncations that are distal to exon 10, in exons 12, 13, and 14. This suggests that either version of the h12 haplotype protein is highly unlikely to be functional in DNA damage repair.

We thus hypothesized that the h12 mutation may cause a more severe immunodeficient phenotype than the h16 mutation, in which only 47 internal amino acids are predicted to be lost.

However, flow cytometry and fibroblast irradiation showed no phenotypic differences between

SCID pigs that were homozygous for either mutation or compound heterozygous. While single missense mutations, as well as deletions of single amino acids without a frameshift, can lead to a

SCID or Omenn syndrome phenotype in humans (Pannicke et al., 2010), examples exist for frameshift mutations producing a truncated protein with partial activity. For example, the

D451fsX10 mutation is a large truncation of >200 amino acids that retains partial nuclease activity(Huang et al., 2009). In a mouse model, the hypomorphic D451fsX10 mutation is associated with aberrant DNA joining that causes chromosomal rearrangements (Jacobs et al.,

2011). Thus, underlying cellular differences in DNA repair activity between the h12 and h16 encoded proteins may exist, and further studies should be conducted to determine the functional differences between these two mutations in pigs.

Value of Artemis SCID pigs in preclinical testing

Similarities between humans and pigs emphasize the value of the pig as a biomedical model (Meurens et al., 2012), and pigs as large animal models for many diseases are being created (Walters et al., 2012). Body, tissue and organ size make them physiologically very similar, allowing surgical and imaging techniques to translate well (Swindle, 2007). Size similarity also makes the pig a more suitable model for testing delivery of cells (i.e., heart) and

71 for measuring location/stability of transplanted cells under orthopedic stress (i.e., joint tissue).

However, the use of pig models in regenerative medicine has been minimal, due to the heretofore lack of a porcine immuno-compromised model that could be used as a xenogenic transplant model. As recently demonstrated for human cancer cells, the SCID pigs we describe here cannot reject a xenograft (Basel et al., 2012). Immunodeficient porcine models have long been sought, as is evident by the recently reported transgenic SCID pigs with targeted IL2RG (Suzuki et al.,

2012; Watanabe et al., 2013) and RAG1 and/or RAG2 mutations (Huang et al., 2014; Lee et al.,

2014). Here, we have described two naturally occurring genetic defects in Artemis that cause

SCID in pigs. These porcine models will be valuable in studies involving cellular and organ transplantation, human cancer progression and treatment, vaccine development, and many other immunological questions. Specifically, defects in Artemis (as well as all genes encoding proteins needed for VDJ recombination) have been shown to be associated with poor outcome of stem cell transplantation in humans (reviewed in Cossu et al., 2010). In the case of Artemis, affected patients have substantially higher risk of late toxicity following hematopoietic cell transplants

(HCT) as compared to patients with RAG lesions (Schuetz et al., 2014). This was attributed to the use of alkylating agents in conditioning regimens. A separate study also found Artemis defects were associated with negative clinical events following HCT, which included graft vs. host disease and a range of infections (Neven et al., 2009). While Xiao et al. (2009) generated an

Artemis deficient mouse model that replicates the phenotype observed in human Artemis deficient SCID patients, these Artemis deficient pigs provide an additional and perhaps more relevant preclinical model to test and improve clinical therapies used in treating these patients.

Future studies will focus on defining optimal husbandry practices to increase the lifespan of

SCID pigs so that such modeling can be performed.

72

Conclusions

We provide evidence of two mutations in the Artemis gene that result in the only known cases of naturally occurring SCID in pigs. Artemis deficient SCID pigs provide a large animal biomedical model to investigate numerous aspects of immunology and cancer biology and would be valuable in efforts to improve therapies for human Artemis deficient SCID patients.

Literature Cited

Basel, M. T., S. Balivada, A. P. Beck, M. A. Kerrigan, M. M. Pyle, J. C. M. Dekkers, C. R. Wyatt, R. R. R. Rowland, D. E. Anderson, S. H. Bossmann, and D. L. Troyer. 2012. Human xenografts are not rejected in a naturally occurring immunodeficient porcine line: a human tumor model in pigs. Biores. Open Access 1:63–68.

Bassing, C. H., W. Swat, and F. W. Alt. 2002. The mechanism and regulation of chromosomal V(D)J recombination. Cell 109:45–55.

Blum, B., and N. Benvenisty. 2008. The Tumorigenicity of Human Embryonic Stem Cells. Adv. Cancer Res. 100:133–158.

Brehm, M. A, and L. D. Shultz. 2012. Human allograft rejection in humanized mice: a historical perspective. Cell. Mol. Immunol. 9:225–231.

Cai, W., D. S. Casey, and J. C. M. Dekkers. 2008. Selection response and genetic parameters for residual feed intake in Yorkshire swine. J. Anim. Sci. 86:287–298.

Callebaut, I., D. Moshous, J.-P. Mornon, and J.-P. de Villartay. 2002. Metallo-beta-lactamase fold within nucleic acids processing enzymes: the beta-CASP family. Nucleic Acids Res. 30:3592–3601.

Certo, M. T., K. S. Gwiazda, R. Kuhar, B. Sather, G. Curinga, T. Mandt, M. Brault, A. R. Lambert, S. K. Baxter, K. Jacoby, B. Y. Ryu, H.-P. Kiem, A. Gouble, F. Paques, D. J. Rawlings, and A. M. Scharenberg. 2012. Coupling endonucleases with DNA end-processing enzymes to drive gene disruption. Nat. Methods 9:973–975.

Cino-Ozuna, A., R. Rowland, J. Nietfeld, M. Kerrigan, J. Dekkers, and C. Wyatt. 2012. Preliminary Findings of a Previously Unrecognized Porcine Primary Immunodeficiency Disorder. Vet. Pathol. 50:144–146.

Cossu, F. 2010. Genetics of SCID. Ital. J. Pediatr. 36:76.

73

Cunningham, J. J., T. M. Ulbright, M. F. Pera, and L. H. J. Looijenga. 2012. Lessons from human teratomas to guide development of safe stem cell therapies. Nat. Biotechnol. 30:849–857.

Darroudi, F., W. Wiegant, M. Meijers, A. A. Friedl, M. van der Burg, J. Fomina, J. J. M. van Dongen, D. C. van Gent, and M. Z. Zdzienicka. 2007. Role of Artemis in DSB repair and guarding chromosomal stability following exposure to ionizing radiation at different stages of cell cycle. Mutat. Res. - Fundam. Mol. Mech. Mutagen. 615:111–124.

Drögemüller, C., U. Reichart, T. Seuberlich, A. Oevermann, M. Baumgartner, K. K. Boghenbor, M. H. Stoffel, C. Syring, M. Meylan, S. Müller, M. Müller, B. Gredler, J. Sölkner, and T. Leeb. 2011. An unusual splice defect in the mitofusin 2 gene (mfn2) is associated with degenerative axonopathy in tyrolean grey cattle. PLoS One 6:e18931.

Dunkelberger, J., N. Boddicker, N. Serão, J. Young, R. Rowland, and J. Dekkers. 2015. Response of pigs divergently selected for residual feed intake to experimental infection with the PRRS virus. Livest. Sci. 177:132–141.

Dvorak, C. C., and M. J. Cowan. 2010. Radiosensitive Severe Combined Immunodeficiency Disease. Immunol. Allergy Clin. North Am. 30:125–142.

Ewen, C., A. Cino-Ozuna, H. He, M. Kerrigan, J. Dekkers, C. Tuggle, R. Rowland, and C. Wyatt. 2014. Analysis of blood leukocytes in a naturally occurring immunodeficiency of pigs shows the defect is localized to B and T cells. Vet. Immunol. Immunopathol. 162:174–179.

Garcia, S., and A. A. Freitas. 2012. Humanized mice: Current states and perspectives. Immunol. Lett. 146:1–7.

Huang, J., X. Guo, N. Fan, J. Song, B. Zhao, Z. Ouyang, Z. Liu, Y. Zhao, Q. Yan, X. Yi, A. Schambach, J. Frampton, M. a Esteban, D. Yang, H. Yang, and L. Lai. 2014. RAG1/2 Knockout Pigs with Severe Combined Immunodeficiency. J. Immunol. 193:1496–503.

Huang, Y., W. Giblin, M. Kubec, G. Westfield, J. St Charles, L. Chadde, S. Kraftson, and J. Sekiguchi. 2009. Impact of a hypomorphic Artemis disease allele on lymphocyte development, DNA end processing, and genome stability. J. Exp. Med. 206:893–908.

Jacobs, C., Y. Huang, T. Masud, W. Lu, G. Westfield, W. Giblin, and J. M. Sekiguchi. 2011. A hypomorphic Artemis human disease allele causes aberrant chromosomal rearrangements and tumorigenesis. Hum. Mol. Genet. 20:806–819.

Lee, K., D.-N. Kwon, T. Ezashi, Y.-J. Choi, C. Park, A. C. Ericsson, A. N. Brown, M. S. Samuel, K.-W. Park, E. M. Walters, D. Y. Kim, J.-H. Kim, C. L. Franklin, C. N. Murphy, R. M. Roberts, R. S. Prather, and J.-H. Kim. 2014. Engraftment of human iPS cells and allogeneic porcine cells into pigs with inactivated RAG2 and accompanying severe combined immunodeficiency. Proc. Natl. Acad. Sci. U. S. A. 111:7260–7265.

Li, L., D. Moshous, Y. Zhou, J. Wang, G. Xie, E. Salido, D. Hu, J.-P. de Villartay, and M. J. Cowan. 2002. A founder mutation in Artemis, an SNM1-like protein, causes SCID in Athabascan-speaking Native Americans. J. Immunol. 168:6323–6329.

74

Mestas, J., and C. C. W. Hughes. 2004. Of mice and not men: differences between mouse and human immunology. J. Immunol. 172:2731–2738.

Meurens, F., A. Summerfield, H. Nauwynck, L. Saif, and V. Gerdts. 2012. The pig: A model for human infectious diseases. Trends Microbiol. 20:50–57.

Moshous, D., I. Callebaut, R. De Chasseval, B. Corneo, M. Cavazzana-Calvo, F. Le Deist, I. Tezcan, O. Sanal, Y. Bertrand, N. Philippe, A. Fischer, and J. P. De Villartay. 2001. Artemis, a novel DNA double-strand break repair/V(D)J recombination protein, is mutated in human severe combined immune deficiency. Cell 105:177–186.

Neven, B., S. Leroy, H. Decaluwe, F. Le Deist, C. Picard, D. Moshous, N. Mahlaoui, M. Debré, J. L. Casanova, L. Dal Cortivo, Y. Madec, S. Hacein-Bey-Abina, G. De Saint Basile, J. P. De Villartay, S. Blanche, M. Cavazzana-Calvo, and A. Fischer. 2009. Long-term outcome after hematopoietic stem cell transplantation of a single-center cohort of 90 patients with severe combined immunodeficiency. Blood 113:4114–4124.

Notarangelo, L. D. 2010. Primary immunodeficiencies. J. Allergy Clin. Immunol. 125:S182– S194.

Oettinger, M. A., D. G. Schatz, C. Gorka, and D. Baltimore. 1990. RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination. Science (80-. ). 248:1517–1523.

Pannicke, U., M. Hönig, I. Schulze, J. Rohr, G. A. Heinz, S. Braun, I. Janz, E. M. Rump, M. G. Seidel, S. Matthes-Martin, J. Soerensen, J. Greil, D. K. Stachel, B. H. Belohradsky, M. H. Albert, A. Schulz, S. Ehl, W. Friedrich, and K. Schwarz. 2010. The most frequent DCLRE1C (ARTEMIS) mutations are based on homologous recombination events. Hum. Mutat. 31:197– 207.

Plews, J., M. Gu, M. Longaker, and J. Wu. 2013. Large animal induced pluripotent stem cells as pre-clinical models for studying human disease. J Cell Mol Med 16:1196–1202.

Poinsignon, C., D. Moshous, I. Callebaut, R. de Chasseval, I. Villey, and J.-P. de Villartay. 2004. The metallo-beta-lactamase/beta-CASP domain of Artemis constitutes the catalytic core for V(D)J recombination. J. Exp. Med. 199:315–321.

Prather, R. S., M. Lorson, J. W. Ross, J. J. Whyte, and E. Walters. 2013. Genetically Engineered Pig Models for Human Diseases. Annu. Rev. Anim. Biosci. 1:203–219.

Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. W. de Bakker, M. J. Daly, and P. C. Sham. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:559–575.

Ramos, A. M., R. P. M. A. Crooijmans, N. A. Affara, A. J. Amaral, L. Alan, J. E. Beever, C. Bendixen, C. Churcher, R. Clark, P. Dehais, M. S. Hansen, J. Hedegaard, Z. Hu, H. H. Kerstens, A. S. Law, D. Milan, D. J. Nonneman, G. A. Rohrer, M. F. Rothschild, and T. P. L. Smith. 2009. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology. 4:e6524.

75

Rooney, S., J. Sekiguchi, C. Zhu, H. L. Cheng, J. Manis, S. Whitlow, J. DeVido, D. Foy, J. Chaudhuri, D. Lombard, and F. W. Alt. 2002. Leaky Scid Phenotype Associated with Defective V(D)J Coding End Processing in Artemis-Deficient Mice. Mol. Cell 10:1379–1390.

Ross, J. W., J. J. Whyte, J. Zhao, M. Samuel, K. D. Wells, and R. S. Prather. 2010. Optimization of square-wave electroporation for transfection of porcine fetal fibroblasts. Transgenic Res. 19:611–620.

Schuetz, C., B. Neven, C. C. Dvorak, S. Leroy, M. J. Ege, U. Pannicke, K. Schwarz, A. S. Schulz, M. Hoenig, M. Sparber-Sauer, S. a. Gatz, C. Denzer, S. Blanche, D. Moshous, C. Picard, B. N. Horn, J. P. De Villartay, M. Cavazzana, K. M. Debatin, W. Friedrich, A. Fischer, and M. J. Cowan. 2014. SCID patients with ARTEMIS vs RAG deficiencies following HCT: Increased risk of late toxicity in ARTEMIS-deficient SCID. Blood 123:281–289.

Schwarz, K., G. H. Gauss, L. Ludwig, U. Pannicke, Z. Li, D. Lindner, W. Friedrich, R. a Seger, T. E. Hansen-Hagge, S. Desiderio, M. R. Lieber, and C. R. Bartram. 1996. RAG mutations in human B cell-negative SCID. Science 274:97–99.

Seok, J., H. S. Warren, A. G. Cuenca, M. N. Mindrinos, H. V Baker, W. Xu, D. R. Richards, G. P. McDonald-Smith, H. Gao, L. Hennessy, C. C. Finnerty, C. M. López, S. Honari, E. E. Moore, J. P. Minei, J. Cuschieri, P. E. Bankey, J. L. Johnson, J. Sperry, A. B. Nathens, T. R. Billiar, M. a West, M. G. Jeschke, M. B. Klein, R. L. Gamelli, N. S. Gibran, B. H. Brownstein, C. Miller- Graziano, S. E. Calvano, P. H. Mason, J. P. Cobb, L. G. Rahme, S. F. Lowry, R. V Maier, L. L. Moldawer, D. N. Herndon, R. W. Davis, W. Xiao, and R. G. Tompkins. 2013. Genomic responses in mouse models poorly mimic human inflammatory diseases. Proc. Natl. Acad. Sci. U. S. A. 110:3507–3512.

Shultz, L. D., M. A. Brehm, J. V. Garcia, and D. L. Greiner. 2012. Humanized mice for immune system investigation: Progress, promise and challenges. Nat. Rev. Immunol. 12:786–798.

Šinkora, M., and J. E. Butler. 2009. The ontogeny of the porcine immune system. Dev. Comp. Immunol. 33:273–283.

Sinkora, M., J. Sinkora, Z. Reháková, and J. E. Butler. 2000. Early ontogeny of thymocytes in pigs: sequential colonization of the thymus by T cell progenitors. J. Immunol. 165:1832–1839.

Sinkora, M., J. Sinkorová, Z. Cimburek, and W. Holtmeier. 2007. Two groups of porcine TCRgammadelta+ thymocytes behave and diverge differently. J. Immunol. 178:711–719.

Stephens, M., N. J. Smith, and P. Donnelly. 2001. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68:978–989.

Suzuki, S., M. Iwamoto, Y. Saito, D. Fuchimoto, S. Sembon, M. Suzuki, S. Mikawa, M. Hashimoto, Y. Aoki, Y. Najima, S. Takagi, N. Suzuki, E. Suzuki, M. Kubo, J. Mimuro, Y. Kashiwakura, S. Madoiwa, Y. Sakata, A. C. F. Perry, F. Ishikawa, and A. Onishi. 2012. Il2rg gene-targeted severe combined immunodeficiency pigs. Cell Stem Cell 10:753–758.

76

Swindle, M. M. 2007. Swine in the laboratory: Surgery, anesthesia, imaging, and experimental techniques, Second Edition. CRC Press, Boca Raton, Florida.

Tomashov-Matar, R., G. Biran, I. Lagovsky, N. Kotler, A. Stein, B. Fisch, O. Sapir, and M. Shohat. 2012. Severe combined immunodeficiency (SCID): From the detection of a new mutation to preimplantation genetic diagnosis. J. Assist. Reprod. Genet. 29:687–692.

De Villartay, J. P. 2009. V(D)J recombination deficiencies. Adv. Exp. Med. Biol. 650:46–58.

Walters, E. M., E. Wolf, J. J. Whyte, J. Mao, S. Renner, H. Nagashima, E. Kobayashi, J. Zhao, K. D. Wells, J. K. Critser, L. K. Riley, and R. S. Prather. 2012. Completion of the swine genome will simplify the production of swine as a large animal biomedical model. BMC Med. Genomics 5:55.

Watanabe, M., K. Nakano, H. Matsunari, T. Matsuda, M. Maehara, T. Kanai, M. Kobayashi, Y. Matsumura, R. Sakai, M. Kuramoto, G. Hayashida, Y. Asano, S. Takayanagi, Y. Arai, K. Umeyama, M. Nagaya, Y. Hanazono, and H. Nagashima. 2013. Generation of Interleukin-2 Receptor Gamma Gene Knockout Pigs from Somatic Cells Genetically Modified by Zinc Finger Nuclease-Encoding mRNA. PLoS One 8:1–8.

Xiao, Z., E. Dunn, K. Singh, I. S. Khan, S. M. Yannone, and M. J. Cowan. 2009. A non leaky Artemis-deficient mouse that accurately models the human SCID phenotype including resistance to hematopoietic stem cell transplantation. Biol Blood Marrow Transpl. 15:1–11.

Xu, Y. 2006. DNA damage: a trigger of innate immunity but a requirement for adaptive immune homeostasis. Nat. Rev. Immunol. 6:261–270.

77

Figures

Figure 3.1 Flow cytometry shows very few CD4+ CD8+ and γδ+ cells in SCID pig thymus. The number in each diagram is the percentage of each cell type in thymocyte populations of a SCID pig or a non-SCID carrier littermate (A and B). CD8a (x-axis) and CD4 (y-axis) (A) and γδ (B) are T cell surface markers. The least square means of percentages of each thymocyte subpopulation are shown in (C). Error bars represent the SE of the estimates. Bars with different letters (a and b) indicate statistically significant (p < 0.01) differences between SCID and non- SCID cell number within the cellular subset.

78

Figure 3.2 Effect of ionizing radiation on fibroblasts from SCID and normal pigs. Fibroblasts from SCID piglets (n = 10) and normal littermates (n = 10) were exposed to increasing doses of gamma rays. Colonies (>2 mm) were stained and counted after 14 d. Survival proportion was calculated as the average number of surviving colonies for three replicates at each radiation dose divided by the number of colonies from nonirradiated cells for each animal. Error bars represent the SE of the least squares means. *p < 0.0001, difference between SCID and normal pigs within radiation dose.

79

Figure 3.3 Manhattan plot of the genome-wide association study for SCID status. Results show the -log10(p-value) of the association of ordered SNPs on S. scrofa chromosomes 1–18, X, Y, and unknown (U) with SCID status based on the dfam option in PLINK.

80

Figure 3.4 Pedigree of SCID ancestors showing carriers of mutated haplotypes based on the Illumina porcine SNP60 panel. Circles represent females and squares represent males. Green symbols are h16 haplotype carriers and pink symbols are h12 haplotype carriers, with lines of each color tracking the respective SCID haplotype through the pedigree to the founder generation. Beige symbols are pigs that were genotyped with the SNP60 panel and did not carry either SCID haplotype. White symbols in generations 0–7 represent nongenotyped individuals. Boxes in generation 8 give information on the numbers of SCID (Affected) and non-SCID (Unaffected) piglets and the number of piglets that died before their SCID phenotype was determined (Unknown) from each parent pair.

81

Figure 3.5 Two independent mutations were found in Artemis. (A) The coding region of the Artemis transcript is indicated by slanted lines and genotypes at mutated positions are shown for chromosomes that carry normal, h12, and h16 haplotypes. Mutant alleles that cause SCID are shown in black lettering. (B) Genomic sequence of the h16 haplotype shows a splice donor site mutation (g.51578763 G→A) responsible for the lack of exon 8 in all h16 transcripts. Capital letters denote exonic sequence whereas lowercase letters denote intronic sequence. (C) A nonsense point mutation (g.51584489 G→A) in exon 10 changes the tryptophan at position 267 to a stop codon in the h12 haplotype.

82

Figure 3.6 Rescue of sensitivity to ionizing radiation. Fibroblasts from compound heterozygous SCID (n = 2) and normal (n = 3) littermates were transfected with human Artemis-expressing plasmid (5 mg, Artemis), with a molar equivalent of pExodus plasmid without the Artemis gene (3.45 mg, Empty Vector), or shocked without plasmid added. Fibroblasts were exposed to a 4 Gy radiation dose 24 h after transfection. Colonies (>2 mm) were counted after 14 d of growth. The average number of surviving colonies for three replicates for a given plasmid was divided by the number of colonies from shock only cells for each animal. Error bars represent the SE of the least squares means. Dots show individual observations. Bars with different letters (a and b) within affected status represent statistical differences between means with p < 0.01.

83

Supplemental Figure 3.1 cDNA structures for the Artemis gene from normal, h16, and h12 haplotypes. Exons are represented by blocks, with exon number shown above the exons present in each transcript. Fractions show the number of sequences for that transcript over the total number of transcripts sequenced for that haplotype. For h12 transcripts, purple indicates that a portion of intron 4 was present in the transcript sequence. The red color shows splicing inside of exons, with the size of the red box indicating the amount of the exon present in the sequenced transcript. Dashed lines show the normal size of the improperly spliced exon for some transcripts.

84

a

a a a a a b b b b b b

Supplemental Figure 3.2 Flow cytometry shows no differences between the lymphocyte counts of SCID pigs with different genotypes. There were no differences between the numbers of any lymphocytes seen among the SCID genotypes (p>0.10). SCID and carrier littermates had the same numbers of NK cells (CD16; p>0.10), but all SCID pigs had lower numbers of B (CD21) and T (CD3e) cells (p<0.01). Error bars represent the standard errors of the least square means. Bars with different letters (a and b) represent statistically significant differences between least square means within marker type.

85

*

*

Supplemental Figure 3.3 Radiosensitivity of fibroblasts from all SCID affected genotypes and normal piglets. Fibroblasts from h12/h12, h16/h16, h12/h16, and normal animals (n=4 per genotype) were exposed to indicated doses of gamma-rays (Gy). Survival proportion was calculated as the average number of surviving colonies for three replicates at each radiation dose was divided by the number of colonies from non-irradiated cells for each animal. Error bars represent the standard error of the least squares means. * indicates statistically significant (p<0.05) difference between all SCID genotypes and normal pigs.

86

CHAPTER 4. GENOME WIDE ASSOCIATION OF PIGLET RESPONSE TO ONE OF

TWO ISOLATES OF THE PORCINE REPRODUCTIVE AND RESPIRATORY

SYNDROME VIRUS1

A paper prepared for publication in Journal of Animal Science

Emily H Waide2,3, Christopher K Tuggle2, and Jack CM Dekkers2,4

Abstract

Porcine Reproductive and Respiratory Syndrome (PRRS) is a devastating disease in the swine industry, and identification of host genetic factors that enable selection for improved performance during PRRS virus (PRRSV) infection would reduce the economic impact of this disease. We conducted genome-wide association study (GWAS) analyses of data from 13 trials of ~200 nursery age piglets that were experimentally infected with one of two isolates of the

PRRSV (NVSL and KS06). Phenotypes analyzed were: viral load (VL) in blood during the first

21days post infection (dpi) and weight gain (WG) from 0 to 42 dpi. We accounted for the previously identified SSC 4 QTL in our models to increase power to identify additional regions.

Bayes-B and single SNP analyses identified the same regions on SSC 3 and 5 to be associated

1 This project was supported by the USDA NIFA PRRS CAP Award 2008-55620-19132, the National Pork Board, and the NRSP-8 Swine Genome and Bioinformatics Coordination projects, and the PRRS Host Genetics Consortium consisting of USDA ARS, Kansas State Univ., Iowa State Univ., Michigan State Univ., Washington State Univ., Purdue Univ., University of Nebraska-Lincoln, PIC/Genus, Newsham Choice Genetics, Fast Genetics, Genetiporc, Inc., Genesus, Inc., PigGen Canada, Inc., IDEXX Laboratories, and Tetracore, Inc. Nick VL Serão, Martine Schroyen, Andrew Hess, Raymond RR Rowland, Joan K Lunney, and Graham Plastow contributed to the work presented in this chapter and are expected to be co-authors when this manuscript has their approval. 2 Department of Animal Science, Iowa State University, Ames, Iowa 3 Current address: The Seeing Eye, Inc., Morristown, New Jersey 4 Correspondence: 239D Kildee Hall (phone: 515-294-7509, email: [email protected])

87 with VL in the KS06 trials and on SSC 6 in the NVSL trials (P<5x10-6); for WG, regions on SSC

5 and 17 were associated in the NVSL trials (P<3x10-5). No regions were identified with both methods for WG in the KS06 trials. Many regions identified by single SNP analyses were not identified using Bayes-B. Except for the GBP5 region on SSC 4, which was associated with VL for both isolates (but only with WG for NVSL), identified regions did not overlap between the two PRRSV isolate datasets, which disagrees with the high genetic correlations of traits between isolates estimated from these data. We also identified genomic regions whose associations with

VL or WG had interactions with either PRRSV isolate or genotype at the SSC 4 QTL. Gene ontology (GO) annotation terms for genes located near moderately associated SNPs (P<0.003) were enriched for multiple immunologically (VL) and metabolism (WG) related GO terms. The biological relevance of these regions suggests that, although use of single SNP analyses and a relaxed threshold may increase the number of false positives, it also increased the identification of true positives. Although only the SSC 4 QTL was associated with response to both PRRSV isolates, genes near associated SNPs were enriched for the same GO terms across PRRSV isolates, suggesting they were affected by similar biological processes.

Introduction

The PRRS Host Genetics Consortium (PHGC) investigates host genetic control of response to Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) infection (Lunney et al., 2011). Analysis of PHGC infection trials showed weight gain (WG) and viral load (VL) responses to the NVSL 97-7985 (NVSL; Osorio et al., 2002) PRRSV isolate were moderately heritable (~0.30) and controlled by a major QTL (Boddicker et al., 2012). One SNP in this QTL,

WUR10000125 (WUR), captured 99.3% of the genetic variance explained and was validated using independent NVSL trials (Boddicker et al., 2014a; Boddicker et al., 2014b).

88

The diversity of PRRSV in the field emphasizes the importance of research involving multiple virus isolates (Murtaugh et al., 2010). To expand the findings from NVSL trials, the

KS2006-72109 (KS06) isolate, which shares 89% ORF5 sequence similarity with NVSL, was used in additional infection trials. Hess et al. (2015) showed that the SSC 4 QTL had an effect on

VL in both isolates but on WG in only the NVSL isolate. High genetic correlations for VL

(0.86±0.19) and WG (0.86±0.27) between the two PRRSV isolates (Hess et al., 2015) led us to hypothesize that specific genomic regions control response to both PRRSV isolates.

Genomic evaluation of complex traits, such as host response to PRRSV, may benefit from the inclusion of biological information, such as gene ontology (GO) annotation (Fortes et al., 2011; Serão et al., 2013). Thus, we aimed to use genome-wide association to find regions associated with response to PRRSV and to use functional analyses to support these findings.

Genomic regions associated with response to both PRRSV isolates may prove useful in industry application and may also control response to other pathogens. Regions associated with response to only one PRRSV isolate may give insight into isolate-specific pathogenicity. Enrichment of relevant GO terms in genes near associated SNPs adds functional information to support statistical associations.

Materials and Methods

All experimental protocols used in this study were approved by the Kansas State

University (KSU) Animal Care and Use Committee.

89

Study design

Lunney et. al (2011) provided a detailed description of the study design employed in the

PHGC trials used in this study. Briefly, a total of 15 trials of approximately 200 commercial crossbred piglets each were sent to KSU at weaning, given one week to acclimate, then innoculated intramuscularly and intranasally with 105 tissue culture infectious dose50 of either the NVSL 97-7985 (Osorio et al., 2002) or KS2006-72109 PRRSV isolate. In total, 2,289 commercial crossbred piglets from 8 genetic backgrounds were used. A more detailed description of the populations used in each trial is in Table 4.1. Trial 9 involved pigs from the

ISU RFI selection lines (Cai et al. 2008) and was excluded from these analyses. Trial 13 was also excluded because piglets from this trial had much lower and more variable viremia profiles compared to the other KS06 trials.

For each trial, blood samples were collected at 0, 4, 7, 11, and 14 days post-infection

(dpi) and then weekly until the completion of the trial at 42 dpi. Individual weights were observed weekly throughout the trials. At 42 dpi, piglets were euthanized and ear tissue was collected for genomic DNA extraction, which was sent to GeneSeek, Inc. (Lincoln, Nebraska,

USA) and genotyped using the Illumina Porcine SNP60 Beadchip (Ramos et al., 2009).

Genotypes with a gene call score lower than 0.5 were set to missing. After removal of SNPs with minor allele frequencies less than 0.01 across all trials and genotyping call rates less than 0.80,

52,386 SNPs remained for analysis, with an overall genotype rate of 99.2%.

Phenotypes

The two phenotypes analyzed in this study were described by Boddicker et al. (2012).

Briefly, the amount of PRRSV RNA in blood samples was quantified using quantitative PCR

90 and reported as the log10 of PRRSV RNA copies relative to a standard curve. Viral load (VL) was calculated as the area under the curve of viremia up to and including 21 dpi. Weight gain

(WG) was calculated as the difference between body weight at 42 and 0 dpi.

Genome wide association studies

Associations of SNP genotypes with each trait were assessed using Bayesian variable selection and single SNP methods, as described in the following.

Single SNP analyses. The following linear mixed model was used in ASReml 4

(Gilmour et al., 2014) to associate SNP genotypes with phenotypes:

� = �� + �!�� + �� + �� + �� + �

where y = vector of phenotypic observations, X = design matrix associating phenotypes with fixed effects, b = vector of fixed effects of the interaction of parity and trial, covariates of initial age and weight, and genotype at the WUR SNP, Zi = design matrix associating phenotypes with class effects for SNP i, gi = vector of genotype class effects for SNP i, W = incidence matrix associating phenotypes with the interaction of pen and trial, p = vector of random effect of pen within trial, S = incidence matrix associating phenotypes with litter effect, l = vector of random effects of litter, V = incidence matrix associating phenotypes with random animal polygenic effects, u = vector of random animal polygenic effects with variance-covariance matrix based on

! the pedigree relationship matrix, and e = vector of residual errors assumed to be i.i.d. ~N(0,�! ).

A one generation pedigree, including 4807 animals, was used. Results are presented as the – log10(P) of the combined additive and dominance effects at each SNP. The interaction of genotype at each SNP with PRRSV isolate and with WUR genotype was also tested on the

91 combined dataset, using the same model described above but including virus isolate or its interaction with SNP genotype as additional fixed effects.

Bayesian methods. All SNP genotypes were simultaneously fitted as random effects using the Bayes B method (Habier et al., 2011), as implemented in GenSel version 4.90

(Fernando and Garrick, 2008). Missing genotypes were replaced with the mean genotype for that individual’s genetic background for that SNP. We used the following mixed model, a modified form of the model used in Zeng et al. (2013), to estimate associations of SNPs with each phenotype:

! !

� = �� + ����!�!" + ����!�!" + � !!! !!! where y and X are as described above, b = vector of fixed effects of sex, the interactions of pen and parity with trial, genotype at the WUR SNP, and covariates of initial age and weight, zai = vector of the additive genotype covariates coded as -10, 0, and 10 for the AA, AB, and BB genotypes, respectively, for SNP i, ai = additive effect for SNP i, δai = indicator for whether the additive effect of SNP i was included (δai = 1) or excluded from (δai = 0) the model for a given iteration of the Markov Chain Monte Carlo (MCMC), zdi = vector of the dominance genotype covariates coded as 10, 0, and 10 for the AA, AB, and BB genotypes, respectively, for SNP i, di

= dominance effect for SNP i, δdi = indicator for whether the dominance effect of SNP i was included (δdi = 1) or excluded from (δdi = 0) the model for a given iteration of the MCMC, and e

! = vector of residual errors assumed to be i.i.d. ~N(0,�! ). GenSel version 4.90 (Fernando and

Garrick, 2008) does not allow for random effects other than SNPs to be fitted in the model, therefore litter was not included as an effect in the Bayes-B model. In order to increase power to identify genomic regions other than the SSC 4 QTL (Boddicker et al., 2012) with effects on

92 response to PRRSV infection, genotype at the WUR SNP was fitted as a fixed effect in our models and the WUR SNP and all SNPs within 2.5 Mb on either side of the WUR SNP were masked from the GWAS. The prior probability that a given SNP was excluded from the model

(δai = 0 and δdi = 0) was set equal to π = 0.99, fitting approximately 1% or 524 additive effects and 524 dominance effects in each of 51,000 iterations of the MCMC, with the first 1,000 iterations designated as burn in. Results are presented as the percent of genetic variance (GV) explained by sum of the additive and dominance effects of SNPs within non-overlapping 1-Mb windows (Wolc et al., 2012) of the genome based on Sus scrofa genome build 10.2

(http://www.ensembl.org/Sus_scrofa/Info/Index?db=core). Windows explaining greater than 1%

GV were further discussed. Alternative methods of evaluating Bayes-B results, such as the proportion of iterations in which a window explained greater than average % GV, identified the same associated regions.

Principal Component Analysis (PCA) of genotypes to determine the number of independent tests. The number of independent single-SNP tests was based on the number of

PCA needed to capture 99.5% of the genetic variance (Gao et al., 2008). Genotype data from each PRRSV isolate dataset were split between chromosomes, then further divided to create chromosomal segments containing a number of SNPs equal to half of the number of animals in the respective dataset. We then used R (R Core Team, 2015) to determine the number of PCA required to explain 99.5% of the variance of SNP genotypes in each chromosomal segment. The summed number of PCA across all chromosomes was considered to be the number of independent tests and used to designate Bonferroni corrected p-value thresholds to determine significance of associations in the single SNP analyses. PCA results by chromosome are in

93

Supplemental Table 4.1. An alternative genome-wide significance threshold was calculated based on a modified Bonferroni correction as !.!" = 0.0002 (Mantel, 1980). ! !"#$

Further analysis of specific genomic regions. Genes located within 500, 250, or 100 kb of SNPs based on Sus scrofa genome build 10.2

(http://www.ensembl.org/Sus_scrofa/Info/Index?db=core) with p-value (P) <0.001, P<0.003, or

P<0.01, as determined by single SNP analyses, with the addition of the WUR region on SSC 4, were compiled into gene lists to perform functional analyses. Resulting gene lists were analyzed for enrichment of GO annotation terms using the GO Slim categories in the Panther software (Mi et al., 2013). Both the complete GO terms and a “Slim” set of terms in both the Biological

Process (BP) and Molecular Function (MF) components of the GO were used for annotation enrichment analysis. The Slim approach uses a limited set of GO terms that provides a broad coverage of the GO annotation (Mi et al., 2013). Panther GO Slim is a selected subset of the complete gene ontologies, which give an overview of the ontologies and remove the very specific detailed terms. Overrepresentation of GO terms was tested using the binomial distribution function (Cho and Campbell, 2000). Lists of genes within 500, 250, or 100 kb of all

52,386 SNPs used in the GWAS analyses were used as backgrounds for the selected SNP lists for each respective window size. Resulting p-values for the enrichment of genes within a GO term were Bonferroni corrected by multiplying the p-value for each term within a category by the total number of terms in the category (Sham and Purcell, 2014); 257 for biological process

Slim, 223 for molecular function Slim, and 177 for pathways.

The 10 most overrepresented GO terms for each gene list (500 kb surrounding SNPs with

P<0.001, 500 kb surrounding SNPs with P<0.003, etc.) were compared to determine the

94 significance threshold and window size that resulted in gene lists that were most overrepresented for functionally relevant biological process slim (BPS) GO terms (Supplemental tables 4.2 to

4.5). The top 10 enriched BPS GO terms in lists of genes within 250 kb of SNPs with P<0.003 contained 5 (KS06 VL), 8 (NVSL VL), 6 (KS06 WG), and 6 (NVSL WG) terms that were considered biologically relevant or were found to be enriched in similar phenotypes analyzed in other studies (Oliveira et al., 2014; Li et al., 2015; Schroyen et al., 2015). Overrepresentation of

GO terms in Panther GO Slim and Complete categories of biological process and molecular function and Panther pathways of genes within 250 kb of SNPs with P<0.003 was assessed using

Panther (Mi et al., 2013).

Results

In total, 2,288 commercial crossbred piglets from 8 genetic backgrounds were infected with one of two PRRSV isolates. Piglets averaged 26.6 (±2.4) days of age and weighed 7.2

(±1.4) kg at the time of infection. Of this total, 1,557 piglets were infected with NVSL, with an average weight of 7.3 (±1.4) kg and 26.6 (±2.6) days of age at inoculation. VL in the NVSL trials was on average 107.0 (±8.4) units and average WG was 14.9 (±5.0) kg. The 731 piglets infected with KS06 averaged 26.7 (±1.9) days of age and weighed an average 6.8 (±1.3) kg at the time of inoculation. VL in the KS06 trials was lower, at an average of 104.6 (±8.9) units, and

WG was higher, at an average of 19.5 (±4.3) kg, in comparison to that of NVSL trials. A more detailed description of the populations used in each trial is in Table 4.1.

Trait heritabilities

Using Bayes-B, the 52,386 SNPs together explained 53% and 48% of the phenotypic variance for VL in the KS06 and NVSL data, respectively, and 37% and 39% of the phenotypic

95 variance for WG in these respective datasets. When genotype at the WUR SNP was fitted as a fixed effect and SNPs within 2.5 Mb on either side of the WUR SNP excluded from the analysis, marker based heritabilities decreased 2% on average. Excluding the SSC 4 QTL region, the top

50, 22, 62, and 26 1-Mb windows explained 10% of the genetic variance in the KS06 VL, NVSL

VL, KS06 WG, and NVSL WG datasets, respectively. The 1 Mb window containing the SSC 4

QTL explained 8.2%, 14.4%, 0.05%, and 10.4% of the genetic variance when genotype at the

WUR SNP was not fitted as a fixed effect but is included as a random effect in Bayes-B. Table

4.2 shows heritability and litter effect estimates from ASReml analyses of WG and VL in the two isolate datasets. Pedigree based heritability estimates were moderate and slightly higher for

VL than for WG for both PRRSV isolates. These were estimated with the model used for GWAS in this study. When genotype at the WUR SNP was removed from the model, heritability estimates increased slightly. Litter effects were around 0 for WG and for VL in the KS06 data, and were larger for VL in the NVSL data (Table 4.2).

Genome wide association studies

Principal components analysis (PCA) determined that the number of principal components required to explain 99.5% of the genetic variation was 20,492, 21,658, and 22,708 for KS06, NVSL, and the combined datasets, respectively (Supplemental Table 4.1). The resulting 5% Bonferroni-corrected genome-wide statistical significance thresholds were 2.4x10-6,

2.3 x10-6, and 2.2 x10-6 for the KS06, NVSL, and combined datasets, respectively.

Manhattan plots of the Bayesian and single SNP GWAS results for VL are shown in

Figure 4.1, and information on the top genomic regions identified by each method are in Table

4.3. Genotype at the WUR SNP was fitted as a fixed effect in all models, which explains the

96 absence of peaks on SSC 4. To compare the identified regions from the two GWAS methods, we plotted the largest –log10(P) for each 1-Mb window from the single SNP analysis against the %

GV explained by that window using the Bayesian method (Supplemental Figures 4.1 to 4.4).

These comparison plots showed that several of the most significantly associated windows were identified by both methods but, overall, more regions were shown to be moderately associated based on the single-SNP method.

For VL in the KS06 data, Bayes-B and single SNP analyses identified the 57 Mb window of SSC 3 (3_57) as the most strongly associated genomic region (apart from WUR), explaining

1.2% GV in the Bayes-B analysis and having a p-value of 5x10-6 in the single SNP analysis.

Windows 7_35 and 7_30 explained the second and third highest percentage of GV in Bayes-B,

0.82 and 0.32%, respectively; these regions were also associated with VL in the KS06 data in single SNP analysis (P=9x10-6; Table 3). Single SNP analysis also identified associations in the

9_80, 11_28, 12_26, 14_17, and 15_84 regions (P<2.2x10-5), although none of these reached genome-wide significance after Bonferroni correction; these regions explained between 0.03 and

0.13% GV in the Bayes-B analysis.

For VL in the NVSL data, the two most significantly associated regions, 6_79 and 6_80, were identified by both Bayes-B (1% GV; Figure 4.1A) and single SNP analysis (P<1.6x10-7;

Figure 4.1B); other regions were not consistently identified by both GWAS methods. A third region on SSC 6 at Mb 67 was identified only in the single SNP analysis (P=1.6x10-7; Figure

4.1B). Results from Bayesian analysis of VL in the NVSL data also showed a region on 1_292 that explained 1.76% GV (Figure 4.1A). Single SNP analysis also identified a genomic region on

SSC 1 at 247 Mb to be associated with VL in the NVSL data (P=2.5x10-7; Figure 4.1B), 45 Mb upstream of the Bayes-B identified region.

97

Figure 4.2 shows Manhattan plots of the Bayesian and single SNP GWAS for WG; information on the top regions is in Table 4.4. For WG in the KS06 data, there were no regions that were identified by both Bayes-B and single SNP analysis. In fact, Bayes-B results showed no regions that explained more than 1% GV (Figure 4.2A) and no SNP reached genome-wide significance in the single SNP results (Figure 4.2B). The two most significantly associated regions from single SNP analysis were 13_59 and 10_52 (P=1.5x10-5), which explained 0.4 and

0.06% GV in the Bayes-B analysis, respectively. The region identified by Bayes-B as explaining the largest % GV, 0.86%, was at 7_113; this region was not found to be strongly associated in the single SNP results (P=0.02).

For WG in the NVSL data, figure 4.2A shows that windows on 5_68 and 7_35 explained more than 1% GV in the Bayes-B analysis, but these regions were not strongly associated in the single SNP analysis (P>0.005; Figure 4.2B). The most significant single SNP association was at

17_22 (P=6.5x10-6); this window explained 0.8% GV in the Bayes-B analysis. 5_71 was shown to be moderately associated with WG in the NVSL data based on both Bayes-B (0.55% GV) and single SNP methods (P=3.3x10-5).

Interaction GWAS

There was no overlap of significantly associated regions between the two PRRSV isolates analyzed in this study. Supplemental Figures 4.5 and 4.6 show the max -log10(P) from single

SNP analysis for each 1-Mb window corrected for the number of SNPs in that window for VL and WG, respectively, in the NVSL data versus the KS06 data. For VL, the most significantly associated regions in the NVSL data had very weak associations in the KS06 data, and vice

98 versa. For WG, 9_80 was associated in both isolates, but did not reach genome-wide significance in either dataset.

We used single SNP analysis to assess the interaction of SNP genotypes with PRRSV isolate and with genotype at the WUR SNP in the combined dataset (Figure 4.3A); information on regions with strongest interactions is in Table 5. Genomic regions with the most significant interaction effects between PRRSV isolate and SNP genotype on VL were at 14_16, 7_33, 7_39, and 12_32 (Figure 4.3A). The 12_32 and 14_16 regions were found to be associated with VL in the KS06 data (P=0.0001 and 0.24% GV; P=5x10-5 and 0.31% GV, respectively) but not in the

NVSL data (P=0.01 and 0.11% GV; P=0.01 and 0.11% GV, respectively). None of these regions reached genome wide significance for interaction with PRRSV isolate after Bonferroni correction.

Figure 4.3B shows the Manhattan plot for the interaction of genotype at each SNP with

PRRSV isolate for WG. SSC 17_22 was shown to have a genome-wide significant interaction with PRRSV isolate for WG (P=1.6x10-6). This region had the most significant association with

WG in the NVSL data (P=6.5x10-6 and 0.8% GV) but was not associated with WG in the KS06 data (P=0.01 and 0.07% GV). Although not genome-wide significant, peaks for interactions of

SNP genotype with PRRSV isolate were seen on 10_53, 11_43, and 11_52 (P>1x10-5). The

10_52 region, one Mb upstream of the isolate interaction region, was associated with WG in the

KS06 data (P=1.5x10-5 and 0.4% GV) but was not strongly associated with WG in the NVSL data (P=0.001 and 0.16% GV). Neither region on SSC 11 with strong isolate interaction effects was strongly associated with WG in either isolate dataset (P>0.01 and between 0.03 and 0.16%

GV).

99

Figure 4.4A shows the Manhattan plot of the interaction of each SNP with WUR genotype for VL. There were 22 SNPs, in 13 different 1-Mb windows, that reached genome- wide significance for their interaction with genotype at the WUR SNP for VL. These 13 regions were on 5 chromosomes, SSC 1_78, 156, and 171, SSC 4_1, 96, 97, 126, 131, and 132, SSC

9_138, SSC 14_125 and 137, and SSC X_11. Interactions of SNPs with WUR genotype for WG were identified on 1_35, 6_28, 13_199, 18_42, and X_98 (P<8x10-5; Figure 4.4B). None of these reached genome-wide significance for their interaction with WUR genotype after Bonferroni correction.

GO term enrichment

Panther (Mi et al., 2013) was used to analyze the enrichment of GO terms in lists of genes within 500, 250, or 100 kb of SNPs associated with the trait in question with P less than

0.0001, 0.003, or 0.01 based on single SNP analyses. Details for gene lists for each scenario are in Table 4.6. The ten most enriched GO terms from the Panther Biological Process Slim (BPS) category for each scenario were compared to determine the SNP significance threshold and window size for further analyses (Supplemental Tables 4.2 to 4.5). Based on the consistent enrichment of biologically relevant BPS terms (Supplemental Tables 4.2 to 4.5), all further analyses were performed using gene lists created genes within 250 kb of SNPs with P<0.003.

Results of overrepresentation of Panther GO Slim terms for each gene list are presented in tables

4.7 to 4.10. Results from enrichment testing using the complete GO categories are in

Supplemental Tables 4.6 and 4.7 for VL and WG, respectively.

The most enriched GO Slim terms for VL for each PRRSV isolate are in Table 4.7; the complete GO analysis is available in Supplemental Table 4.6. The 3 BPS and 1 Molecular

100

Function Slim (MFS) GO terms most enriched in the gene list for VL in the KS06 data were also shown to be enriched in the NVSL VL gene list. The consistently enriched BPS GO terms were natural killer cell activation, immune response, and B cell mediated immunity; the MFS GO term enriched in both PRRSV isolate gene lists was binding. An additional 7 BPS and 4 MFS GO terms were enriched in the NVSL VL gene list.

The most enriched GO terms for WG are in Table 4.8, with the enriched Complete GO terms in Supplemental Table 4.7. After Bonferroni correction, there were no GO terms that were significantly enriched in the KS06 WG gene list. Several metabolic process terms were enriched in the NVSL WG gene list, although only nucleobase-containing compound metabolic process was significantly enriched after Bonferroni correction (P=1 x10-4). Genes involved in the molecular function of nucleic acid binding were also enriched in the NVSL WG list (P=1 x10-4).

The CCKR signaling map pathway was also enriched in the NVSL WG gene list (P=9.5x10-4).

Genes near SNPs whose association with VL had an interaction with PRRSV isolate were enriched for several metabolic process BPS GO terms (Table 4.10), while genes near SNPs whose association with WG had an interaction with PRRSV isolate were enriched for several immune function related GO terms. There were no GO terms that were consistently enriched in gene lists based on SNPs with interaction effects with PRRSV isolate for VL and WG.

There were over 1,700 SNPs whose association with VL had an interaction with genotype at the WUR SNP at –log10(P)>2.5 and there were 4,599 genes within 250 kb of these SNPs, which was approximately 4 times as large as the gene lists created with these criteria for VL and

WG in each PRRSV isolate dataset (Table 4.6). We would not expect such a large number of genes to have differential effects on VL in animals with different genotypes at the WUR SNP,

101 and this large list likely contains many false positives. Furthermore, an extremely large gene list decreases power to detect GO term enrichment. Therefore, we increased the significance threshold of SNP association for the interaction with genotype at the WUR SNP for VL and created a list of genes within 250 kb of the 383 SNPs with -log10(P)>4, resulting in a similar number of genes (1,048) as the other lists analyzed in this study. This list was enriched for enzymatic GO terms (Table 4.11). For WG, genes near SNPs with interactions with WUR genotype at –log10(P)>2.5 were not enriched for any BPS or MF GO terms.

Discussion

We used Bayesian and single SNP GWAS methods to identify genomic regions associated with host response to experimental infection with two different isolates of PRRSV.

Several genomic regions were found to be consistently associated with the disease traits across analysis methods, but single SNP analysis was better able to detect small effects than Bayes-B.

This result is in part due to the fact that Bayes-B analysis was performed with a prior probability of association for each SNP equal to 0.01. Bayes-B analysis fitted all SNPs simultaneously as random effects, whereas single-SNP analysis fitted each SNP at a time as a fixed effect. The single SNP method that was employed in this study also allowed for other random effects, whereas the current version of GenSel, which was used for the Bayes-B analyses does not allow fitting of random effects, other than those of SNP effects. Neither Bayes-B nor single SNP analysis found regions that were associated with either trait in both PRRSV isolates. This result was unexpected, due to the high genetic correlations between isolates estimated by Hess et al.

(2015). This discrepancy may be due to the genetic correlation resulting from many very small pleiotropic effects that are not detected by GWAS, as shown by the mass of small associations in

Supplemental Figures 4.5 and 4.6.

102

Genotype at the WUR SNP was fitted as a fixed effect to account for the highly significant effect of the SSC 4 QTL. Very few SNPs in other genomic regions reached genome-wide significance in the single SNP analysis and the genetic variance explained by 1-Mb windows in the Bayesian analyses was very small; their association may be considered spurious if not for results from functional analysis of neighboring genes. Immune and weight gain related traits are very complex and involve actions and interactions of a large number of genetic loci. Creating gene lists from genes nearby SNP associated with each trait and performing gene ontology term enrichment of these lists provided further evidence of associations.

Viral Load

GWAS revealed regions on 3_57, 7_30, 7_35, 9_80, 11_28, 12_26, 14_17, and 15_84 to be associated with VL in the KS06 data; 1_247, 1_292, 3_68, 3_72, 3_128, 6_67, 6_69, 6_80, and 15_7 were associated with VL in the NVSL data (Table 4.3). Immune related QTL were found in 19 of the top 21 regions identified for VL in either PRRSV isolate; selected regions are further described in the following.

The region on 1_292 was the only genomic region previously identified by Boddicker et al. (2014b) as being associated with VL in the first 8 trials of NVSL infected piglets, apart from the WUR region. Boddicker et al. (2014b) identified 1_292 using the Bayes-B method, which is the same GWAS method that identified this QTL in this study; single-SNP GWAS did not show a strong association for this region. Genes in the 1_247 region include DOCK8, dedicator of cytokinesis 8, which, when mutated in humans, causes combined immunodeficiency (Engelhardt et al., 2009). DOCK8 deficiency in a mouse knockout model leads to a loss of circulating NKT cells (Crawford et al., 2013). DOCK8 was also identified as a member of a gene module whose

103 expression was correlated with viremia at 4 and 7 dpi (Schroyen et al., 2015). A QTL for white blood cell counts was located in the region on 1_247 (Wattrang et al., 2005; Cho et al., 2011).

GNG11, located at 9_80, plays a role in cellular senescence and shows expression changes during cellular stress (Hossain et al., 2006). GNG11 was shown to have differential expression in response to PRRSV infection (Badaoui et al., 2013). The 9_80 region also contained a QTL for immunoglobulin G level (Okamura et al., 2012). STAMBP (AMSH), located at 3_72, interacts with STAM and other proteins involved in cytokine signaling (Tanaka et al., 1999). QTL for C3c concentration (Wimmers et al., 2009) and IFN-γ and IL-10 levels (Uddin et al., 2011) were found in the 3_72 region. The 11_28 region identified for KS06 VL using single SNP analyses contains a QTL for IFN-γ, IL-10, and TLR9 levels in blood after vaccination (Uddin et al., 2010;

Uddin et al., 2011).

Genomic regions associated with VL did not overlap between PRRSV isolates, but we did see that genes near associated SNPs were enriched for the same GO terms. This indicates that associated SNPs are in LD with variants involved in similar biological processes or functions across the PRRSV isolates. For both PRRSV isolates, genes near SNPs associated with VL were enriched for natural killer cell activation, immune response, and B cell mediated immunity BPS

GO terms. Additional GO terms were identified as being overrepresented in gene lists from each

PRRSV isolate, without overlap. The activity of NK cells, specifically via production of IFN-γ, are thought to play an important role in clearance of PRRSV and reduction of viral load (Wesley et al., 2006). Nineteen and 15 genes in the gene lists for VL in KS06 and NVSL, respectively, were involved in NK cell activation. A region on SSC 5_64-65 contained several genes in one or both VL lists; including LY49 (KLRA1), KLRD1, KLRF1, KLRG1, CLECL1, CLEC1A, CLEC1B,

CLEC7A, and CLEC12A. LY49 encodes a receptor found on the NK cell that is involved in

104 response to viral infection (Brown et al., 2001), possibly via their role in apoptosis (Berry et al.,

2013). KLRD1, KLRF1, and KLRG1 are molecules expressed on the NK cell (Li and Mariuzza,

2014) that have the ability to recognize non-self cells that do not express MHC I (Li et al., 2009).

The CLEC gene family are C-type lectin receptors that are found on antigen presenting cells and act to recognize pathogens and promote cell-cell interactions for effective immune responses

(McGreal et al., 2005; Kanazawa, 2007).

VL gene lists also included SLAMF9, JAK3, FCAR, FCRL3, IKBK, PRKRA (PACT),

PECAM1, and the GBP family genes, all annotated with the immune response GO term. The seven genes in the porcine GBP family are in the SSC 4 QTL region, which was previously shown to have an effect on VL (Boddicker et al., 2012). Recently, the putative causative mutation for the SSC4 QTL was identified in the GBP5 gene (Koltes et al., 2015). IKBK is a kinase that is part of the pathway responsible for initiation of the NF-κB signaling cascade, which serves as one of the host’s first responses after viral invasion (Amaya et al., 2014). IKBK was also shown to be differentially expressed in response to infection with the NVSL PRRSV isolate (Schroyen et al., 2015). PACT recognizes RNA viruses and acts with RIG-I to stimulate

IFN production during an immune response against a virus (Kok et al., 2011; Kok and Jin, 2013).

SLAMF9 is a member of the Signaling Lymphocyte Activation Molecule receptor family, which are NK cell receptors that affect T cell responses to viral infection (Waggoner and Kumar, 2012).

SLAMF9 was shown to be differentially expressed in response to PRRSV infection in an independent dataset (Badaoui et al., 2013). PECAM1 mediates adhesion of leukocytes to other cells and permits movement of leukocytes through endothelial walls (Proust et al., 2014).

105

Weight Gain

GWAS identified regions on 7_113, 10_52, 13_59, and 13_214 were found to be associated with WG in the KS06 data; 1_15, 5_68, 5_71, 7_35, 7_39, 10_46, and 17_22 were associated with WG in the NVSL data. Production related QTL, such as average daily again and weight gain, were found in 24 of the top 25 regions associated with WG in either PRRSV isolate.

The MX1 gene is located in the 13_214 region and has been shown to be associated with the initial defense mechanisms employed by macrophages upon PRRSV infection (Chung et al.,

2004). Furthermore, variations in the MX1 promoter were associated with increased MX1 gene expression, which could be expected to result in heightened immune response to PRRSV and possibly more resistant pigs (Y. Li et al., 2015). The 1_15 region contains the VIP (vasoactive intestinal peptide) gene. The VIP pathway was shown to be associated with obesity in humans

(Liu et al., 2010). GLO1, located in 7_39, was associated with body weight in mice (Wuschke et al., 2007). The 17_22 region contains an interesting candidate gene, MKKS, as MKKS null mice were shown to weigh more and consume a greater amount of food compared to wild-type littermates (Fath et al., 2005). In addition, certain MKKS haplotypes were associated with obesity and metabolic syndrome in a Greek human population (Rouskas et al., 2008). QTL for pig weaning weight (Guo et al., 2008) and live weight at slaughter (Pierzchala et al., 2003) have also been identified in this region on SSC17. Nagamine and others (2003) mapped several growth related QTL to SSC7.

Similar to results for VL associations, no genomic regions were identified to be associated with WG for both PRRSV isolates; however, we did not see overlap of overrepresented GO terms across PRRSV isolates for WG. The WG phenotype analyzed in this

106 study is more accurately be described as WG after infection with the PRRSV; therefore, we could expect the severity of PRRSV infection to have a negative relationship with WG. Genes near SNPs associated with WG in the NVSL data were enriched for several metabolic process

BPS, several binding related MFS GO terms, and the CCKR signaling map pathway. Genes in the metabolic process GO term included FOXO3, INPP5F, PRKCQ, LIPC, BMP7, HMGA1, and

NLK. The gene FOXO3 has been shown to be involved in growth retardation of piglets born to sows fed low protein diets during their gestation and lactation (Jia et al., 2015). Different

INPP5F genotypes were associated with ADG in purebred Yorkshire gilts (Zhou et al., 2009). In addition, INPP5F was shown to have differential expression during PRRSV infection in a meta- analysis (Badaoui et al., 2013). PRKCQ has been shown to be associated with obesity in mice, through insulin resistance (Serra et al., 2003; Gao et al., 2007). LIPC is an enzyme that is important in energy homeostasis, and was shown to affect weight gain in mice (Chiu et al.,

2010). BMP7 affects brown adipose tissue development and plays a role in obesity (Tseng et al.,

2008; Townsend et al., 2012). HMGA1 was shown to be associated with body mass index in

Hispanic women (Graff et al., 2013). NLK variants were shown to be associated with fat body mass (Pei et al., 2014). The CCKR (cholecystokinin receptor) pathway is activated in response to food consumption and act to regulate gastrointestinal secretions and motility (Wank, 1995).

Boddicker et al. (2012, 2014a, 2014b) estimated moderately negative phenotypic and genetic correlations between WG and VL in the NVSL data; phenotypic correlation estimates ranged from -0.25 to -0.29 and genotypic correlation estimates ranged from -0.31 to -0.46. We, however, did not identify any associated genomic regions that overlapped between WG and VL for either PRRSV isolate; however, we did find that the NVSL gene lists for WG and VL were both enriched for the binding molecular function GO term. In addition, the KS06 WG gene list

107 was enriched for antigen processing and presentation of peptide or polysaccharide via MHC class II.

Interaction GWAS

Regions associated with VL on 14_16, 7_33, 7_39, 12_32, and 9_36 had the strongest interaction effect with PRRSV isolate; for association with WG, SNPs at 17_22, 10_52, 11_43,

11_52, and 7_113 showed interaction effects with PRRSV isolate. Several of these regions were associated with these respective traits in GWAS results using data from only one PRRSV isolate.

The most notable result from the interaction GWAS was the identification of 13 regions with genome-wide significant associations for the interaction with genotype at the WUR SNP for

VL. Candidate genes in these regions included DEDD, SLAMF6 (NTBA), CD48, CD244 (2B4), and LY9 at SSC 4_96 and LY6H at SSC 4_1. The DEDD (death effector domain containing) gene is involved in apoptosis (Alcivar et al., 2003). SLAMF6 plays a role in the activation of NK and

T cells (Flaig et al., 2004; Valdez et al., 2004). CD244 is a receptor of NK cells whose ligand is

CD48, expressed on T cells; both of these molecules are involved in lymphocyte activity

(Pacheco et al., 2013; Kis-Toth and Tsokos, 2014). LY9 is another member of the SLAM family involved in mediating innate T cell function (Sintes et al., 2013).

We expected to see enrichment of genes in GO terms related to GBP family protein functions in the gene lists created from the interactions of SNP genotypes with genotype at the

WUR SNP, which is in LD with a putative causative mutation in the GBP5 gene (Koltes et al.,

2015). Instead, results showed enrichment for blood coagulation, proteolysis, and several peptidase activity GO terms. The interferon-inducible GBP family genes are annotated instead with cytokine-signaling, interferon signaling, cellular response to interferon gamma, defense

108 response to virus, and metabolic process GO terms. Thus, it is not clear how genes involved in blood coagulation, proteolysis, and peptidase activity functions could have differential effects based on genotype at the WUR SNP. Furthermore, there were no genes in the VL or WG gene list for interaction with genotype at the WUR SNP that were known to have protein interactions with any member of the GBP protein family.

We also assessed the enrichment of WUR interaction gene lists for genes shown to be DE based on genotype at the WUR SNP in RNA sequencing data collected on 18 pigs infected with the NVSL isolate (Koltes et al., 2015). Seventeen and 13 of the 516 genes that were DE by WUR genotype were within 250 kb of SNPs whose association with VL and WG, respectively, had an interaction with genotype at the WUR SNP at –log10(P)>4 (VL) or 2.5 (WG; data not shown).

DE genes by WUR genotype in the RNA sequencing data were not overrepresented in either of our WUR interaction gene lists (P>0.05).

Value of functional annotation of GWAS results

Single SNP analyses revealed 0, 2, 0, and 0 mapped genomic regions that reached genome-wide significance for KS06 VL, NVSL VL, KS06 WG, and NVSL WG, respectively.

Few candidate genes were identified by looking at genes located in regions surrounding the most significantly associated SNPs. Thus, we created gene lists using a relaxed significance threshold for SNPs and assessed the enrichment of genes assigned to GO annotation terms in these lists.

Initially, we analyzed genes within 100, 250, or 500 kb of SNPs associated with the trait at thresholds of P<0.001, 0.003, or 0.01. We looked at the top 10 enriched BPS GO terms for each scenario and chose the middle stringency threshold (P<0.003) and medium sized window (250

109 kb) to use for further analysis based on consistency of enriched terms across combinations of variables and based on the biological relevance of the enriched GO terms.

Several biologically relevant GO terms were shown to be enriched in our lists, which provided a larger set of candidate genes consisting of genes with the given GO term annotation within 250 kb of SNPs associated with the trait at P<0.003. Several of the genes identified through this method were differentially expressed in response to PRRSV infection in other studies. The use of annotation information provided greater insight into genomic regions that may play a role in piglet response to PRRSV. This method was especially useful in interpretation of the complicated immune response related traits analyzed here, for which many genes may have small effects on phenotype, which may be undetectable when considered individually but by grouping the effects of genes that play a role in the same biological function or pathway, larger effects on the phenotype may be detected. These results may be useful for creating subsets of SNPs to be used in genomic prediction or selection, in an effort to reduce noise of unrelated or unassociated SNPs, while keeping SNPs that have small effects on the phenotype.

We also tested the enrichment of our lists for genes that were shown to be differentially expressed (DE) in the blood of pigs selected from three NVSL trials at 4 or 7 dpi compared to 0 dpi by Schroyen et al. (2015). The Pigoligoarray (Steibel et al., 2009) was used to measure gene expression in the blood of 100 pigs that were selected based on their assignment to one of four phenotypic groups after infection with the NVSL isolate (high VL with high WG, high VL with low WG, low VL with high WG, and low VL with low WG). Schroyen et al. (2015) found that

239 and 165 genes were differentially expressed (FDR=0.10) at 4 and 7 dpi, respectively, compared to 0 dpi (Supplemental tables 1 and 2 in Schroyen et al., 2015). We did not find a statistically significant overrepresentation of DE genes in any of our analyzed gene lists, but

110 several of the candidate genes that we identified (Table 4.9) were present in the DE list.

Furthermore, there were genes in our gene lists that were members of gene modules whose overall expression was correlated with either VL or WG, as shown in Table 4.9 (Figure 3 in

Schroyen et al., 2015).

We also compared our gene lists to results from two meta-analyses of pig immune response: 1. Dawson et al. (2013) expanded the pig immunome annotation through analysis of data from 188 Affymetrix 24K microarray chips from several studies of in vivo or in vitro immune stimulation, 2. Badaoui et al. (2013) used 779 chips from 29 studies to identify genes, pathways, and biological functions involved in immune response of pigs. Several genes in the gene lists analyzed in the current study were also in the gene lists from Dawson et al. (2013) or

Badaoui et al. (2013), though not more than expected at P=0.05. The gene lists analyzed in the current study contained 1 (VL) and 10 (WG) genes that were members of a cluster of 511 gene transcripts that were DE in response to immune stimuli and had enriched GO terms related to immune function (Supplemental Table 6 in Dawson et al., 2013). Two (VL) and 2 (WG) and genes in our lists were contained in the 139 DE genes for general immune response, and 1 (VL) and 17 (WG) genes in our lists were contained in the 537 DE genes for response to PRRSV infection (Supplemental file 10, Table S8 in Badaoui et al., 2013).

Conclusions

We found several genomic regions associated with response to experimental infection with one of two isolates of the PRRSV that were consistently identified using both single SNP and Bayesian GWAS methods. We did not find genomic regions that were significantly associated with piglet response for both PRRSV isolates; however, we did find that genes near

111 associated SNPs were enriched for the same GO terms for both PRRSV isolates. This indicates similar biological mechanisms are involved in the response to both PRRSV isolates.

Literature Cited

Alcivar, A., S. Hu, J. Tang, and X. Yang. 2003. DEDD and DEDD2 associate with caspase-8/10 and signal cell death. Oncogene 22:291–297.

Amaya, M., F. Keck, C. Bailey, and a Narayanan. 2014. The role of the IKK complex in viral infections. Pathog. Dis. 72:32–44.

Arceo, M. E., C. W. Ernst, J. K. Lunney, I. Choi, N. E. Raney, T. Huang, C. K. Tuggle, R. R. R. Rowland, and J. P. Steibel. 2012. Characterizing differential individual response to porcine reproductive and respiratory syndrome virus infection through statistical and functional analysis of gene expression. Front. Genet. 3:321.

Badaoui, B., C. K. Tuggle, Z. L. Hu, J. M. Reecy, T. Ait-Ali, A. Anselmo, and S. Botti. 2013. Pig immune response to general stimulus and to porcine reproductive and respiratory syndrome virus infection: a meta-analysis approach. BMC Genomics 14:220.

Berry, R., N. Ng, P. M. Saunders, J. P. Vivian, J. Lin, F. A Deuss, A. J. Corbett, C. A Forbes, J. M. Widjaja, L. C. Sullivan, A. D. McAlister, M. a Perugini, M. J. Call, A. A Scalzo, M. A Degli- Esposti, J. D. Coudert, T. Beddoe, A. G. Brooks, and J. Rossjohn. 2013. Targeting of a natural killer cell receptor family by a viral immunoevasin. Nat. Immunol. 14:699–705.

Boddicker, N. J., A. Bjorkquist, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. Dekkers. 2014a. Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection. Genet Sel Evol 46:18.

Boddicker, N. J., D. J. Garrick, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. M. Dekkers. 2014b. Validation and further characterization of a major quantitative trait locus associated with host response to experimental infection with porcine reproductive and respiratory syndrome virus. Anim. Genet. 45:48–58.

Boddicker, N., E. H. Waide, R. R. R. Rowland, J. K. Lunney, D. J. Garrick, J. M. Reecy, and J. C. M. Dekkers. 2012. Evidence for a major QTL associated with host response to Porcine reproductive and respiratory syndrome virus challenge. J. Anim. Sci. 90:1733–1746.

Brown, M. G., A O. Dokun, J. W. Heusel, H. R. Smith, D. L. Beckman, E. A Blattenberger, C. E. Dubbelde, L. R. Stone, A A Scalzo, and W. M. Yokoyama. 2001. Vital involvement of a natural killer cell activation receptor in resistance to viral infection. Science 292:934–937.

112

Chiu, H. K., K. Qian, K. Ogimoto, G. J. Morton, B. E. Wisse, N. Agrawal, T. O. McDonald, M. W. Schwartz, and H. L. Dichek. 2010. Mice lacking hepatic lipase are lean and protected against diet-induced obesity and hepatic steatosis. Endocrinology 151:993–1001.

Cho, I.-C., H. B. Park, C. K. Yoo, G. J. Lee, H. T. Lim, J. B. Lee, E. J. Jung, M. S. Ko, J. H. Lee, and J. T. Jeon. 2011. QTL analysis of white blood cell, platelet and red blood cell-related traits in an F2 intercross between Landrace and Korean native pigs. Anim. Genet. 42:621–626.

Cho, R. J., and M. J. Campbell. 2000. Transcription, genomes, function. Trends Genet. 16:409– 415.

Chung, H. K., J. H. Lee, S. H. Kim, and C. Chae. 2004. Expression of interferon-?? and Mx1 protein in pigs acutely infected with porcine reproductive and respiratory syndrome virus (PRRSV). J. Comp. Pathol. 130:299–305.

Crawford, G., A. Enders, U. Gileadi, S. Stankovic, Q. Zhang, T. Lambe, T. L. Crockford, H. E. Lockstone, A. Freeman, P. D. Arkwright, J. M. Smart, C. S. Ma, S. G. Tangye, C. C. Goodnow, V. Cerundolo, D. I. Godfrey, H. C. Su, K. L. Randall, and R. J. Cornall. 2013. DOCK8 is critical for the survival and function of NKT cells. Blood 122:2052–2061.

Engelhardt, K. R., S. Mcghee, S. Winkler, A. Sassi, C. Woellner, G. Lopez-herrera, A. Chen, H. S. Kim, M. G. Lloret, I. Schulze, S. Ehl, J. Thiel, D. Pfeifer, H. Veelken, T. Niehues, K. Siepermann, S. Weinspach, I. Reisli, S. Keles, and F. Genel. 2009. Large deletions and point mutations involving DOCK8 in the autosomal recessive form of the hyper-IgE syndrome. J. Allergy 124:1289.

Fath, M. a., R. F. Mullins, C. Searby, D. Y. Nishimura, J. Wei, K. Rahmouni, R. E. Davis, M. K. Tayeh, M. Andrews, B. Yang, C. D. Sigmund, E. M. Stone, and V. C. Sheffield. 2005. Mkks- null mice have a phenotype resembling Bardet - Biedl syndrome. Hum. Mol. Genet. 14:1109– 1118.

Fernando, R. L., and D. J. Garrick. 2008. GenSel-User manual for a portfolio of genomic selection related analyses. Anim. Breed. Genet. Iowa State Univ. Ames:0–24.

Flaig, R. M., S. Stark, and C. Watzl. 2004. Cutting edge: NTB-A activates NK cells via homophilic interaction. J. Immunol. 172:6524–6527.

Fortes, M. R. S., A. Reverter, S. H. Nagaraj, Y. Zhang, N. N. Jonsson, W. Barris, S. Lehnert, G. B. Boe-Hansen, and R. J. Hawken. 2011. A single nucleotide polymorphism-derived regulatory gene network underlying puberty in 2 tropical breeds of beef cattle. J. Anim. Sci. 89:1669–1683.

Fortes, M. R. S., A. Reverter, Y. Zhang, E. Collis, S. H. Nagaraj, N. N. Jonsson, K. C. Prayaga, W. Barris, and R. J. Hawken. 2010. Association weight matrix for the genetic dissection of puberty in beef cattle. Proc. Natl. Acad. Sci. U. S. A. 107:13642–13647.

113

Gao, X., J. Starmer, and E. R. Martin. 2008. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet. Epidemiol. 32:361– 369.

Gao, Z., Z. Wang, X. Zhang, A. A Butler, A. Zuberi, B. Gawronska-kozak, M. Lefevre, D. York, E. Ravussin, H. Berthoud, O. Mcguinness, W. T. Cefalu, J. Ye, Z. Gao, Z. Wang, X. Zhang, B. Aa, a Zuberi, B. Kozak, M. Lefevre, D. York, E. Ravussin, B. H-r, O. Mcguinness, C. Wt, and Y. J. Inactivation. 2007. Inactivation of PKC theta leads to increased susceptibility to obesity and dietary insulin resistance in mice. Am. J. Physiol. Endocrinol. Metab. 70808:84–91.

Gilmour, A. R., B. J. Gogel, B. R. Cullis, and R. Thompson. 2014. ASReml User Guide.

Graff, M., L. Fernández-Rhodes, S. Liu, C. Carlson, S. Wassertheil-Smoller, M. Neuhouser, a Reiner, C. Kooperberg, E. Rampersaud, J. E. Manson, L. H. Kuller, B. V Howard, H. M. Ochs- Balcom, K. C. Johnson, M. Z. Vitolins, L. Sucheston, K. Monda, and K. E. North. 2013. Generalization of adiposity genetic loci to US Hispanic women. Nutr. Diabetes 3:e85.

Guo, Y. M., G. J. Lee, A. L. Archibald, and C. S. Haley. 2008. Quantitative trait loci for production traits in pigs: A combined analysis of two Meishan x Large White populations. Anim. Genet. 39:486–495.

Habier, D., R. L. Fernando, K. Kizilkaya, and D. J. Garrick. 2011. Extension of the bayesian alphabet for genomic selection. BMC Bioinformatics 12:186.

Hossain, M. N., R. Sakemura, M. Fujii, and D. Ayusawa. 2006. G-protein subunit GNG11 strongly regulates cellular senescence. Biochem. Biophys. Res. Commun. 351:645–650.

Jia, Y., G. Gao, H. Song, D. Cai, X. Yang, and R. Zhao. 2015. Low-protein diet fed to crossbred sows during pregnancy and lactation enhances myostatin gene expression through epigenetic regulation in skeletal muscle of weaning piglets. Eur. J. Nutr.

Kanazawa, N. 2007. Dendritic cell immunoreceptors: C-type lectin receptors for pattern- recognition and signaling on antigen-presenting cells. J. Dermatol. Sci. 45:77–86.

Kis-Toth, K., and G. C. Tsokos. 2014. Engagement of SLAMF2/CD48 Prolongs the Time Frame of Effective T Cell Activation by Supporting Mature Dendritic Cell Survival. J. Immunol. 192:4436–42.

Kok, K. H., and D. Y. Jin. 2013. Balance of power in host-virus arms races. Cell Host Microbe 14:5–6.

Kok, K. H., P. Y. Lui, M. H. J. Ng, K. L. Siu, S. W. N. Au, and D. Y. Jin. 2011. The double- stranded RNA-binding protein PACT functions as a cellular activator of RIG-I to facilitate innate antiviral response. Cell Host Microbe 9:299–309.

114

Koltes, J. E., E. Fritz-Waters, C. J. Eisley, I. Choi, H. Bao, A. Kommadath, N. V. L. Serão, N. J. Boddicker, S. M. Abrams, M. Schroyen, H. Loyd, C. K. Tuggle, G. S. Plastow, L. Guan, P. Stothard, J. K. Lunney, P. Liu, S. Carpenter, R. R. R. Rowland, J. C. M. Dekkers, and J. M. Reecy. 2015. Identification of a putative quantitative trait nucleotide in guanylate binding protein 5 for host response to PRRS virus infection. BMC Genomics 16:1–13.

Li, Y., M. Hofmann, Q. Wang, L. Teng, L. K. Chlewicki, H. Pircher, and R. A. Mariuzza. 2009. Structure of natural killer cell receptor KLRG1 bound to E-cadherin reveals basis for MCH- independent missing self recognition. Immunity 31:35–46.

Li, Y., S. Liang, H. Liu, Y. Sun, L. Kang, and Y. Jiang. 2015. Identification of a short interspersed repetitive element insertion polymorphism in the porcine MX1 promoter associated with resistance to porcine reproductive and respiratory syndrome virus infection. Anim. Genet.:n/a–n/a.

Li, Y., and R. A. Mariuzza. 2014. Structural basis for recognition of cellular and viral ligands by NK cell receptors. Front. Immunol. 5:1–20.

Liu, Y., Y. Guo, L. Zhang, Y. Pei, N. Yu, and P. Yu. 2010. Biological pathway-based genome- wide association analysis identified the vasoactive intestinal peptide (VIP) pathway important for obesity. Obesity 18:2339–2346.

Lunney, J. K., J. P. Steibel, J. M. Reecy, E. Fritz, M. F. Rothschild, M. Kerrigan, B. Trible, and R. R. Rowland. 2011. Probing genetic control of swine responses to PRRSV infection: current progress of the PRRS host genetics consortium. BMC Proc. 5 Suppl 4:S30.

McGreal, E. P., J. L. Miller, and S. Gordon. 2005. Ligand recognition by antigen-presenting cell C-type lectin receptors. Curr. Opin. Immunol. 17:18–24.

Mi, H., A. Muruganujan, and P. D. Thomas. 2013. PANTHER in 2013: Modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41:377–386.

Murtaugh, M. P., T. Stadejek, J. E. Abrahante, T. T. Y. Lam, and F. C. C. Leung. 2010. The ever-expanding diversity of porcine reproductive and respiratory syndrome virus. Virus Res. 154:18–30.

Nagamine, Y., C. S. Haley, A. Sewalem, and P. M. Visschert. 2003. Quantitative trait loci variation for growth and obesity between and within lines of pigs (Sus scrofa). Genetics 164:629–635.

Okamura, T., W. Onodera, T. Tayama, H. Kadowaki, C. Kojima-Shibata, E. Suzuki, Y. Uemoto, S. Mikawa, T. Hayashi, T. Awata, N. Fujishima-Kanaya, a. Mikawa, H. Uenishi, and K. Suzuki. 2012. A genome-wide scan for quantitative trait loci affecting respiratory disease and immune capacity in Landrace pigs. Anim. Genet. 43:721–729.

115

Osorio, F. A, J. A Galeota, E. Nelson, B. Brodersen, A Doster, R. Wills, F. Zuckermann, and W. W. Laegreid. 2002. Passive transfer of virus-specific antibodies confers protection against reproductive failure induced by a virulent strain of porcine reproductive and respiratory syndrome virus and establishes sterilizing immunity. Virology 302:9–20.

Pacheco, Y., A. P. McLean, J. Rohrbach, F. Porichis, D. E. Kaufmann, and D. G. Kavanagh. 2013. Simultaneous TCR and CD244 signals induce dynamic downmodulation of CD244 on human antiviral T cells. J. Immunol. 191:2072–81.

Pei, Y. F., L. Zhang, Yongjun Liu, J. Li, H. Shen, Y. Z. Liu, Q. Tian, H. He, S. Wu, S. Ran, Y. Han, R. Hai, Y. Lin, J. Zhu, X. Z. Zhu, C. J. Papasian, and H. W. Deng. 2014. Meta-analysis of genome-wide association data identifies novel susceptibility loci for obesity. Hum. Mol. Genet. 23:820–830.

Pierzchala, M., D. Cieslak, G. Reiner, H. Bartenschlager, G. Moser, and H. Geldermann. 2003. Linkage and QTL mapping for Sus scrofa chromosome 17. J. Anim. Breed. Genet. 120:132–137.

Proust, R., C. Crouin, L. Y. Gandji, J. Bertoglio, and F. Gesbert. 2014. The adaptor protein SAP directly associates with PECAM-1 and regulates PECAM-1-mediated-cell adhesion in T-like cell lines. Mol. Immunol. 58:206–213.

Ramos, A. M., R. P. M. A. Crooijmans, N. A. Affara, A. J. Amaral, L. Alan, J. E. Beever, C. Bendixen, C. Churcher, R. Clark, P. Dehais, M. S. Hansen, J. Hedegaard, Z. Hu, H. H. Kerstens, A. S. Law, D. Milan, D. J. Nonneman, G. A. Rohrer, M. F. Rothschild, and T. P. L. Smith. 2009. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology. 4:e6524.

Rouskas, K., K. Paletas, a Kalogeridis, M. Sarigianni, E. Ioannidou-Papagiannaki, a Tsapas, and a Kouvatsi. 2008. Association between BBS6/MKKS gene polymorphisms, obesity and metabolic syndrome in the Greek population. Int. J. Obes. (Lond). 32:1618–1625.

Schroyen, M., J. P. Steibel, J. E. Koltes, I. Choi, N. E. Raney, C. Eisley, E. Fritz-Waters, J. M. Reecy, J. C. M. Dekkers, R. R. R. Rowland, J. K. Lunney, C. W. Ernst, and C. K. Tuggle. 2015. Whole blood microarray analysis of pigs showing extreme phenotypes after a porcine reproductive and respiratory syndrome virus infection. BMC Genomics 16:516.

Serão, N. V., D. González-Peña, J. E. Beever, D. B. Faulkner, B. R. Southey, and S. L. Rodriguez-Zas. 2013. Single nucleotide polymorphisms and haplotypes associated with feed efficiency in beef cattle. BMC Genet. 14:94.

Serra, C., M. Federici, a. Buongiorno, M. I. Senni, S. Morelli, E. Segratella, M. Pascuccio, C. Tiveron, E. Mattei, L. Tatangelo, R. Lauro, M. Molinaro, a. Giaccari, and M. Bouché. 2003. Transgenic mice with dominant negative PKC-theta in skeletal muscle: A new model of insulin resistance and obesity. J. Cell. Physiol. 196:89–97.

116

Sham, P. C., and S. M. Purcell. 2014. Statistical power and significance testing in large-scale genetic studies. Nat. Rev. Genet. 15:335–46.

Shi, M., T. T. Y. Lam, C. C. Hon, R. K. H. Hui, K. S. Faaberg, T. Wennblom, M. P. Murtaugh, T. Stadejek, and F. C. C. Leung. 2010. Molecular epidemiology of PRRSV: A phylogenetic perspective. Virus Res. 154:7–17.

Sintes, J., M. Cuenca, X. Romero, R. Bastos, C. Terhorst, A. Angulo, and P. Engel. 2013. Cutting edge: Ly9 (CD229), a SLAM family receptor, negatively regulates the development of thymic innate memory-like CD8+ T and invariant NKT cells. J. Immunol. 190:21–26.

Stadejek, T., A. Stankevicius, M. P. Murtaugh, and M. B. Oleksiewicz. 2013. Molecular evolution of PRRSV in Europe: Current state of play. Vet. Microbiol. 165:21–28.

Steibel, J. P., M. Wysocki, J. K. Lunney, a. M. Ramos, Z. L. Hu, M. F. Rothschild, and C. W. Ernst. 2009. Assessment of the swine protein-annotated oligonucleotide microarray. Anim. Genet. 40:883–893.

Tanaka, N., K. Kaneko, H. Asao, H. Kasai, Y. Endo, T. Fujita, T. Takeshita, and K. Sugamura. 1999. Possible involvement of a novel STAM-associated molecule “AMSH” in intracellular signal transduction mediated by cytokines. J. Biol. Chem. 274:19129–19135.

Townsend, K. L., R. Suzuki, T. L. Huang, E. Jing, T. J. Schulz, K. Lee, C. M. Taniguchi, D. O. Espinoza, L. E. McDougall, H. Zhang, T.-C. He, E. Kokkotou, and Y.-H. Tseng. 2012. Bone morphogenetic protein 7 (BMP7) reverses obesity and regulates appetite through a central mTOR pathway. FASEB J. 26:2187–2196.

Tseng, Y.-H., E. Kokkotou, T. J. Schulz, T. L. Huang, J. N. Winnay, C. M. Taniguchi, T. T. Tran, R. Suzuki, D. O. Espinoza, Y. Yamamoto, M. J. Ahrens, A. T. Dudley, A. W. Norris, R. N. Kulkarni, and C. R. Kahn. 2008. New role of bone morphogenetic protein 7 in brown adipogenesis and energy expenditure. Nature 454:1000–1004.

Uddin, M. J., M. U. Cinar, C. Große-Brinkhaus, D. Tesfaye, E. Tholen, H. Juengst, C. Looft, K. Wimmers, C. Phatsara, and K. Schellander. 2011. Mapping quantitative trait loci for innate immune response in the pig. Int. J. Immunogenet. 38:121–131.

Uddin, M. J., C. Grosse-Brinkhaus, M. U. Cinar, E. Jonas, D. Tesfaye, E. Tholen, H. Juengst, C. Looft, S. Ponsuksili, K. Wimmers, C. Phatsara, and K. Schellander. 2010. Mapping of quantitative trait loci for mycoplasma and tetanus antibodies and interferon-gamma in a porcine F2 Duroc × Pietrain resource population. Mamm. Genome 21:409–418.

Valdez, P. A., H. Wang, D. Seshasayee, M. Van Lookeren Campagne, A. Gurney, W. P. Lee, and I. S. Grewal. 2004. NTB-A, a New Activating Receptor in T Cells That Regulates Autoimmune Disease. J. Biol. Chem. 279:18662–18669.

117

Waggoner, S. N., and V. Kumar. 2012. Evolving role of 2B4/CD244 in T and NK cell responses during virus infection. Front. Immunol. 3:377.

Wank, S. A. 1995. Cholecystokinin receptors. Am J Physiol 269:G628–G646.

Wattrang, E., M. Almqvist, A. Johansson, C. Fossum, P. Wallgren, G. Pielberg, L. Andersson, and I. Edfors-Lilja. 2005. Confirmation of QTL on porcine chromosomes 1 and 8 influencing leukocyte numbers, haematological parameters and leukocyte function. Anim. Genet. 36:337– 345.

Wesley, R. D., K. M. Lager, and M. E. Kehrli. 2006. Infection with Porcine reproductive and respiratory syndrome virus stimulates an early gamma interferon response in the serum of pigs. Can. J. Vet. Res. 70:176–182.

Wimmers, K., E. Murani, K. Schellander, and S. Ponsuksili. 2009. QTL for traits related to humoral immune response estimated from data of a porcine F2 resource population. Int. J. Immunogenet. 36:141–151.

Wuschke, S., S. Dahm, C. Schmidt, H.-G. Joost, and H. Al-Hasani. 2007. A meta-analysis of quantitative trait loci associated with body weight and adiposity in mice. Int. J. Obes. (Lond). 31:829–841.

Zeng, J., A. Toosi, R. L. Fernando, J. C. M. Dekkers, and D. J. Garrick. 2013. Genomic selection of purebred animals for crossbred performance in the presence of dominant gene action. Genet. Sel. Evol. 45(1):11.

Zhou, Q. Y., J. N. Huang, M. J. Zhu, and S. H. Zhao. 2009. Molecular characterization and association analysis with production traits of the porcine INPP5F Gene. Mol. Biol. Rep. 36:1095–1098.

118

Figures

Figure 4.1 Genome wide association results for VL using single SNP and Bayes-B analyses. (A) Bayes-B results showing the percent of genetic variance explained by 1-Mb non-overlapping windows of SNPs across chromosomes for KS06 (top) and NVSL (bottom). (B) Single SNP analysis results showing the -log10(p-value) of ordered SNPs across chromosomes and for unmapped (U) SNPs for KS06 (top) and NVSL (bottom). Dashed lines indicate genome wide significance at a Bonferroni corrected p-value of 0.05.

119

Figure 4.2 Genome wide association results for WG using single SNP and Bayes-B analyses. (A) Bayes-B results showing the percent of genetic variance explained by 1-Mb non-overlapping windows of SNPs across chromosomes for KS06 (top) and NVSL (bottom). (B) Single SNP analysis results showing the -log10(p-value) of ordered SNPs across chromosomes and for unmapped (U) SNPs for KS06 (top) and NVSL (bottom). Dashed lines indicate genome wide significance at a Bonferroni corrected p-value of 0.05.

120

Figure 4.3 Manhattan plots of –log10(P) of the main (top) and interaction (bottom) of SNP genotype with PRRSV isolate for VL (A) and WG (B) across chromosomes and unmapped (U) SNPs. Dashed lines indicate genome wide significance at a Bonferroni corrected P of 0.05.

121

Figure 4.4 Manhattan plots of –log10(P) of the main (top) and interaction (bottom) of SNP genotype with genotype at the WUR SNP for VL (A) and WG (B) across chromosomes and unmapped (U) SNPs. Dashed lines indicate genome wide significance at a Bonferroni corrected P of 0.05.

122

Supplemental Figure 4.1 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for KS06 Viral Load against the percent of genetic variance explained by that window from Bayes-B.

123

2

1 GenSel Percent GenSelof Percent Genetic Variance 0 0 2 4 6 Largest log p value − 10( − )

Supplemental Figure 4.2 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for NVSL Viral Load against the percent of genetic variance explained by that window from Bayes-B.

124

2

1 GenSel Percent GenSelof Percent Genetic Variance 0 0 2 4 6 Largest log p value − 10( − )

Supplemental Figure 4.3 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for KS06 Weight Gain against the percent of genetic variance explained by that window from Bayes-B.

125

2

1 GenSel Percent GenSelof Percent Genetic Variance 0 0 2 4 6 Largest log p value − 10( − )

Supplemental Figure 4.4 Comparison plot of the largest –log10(P) in each 1-Mb window based on the single-SNP analysis for NVSL Weight Gain against the percent of genetic variance explained by that window from Bayes-B.

126

Tables

127

Litter 0.06(0.05) 0.06(0.05) 0.04(0.03) 0.04(0.03)

Weight GainWeight Heritability 0.24(0.12) 0.25(0.12) 0.28(0.08) 0.30(0.08)

Litter 0.0(0.0) 0.0(0.0) 0.18(0.04) 0.19(0.04)

Viral Viral Load Heritability 0.38(0.09) 0.40(0.09) 0.33(0.1) 0.34(0.1)

uded V V isolates. WUR genotype WUR genotype or included in fromexcluded model KS06 Included Excluded NVSL Incl Excluded Table 4.2 Estimates (SE) of heritability and litter effects for andfor (VL) littereffects andTable Viral and KS064.2 Estimates (WG) heritability for Gain Load Weight (SE) of NVSL PRRS

128

Table 4.3 Top 1-Megabase (Mb) genomic regions associated with Viral Load in the KS06 and NVSL trials. Single SNP GWAS Bayes-B GWAS Chr. Pos. Largest Rank Genetic Rank Candidate QTL (Mb) –log10(P) Variance(%) Genes KS06 3 57 5.3 1 1.2 1 WBC counts1 7 30 5.3 1 0.32 3 Lymphocyte number2 7 35 5.0 2 0.82 2 Lymphocyte number2 3 53 5.0 2 0.03 25 WBC counts1 14 17 5.0 3 0.09 19 C3c concentration3 11 28 5.0 3 0.03 25 IFN-γ level, IL-10 level, IL-2 level, TLR 2 level, TLR 9 level4 12 26 4.9 4 0.13 15 IL-10 level TLR 2 level TLR 9 level4 15 84 4.7 5 0.06 22 CD4+/CD8+ leukocyte ratio2 9 80 4.7 5 0.02 26 GNG11 Immunoglobulin G1 14 16 4.3 8 0.31 4 C3c concentration3 12 32 3.9 15 0.24 5 MMD IL-10 level TLR 2 level TLR 9 level4 7 32 3.5 36 0.24 5 Lymphocyte number2 NVSL 6 79 7.6 1 1.0 3 IL-2 level4 6 80 6.8 2 1.0 2 IL-2 level4 6 67 6.8 2 0.3 34 IL-2 level4 1 247 6.6 3 0.2 35 DOCK8 WBC counts6 3 68 5.4 4 0.3 34 C3c concentration3 IFN-γ level4 IL-10 level4 3 128 4.7 5 0.16 21 1 292 1.9 428 1.76 1 C3c concentration3 PRRSV susceptibility7 3 72 3.1 32 0.65 4 STAMBP C3c concentration3 IFN-γ level4 IL-10 level4 15 7 2.3 200 0.6 5

1 Okamura et al., 2012; 2 Reiner et al., 2008; 3 Wimmers et al., 2009; 4 Uddin et al., 2011; 5 Lu et al., 2011; 6 Wattrang et al., 2005; Cho et al., 2011; 7 Boddicker et al., 2011

129

Table 4.4 Top 1-Megabase (Mb) genomic regions associated with Weight Gain in the KS06 and NVSL trials. Single SNP Bayes-B GWAS GWAS Chr. Pos. Largest – Rank Genetic Rank Candidate QTL (Mb) log10(P) Variance (%) Genes KS06 13 59 4.82 1 0.06 22 ADG1,2 10 52 4.82 1 0.4 3 ADG2,3 Body weight4,5,6 FCR7 4 6 4.80 2 0.09 19 ADG2,4,8 Body weight4,5,9,10 9 90 4.80 2 0.03 25 ADG11 Body weight6 15 140 4.61 3 0.04 24 ADG4,12,13 12 46 4.61 3 0.11 17 Body weight4,14 ADG2 4 5 4.43 4 0.24 6 ADG2,4,8 Body weight4,5,9 FCR7 9 88 4.43 4 0.03 25 ADG11 Body weight6 7 2 4.41 5 0.04 24 ADG8 8 26 4.41 5 0.16 12 ADG2,15,16 Body weight6,14,15 7 113 1.77 572 0.86 1 ADG8 Body weight19 13 214 3.9 14 0.43 2 MX1 Body weight21 13 6 2.43 150 0.28 4 ADG1,3 11 23 3.41 28 0.25 5 NVSL 17 22 5.19 1 0.8 4 MKKS Body weight19,25 7 39 4.89 2 0.17 20 GLO1 ADG4,8,12,16,17,20 Body weight4,6,17,19,20,21 1 15 4.89 2 0.04 33 VIP Body weight15,25 ADG2,3 9 90 4.72 3 0.33 10 Body weight6 9 73 4.72 3 0.04 33 ADG11 9 85 4.56 4 0.13 18 9 84 4.56 4 0.19 24 9 145 4.56 4 0.04 33 ADG27 Body weight14,27 10 46 4.56 4 0.03 34 ZEB1 ADG1,2,3 Body weight4,5,6 FCR7

130

Table 4.4 Continued

5 71 4.48 5 0.55 5 ADIPOR2 Body weight14 ADG2 1 22 4.48 5 0.05 32 ADG2,3,12 Body weight14,19 5 68 2.28 273 1.19 1 ADG2 Body weight14,26 7 35 3.69 21 1.18 2 ADG4,8,12,16,20 Body weight4,6,17,19,20 3 138 4.16 9 0.84 3 Body weight19 PRRSV susceptibility28

1Knott et al., 1998; 2de Koning et al., 2001; 3Liu et al, 2007; 4Liu et al., 2008; 5Edwards et al., 2008; 6Ai et al, 2012; 7Jiao et al, 2014; 8Paszek et al., 1999; 10Choi et al., 2011; 11Duthie et al., 2008; 12Ruckert et al., 2010; 13Soma et al., 2011; 14Yoo et al., 2014; 15Beeckmann et al., 2003; 16Evans et al., 2003; 17Bidanel et al., 2001; 18Quintanilla et al., 2002; 19Guo et al., 2008; 20Sanchez et al., 2006; 21Yue et al., 2001; 22Munoz et al., 2009; 23Harmengnies et al., 2006; 24Stratil et al., 2006; 25Pierzchala et al., 2003; 26Schneider et al., 2012; 27Cepica et al., 2003; 28Boddicker et al., 2014b

131

Table 4.5 GWAS results for the interaction of SNP genotype and PRRSV isolate or genotype at the WUR SNP. Pos. Largest GWAS Interaction Chr. Candidate Genes (Mb) –log10(P) Rank Viral Load 14 16 4.93 1 7 33 4.82 2 PRRSV 7 39 4.70 3 isolate 12 32 4.58 4 9 36 4.21 5 4 126 8.29 1 4 131 7.87 2 14 125 6.93 3 1 78 6.57 4 14 137 6.06 5 X 11 6.05 6 WUR 4 132 5.95 7 genotype 4 96 5.93 8 DEDD, SLAMF6, CD48, CD244, LY9 1 171 5.81 9 4 97 5.78 10 4 1 5.76 11 LY6H 9 138 5.71 12 1 156 5.69 13 Weight Gain 17 22 5.80 1 MKKS 10 53 4.93 2 PRRSV 11 43 4.71 3 isolate 11 52 4.65 4 7 113 4.38 5 13 199 4.88 1 X 98 4.48 2 WUR 18 42 4.25 3 genotype 6 28 4.25 3 1 35 4.09 4 13 19 3.98 5

132

Table 4.6 Information for gene lists created from genes located within 100, 250, or 100 kb of SNPs associated with the trait at three –log10(P) thresholds (2, 2.5 and 3) for Viral Load and Weight Gain for the KS06 and NVLS isolates.

n pig Ensembl IDs n SNPs with – n non-overlapping PRRSV -log (P) (n Ensembl IDs with 10 log (P) > regions Isolate threshold 10 GO annotation) threshold 500 250 100 500 250 100 Viral Load 1576 844 401 3 215 157 166 187 (1284) (706) (332) 3304 937 443 KS06 2.5 456 305 347 393 (2723) (781) (369) 6445 3828 1184 2 1053 567 686 849 (5305) (3170) (1557) 1014 556 263 3 115 88 95 102 (835) (461) (218) 2418 1390 660 NVSL 2.5 303 208 239 267 (1991) (1140) (553) 5347 3166 1542 2 810 467 566 667 (4441) (2645) (1323) Weight Gain 931 514 254 3 116 93 98 106 (762) (423) (210) 2048 1108 510 KS06 2.5 245 193 208 223 (1686) (916) (430) 5269 3024 1451 2 720 476 554 625 (4306) (2466) (1194) 1173 592 287 3 169 118 127 137 (942) (486) (238) 2812 1510 722 NVSL 2.5 378 259 282 311 (2291) (1254) (613) 6075 3555 1745 2 958 516 639 770 (4908) (2895) (1439)

133

Table 4.7 Most Enriched GO Terms for the gene lists for Viral Load for the KS06 and NVSL isolates.

Category GO Term n Genes in Fold P-value Bonferroni List Enrichment P-value KS06 Biological natural killer cell 19 4.9 2.79 x 10-8 6.28 x 10-6 Process activation Slim immune response 37 1.97 1.11 x 10-4 0.025 B cell mediated 15 2.64 7.91 x 10-4 0.18 immunity Molecular binding 244 1.25 5.22 x 10-5 9.08 x 10-3 Function Slim NVSL Biological regulation of 13 4.92 4.1 x 10-6 9.23 x 10-4 Process vasoconstriction Slim immune response 51 1.86 2.99 x 10-5 6.73 x 10-3 metabolic process 484 1.15 6.2 x 10-5 0.014 lysosomal transport 11 4.27 7.85 x 10-5 0.0177 primary metabolic 407 process 1.17 9.57 x 10-5 0.0215 endocytosis 35 1.93 2.44 x 10-4 0.0549 receptor-mediated 23 endocytosis 2.27 3.26 x 10-4 0.0734 biological regulation 246 1.23 3.43 x 10-4 0.0772 B cell mediated 20 immunity 2.41 3.78 x 10-4 0.0851 natural killer cell 15 activation 2.65 7.68 x 10-4 0.173 Molecular serine-type 20 Function endopeptidase Slim inhibitor activity 4.89 1.30 x 10-8 2.26 x 10-6 binding 356 1.25 1.36 x 10-6 2.37 x 10-4 peptidase inhibitor 28 activity 2.73 3.11 x 10-6 5.41 x 10-4 enzyme inhibitor 35 activity 2.24 1.37 x 10-5 2.38 x 10-3 receptor binding 74 1.53 2.79 x 10-4 4.85 x 10-2

134

Table 4.8 Most Enriched GO Terms for the gene lists for Weight Gain for the KS06 and NVSL isolates.

Category GO Term Fold n Genes in P-value Bonferroni Enrichment List P-value KS06 Biological antigen processing 4.95 7 6.25 x 10-4 0.147 Process and presentation of Slim peptide or polysaccharide antigen via MHC class II NVSL Biological nucleobase- 1.25 238 1.38 x 10-4 0.031 Process containing compound Slim metabolic process metabolic process 1.11 514 1.5 x 10-3 0.34 Molecular nucleic acid binding 1.28 200 1.6 x 10-4 0.028 Function DNA binding 1.35 130 3.52 x 10-4 0.061 Slim binding 1.16 365 5.67 x 10-4 0.099 Panther CCKR signaling map 2.29 19 9.52 x 10-4 0.15 Pathway

135

Table 4.9 Selected candidate genes identified by GO term enrichment.

Candidate Gene Evidence Reference KS06 Viral Load MYO5A, CFI, FCRL1, Members of gene modules Schroyen et al. (2015) FCRL4, SLA-DQA1 correlated with VL TXNDC9, IKBKB Differentially expressed in blood Schroyen et al. (2015) SLAMF9 PRRS specific response Badaoui et al. (2013) GBP1 General immune response Badaoui et al. (2013) NVSL Viral Load CCRL1, CLEC1A, HP, Members of gene modules Schroyen et al. (2015) SEMA6C, SEZ6L, correlated with VL TMEM176A, TMEM176B, CCRL1, CLEC1A RHOC Dawson et al. (2013) HSPB1 General immune response Badaoui et al. (2013) KS06 Weight Gain CDIPT, SLAMF9, NMT1 PRRS specific response Badaoui et al. (2013) NMT1, TAP1 Dawson et al. (2013) SBF1, TAP1 Differentially expressed Schroyen et al. (2015) NVSL Weight Gain SSU72, TOP2A General immune response Badaoui et al. (2013) TOP2A, CDA, CDK6, EPS15, PRRS Specific response Schroyen et al. (2015) FKBP3, IER3, INPP5F, NFIL3, PSMD12, RGS10, RHOQ, TAB2, TCEB3, WBSCR22 ALPK2, CEBPB, CPSF3, Differentially expressed Schroyen et al. (2015) NEK9, TFAP2A, TXNDC9, DDX6 EIF5B, UBE2V1, CITED2, Members of modules correlated Schroyen et al. (2015) LAPTM4A, LCN2, PLA2G4A with weight gain

AIFM1, ATP11C, CNOT7, Dawson et al. (2013) FUNDC1, HIST1H2BH, MOBKL3, MTRF1L, PLA2G6

136

Table 4.10 Most enriched GO terms for the interaction with PRRSV isolate for Viral Load and Weight Gain.

Category GO Term Fold n Genes in P-value Bonferroni Enrichment List P-value Viral Load Biological nucleobase- Process containing compound -6 -4 Slim metabolic process 1.32 260 1.47 x 10 3.31 x 10 primary metabolic -5 -3 process 1.18 468 1.15 x 10 2.59 x 10 -5 -3 metabolic process 1.15 552 1.82 x 10 4.10 x 10 transcription from RNA polymerase II -5 -2 promoter 1.42 129 5.31 x 10 1.19 x 10 transcription, DNA- -4 -2 dependent 1.36 139 1.83 x 10 4.12 x 10 regulation of transcription from RNA polymerase II -4 -2 promoter 1.43 98 3.13 x 10 7.04 x 10 Molecular nucleic acid binding 1.31 212 2.7 x 10-5 4.63 x 10-3 Function nucleotide kinase Slim activity 3.94 11 1.57 x 10-4 0.027 sequence-specific DNA binding transcription factor activity 1.41 107 2.81 x 10-4 0.05 Weight Gain -6 -4 Biological antigen processing > 5 11 2.36 x 10 5.31 x 10 Process and presentation -5 -3 Slim pattern specification 3.02 19 2.94 x 10 6.62 x 10 process -5 response to > 5 8 7.24 x 10 0.016 interferon-gamma -4 antigen processing > 5 7 1.41 x 10 0.032 and presentation of peptide or polysaccharide antigen via MHC class II -4 skeletal system 2.63 18 2.52 x 10 0.057 development -4 DNA binding 1.35 130 3.52 x 10 0.061 -4 binding 1.16 365 5.67 x 10 0.099

137

Table 4.11 Most enriched GO terms for the interaction with genotype at the WUR SNP for Viral Load (VL).

Category GO Term Fold n Genes P-value Bonferroni Enrichment in List P-value Viral Load -6 -3 Biological blood coagulation 3.42 18 9.75 x 10 2.19 x 10 Process -4 Slim proteolysis 1.73 45 3.45 x 10 0.078 Molecular serine-type 3.17 x 10-9 Function endopeptidase Slim inhibitor activity > 5 18 5.52 x 10-7 peptidase inhibitor activity 3.45 26 9.34 x 10-8 1.63 x 10-5 amylase activity > 5 6 2.53 x 10-6 4.4 x 10-4 hydrolase activity, hydrolyzing O- glycosyl compounds > 5 9 1.46 x 10-5 2.54 x 10-3 enzyme inhibitor activity 2.45 28 2.15 x 10-5 3.74 x 10-3 peptidase activity 1.9 41 9.9 x 10-5 0.017

138

Supplemental Table 4.1 Number of independent principal components for Sus scrofa chromosomes 1 through 18, X, and unknown (U) from Principal Components Analysis.

Chro- Dataset # SNPs mosome KS06 NVSL Combined 1 6031 1990 1990 1986 2 3201 1129 1258 1317 3 2808 1120 1241 1283 4 3360 1269 1317 1382 5 2331 1028 1101 1125 6 3481 1275 1330 1397 7 3161 1295 1356 1430 8 2715 1058 1096 1217 9 3056 1212 1299 1365 10 1725 846 887 976 11 1851 815 835 934 12 1578 748 801 875 13 3908 1230 1327 1316 14 3683 1191 1279 1284 15 2770 1074 1163 1204 16 1811 760 801 889 17 1647 725 778 858 18 1269 577 627 604 X 1146 499 514 577 U 854 651 658 689 Total 52386 20,492 21,658 22,708

139

Supplemental Table 4.2 Top enriched Biological Process Slim GO terms for KS06 Viral Load. Green colored cells contain GO terms considered to be biologically relevant to VL after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed after PRRSV infection (Schroyen et al., 2015). Underlined GO terms were enriched in genes differentially expressed in porcine alveolar macrophages after co-infection with PRRSV and Mycoplasma hyopneumoniae (Li et al, 2015).

Genes within given kb of SNPs Genes within given kb of SNPs with P Genes within given kb of SNPs with P < 0.001 < 0.003 with P < 0.01 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb segment segment NK cell antigen antigen cell-cell NK cell NK cell NK cell specificati specificati activatio proc and proc and signaling activation activation activation on on n pres pres ectoderm B cell immune NK cell NK cell immune immune NK cell cation developme mediated respons activation activation response response activation transport nt immunity e response cellular aa B cell ion primary cell-cell to immune antigen proc ion metabolic mediated transpor metabolic signaling interferon- response and pres transport process immunity t process gamma reg of B cell antigen trans macropha cellular aa antigen proc mediate proc and metabolic antigen proc immune from RNA ge catabolic and pres via d pres via process and pres response pol II activation process MHC class II immunit MHC promoter y class II mammary reg of trans pattern induction immune cation NK cell gland from RNA metabolic cell-cell specificati of system transpor activation developme pol II process signaling on process apoptosis process t nt promoter regulation embryo ectoderm female antigen proc cellular aa antigen segment of immune developme developme gamete and pres via metabolic proc and specification molecular response nt nt generation MHC class II process pres function trans RNA metabol reg of cellular immune immune macrophage from RNA spermatogene metabolic ic catalytic defense response response activation pol II sis process process activity response promoter reg of trans trans muscle primary pattern cellular cellular aa from RNA from organ metabol ion NK cell specificati defense metabolic pol II RNA pol developme ic transport activation on process response process promoter II nt process promoter reg of reg of digestive digestive phospha skeletal segment protein trans lipid carbohydr tract tract te system specificati phosphorylati from RNA metabolic ate mesoderm mesoderm metabol developme on on pol II process transport dev dev ic nt promoter process digestive transcripti transcripti female immune B cell synaptic cell-cell tract on, DNA- ion transport on, DNA- gamete system mediated transmissi signaling mesoderm dependent dependent generation process immunity on dev

140

Supplemental Table 4.3 Top enriched Biological Process Slim GO terms for NVSL VL. Green colored cells contain GO terms considered to be biologically relevant to VL after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed after PRRSV infection (Schroyen et al., 2015). Underlined GO terms were enriched in genes differentially expressed in porcine alveolar macrophages after co-infection with PRRSV and Mycoplasma hyopneumoniae (Li et al, 2015). Genes within given kb of SNPs Genes within given kb of SNPs Genes within given kb of SNPs with P < 0.001 with P < 0.003 with P < 0.01 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb reg of seq- specific reg of reg of reg of reg of reg of NK cell DNA metabolic metabolic molecular vasoconstric vasoconstric vasoconstric vasoconstric activation binding process process function tion tion tion tion trans factor activity B cell response to reg of primary primary NK cell metabolic immune immune mediated external vasoconstric metabolic metabolic activation process response response immunity stimulus tion process process reg of response to reg of primary reg of immune metabolic lysosomal metabolic molecular external molecular metabolic vasoconstric response process transport process function stimulus function process tion digestive response to receptor- cellular aa blood lysosomal tract biological immune external proteolysis mediated metabolic coagulation transport mesoderm regulation response stimulus endocytosis process devt neuron- neuron- pyrimidine reg of reg of primary B cell primary neuron neuron nucleobase lysosomal catalytic catalytic metabolic mediated metabolic synaptic synaptic metabolic transport activity activity process immunity process transmissio transmissio process n n reg of seq- reg of seq- specific specific respiratory receptor- protein DNA DNA lysosomal Endocy- electron Proteo-lysis mediated cell killing metabolic binding binding transport tosis transport endocytosis process trans factor trans factor chain activity activity pyrimidine generation primary protein receptor- primary B cell reg of metabolic nucleobase of precursor metabolic metabolic mediated metabolic mediated biological process metabolic metabolites process process endocytosis process immunity process process and energy reg of trans respiratory protein primary cellular aa immune NK cell biological metabolic from RNA electron metabolic metabolic metabolic response activation regulation process pol II transport process process process promoter chain neuron- neuron- pyrimidine pyrimidine B cell regulation B cell neuron neuron nucleobase angiogenesi nucleobase mediated of catalytic endocytosis mediated synaptic synaptic metabolic s metabolic immunity activity immunity transmissio transmissio process process n n nitrogen digestive cellular aa cellular aa mRNA 3'- compound immune catabolic tract NK cell endocytosis biosynthetic metabolic end metabolic response process mesoderm activation process process processing process dev

141

Supplemental Table 4.4 Top enriched Biological Process Slim GO terms for KS06 Weight Gain. Green colored cells contain GO terms considered to be biologically relevant to WG after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed in blood after PRRSV infection (Schroyen et al., 2015). Underlined GO terms contain genes shown to be involved in feed efficiency traits in beef cattle (Oliveria et al., 2014). Genes within given kb of SNPs Genes within given kb of SNPs Genes within given kb of SNPs with P < 0.001 with P < 0.003 with P < 0.01 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb antigen antigen cellular anterior/pos digestive antigen proc and proc and antigen antigen component terior axis nuclear tract proc and pres via pres via proc and proc and morphogen specificatio transport mesoderm pres MHC class MHC class pres pres esis n dev II II antigen antigen anatomical skeletal protein proc and antigen antigen proc and pattern structure RNA system complex pres via proc and proc and pres via specificatio morphogen localization developme assembly MHC class pres pres MHC class n process esis nt II II antigen anterior/pos anatomical regulation protein cellular proc and pattern terior axis structure of nuclear complex defense cell killing pres via specificatio specificatio morphogen vasoconstri transport biogenesis response MHC class n process n esis ction II cellular cellular anterior/pos anterior/pos female segment protein protein protein terior axis mitochondri terior axis fertilization gamete specificatio modificatio modificatio lipidation specificatio al transport specificatio generation n n process n process n n regulation anterior/pos protein protein mitochond cellular protein protein of terior axis glycosylatio complex rial defense complex cell killing lipidation vasoconstri specificatio n assembly transport response assembly ction n neuron- digestive mitochond protein phospholipi mitochond protein neuron female reproductio tract rial metabolic d metabolic rial complex synaptic gamete n mesoderm transport process process transport biogenesis transmissio generation dev n skeletal protein sensory phospholipi cellular pattern antigen extracellula metabolic system complex perception d metabolic defense specificatio proc and r transport process developmen biogenesis of smell process response n process pres t mitochondri mammary anatomical mitochondri embryo ectoderm visual extracellula on gland structure on protein developmen developme perception r transport organizatio developmen morphogen organizatio lipidation t nt n t esis n neuron- purine cellular neuron B cell nucleobase component protein peroxisoma nitrogen angiogenesi synaptic mediated cell killing metabolic morphogen folding l transport utilization s transmissio immunity process esis n cellular pyrimidine skeletal muscle primary secondary segment component reproductio amino acid nucleobase system organ metabolic metabolic specificatio organizatio n transport metabolic developmen developme process process n n process t nt

142

Supplemental Table 4.5 Top enriched Biological Process Slim GO terms for NVSL Weight Gain. Green colored cells contain GO terms considered to be biologically relevant to WG after PRRSV infection. Bolded GO terms were found to be enriched in genes differentially expressed in blood after PRRSV infection (Schroyen et al., 2015). Underlined GO terms contain genes shown to be involved in feed efficiency traits in beef cattle (Oliveria et al., 2014). Genes within given kb of SNPs Genes within given kb of SNPs Genes within given kb of SNPs with P < 0.001 with P < 0.003 with P < 0.01 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb 500 kb 250 kb 100 kb nb- cellular protein extracellular response to metabolic containing catabolic metabolic localization component phosphoryla transport stimulus process compound process process organization tion met process cellular chromatin defense defense metabolic metabolic component transport localization locomotion organizatio response to response to process process org or n bacterium bacterium biogenesis nb- cellular cellular response to primary extracellular ion chromatin chromatin containing component protein toxic metabolic transport transport assembly assembly compound morphogene modificatio substance process met process sis n process anatomical nb- chromatin nitrogen pyrimidine ion protein structure containing transport cell killing organizatio compound nb met transport transport morphogene compound n met process process sis met process nb- intracellular primary response to cation containing biological extracellular metabolic behavior protein metabolic biotic transport compound regulation transport process transport process stimulus met process reg of nb- cell DNA primary primary anion anion metabolic containing communicat localization recombinati metabolic metabolic transport transport process compound ion on process process met process nb- primary regulation cytoskeleto amino acid anion extracellular containing cellular behavior metabolic of catalytic n transport transport transport compound process process activity organization met process immune pyrimidine DNA DNA cellular protein chromatin cation chromatin system nb met recombinati recombinati component metabolic assembly transport assembly process process on on organization process neuron- reg of nb- neuron cell nitrogen reg of defense transcriptio regulation protein containing synaptic communicat compound molecular response to n, DNA- of catalytic transport compound transmissio ion met process function bacterium dependent activity met process n reg of trans reg of polysacchar nitrogen protein reg of cell-cell cell-cell response to from RNA biological ide met compound phosphoryla molecular signaling signaling stress pol II process process met process tion function promoter

143

Supplemental Table 4.6 Most Enriched GO Terms in Panther Complete Categories for Viral Load.

PRRSV Category GO Term Fold n P-value Bonfer- Isolate Enrich- Genes roni ment in List P-value NVSL Biological adenylate cyclase- Process activating adrenergic Complete receptor signaling -7 -3 pathway >5 9 6.83 x 10 7.96 x 10 adrenergic receptor -7 -3 signaling pathway >5 10 8.5 x 10 9.91 x 10 negative regulation of -6 -2 peptidase activity 2.94 27 1.22 x 10 1.42 x 10 negative regulation of -6 -2 proteolysis 2.64 32 1.32 x 10 1.54 x 10 negative regulation of cellular protein -6 -2 metabolic process 1.94 57 2.87 x 10 3.35 x 10 negative regulation of protein metabolic -6 -2 process 1.88 59 5.28 x 10 6.15 x 10 Molecular peptidase inhibitor 26 -8 -4 Function activity 3.59 4.60 x 10 1.48 x 10 Complete enzyme inhibitor 34 -8 -4 activity 2.89 7.92 x 10 2.55 x 10 peptidase regulator 27 -7 -4 activity 3.2 2.43 x 10 7.82 x 10 endopeptidase 22 -6 -3 inhibitor activity 3.33 1.65 x 10 5.31 x 10 endopeptidase 22 regulator activity 3.24 2.57 x 10-6 8.27 x 10-3 serine-type 15 endopeptidase inhibitor activity 4.11 6.56 x 10-6 0.0211 molecular function 71 regulator 1.66 3.35 x 10-5 0.108

144

Supplemental Table 4.7 Most Enriched GO Terms in Panther Complete Categories for Weight Gain.

PRRSV Category GO Term Fold n P-value Bonferroni Isolate Enrichment Genes P-value in List KS06 Biological negative regulation Process of retinoic acid Complete receptor signaling pathway > 5 8 8.11E-07 9.45E-03 regulation of retinoic acid receptor signaling pathway > 5 8 2.16E-06 2.52E-02 NVSL Molecular protein 1.34 x -4 -4 Function heterodimerization 2.5 42 10 4.31 x 10 Complete DNA binding 2.02 x -6 3 1.52 131 10 6.5 x 10 nucleic acid binding 2.38 x -5 -2 1.29 236 10 7.66 x 10

145

CHAPTER 5: GENOMIC PREDICTION OF PIGLET RESPONSE TO ONE OF TWO

ISOLATES OF THE PORCINE REPRODUCTIVE AND RESPIRATORY SYNDROME

VIRUS1

Prepared for publication in Genetics, Selection, and Evolution

Emily H Waide2,3, Christopher K Tuggle2, and Jack CM Dekkers2,4

Abstract

Background : Genomic prediction of the pig’s response to the Porcine Reproductive and

Respiratory Syndrome (PRRS) virus (PRRSV) would be a useful tool in the swine industry.

Increasing accuracy of genomic prediction through addition of biological information of SNPs is gaining interest in the genetics community. This study investigated the accuracy of genomic prediction using training and validation datasets from populations with different genetic backgrounds that were challenged with different PRRSV isolates, as well as the accuracy of these predictions using annotation-based subsets of SNPs.

1 This project was supported by the USDA NIFA PRRS CAP Award 2008-55620-19132, the National Pork Board, and the NRSP-8 Swine Genome and Bioinformatics Coordination projects, and the PRRS Host Genetics Consortium consisting of USDA ARS, Kansas State Univ., Iowa State Univ., Michigan State Univ., Washington State Univ., Purdue Univ., University of Nebraska-Lincoln, PIC/Genus, Newsham Choice Genetics, Fast Genetics, Genetiporc, Inc., Genesus, Inc., PigGen Canada, Inc., IDEXX Laboratories, and Tetracore, Inc. Nick VL Serão, Martine Schroyen, Andrew Hess, Raymond RR Rowland, Joan K Lunney, Graham Plastow made significant contributions to the work presented in this chapter and are expected to be listed as co-authors when this manuscript has their approval. 2 Department of Animal Science, Iowa State University, Ames, Iowa 3 Current address: The Seeing Eye, Inc., Morristown, New Jersey 4 Correspondence: 239D Kildee Hall (phone: 515-294-7509, email: [email protected])

146

Results : Genomic prediction accuracy averaged 0.36 for viral load (VL) and 0.23 for weight gain (WG) following experimental PRRSV challenge, which demonstrates that genomic selection could be used to improve response to PRRSV infection. Training on weight gain (WG) data during infection with a less virulent PRRSV, KS06, resulted in poor accuracy of prediction for WG during infection with a more virulent PRRSV, NVSL. Inclusion of SNPs that are in linkage disequilibrium with a major QTL on chromosome 4 was vital for accurate prediction of viral load (VL). Overall, neither SNPs that were significantly associated with either trait in single

SNP genome-wide association analysis, nor SNPs near genes enriched for biologically relevant

GO terms were able to predict the phenotypes with as high an accuracy as using SNPs across the whole genome. Inclusion of data from related animals into the training population increased whole genome prediction accuracy by 33% for VL and by 37% for WG. Accuracy of prediction using only the SNPs in the major QTL was not changed with inclusion of related animals into the training population.

Conclusions : Results show that genomic prediction of response to PRRSV infection is moderately accurate and, when using all SNPs in the genome, is not extremely sensitive to differences in virulence of the PRRSV in training and validation populations. Including related animals into the training population increases prediction of accuracy when using the whole genome or SNPs other than those in a major QTL.

Introduction

Improving phenotypic performance of livestock is the overall goal of animal breeding programs. For some economically important traits, phenotypes on selection candidates or close relatives are difficult to obtain due to high cost of phenotyping, strict biosecurity measures, or

147 age at which phenotypes are measurable. Genomic prediction provides an attractive alternative to select for these traits. Porcine Reproductive and Respiratory Syndrome (PRRS) is an economically devastating disease in the swine industry caused by a rapidly mutating virus

(PRRSV; Fang et al., 2007). Genomic prediction for response to the Porcine Reproductive and

Respiratory Syndrome (PPRS) virus (PRRSV) in pigs would be highly valuable to the swine industry, as most selection takes place in high health nucleus farms that are unlikely to face

PRRSV outbreaks. Conducting experimental PRRSV infection trials requires the use of strictly regulated biocontainment facilities and is very expensive and labor intensive. Therefore, the ability to combine data from pigs with different genetic backgrounds which were infected with different PRRSV isolates to use as a training population for genomic prediction of response of unrelated piglets to other PRRSV isolates would be very beneficial.

Genomic prediction involves the use of genotypes at single nucleotide polymorphisms

(SNPs) across the genome to predict phenotypes that have not been observed in the selection candidates (Meuwissen et al., 2001). SNP chips contain thousands to hundreds of thousands of genetic markers that cover the whole genome of the species (Ramos et al., 2009). Recently, several studies (Morota et al., 2014; Do et al., 2015) have investigated the effect of addition of annotation information on the accuracy of genomic prediction by grouping SNPs based on their genic location - intronic, exonic, intergenic, etc. We aimed to expand on this concept by using the functional annotation information of genes near SNPs to create biological SNP subsets, which has been explored in few studies to date (Snelling et al., 2013; Maltecca, 2014).

The data used in this study were from the PRRS Host Genetics Consortium (PHGC,

Lunney et al., 2011) and included phenotypes on weight gain (WG) and viral load (VL) from 9 trials of ~200 piglets that were infected with the NVSL 97-7985 (NVSL) PRRSV isolate (Osorio

148 et al., 2002) and from 4 similarly sized trials of piglets infected with the KS2006-72109 (KS06) isolate. Using data from the first 8 NVSL trials, Boddicker et al. (Boddicker et al., 2012;

Boddicker et al., 2014a; Boddicker et al., 2014b) discovered and validated a QTL on Sus scrofa chromosome (SSC) 4 for VL and WG and showed that a single SNP in this region,

WUR10000125 (WUR), explained most of the genetic variance of the QTL. Furthermore,

Boddicker et al. (2014a) showed that accuracy of genomic prediction across breeds within the

NVSL data was maximized when only the SNPs within the 1 Mb window containing the WUR

SNP were used. Our study expands these questions to prediction across PRRSV isolates.

Genotype at the WUR SNP was shown to be associated with VL in both PRRSV isolates and with WG in the NVSL isolate (Hess et al., 2015). Genetic correlations for VL and WG between the two isolates were both estimated to be 0.86, which indicates that accurate genomic prediction across isolates is possible. Also, using field data, which likely contained multiple PRRSV isolates or strains of infection, Serão et al. (2014) showed that genomic prediction of PRRSV antibody response was moderately accurate. These findings are important to the swine industry, as PRRSV is a rapidly mutating virus (Fang et al., 2007; Kimman et al., 2009) and different strains are infecting industry populations and are likely to evolve from one outbreak to the next.

Except for the SSC 4 QTL, which was associated with VL in both isolates and with WG in the NVSL isolate, genome-wide association studies (GWAS) in the data used in this study did not identify genomic regions that overlapped between the two PRRSV isolate datasets (Chapter

4), despite the high genetic correlations that have been estimated for traits between isolates (Hess et al., 2015). However, although the most strongly associated regions were inconsistent between

PRRSV isolates, we found that genes near these SNPs were enriched for several of the same GO terms. Thus, we hypothesized that selection of SNPs near genes involved in biological processes

149 relevant to the phenotype could improve the accuracy of genomic prediction by reducing the number of uninformative SNPs used for prediction (Chapter 4).

Against this background, the objectives of this study were to assess the accuracy of genomic prediction for VL and WG in the following scenarios: 1) across PRRSV isolates and genetic sources, 2) using all SNPs, all SNPs other than those in the SSC 4 QTL region, or only

SNPs in the 1 megabase (Mb) SSC 4 QTL region, and 3) using smaller subsets of SNPs chosen based on GWAS analyses and GO annotation.

Materials and Methods

All experimental protocols used in this study were approved by the Kansas State

University (KSU) Animal Care and Use Committee.

Study design and animal populations

Lunney et. al (2011) provided a detailed description of the study design employed in the

PHGC trials used in this study. Briefly, a total of 13 trials of approximately 200 commercial crossbred piglets each were sent to Kansas State University at weaning, given one week to acclimate, then inoculated intramuscularly and intranasally with 105 tissue culture infectious dose of either the NVSL 97-7985 (NVSL) (Osorio et al., 2002) or KS2006-72109 (KS06)

PRRSV isolate. Blood samples were collected at 0, 4, 7, 11, and 14 days post-infection (dpi) and then weekly until termination of the trial at 42 dpi. Individual weights were observed weekly throughout the trial. At 42 dpi, piglets were euthanized and ear tissue was collected for genomic

DNA extraction, which was sent to GeneSeek, Inc. (Lincoln, Nebraska, USA) and genotyped using the Illumina Porcine SNP60 Beadchip (Ramos et al., 2009). Quality control of genotype data has been previously described (Waide et al., 2015). Briefly, after filtering genotypes with

150

GenCall Scores lower than 0.5, minor allele frequencies less than 0.01, and genotyping call rates less than 0.80, 52,386 SNP remained, with an overall genotype rate of 99.2%.

In total, data on 2,288 commercial crossbred piglets from 8 genetic backgrounds were used. Piglets averaged 26.6 (±2.4) days of age and weighed 7.17 (±1.4) kg at the time of infection. Of this total, 1,557 piglets were infected with NVSL, with an average weight of 7.34

(±1.4) kg and 26.6 (±2.6) days of age at inoculation. The remaining 731 piglets averaged 26.7

(±1.9) days of age and weighed 6.80 (±1.29) kg on average at the time of inoculation with KS06.

A more detailed description of the populations used in each trial is in Waide et al. (2015). Trial 9 involved pigs from the ISU RFI selection lines (Cai et al. 2008) and were excluded from these analyses. Trial 13 was also excluded from these analyses because piglets from this trial had much lower and more variable viremia profiles compared to the other KS06 trials. Relationships between piglets used in each trial were investigated using principal component analysis (PCA) of genotypes for all SNPs using the R function prcomp (R Core Team, 2015).

Phenotypes

The two phenotypes analyzed in this study were described by Boddicker et al. (2012).

Briefly, the amount of PRRSV RNA in blood samples was estimated using quantitative PCR and reported as the log10 of PRRSV RNA copies relative to a standard curve. Viral load (VL) was calculated as the area under the curve of viremia up to and including 21 dpi. Weight gain (WG) was calculated as the difference between body weight at 42 and 0 dpi.

Genomic prediction analyses

Bayesian variable selection, Bayes-B, was used to estimate additive and dominance effects for 52,386 SNPs for genomic prediction using the following equation:

151

! !

� = �� + ����!�!" + ����!�!" + � !!! !!! where y = vector of phenotypic observations, X = incidence matrix relating phenotypes to fixed effects, b = vector of fixed effects of sex, the interactions of pen and parity with trial, genotype at the WUR SNP, and covariates of initial age and weight, zai = vector of the additive genotype covariates coded as -10, 0, and 10 for the AA, AB, and BB genotypes, respectively, for SNP i, ai

= additive effect for SNP i, δai = indicator for whether the additive effect of SNP i was included

(δai = 1) or excluded from (δai = 0) the model for a given iteration of the Markov Chain Monte

Carlo (MCMC) chain, zdi = vector of the dominance genotype covariates coded as 10, 0, and 10 for the AA, AB, and BB genotypes, respectively, for SNP i, di = dominance effect for SNP i, δdi

= indicator for whether the dominance effect of SNP i was included (δdi = 1) or excluded from

(δdi = 0) the model for a given iteration of the MCMC chain, and e = vector of residual errors.

The prior probability that a given SNP was excluded from the model (δai = 0 and δdi = 0) was set equal to π = 0.99, fitting approximately 1% or 524 additive effects and 524 dominance effects in each of 51,000 iterations of the MCMC chain, with the first 1,000 iterations designated as burn in. We also performed these analyses with addition of genotype at the WUR SNP as a fixed effect and SNPs within 2.5 Mb on either side of the WUR SNP excluded from the analysis.

Using the estimates obtained from the Bayes-B analysis, the genomic estimated genotypic value (GEGV) was obtained for each animal in the validation population using the following equation:

! !

����! = �����! + �����! !!! !!!

152

where GEGVj = the genotypic value for individual j, zaij and zdij are as described above, �i = estimate of the additive effect for SNP i, and �i = dominance effect for SNP i. To estimate the prediction accuracy of the SSC 4 QTL alone, we used the additive and dominance effects from the whole genome Bayes-B analysis of the training population for the 39 SNPs in the 1 Mb window that contains the WUR SNP to estimate GEGVs in the validation population. To estimate the predictive accuracy of the rest of the genome aside from the SSC 4 QTL (Genome-

WUR), genotype at the WUR SNP was fitted as a fixed effect in the model and all SNPs within

2.5 Mb on either side of the WUR SNP were excluded from analysis.

Training and validation datasets. Figure 5.2 shows examples of how the data were split into training and validation groups for the 5 main genomic prediction scenarios described in the following, followed by an example of each scenario. 1) Genomic prediction across PRRSV isolate (black arrows in Figure 5.1A): e.g., all data from the NVSL isolate used for training and all data from the KS06 isolate used for validation (NTàKV). 2) Genomic prediction across breeding company with both PRRSV isolates in training (red arrow in Figure 5.1A): e.g., trials 4-

15 from the NVSL isolate and trials 10, 12, and 14 from the KS06 isolate used for training and trials 1-3 from the NVSL isolate used for validation (NKTàNV). 3) Genomic prediction across breeding company within PRRSV isolate (small blue arrow in Figure 5.1B): e.g., trials 11, 12, and 14 from the KS06 isolate used for training and trial 10 from the KS06 isolate used for validation (KTàKV). 4) Genomic prediction across breeding company and PRRSV isolate (large blue arrow in Figure 5.1B): e.g., trials 11, 12, and 14 from the KS06 isolate used for training and trial 15 from the NVSL isolate used for validation (KTàNV). 5) Including breeding company across PRRSV isolate (purple arrow in Figure 5.1B): e.g., trials 1-8 and 15 from the NVSL isolate used for training and trial 14 from the KS06 isolate used for validation (NTàKV).

153

Accuracy of genomic prediction was calculated as the correlation between the GEGV and phenotypes adjusted for the fixed effects included in the genomic prediction model, divided by the square root of the heritability of the trait for the PRRSV isolate of infection in the validation population, as estimated by Hess et al. (2014). For example, when we trained on VL from one or more NVSL infection trials and validated on VL from one or more KS06 infection trials, genomic prediction accuracy was calculated as the correlation between the GEGV and adjusted phenotypes for VL in the KS06 trial(s) divided by the square root of heritability for VL in all of the KS06 data. Phenotypes were adjusted for estimates of fixed effects (sex and interactions of pen and parity with trial; genotype at the WUR SNP was included as a fixed effect in Genome-

WUR analyses only) within the validation population.

SNP subset selection

Genomic prediction between PRRSV isolates was also performed using subsets of SNPs selected based on results of GWAS and functional annotation information. To identify subsets of SNPs selected to have an effect on phenotype, single SNP GWAS were performed using ASReml 4

(Gilmour et al., 2014) with the following linear mixed model:

� = �� + �!�� + �� + �� + � + �

where y, X, and b are as described above, Zi = matrix of genotypes coded as 0, 1, and 2 for the

AA, AB, and BB genotypes, respectively, for SNP i, gi = vector of genotype class effects for

SNP i, W = incidence matrix associating phenotypes with the interaction of pen and trial, p = vector of random interaction effects of pen and trial, S = incidence matrix associating phenotypes with litter effects, l = vector of random effects of litter, u = vector of random animal polygenic effects with variance-covariance matrix based on the pedigree relationship matrix, and

154 e = vector of residual errors. A two generation pedigree of 4807 animals was used. Single SNP analyses were performed on data from each PRRSV isolate separately.

SNPs with P<0.003 within PRRSV isolate, as determined by single SNP analyses described above, were used as the largest SNP subset of potentially informative SNPs for genomic prediction. In order to add functional annotation information to this subset selection, genes within 250 kb of SNPs associated with the trait at P<0.003 were compiled into gene lists to perform gene ontology (GO) term enrichment analyses using the Panther software (Mi et al.,

2013). We then created subsets of the list of SNPs with P<0.003 consisting of SNPs that were within 250 kb of genes in biologically relevant enriched GO terms; for example, SNPs with

P<0.003 that were within 250 kb of genes in the natural killer cell activation GO term were grouped into a SNP subset. We added the WUR SNP to each of these lists.We also performed genomic prediction using all SNPs within 250 kb of all genes in biologically relevant enriched

GO terms, regardless of the strength of association.

In order to evaluate the accuracy of genomic prediction using these subsets, we estimated the effects of the SNPs in these lists using Bayes-A (i.e. Bayes-B with π=0). For these analyses, we trained on the PRRSV isolate with which the SNPs were associated in single SNP analysis and validated on the data from pigs infected with the other PRRSV isolate.

Results

The data used in this study consisted of 2,288 pigs from 8 breeding companies that were infected with one of two PRRSV isolates. Information on each trial is in Table 5.1. Principal components analysis (PCA) of SNP genotypes from all animals used in this study showed that pigs from each breeding company clustered together (Figure 5.1). The first principal component

155

(PC) explained 7.2% of the variance in genotypes and PC2 explained 6.4%. When we plotted

PC1 against PC2, pigs generally clustered by breeding company. PC1 distinguished pigs in breeding company 1 from pigs in breeding company 7, and each were separated from pigs in the other breeding companies. PC2 separated pigs based on their actual breed make-up, with the progeny of Duroc sires bred to white crossbred dams clustered together and those without Duroc sires clustered together. This breed separation agrees with previous reports that show clustering of Large White (LW), Landrace (LR), and Pietrain pigs together and separate from Duroc pigs based on genomic data (Ramírez et al., 2015).

Genomic prediction of Viral Load (VL)

Prediction across PRRSV isolates. Although genomic regions associated with VL were not consistent across PRRSV isolate (Chapter 4), except for the SSC 4 QTL, we found that genomic prediction across PRRSV isolate (black arrows in Figure 5.2A) was moderately accurate. Accuracy of whole genome prediction training on the NVSL data and validating on the

KS06 data (NTàKV) was 0.34, while training on the KS06 data predicted genomic estimated genotypic values (GEGVs) in the NVSL data (KTàNV) with an accuracy of 0.44 (Figure 5.3A).

When we accounted for the WUR SNP in the analysis (Genome-WUR) – in other words, when we fitted genotype at the WUR SNP as a fixed effect and removed SNPs within 2.5 Mb on either side of the WUR SNP from the prediction – these accuracies were reduced to 0.12 and 0.13, respectively (Figure 5.3A). Accuracy of prediction using only the 39 SNPs in the 1 Mb window containing the WUR SNP (WUR only) was less than that of the whole genome, but greater than the accuracy of Genome-WUR (Figure 5.3A). For the whole genome and WUR-only predictions,

KTàNV was more accurate than NTàKV, but accuracy was equivalent between KTàNV and

NTàKV when using Genome-WUR.

156

Effect of genetic relationships between training and validation. In order to determine the contribution of genetic relationships between animals in the training and validation datasets, the accuracy of genomic prediction across PRRSV isolates when validating on data from one breeding company was compared using two training scenarios: 1) excluding the validation breeding company from training (large blue arrow in Figure 5.2B) and 2) including the validation breeding company in training (purple arrow in Figure 5.2B). These analyses were only conducted for breeding companies that had data for both PRRSV isolates (5 trials of NVSL data and 3 trials of KS06 data; Figure 5.2). Including the validation breeding company in the training population increased accuracy of prediction most for the Genome-WUR prediction (Figure 5.3B; trial specific genomic prediction accuracies are in Figure 5.4). The average prediction accuracy for Genome-WUR when the validation breeding company was excluded from training was 0.01; this prediction accuracy increased to an average of 0.17 when the validation breeding company was included in training (Figure 5.3B). This could indicate that the Genome-WUR predictions are primarily based on genetic relationships instead of biological effects. The accuracy of genome-WUR prediction for one breeding company in the NVSL data (indicated by triangles in

Figure 5.3B) was very low even with inclusion of related animals in the training population. If this accuracy is ignored, the average accuracy of KTàNV genome-WUR prediction increased from 0.19 to 0.26 when related animals are included in training. For WUR only prediction, inclusion of the validation breeding company increased accuracy by 48% for trial 15 but did not have an effect on prediction accuracy for other trials (Figure 5.4C). The favorable allele at the

WUR SNP had a larger effect on VL for the NVSL isolate than for the KS06 isolate, and also has a larger effect on VL in trial 10 than in the other KS06 trials.

157

Prediction across breeding companies with both PRRSV isolates in training. Since the PRRSV mutates rapidly (Fang et al., 2007), it would be valuable to be able to use data from response to one PRRSV isolate to predict the ability of pigs to respond to another PRRSV isolate. Furthermore, collection of response to PRRSV infection is expensive and difficult. Thus, it would be beneficial to use all available data for genomic prediction, resulting in a training dataset consisting of pigs infected with one of several different PRRSV isolates. To assess the effect of PRRSV isolates in the training dataset, we compared genomic prediction across breeding company within PRRSV isolate (small blue arrow in Figure 5.2B), across PRRSV isolates (large blue arrow in Figure 5.2B), and using data from both PRRSV isolate infections in training (red arrow in Figure 5.2A). Results showed that whole genome prediction was moderately accurate within isolate, across isolate, and with both isolates in training (Figure

5.3C). Training on VL in the KS06 data gave the only positive accuracy for KTàNV and

KTàKV Genome-WUR prediction. On average, whole genome and WUR-only prediction were not sensitive to the PRRSV isolate used in training. The trial with zero WUR-only prediction in within isolate, across isolate, and both isolates in training scenarios (indicated by a gray diamond) was trial 8, which has been shown to have very low predictive accuracy using on the

SSC 4 QTL (Boddicker et al., 2014b) and genotype at the WUR SNP was shown to not have an effect on VL in that trial (Hess et al., 2015).

Prediction using SNP subsets. Results of single SNP GWAS were used to perform gene ontology (GO) annotation enrichment analyses, as previously described (Waide et al., 2015). The largest SNP subsets that were used for genomic prediction contained SNPs with associations of

P<0.003 within trait and PRRSV isolate. SNP effects were estimated using data from the PRRSV isolate with which they were associated in single SNP analysis using Bayes-A. The resulting

158

SNP effect estimates were then used to predict GEGVs of pigs infected with the other PRRSV isolate. There were 930 and 616 SNPs associated with VL at P<0.003 in the KS06 and NVSL data, respectively. When we used only these SNPs for genomic prediction of VL across PRRSV isolates, accuracy decreased to 0.18 (from 0.44) for KTàNV and to 0.27 (from 0.34) for NTàKV

(Figure 5.4). Genes near each of these SNP subsets were shown to be enriched for several immune function related GO terms (Chapter 4). However,when we used only the 108 (KS06) and 166 (NVSL) SNPs near genes in these immune related GO terms, prediction accuracy was

0.19 for KTàNV and 0.24 for NTàKV (Figure 5.4).

Genomic prediction of Weight Gain (WG)

Prediction across PRRSV isolates. Accuracies for genomic prediction of WG across

PRRSV isolates are in Figure 5.5A. Whole genome or Genome-WUR prediction was most accurate for NT_KV ; however, for KT_NV, WUR only prediction was over twice as accurate as whole genome and Genome-WUR prediction (Figure 5.6A). The SSC 4 QTL was shown to have a large effect on WG in the NVSL data (Boddicker et al., 2014a) but was not significantly associated with WG in the KS06 data (Hess et al., 2015). Bayes-B results showed that the 1 Mb window containing the WUR SNP explained 10.4% of genetic variance for WG in the NVSL data, but only 0.05% in the KS06 data (data not shown). GWAS of WG in the KS06 data detected no QTL with large effects but many very small effects spread out across the genome

(Chapter 4). Furthermore, genes near SNPs moderately associated with WG in the KS06 were not enriched for metabolic GO terms, unlike those for WG in the NVSL data. The NVSL isolate is more virulent than the KS06 isolate; piglets infected with the KS06 isolate had lower viral load and higher weight gain than pigs infected with the NVSL isolate (Chapter 4). Additional KS06 infection trials may be needed to accurately estimate SNP effects for WG.

159

Effect of genetic relationships between training and validation. Including the validation breeding company in training had no effect on the average accuracy of genomic prediction of WG for either PRRSV isolate or for any SNP subset, but these accuracies had large standard deviations (Figure 5.5B). Figure 5.5B shows the trial specific accuracies that were averaged to get each of these estimates. When training on WG in the KS06 trials and predicting trials 1-3 or trial 5 of the NVSL data, whole genome and Genome-WUR prediction yielded negative accuracies but the accuracy of WUR only prediction was positive.Whole genome or

Genome-WUR prediction of WG in trial 15 of the NVSL data was moderately accurate, 0.28 and

0.31, respectively, but WUR only prediction accuracy was low, 0.11 (Figure 5.5B). When SNP effects were trained on WG in the NVSL data, whole genome prediction most accurately predicted GEGV for WG in trials 10 and 11 of the KS06 data, whereas Genome-WUR most accurately predicted GEGV for WG in trial 14 of the KS06 data (Figure 5.5B). Across breed prediction of WG for KT_NV using the whole genome ranged from -0.39 to 0.46. Trials with low or negative accuracies using the whole genome tended to have higher accuracies when using the

WUR region for prediction, while trials with high whole genome accuracy (Figure 5.5B) had lower WUR-only prediction accuracies (Figure 5.5B).

Prediction across breeding companies with both PRRSV isolates in training. Figure

5.5C shows the average accuracy for WG of whole genome, Genome-WUR, and WUR-only prediction within isolate, across isolates, and when training on both isolates. Including WG from both isolates in training increased the accuracy of whole genome and Genome-WUR prediction of WG in the KS06 data compared to within isolate prediction. Furthermore, although accuracies had large standard deviations, the average accuracy of prediction for WG in the KS06 data was more accurate when NVSL data were used for training compared to within-isolate prediction.

160

WUR-only prediction in the KS06 data had zero accuracy regardless of the PRRSV isolate used for training. This could be an effect of the size of the training population for within-isolate prediction of KS06. There also was no predictive ability of the whole genome or Genome-WUR for NVSL data when SNP effects were trained on WG in the KS06 isolate.

Prediction using SNP subsets. Prediction accuracy of WG for KTàNV increased from

0.10 when using the whole genome to 0.14 when using only the 496 SNPs with P<0.003 and decreased slightly to 0.08 when using the 134 SNPs near genes in metabolism-related GO terms

(Figure 5.6). For NTàKV, the accuracy of prediction decreased from 0.34 when using the whole genome to 0.24 when using the 770 SNPs with P<0.003. Interestingly, when the 484 SNPs near genes in biologically relevant GO terms were used for prediction, accuracy increased slightly to

0.29. Prediction accuracies for WG were lower for each scenario involving KTàNV, which may be caused by the decreased virulence of the KS06 PRRSV isolate, which had a diminished effect on WG, when compared to infection with the NVSL isolate.

Discussion

The results presented in this study show that even without overlap of genomic regions identified in GWAS, we can still obtain moderately accurate genomic predictions across two isolates of the PRRSV. Genomic prediction using the whole genome was more accurate that

Genome-WUR or WUR-only prediction for all scenarios in both VL and WG, except for excluding related animals in training in KTàNV for WG which was most accurate using WUR- only. For VL, the most accurate genomic prediction scenario involved using the whole genome to predict across PRRSV isolates when the validation breeding company was included in the

161 training data. For WG, the most accurate genomic prediction scenario involved whole genome prediction with both PRRSV isolates included in the training data.

Overall, genomic prediction of WG was less accurate than genomic prediction of VL.

This may be in part due to the differences in the effect of the SSC 4 QTL on WG in these two

PRRSV isolates. Furthermore, weight gain or a related trait (such as feed efficiency or average daily gain) is a common trait selected upon in the commercial pig industry (Rauw et al., 1998), whereas VL after PRRSV infection is not selected upon. Although the WG measured in this study was in piglets infected with PRRSV, as opposed to healthy piglets, prediction of the trait may be affected by the selection in the industry.

Genomic prediction across breeding companies

Prediction within PRRSV isolate. Whole genome prediction of VL across breeding companies within PRRSV isolate was more accurate than WUR-only prediction, and WUR-only was more accurate than Genome-WUR prediction. There was very little to no predictive ability of Genome-WUR for VL within isolate, indicating that the effects of the SSC 4 QTL contributed most of the prediction accuracy in this scenario. Genome-WUR accuracy was moderately negative for NVSL trial 6, which may be due to the fact that the pigs in this trial were purebred

Landrace while pigs in other trials were crossbred. Accuracy of within isolate prediction for VL in NVSL trial 8 was negative for all three SNP sets, which is not explained by PCA or the average VL in this trial. The negative accuracy of WUR-only prediction for trial 8 can be explained by the lack of effects of the WUR SNP on VL in this trial (Boddicker et al., 2014b).

Accuracy of WG prediction within PRRSV isolate was more accurate for the NVSL trials compared to the KS06 trials. In NVSL trials, whole genome prediction accuracy was higher than

162

WUR-only, and WUR-only was more accurate than Genome-WUR prediction. In KS06 trials, whole genome and Genome-WUR prediction accuracies were equivalent and higher than WUR- only prediction. WUR-only prediction was more accurate in NVSL trials compared to KS06 trials, which is due to the lack of effect of the WUR SNP on WG in the KS06 isolate (Hess et al.,

2015). For KS06 trials 10, 11, and 12, WUR-only prediction accuracy was low (range 0.09-

0.13), yet positive. However, there was no accuracy of WUR-only prediction in KS06 trial 14, which can be explained by the fact that effects of the WUR SNP on WG in trial 14 are in the opposite direction (AA animals had numerically higher WG than AB animals) of the effects in other KS06 trials (Hess et al., 2015).

Prediction including related animals in training. The most notable increase in prediction accuracy due to inclusion of related animals in training was for VL in NVSL trial 15.

Including related animals in training increased whole genome prediction accuracy by 100%,

Genome-WUR by 200%, and WUR-only by 50% for NVSL trial 15. Trial 15 is from breeding company 7, which is separated from all other breeding companies by PCA, which may explain these large increases in accuracy. However, prediction accuracy for WG in trial 15 did not show dramatic increases from inclusion of related animals in training. In fact, the largest increases in accuracy due to inclusion of related animals in training were for KS06 trial 10, which is from the same breeding company as NVSL trial 15. For WG in KS06 trial 10 inclusion of related animals in training led to increases in accuracies of whole genome prediction by 40% and Genome-WUR by 700%, while WUR-only prediction was equivalent between excluding and including related animals in training.

On average, inclusion of the validation breeding company in the training population increased whole genome and Genome-WUR prediction accuracy for both VL and WG, as is

163 expected with increased relationships between the training and validation animals (Habier et al.,

2007; Goddard, 2008; Hayes et al., 2009; Daetwyler et al., 2013). Although other studies have found that genomic prediction has very low or no accuracy when training and validating across breeds (Hayes et al., 2009; Ibánẽz-Escriche et al., 2009; de Roos et al., 2009; Toosi et al., 2009;

Daetwyler et al., 2012), we showed that there was predictive ability across genetic backgrounds in our data when using the whole genome. Although the training and validation populations used in this study were from different breeding companies, the commercial crossbred pigs were likely more genetically similar across breeding companies than the purebred Holstein and Jersey populations used by Hayes et al. (2009). Ibánẽz-Escriche et al. (2009), de Roos et al. (2009), and

Toosi et al. (2009) examined the accuracy of genomic prediction across breeds using simulation, which requires many assumptions that are often violated in real data, such as that used in this study. Another likely explanation for differences in predictive ability across genetic backgrounds in our study was the large predictive contribution of the SSC 4 QTL. Inclusion of related animals in training increased accuracy of WUR-only predictions less than compared to whole genome, and increase in accuracy of whole genome prediction was less than compared to Genome-WUR prediction. Therefore, accuracy of genomic prediction across genetic backgrounds without related animals in training and validation may be higher when there is one or several QTL with large effects on the phenotype of interest compared to scenarios in which no major QTL affect the trait and effects are small and spread across the genome.

Genomic prediction across PRRSV isolates.

Prediction across breeding companies. Prediction accuracies of VL across isolates and breeding companies followed the same pattern as within isolate prediction across breeding companies, with whole genome prediction having the highest accuracy followed by WUR-only

164 and Genome-WUR having the least accurate prediction. For prediction of WG, accuracy patterns switched between the two isolates of within isolate prediction. For KTàNV, WUR-only prediction was more accurate than whole genome and Genome-WUR, while whole genome and

Genome-WUR prediction accuracies were equivalent to one another. For NTàKV, whole genome prediction was slightly more accurate than Genome-WUR, and Genome-WUR accuracy was 200% higher than WUR-only prediction.

Prediction within isolate, across isolate, and both isolates in training. For prediction of VL, average accuracies were the same for within isolate, across isolate, and both isolate in training for each PRRSV isolate in validation. For WG, on average, whole genome and Genome-

WUR predictions were most accurate when NVSL trials were included in training; NTàNV was more accurate than KTàNV, NTàKV was more accurate than KTàKV, and NKTàKV was more accurate than both NTàKV and KTàKV. This may be due to the absence of peaks seen in the

GWAS for WG in the KS06 data (Chapter 4), which spreads very small effects out over many

SNPs. For WUR-only prediction of WG, accuracies were similar among within isolate, across isolate, and both isolates in training for each validation isolate, with WUR-only prediction of

WG in NVSL trials having higher accuracy compared to that of KS06 trials. Genotype at the

WUR SNP was shown to be associated with WG in the NVSL data, but not in the KS06 data, which explains the increased accuracy for WUR-only prediction in NVSL trials. Effects of the

WUR SNP are in the same direction in 3 of 4 KS06 trials as in the NVSL trials, therefore WUR- only prediction accuracy of NTàNV and KTàNV are equivalent.

165

Genomic prediction using SNP subsets based on functional analyses

When we used SNPs that were associated with VL in the NVSL data based on single

SNP analysis, we found that NTàKV accuracy was 81% of that using all SNPs in the genome; when only SNPs near genes with GO terms shown to be enriched in associated regions were used for prediction, accuracy was 72% of whole genome accuracy. Using SNPs associated with VL in the KS06 data based on single SNP analysis or associated SNPs near enriched GO terms for

KTàNV resulted in accuracies 40 and 44% of the whole genome accuracy for prediction of VL, respectively.

Using only SNPs associated with WG in the KS06 data in single SNP analysis increased

KTàNV prediction accuracy was 143% of accuracy using the whole genome, but was less accurate than WUR only prediction. Accuracy of KTàNV prediction for WG decreased to 81% of the accuracy of whole genome prediction when using only SNPs near GO enriched genes. For

NTàKV, using SNPs associated with WG in the NVSL data or associated SNPs near enriched

GO terms resulted in accuracies 72 and 86% of the whole genome accuracy, respectively.

Overall, prediction using SNPs near GO enriched genes was more accurate than using all

SNPs with P<0.003, but both were still less accurate than the whole genome. The results from this study agreed with a study conducted by Lucot et al. (2015), in which it was shown that fewer windows most significantly associated with age at puberty were required to explain the phenotypic variance in training, but using all SNPs in the genome explained the largest proportion of phenotypic variance in validation. This study also showed that the accuracy of genomic prediction was maximized when the whole genome was used (Lucot et al., 2015). Do et al. (2015) performed genomic prediction of feed efficiency traits in pigs using SNP subsets based

166 on genic locations of SNPs, such as intergenic, intron, exon, etc. None of the SNP subsets had an effect on prediction accuracy, and were equivalent in all cases except for one category (missense

SNPs) to a randomized set of SNPs (Do et al., 2015). In this study, we saw that SNP subsets based on association or GO term enrichment information gave lower accuracies than the whole genome. When we used all SNPs other than those in the association or GO term enriched categories used in this study, genomic prediction accuracy was the same as the accuracy of the whole genome, which would indicate that these SNP subsets did not contribute significantly to the prediction accuracy. However, when we randomly selected the same number of SNPs in each of the association or enriched GO term categories in this study and found that the randomly selected subsets had zero predictive accuracy.

Addition of functional annotation of GWAS associations may increase the accuracy of genomic prediction through alternative methods, though, such as by assigning SNPs to subsets based on annotation of genes and allowing different π values for the subsets, such as is implemented in Bayes-R (Erbe et al., 2012). In this manner, the whole genome is used for prediction, with the probability of association for SNP subsets determined by possible biological relevance. Bayes-N is another Bayesian GWAS method in which a nested model is used, where association of groups of SNPs with the phenotype is analyzed first, then associations of SNPs within the associated groups are examined (Zeng, 2015).

Comparison to previous studies

Our results for within PRRSV isolate Genome-WUR prediction for VL in the NVSL data were similar to the results of Boddicker et al. (2014b) using data from trials 1 to 8; however our

WUR only prediction accuracy was lower than Boddicker et al. (2014b) estimates. For trial 4,

167

Boddicker et al. (2014b) showed a negative accuracy when using SNPs across the genome except for the SSC 4 QTL (similar to our Genome-WUR scenario), whereas our analysis showed a moderately positive prediction accuracy. There are several possible reasons for these differences. Boddicker et al (2014b) used the Bayes-C method of GenSel (Fernando and Garrick,

2008); we fitted the covariates of initial age and weight in a Bayes-B model, while these effects were not in their model. Third, our training dataset included an additional trial of data (trial 15).

Boddicker et al. (2014b) predicted breeding values, while we estimated genotypic values using both additive and dominance effects for each SNP. Furthermore, Boddicker et al. (2014b) adjusted phenotypes in the validation population using estimates of fixed effects obtained from

ASReml analysis on all 8 trials of NVSL data used in their study. The fixed effect estimates used to adjust validation phenotypes in this study were estimated using only the validation data.

We used Bayes-B with π=0.99 for all analyses in this study. For the Genome-WUR analyses, it may have been more appropriate to utilize Bayes-C. Generally, Bayes-B would be a more appropriate method to analyze a trait in which there are one or several major QTL and remaining effects are small, as Bayes-B allows for each SNP to have locus-specific variance

(Meuwissen et al., 2001). In contrast, with Bayes-C and Bayes-Cπ, all SNPs have a common variance (Habier et al., 2011). We repeated the across isolate Genome-WUR analyses using

Bayes-C, but genomic prediction accuracies were the same for VL and declined by 18% for WG

(data not shown). Bayes-B with π=0.99 provided more accurate estimates of GEGV compared to the Bayes-Cπ method for both VL and WG. Furthermore, accuracy obtained with Bayes-B when we set π equal to the estimates from Bayes-Cπ analysis were the same as using Bayes-B with

π=0.99. For VL and WG in the KS06 data, Bayes-Cπ estimates of π were 0.99 for both the whole genome and Genome-WUR analyses. For VL and WG in the NVSL data, π was estimated to be

168

0.69 and 0.77 for the whole genome, respectively, and 0.56 and 0.42 for Genome-WUR, respectively.

Our estimates of genotypic value include both additive and dominance effects, whereas genomic estimated breeding values (GEBV) are calculated based on only additive effects

(Meuwissen et al., 2001). When we evaluated the whole genome accuracy of GEBV prediction across PRRSV isolates, average accuracy for VL was unchanged, but prediction accuracy for

WG increased by 30%, from 0.34 to 0.4 for NTàKV and from 0.1 to 0.16 for KTàNV, compared to the additive and dominance combination model.

Conclusions

Genomic prediction of response to PRRSV isolate was moderately accurate on average, including genomic prediction across PRRSV isolates and across breeding companies. Overall, the Bayes-B method yielded the most accurate genomic predictions in this study. Whole genome prediction across PRRSV isolates and breeding companies was moderately accurate, but accuracy was greatly reduced when SNPs in the SSC 4 QTL were removed from the prediction.

The previously identified QTL on SSC 4 (Boddicker et al., 2012) had a large effect on VL for both PRRSV isolates and on WG for the NVSL isolate (Hess et al., 2015), and therefore, had a large contribution to prediction accuracy. Greater relationships between training and validation populations had larger effects on genomic prediction accuracy for scenarios in which there were many QTL with small effects on the trait of interest. Especially for VL, inclusion of related animals in training yielded the largest increase in Genome-WUR prediction.

We found that using the whole genome for prediction was most accurate. The use of

SNPs shown to be associated with each trait in single SNP analysis decreased prediction

169 accuracy. Furthermore, SNPs near genes annotated with biologically relevant GO terms had less predictive ability compared to the whole genome.

Literature Cited

Boddicker, N. J., A. Bjorkquist, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. Dekkers. 2014a. Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection. Genet Sel Evol 46:18.

Boddicker, N. J., D. J. Garrick, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. M. Dekkers. 2014b. Validation and further characterization of a major quantitative trait locus associated with host response to experimental infection with porcine reproductive and respiratory syndrome virus. Anim. Genet. 45:48–58.

Boddicker, N., E. H. Waide, R. R. R. Rowland, J. K. Lunney, D. J. Garrick, J. M. Reecy, and J. C. M. Dekkers. 2012. Evidence for a major QTL associated with host response to Porcine reproductive and respiratory syndrome virus challenge. J. Anim. Sci. 90:1733–1746.

Daetwyler, H. D., M. P. L. Calus, R. Pong-Wong, G. de Los Campos, and J. M. Hickey. 2013. Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 193:347–65.

Daetwyler, H. D., K. E. Kemper, J. H. J. van der Werf, and B. J. Hayes. 2012. Components of the Accuracy of Genomic Prediction in a Multi-Breed Sheep Population. J. Anim. Sci.

Do, D. N., L. L. G. Janss, J. Jensen, and H. N. Kadarmideen. 2015. SNP annotation-based whole genomic prediction and selection : An application to feed efficiency and its component traits in pigs. J. Anim. Sci.:2056–2063.

Fang, Y., P. Schneider, W. P. Zhang, K. S. Faaberg, E. a. Nelson, and R. R. R. Rowland. 2007. Diversity and evolution of a newly emerged North American Type 1 porcine arterivirus: Analysis of isolates collected between 1999 and 2004. Arch. Virol. 152:1009–1017.

Garrick, D. J. 2011. The nature, scope and impact of genomic prediction in beef cattle in the United States. Genet. Sel. Evol. 43:17.

Gilmour, A. R., B. J. Gogel, B. R. Cullis, and R. Thompson. 2014. ASReml User Guide.

Goddard, M. 2008. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136:245–257.

Habier, D., R. L. Fernando, and J. C. M. Dekkers. 2007. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177:2389–97.

170

Habier, D., R. L. Fernando, K. Kizilkaya, and D. J. Garrick. 2011. Extension of the bayesian alphabet for genomic selection. BMC Bioinformatics 12:186.

Hayes, B. J., P. J. Bowman, A. C. Chamberlain, K. Verbyla, and M. E. Goddard. 2009. Accuracy of genomic breeding values in multi-breed dairy cattle populations. Genet. Sel. Evol. 41:51.

Hayes, B. J., P. M. Visscher, and M. E. Goddard. 2009. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. (Camb). 91:47–60.

Hess, A. S., N. J. Boddicker, R. R. R. Rowland, J. K. Lunney, G. S. Plastow, and J. C. M. Dekkers. 2015. A comparison of genetic parameters and effects for a major QTL between piglets infected with one of two isolates of porcine reproductive and respiratory syndrome virus. In: North American PRRS Symposium. Chicago, Illinois, USA.

Ibánẽz-Escriche, N., R. L. Fernando, A. Toosi, and J. C. Dekkers. 2009. Genomic selection of purebreds for crossbred performance. Genet. Sel. Evol. 41:12.

Kimman, T. G., L. A. Cornelissen, R. J. Moormann, J. M. J. Rebel, and N. Stockhofe- Zurwieden. 2009. Challenges for porcine reproductive and respiratory syndrome virus (PRRSV) vaccinology. Vaccine 27:3704–3718.

Lunney, J. K., J. P. Steibel, J. M. Reecy, E. Fritz, M. F. Rothschild, M. Kerrigan, B. Trible, and R. R. Rowland. 2011. Probing genetic control of swine responses to PRRSV infection: current progress of the PRRS host genetics consortium. BMC Proc. 5 Suppl 4:S30.

Maltecca, C. 2014. A multifaceted approach to the use of genomic selection in new traits. In: The Tenth World Congress of Genetics Applied to Livestock Production. Vancouver, BC, Canada.

Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829.

Mi, H., A. Muruganujan, and P. D. Thomas. 2013. PANTHER in 2013: Modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41:377–386.

Morota, G., R. Abdollahi-Arpanahi, A. Kranis, and D. Gianola. 2014. Genome-enabled prediction of quantitative traits in chickens using genomic annotation. BMC Genomics 15:109.

Osorio, F. A., J. A. Galeota, E. Nelson, B. Brodersen, A. Doster, R. Wills, F. Zuckermann, and W. W. Laegreid. 2002. Passive transfer of virus-specific antibodies confers protection against reproductive failure induced by a virulent strain of porcine reproductive and respiratory syndrome virus and establishes sterilizing immunity. Virology 302:9–20.

Ramos, A. M., R. P. M. A. Crooijmans, N. A. Affara, A. J. Amaral, L. Alan, J. E. Beever, C. Bendixen, C. Churcher, R. Clark, P. Dehais, M. S. Hansen, J. Hedegaard, Z. Hu, H. H. Kerstens, A. S. Law, D. Milan, D. J. Nonneman, G. A. Rohrer, M. F. Rothschild, and T. P. L. Smith. 2009. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology. 4:e6524.

171

De Roos, A. P. W., B. J. Hayes, and M. E. Goddard. 2009. Reliability of genomic predictions across multiple populations. Genetics 183:1545–53.

Serão, N., O. Matika, R. A. Kemp, J. C. S. Harding, S. C. Bishop, G. S. Plastow, and J. C. M. Dekkers. 2014. Genetic analysis of reproductive traits and antibody response in a PRRS outbreak herd. J. Anim. Sci. 92:2905–2921.

Snelling, W.M., R.A. Cushman, J.W. Keele, C. Maltecca, M.G. Thomas, M.R.S. Fortes, and A. Reverter. 2013. Breeding and Genetics Symposium: Networks and pathways to guide genomic selection. J. Anim. Sci. 91:537-552.

Toosi, A., R. L. Fernando, and J. C. M. Dekkers. 2009. Genomic selection in admixed and crossbred populations. J. Anim. Sci. 88:32–46.

Waide, E. H., C. K. Tuggle, N. V. L. Serão, M. Schroyen, A. Hess, R. R. R. Rowland, J. K. Lunney, G. S. Plastow, and J. C. M. Dekkers. 2015. Genome wide association analysis of piglet response to experimental infection with one of two isolates of the Porcine Reproductive and Respiratory Syndrome virus. In preparation.

172

Figures

Figure 5.1 Principal components analysis of the SNP genotype data. Each point represents a single animal, with each color representing one of the 8 breeding companies used in this study. The breeding company numbers match those shown in Table 5.1. The breed makeup of the animals for each breeding company are shown in the same color. LR=Landrace, LW=Large White, and Y=Yorkshire. Breeds are presented as breed of sire x breed of dam.

173

Figure 5.2 Scenarios used for training and validation for genomic prediction. Pink, purple, and blue colored rectangles represent each of the 3 breeding companies with pigs in both NVSL and KS06 trials, with the trial number indicated inside the rectangle. Gray colored rectangles represent individual breeding companies with only one trial of PRRSV infected pigs. Arrows indicate direction of genomic prediction, with the tail originating from the trials used in training and the head pointing towards the trial(s) used for validation. (A) Genomic prediction across PRRSV isolates using all data indicated by black arrows; across breeding company using both PRRSV isolates in training indicated by red arrows. (B) Genomic prediction across PRRSV isolate including validation breeding company in training indicated by purple arrow; Genomic prediction across breeding company within PRRSV isolate and across breeding companies and isolate indicated by small and large blue arrows, respectively.

174

(A)

(B)

(C)

Figure 5.3 Accuracy of genomic prediction across PRRSV isolates for Viral Load. Accuracy, presented as the average correlation between genomic estimated breeding values and adjusted phenotypes divided by the square root of heritability in the validation population, of genomic prediction for VL in the given scenario. (A) Accuracy of genomic prediction across PRRSV isolates for the whole genome, genome when we account for WUR genotype (Genome - WUR), and using only the SNPs in the 1 Mb window containing the WUR SNP (WUR only). (B) Average accuracy of genomic prediction across PRRSV isolates without validation breeding company in training (Excluding) and with validation breeding company in training (Including). (C) Average accuracy of prediction across breeding companies with indicated PRRSV isolate validation population when training within the same PRRSV isolate, across isolate, or using both isolates. Individual points represent the accuracy of prediction for one trial with breeding companies represented in both PRRSV isolates having the same black shape and gray diamonds representing the trials which were not replicated.

175

Figure 5.4 Genomic prediction accuracy for Viral Load using SNP subsets based on single- SNP GWAS and functional analysis. Accuracy of genomic prediction using all SNPs (Whole Genome), those SNPs with P<0.003 from single-SNP analysis, or SNPs located near genes in enriched biologically relevant gene ontology terms (Gene Ontology).

176

(A)

(B)

(C) Figure 5.3 Accuracy of genomic prediction across PRRSV isolates for Weight Gain. Accuracy, presented as the average correlation between genomic estimated breeding values and adjusted phenotypes divided by the square root of heritability in the validation population, of genomic prediction for WG in the given scenario. (A) Accuracy of genomic prediction across PRRSV isolates for the whole genome, genome when we account for WUR genotype (Genome - WUR), and using only the SNPs in the 1 Mb window containing the WUR SNP (WUR only). (B) Average accuracy of genomic prediction across PRRSV isolates without validation breeding company in training (Excluding) and with validation breeding company in training (Including). (C) Average accuracy of prediction across breeding companies with indicated PRRSV isolate validation population when training within the same PRRSV isolate, across isolate, or using both isolates. Individual points represent the accuracy of prediction for one trial with breeding companies represented in both PRRSV isolates having the same black shape and gray diamonds representing the trials which were not replicated.

177

Figure 5.6 Genomic prediction accuracy for Weight Gain using SNP subsets based on single-SNP GWAS and functional analysis. Accuracy of genomic prediction using all SNPs (Whole Genome), those SNPs with P<0.003 from single-SNP analysis, or SNPs located near genes in enriched biologically relevant gene ontology terms (Gene Ontology).

178

Tables

Table 5.1 General information on each PRRSV infection trial.

Viral Load Weight Gain Breeding PRRSV Trial(s) Breed Average Average Isolate Company n n (s.d.) (s.d.) 1-3 1 LW1 x LR2 504 108.3 (8.1) 487 12.1 (4.4) 4 2 Duroc x LW/LR 192 113.2 (6.3) 190 15.9 (4.0) 5 3 Duroc x LR/LW 184 101.4 (7.2) 183 19.1 (2.9) 6 4 LR x LR 123 109.6 (8.0) 106 14.8 (5.6) NVSL 7 5 Pietrain x LW/LR 189 104.5 (6.2) 189 14.5 (3.2) 8 6 Duroc x Y3/LR 188 107.9 (6.6) 182 10.2 (4.6) 15 7 LR x LW 171 107.6 (10.9) 165 19.1 (4.0) NVSL - - 1551 107.0 (8.4) 1502 14.9 (5.0) 10 7 LR x LW 174 93.9 (6.7) 179 19.1 (4.2) 11 1 LW x LR 170 100.4 (6.4) 178 18.6 (4.4) KS06 12 8 LR x LW 171 104.7 (6.3) 170 19.0 (4.1) 14 3 Duroc x LR/LW 180 98.6 (7.7) 171 21.3 (4.1) KS06 - - 695 99.4 (7.8) 641 19.5 (4.3) - Total - - 2246 104.6 (8.9) 2200 16.4 (5.2) 1 Large White (LW); 2 Landrace (LR); 3Yorkshire (Y).

179

CHAPTER 6: GENERAL DISCUSSION

Diseased pigs cause great economic losses in the swine industry. Inherited diseases, such as immunodeficiencies, are not generally a main focus of producers or animal breeders; however ignoring an issue will not resolve it. Death loss in the nursery phase is likely to be attributed to post-weaning multisystemic wasting syndrome (López-Soria et al., 2011) or other communicable diseases, while Mendelian segregating immune deficiencies have not been considered to date.

Severe Combined Immunodeficient Pigs

The first objective of this thesis was to discover the mutations that caused naturally occuring Severe Combined Immunodeficiency (SCID) in pigs from a selection line of pigs at

Iowa State University. In Chapter 3 of this thesis, we describe the discovery of two novel mutations in the Artemis gene that cause SCID in these pigs. The SCID phenotype of these pigs also closely mimics humans with Artemis deficiency (also known as Radio-Sensitive SCID;

Dvorak and Cowan, 2010), as fibroblasts from SCID pigs were shown to have increased sensitivity to ionizing radiation. We provided further evidence of mutations in Artemis as causing

SCID in these pigs by rescuing this cellular phenotype by transfection of human Artemis cDNA containing plasmid into the SCID fibroblasts, which resulted in reduced radiosensitivity. We interpreted this results as indicating the human ARTEMIS protein expressed from the plasmid had rescued the lack of porcine ARTEMIS protein. These mutations were associated with two haplotypes based on SNPs on the commercially available Illumina Porcine SNP60 Beadchip

(Ramos et al., 2009). Both of these haplotypes were traced back through the pedigree to the founders of the population in which SCID was discovered, which were sourced from commercial

180 pigs in the Midwestern United States. This indicates that these deleterious mutations may be segregating in the swine industry.

SCID carrier status of commercial pigs

To determine if commercial pigs carried the SCID haplotypes, we performed a haplotype analysis on a total of 17,324 breeding-age pigs from the National Swine Registry dataset, a commercial pig breeding company, and the RFI selection lines at ISU. The pigs from the

National Swine Registry and the RFI selection lines at ISU were purebred Yorkshire pigs, while the pigs from the breeding company were from multiple lines with unknown breed composition.

The results of these haplotype analyses are in Appendix A. Out of the total 14,974 commercial pigs for which SNP genotypes were available (those from the National Swine Registry and the breeding company), 359 pigs were shown to carry haplotypes identical to one of the SCID haplotypes identified in Chapter 3 of this thesis. Out of the 2,350 pigs from the RFI selection lines at ISU, 312 animals carried one of the identified SCID haplotypes. However, there were 14 pigs that were either homozygous for one SCID haplotype or compound heterozygous, carrying one of each SCID haplotype, yet survived into adulthood, presumably displaying a normal immunosufficeint phenotype. These 14 pigs were from the low RFI line, which is the line in which the first 4 known SCID piglets were identified. These 14 pigs would be expected to be

SCID affected, meaning that they would not have been able to survive past approximately 6 weeks of age without sterile housing conditions. As these pigs were of breeding age (over 8 months old), this shows that the SNP60 genotypes in this region do not fully track transmission of the causative mutations for SCID, and it is possible that the SCID SNP haplotypes, although they are closely associated with the SCID phenotype in the ISU population, do not track the causative mutations in all populations.

181

The true SCID mutation carrier status could be obtained by genotyping these suspected

SCID carriers and those supposedly SCID affected pigs for the causative mutations at Artemis, as described in Chapter 3. Furthermore, genetic tests for the SCID causitive mutations can be used to avoid carrier-by-carrier matings. SCID carrier-by-carrier matings would result in litters in which one quarter of the piglets weaned would not survive the nursery phase. It is not absolutely necessary that SCID carriers are completely eliminated in the commercial pig industry. One option would be to ensure that all parents of one sex do not carry either SCID mutation. In this way, no matings would be able to produce SCID piglets; only, at most, SCID carriers could be produced, which have none of the health symptoms of SCID affected pigs. Since haplotypes based on the commercially available porcine SNP60 chip (Ramos et al., 2009) do not reliably track the SCID causative mutations, tests for these specific mutations need to be used to make sure there are no carrier-by-carrier matings. We have begun the development of such tests (see

Appendix B).

SCID pig as a biomedical model

In commercial swine operations, SCID pigs would be undesirable, causing economic losses and increased piglet disease and suffering. On the other hand, this immunodeficient pig provides a very attractive model to the biomedical community. In fact, several groups have spent years attempting to create SCID pigs. This hard work paid off with the creation of transgenic

SCID pigs, including RAG1 or RAG1/RAG2 knockouts (Huang et al., 2014; Lee et al., 2014) and others with mutations in the X-linked IL2RG gene (Suzuki et al., 2012; Watanabe et al., 2013).

Pigs are anatomically, physiologically, and immunologically more similar to humans than mice

(Mestas and Hughes, 2004; Meurens et al., 2012), and these SCID pig models provide an alternative to the widely used biomedical model, SCID mice. SCID pigs could be used in studies

182 investigating many aspects of human immune function, stem cell-based therapeutics and vaccine testing, etc.

Studies are ongoing to develop this biomedical model, as several questions remain with regards to the use of Artemis deficient SCID pigs. We discovered two independent mutations in the Artemis gene that cause SCID in these pigs (Chapter 3). One mutation, in haplotype 16 (h16), is a splice site variant that results in Artemis transcripts lacking the 141 base pair exon 8. This transcript would maintain its reading frame, but would lack a portion of the beta-CASP region, which is required for proper Artemis function (Callebaut et al., 2002; Poinsignon et al., 2004).

The other mutation, in h12, is a point mutation in exon 10, which changes a Tryptophan amino acid codon to a stop codon. Translation of a transcript with this mutation would be expected to produce a truncated protein missing 61% of the amino acids in the complete Artemis protein.

There may be functional differences in these two mutated proteins, which remain to be fully investigated. Results presented in Chapter 3 have shown that there are no differences in the numbers of B or T cells or sensitivity of fibroblasts to ionizing radiation between animals or cells homozygous for either mutation. But, differences have been noted in fatal cancers of bone marrow rescued pigs with different combinations of these haplotypes, as described in the following.

Allogenic bone marrow transfer rescues immunodeficient phenotype in pigs

Allogenic bone marrow transfer (BMT) has been employed to rescue the immunodeficiency in a total of 9 piglets to date (Powell et al., In preparation). Five of these had to be euthanized at an early age due to severe graft versus host disease (GvHD). The remaining 4

BMT-rescued piglets survived the initial months following BMT. They were housed in a non-

183 sterile environment, indicating that these pigs were exposed to pathogens and were able to mount an immune repsonse to maintain their health. Two of these surviving piglets were shown to respond to vaccination, as well (Powell et al., In preparation). Three of these 4 BMT pigs succumbed to cancers at approximately 12 months of age. Interestingly, 2 of these pigs were homozygous for the h16 mutation and both developed immune related cancers, lymphoma and/or leukemia. The third BMT pig, who was a compound heterozygous h12/h16 SCID, developed nephroblastoma, which is a fairly common cancer affecting swine (Engström and Granerus,

2009). As of October, 2015, the last BMT SCID pig has survived to 3 years and 5 months of age.

This boar, Wayne, is a compound heterozygous h12/h16 SCID pig who was BMT-rescued.

There could be many reasons for the differences in life span of SCID pigs; but, the current hypothesis is that the protein translated from the h16 mutant transcript may have some residual enzymatic activity, leading to abberant rearrangement of lymphocyte receptors. Hypomorphic

Artemis mutations have been shown to result in oncogenic chromosomal rearrangements in mice

(Jacobs et al., 2011). Also, clinical manifestations vary based on the specific mutations in

Artemis deficient human SCID patients (Lee et al., 2013), including lymphoma arising from several types of mutations (Felgentreff et al., 2015). Further studies are required to determine the functional differences in vitro and in vivo between these two Artemis mutations.

As described above, our group was able to rescue this immunodeficiency in pigs with allogenic BMT (giving a pig immune system to SCID pigs). One large question that remains is whether or not we will be able to successfully humanize the SCID pigs. Humanization involves transplantation of human immune cells into a non-human animal (Pearson et al., 2008), such as our SCID pigs. SCID mice have been widely used in biomedical reasearch (Mestas and Hughes,

2004; De Villartay, 2009) and much research has been done to improve the SCID mouse model,

184 including combining mutations to increase the success rate of humanization (Shultz et al., 2007;

Pearson et al., 2008; Tanner et al., 2014). The SCID pig described in this thesis and other publications (Cino-Ozuna et al., 2012; Ewen et al., 2014) was not able to reject human cancer cells (Basel et al., 2012), but a large question remains on whether or not human stem cells would survive in this model, resulting in a humanized SCID pig.

SCID pigs as a model to study disease in swine

SCID pigs are a valuable model in biomedical research due to the similarity of pigs and humans (Lunney, 2007; Kuzmuk and Schook, 2011; Meurens et al., 2012; Swindle et al., 2012), but these pigs would also be useful in studies of pig-specific diseases. The SCID pigs described in Chapter 3 lack both B and T cells (Ewen et al., 2014), two very important cellular components of the adaptive immune system, which could help answer questions about the pathogenesis of diseases in swine. The Porcine Reproductive and Respiratory Syndrome (PRRS) is one example of an economically important swine disease (Holtkamp et al., 2013) for which the viral pathogenesis is not well understood. Recently, a study was conducted to determine the differences in response to PRRS virus (PRRSV) infection between these SCID pigs and their non-SCID littermates (Chen et al., 2015). While the non-SCID piglets responded in a typical fashion to the PRRSV infection, reaching peak viremia around 11 days post infection (dpi), then decreasing the levels of virus in the blood until the end of the study, PRRSV viremia of the SCID pigs showed a different profile. Up to and including 11 dpi, the SCID pigs had lower levels of

PRRSV in their circulating blood compared to non-SCID littermates. Subsequently, this relationship reversed because SCID pigs were unable to control PRRSV replication; while the non-SCID pig immune system was combatting and clearing the PRRSV, the PRRSV levels in

SCID pigs continued increasing. T cell function may play a role in the low PRRSV levels during

185 the first 11 dpi, as well as the high PRRSV levels after 11 dpi. These results led Chen et al.

(2015) to speculate that IL-10, a cytokine produced by suppressor T cells, could be responsible for increasing the permissivity of macrophages available to be infected by the PRRSV (Patton et al., 2009; Cecere et al., 2012), in turn allowing for more viral replication at the early time point in non-SCID pigs. Later in the PRRSV infection period, T helper cells produce another cytokine,

IFN-γ, which acts to protect macrophages from infection by the PRRSV (Rowland et al., 2001).

While no cytokine measurements are available to support this hypothesis, Chen et al. (2015) used these immunodeficient pigs and their non-SCID littermates to examine the immune response to

PRRSV infection, providing valuable information on the immunobiology behind this viral disease.

Porcine Reproductive and Respiratory Syndrome

Maintenance of herd health is a topic of great importance in the swine industry, and includes the implementation of many continuously updated protocols, including strict biosecurity measures, vaccination, and constant vigilance. All of this is done to prevent introduction or spread of communicable diseases, such as Porcine Reproductive and Respiratory Syndrome

(PRRS). PRRS virus (PRRSV) outbreaks are estimated to cause a $664 million loss to the United

States swine industry annually (Holtkamp et al., 2013). Vaccines have been developed to combat

PRRSV outbreaks but their success is limited due to the lack of protection against heterologous infections (Park et al., 2014) and the hesitation of farmers to implement a PRRS vaccination protocol because the vaccine is a modified live virus and can cause clinical symptoms (Hu and

Zhang, 2014; Renukaradhya et al., 2015). Genetic selection of pigs with improved response to

PRRSV infection is a valuable tool that can be added to the current PRRSV fight toolbox.

Studies have shown that there is genetic variation in response to PRRSV (Lewis et al., 2007;

186

Boddicker et al., 2012), indicating that it is possible to select animals with improved genetics for

PRRSV response traits such as reproductive performance (Lewis et al., 2009; Serão et al., 2014), survivability (Vukasinovic and Clutter, 2010), antibody levels (Serão et al., 2014), viremia in blood (Biffani et al., 2010; Boddicker et al., 2012; Boddicker et al., 2014a; Hess et al., 2014;

Boddicker et al., 2014b), and weight gain under infection (Boddicker et al., 2012; Boddicker et al., 2014a; Hess et al., 2014; Boddicker et al., 2014b). A major quantitative trait locus (QTL) on

Sus scrofa chromosome (SSC) 4 was identified to be associated with both the amount of virus in the blood (VL) of PRRSV infected pigs up to 21 days post infection (dpi) and weight gain of piglets from 0 to 42 dpi (WG; Boddicker et al. 2012; Boddicker, Bjorkquist, et al. 2014;

Boddicker, Garrick, et al. 2014). These studies focused on the effects of the SSC 4 QTL and included data from infection of pigs with only one PRRSV isolate. The work presented in

Chapter 4 of this thesis examined the genomic regions that account for over 80% of the genetic variation of each trait that was not explained by the SSC 4 QTL. Furthermore, our data included the response of piglets infected with both the NVSL PRRSV isolate that was used in the

Boddicker et al. (2012; 2014a; 2014b) studies and a more recent PRRSV isolate, KS06.

Genome wide association studies

The second and third objectives of this thesis were to identify genomic regions associated with experimental infection with one of two isolates of the PRRSV and to assess enrichment of gene ontology (GO) annotation terms of genes in these identified SNP-associated regions. In

Chapter 4 of this thesis, we identified several previously unidentified genomic regions that were associated with response to PRRSV; but there were no regions that were consistently associated with VL or WG in both PRRSV isolate datasets. We utilized two GWAS methods in our analyses, single SNP and Bayesian variable selection GWAS (Bayes-B). Several regions were

187 shown to be associated with WG or VL by both single SNP and Bayes-B GWAS analyses, although single SNP analysis identified a larger number of associated regions than Bayes-B.

Single SNP analysis performed tests on each SNP individually, with no constraints on the number of SNP associated with the phenotype of interest, while our Bayes-B analyses assumed that only 1% of the SNPs in the genotype data had an effect on the phenotype. To determine whether Bayes-B would be able to identify additional regions with a relaxed assumption of SNP association, we lowered π values to those obtained by the Bayes-Cπ method (Chapter 5). Using these lower values of π (ranging from 0.42 to 0.77) did, however, not reveal additional genomic regions to be associated with either trait. Single SNP GWAS was better able to identify genomic regions with smaller effects on each phenotype than Bayes-B, as evidenced by the functional results described in the following. In an effort to determine if these SNPs with small effects

(those with less than genome-wide statistical significance) were true associations rather than spurious statistical associations, we performed gene ontology (GO) annotation enrichment tests to assess the biological relevance of the genes located near SNPs associated with the phenotype.

Quantitative traits, such as response to viral infection, are controlled by the action of many genes, and each of them may have small effects on the trait. When genes with small effects interact in pathways or biological processes, their cumulative effects may result in larger effects on the overall phenotype, assuming additive gene action without complementation. SNPs with moderate associations may be in LD with these minor QTL. Relaxation of the significance threshold in the single SNP results will increase the number of regions identified, but functional annotation or GO term enrichment analyses can be used to increase confidence of true association.

188

Functional analysis of GWAS results

As functional analysis of SNP-based GWAS results has not been implemented in our traits of interest, we first assessed two criteria for gene list creation - 1) statistical significance threshold for identification of associated SNPs and 2) the chromosomal distance (in kilo base pairs; kb) from the associated SNPs for identification of positional candidate genes. We used

BioMart (www.biomart.org) to create lists of genes within window sizes of 1,000, 500, and 200 kb (500, 250, and 100 kb 5’ and 3’) surrounding SNPs associated with the trait of interest in the data from one of two PRRSV isolates at p-value thresholds of 0.01, 0.003, and 0.001, and investigated the enrichment of GO terms in the biological process slim category from the Panther software (Mi et al., 2013). The number of genes in the lists increased with an increase in the significance threshold and with larger window size. Across traits and PRRSV isolates, lists of genes within 250 kb of SNPs with p-value<0.003 were most enriched for biologically relevant

GO terms. Thus, we performed more in depth GO term enrichment analyses on these lists.

We found that several GO terms were enriched in the genes that were located near SNPs associated with each trait. Furthermore, although genomic regions were not consistently associated across the two PRRSV isolates used in this study, we found several of the same GO terms to be enriched for genes in the associated regions across isolates. Genes near SNPs associated with VL were enriched for natural killer cell activation, B cell activation, and immune response GO terms. Genes near SNPs associated with WG were enriched for an antigen processing and presentation GO term. Genes near SNPs associated with WG in the NVSL data were enriched for several metabolically-related GO terms, although enrichment for these metabolic GO terms was not statistically significant for WG in the KS06 data. Our GO term

189 enrichment analysis also aided in the identification of functionally relevant candidate genes, as shown in the following table 6.1.

Table 6.1. Candidate genes identified in genomic regions with only the most significant statistical associations (GWAS identified) or those identified through GO term enrichment analyses (Annotation identified).

We found that GO term enrichment analyses of the single SNP GWAS results identified more candidate genes compared to looking at genes located near only the most significantly associated SNPs (Chapter 4). One value of GO term enrichment analysis is in the ease with which these additional candidate genes with relevant function in viral immune response were identified. To identify candidate genes from GWAS results, we could search for genes located in each SNP-associated region at a time, using a genome browser, such as Ensembl

(http://www.ensembl.org/Sus_scrofa/Info/Index?db=core). In addition to decreasing the time required for detailed analysis of GWAS results, GO term enrichment of genes near SNP associated with the phenotype of interest adds valuable biological information to the statistical association of SNPs. Functional annotation of genes near SNPs that have moderate associations

190 with the phenotype of interest increase the confidence of true association. Furthermore, functional analysis identifies possible pathways involved in the phenotype of interest, which in turn aids in the discovery of candidate genes.

Genomic prediction

GWAS results help to identify genomic regions associated with traits of interest; another use of the associations between genetic markers and phenotypes is genomic prediction. Genomic prediction involves the use of statistical methods to predict the genetic merit, or genomic estimated breeding value (GEBV), of a selection candidate for a phenotype that has not been observed on that individual (Meuwissen et al., 2001). Prediction of GEBVs require a training population of individuals that have both phenotypic observations for the trait being selected upon and genotypic observations for the same markers for which the selection candidates in the validation population have been genotyped. The effects of alleles at each genetic marker are estimated using the data from the training population and then used to predict the GEBVs of the validation population using only genotypes for those markers (Meuwissen et al., 2001). These

GEBVs can then be used by animal breeders for genomic selection, or selecting animals with the highest genetic merit for the trait or traits of interest using genotypic information (Dekkers,

2010). Marker assisted selection (MAS) was one of the first eras of genomic selection and involved the use of relatively few genetic markers to predict genetic merit of the selection candidate (Smith and Simpson, 1986; Lande and Thompson, 1990). One of the major drawbacks of MAS was that many associations between markers and phenotypes were unable to be replicated, possibly due to the fact that the associations were detected using specific breed crosses and were not applicable in commercial crosses (Andersson, 2001; Dekkers, 2012).

Genetic based prediction of GEBVs for complex polygenic traits, those controlled by many

191 genes, benefitted from the advent of dense genome-wide SNP chips, which provide genotypes for thousands to hundreds of thousands of genetic markers (Ramos et al., 2009).

Genomic selection is particularly useful for traits that are difficult or expensive to measure, expressed only in one sex or later in life, or highly polygenic traits controlled by many

QTL with small effects. Response to PRRSV infection is a good example of a highly polygenic trait that is very hard and expensive to measure, even impossible on selection candidates or their close relatives; this is because breeding animals are housed on nucleus farms with extremely strict biosecurity protocols to maintain high herd health. The aim of Chapter 5 of this thesis was to assess the accuracy of genomic prediction of PRRSV response traits across breeding companies and PRRSV isolates.

Even with inconsistency of associated regions across PRRSV isolates (Chapter 4), we showed that using data from one PRRSV isolate to estimate SNP effects (training) and using these estimates to predict the phenotype of the other isolate using only genotypes (validation) was moderately accurate (Chapter 5). We also found that genomic prediction was moderately accurate across breeding companies, which disagrees with previous reports of very low to no genomic prediction accuracy for different traits across breeds (Hayes et al., 2009; Ibánẽz-

Escriche et al., 2009; de Roos et al., 2009; Toosi et al., 2009; Daetwyler et al., 2012). Although the training and validation populations used in this study were from different breeding companies, the commercial crossbred pigs were likely more genetically similar across breeding companies than the purebred Holstein and Jersey populations used by Hayes et al. (2009).

Ibánẽz-Escriche et al. (2009), de Roos et al. (2009), and Toosi et al. (2009) examines the accuracy of genomic prediction across breeds using simulations, which requires many assumptions that are often violated in real data, such as that used in this study. For instance,

192 population simulation often assumes random mating of animals, and the pigs used in this study were commercial animals which have been selected on many economically important traits and matings are made to capitalize on the genetic merit of the parents without large increases in inbreeding (Silió et al., 2013).

When we increased the genetic relationship between the training and validation populations, by including the validation breeding company in training for across PRRSV isolate genomic prediction, on average, genomic prediction accuracy increased, which agrees with previous studies that find increased accuracy of genomic predictions with increased relationships between training and validation populations (Habier et al., 2007; Goddard, 2008; Hayes et al.,

2009; Daetwyler et al., 2013).

SNP subsets used in genomic prediction

We also used different subsets of SNPs for genomic prediction, specifically excluding the

SNPs near the SSC 4 QTL from prediction and fitting WUR genotype as a fixed effect in the model (Genome-WUR) and only the SNPs in the 1 Mb window that includes the WUR SNP

(WUR only). Our results showed that genomic prediction was most accurate when using the effects of all SNPs in the whole genome. In most scenarios, the contribution of the WUR region to the accuracy of genomic prediction was large, as evidenced by the marked decrease in genomic prediction accuracy using the Genome-WUR SNPs from the whole genome (Chapter

5), which agreed with previous results by Boddicker et al. (2014a). Boddicker et al. (2014a) found that genomic prediction based on only the SSC 4 QTL SNPs was generally more accurate than using all SNPs in the genome; however, we found that using all SNPs in the genome was the most accurate. We also hypothesized that using only the SNPs associated with the trait of

193 interest in single SNP analyses or associated SNPs near genes whose function is likely involved in the trait increases genomic prediction accuracy compared to using all SNPs in the genome by removing some noisy SNPs. However, we found that genomic prediction using the whole genome was generally more accurate than either subset of SNPs (Chapter 5). Overall, these results show that genomic prediction for the scenarios used in this study is most accurate when using the estimates of all SNPs in the genome.

Alternative strategies that could be used to incorporate GO term enrichment information into genomic prediction would be to use all SNPs in the genome, but weight different subsets of

SNPs based on the annotation of genes close to the SNPs. In this way, genomic prediction takes advantage of the information from all genotyped SNPs with the additional information provided by the annotation of genes near these SNPs. For example, we could assume that SNPs near genes involved in natural killer cell activation would be more likely to have an effect on VL than SNPs near genes annotated with the mammary gland development GO term. In this example, we would allow SNPs in the natural killer cell activation subset to have a higher likelihoood of association with VL than SNPs in the mammary gland development subset by setting a higher pi value for the latter than the former. Several issues may complicate the implementation of this strategy including the following: 1) Genes may be annotated with multiple GO terms, and determining one specific subset for neighboring SNPs may not be clear. 2) The porcine genome annotation is imperfect, and the locations of over 5% of the SNPs on the current Illumina Porcine SNP60 chip are unknown. 3) The values at which to set pi for each SNP subset are unknown and depend on many factors, such as the number of SNPs in each subset and the underlying genetic architecture of the trait.

194

Application of this research

The research presented in Chapters 4 and 5 of this thesis provide valuable information about the genomics of piglet’s response to experimental infection with one of two isolates of the

PRRSV. We used GO term enrichment of GWAS results to identify several biological pathways that were overrepresented in genes near SNP associated with response to PRRSV infection. This time efficient method could be used to provide information about the biological roles of genes near SNPs associated with any trait and in any species with an annotated genome. Functional analysis of GWAS results is useful in reducing the time required to identify functionally relevant genes near SNPs with moderate or smaller sized effects and also provides biological information to the statistical association of SNPs. Chapter 5 of this thesis showed that genomic prediction of response to the PRRSV was moderately accurate when predicting across PRRSV isolate datasets and when predicting across genetic backgrounds. These results would be useful for selection of pigs in high health nucleus herds, since these animals are unlikely to be exposed to the PRRSV, while it is possible that their progeny or grand-progeny may be on commercial swine farms during a PRRSV outbreak. Therefore, the increased genetic value of response to PRRSV infection may not benefit the nucleus animal, but favorable alleles may be passed to future generations and prove valuable. The moderate accuracies of genomic prediction across breeding companies are a valuable result, as this shows that the training population does not need to contain close relatives of selection candidates. We also showed that the PRRSV isolate of infection of pigs in the training does not need to be the same isolate of infection of pigs in the validation, which is important to the swine industry because the PRRSV rapidly mutates and infection with the exact same PRRSV strain is unlikely to occur in multiple PRRSV outbreaks

(Fang et al., 2007; Kimman et al., 2009).

195

Unanswered questions

There remain several questions to be addressed with regards to the work presented in

Chapters 4 and 5 of this thesis. First, the data used in these studies were from piglets that were experimentally challenged with the PRRSV in a clean and controlled biosecure environment

(Lunney et al., 2011). It is not clear whether the same genomic regions would be identified to be associated with response to natural PRRSV infection in field conditions, where pigs would likely be coinfected with multiple pathogens during a PRRSV outbreak (Murtaugh et al., 2002).

Furthermore, although our data includes responses to two PRRSV isolates, there are many more

PRRSV strains in the swine industry (Fang et al., 2007; Kimman et al., 2009). Other than the

SSC 4 QTL, no regions were identified across the two isolates of infection in this study, but there may be some consistency of SNP-region associations in future studies with other PRRSV isolates. When performing genomic selection, it is wise to ensure that SNPs with larger effects on the trait of interest do not have negative effects on other economically important traits in the swine industry; for example, if the favorable genotype at a SNP associated with WG during

PRRSV infection has unfavorable association with WG in healthy conditions, it would not be profitable to select on this SNP unless piglets will always be infected with the PRRSV.

Furthermore, selecting for favorable alleles for response to PRRSV infection that have unfavorable effects on response to other diseases would not be profitable. Care should always be taken when selecting on immune response traits, even without unfavorable associations as mentioned previously. The immune system should be reactive enough to protect the host from the infecting pathogen without overreacting to the pathogen, as this can cause damage to the host, also. In addition to efficient, the immune response should also be effective; if antibodies do

196 not aid in eliminating a specific pathogen from the host, then production of antibodies will only result in wasting of energy by the host without positive effects on pathogen clearance.

Literature Cited

Basel, M. T., S. Balivada, A. P. Beck, M. A. Kerrigan, M. M. Pyle, J. C. M. Dekkers, C. R. Wyatt, R. R. R. Rowland, D. E. Anderson, S. H. Bossmann, and D. L. Troyer. 2012. Human xenografts are not rejected in a naturally occurring immunodeficient porcine line: a human tumor model in pigs. Biores. Open Access 1:63–68.

Biffani, S., S. Botti, a. Caprera, E. Giuffra, and a. Stella. 2010. Genetic Susceptibility to Porcine Reproductive and Respiratory Syndrome (PRRS) virus in commercial pigs in Italy. Proc. Ninth World Congr. Genet. Appl. to Livest. Prod.:592–595.

Boddicker, N. J., A. Bjorkquist, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. Dekkers. 2014a. Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection. Genet Sel Evol 46:18.

Boddicker, N. J., D. J. Garrick, R. R. R. Rowland, J. K. Lunney, J. M. Reecy, and J. C. M. Dekkers. 2014b. Validation and further characterization of a major quantitative trait locus associated with host response to experimental infection with porcine reproductive and respiratory syndrome virus. Anim. Genet. 45:48–58.

Boddicker, N., E. H. Waide, R. R. R. Rowland, J. K. Lunney, D. J. Garrick, J. M. Reecy, and J. C. M. Dekkers. 2012. Evidence for a major QTL associated with host response to Porcine reproductive and respiratory syndrome virus challenge. J. Anim. Sci. 90:1733–1746.

Callebaut, I., D. Moshous, J.-P. Mornon, and J.-P. de Villartay. 2002. Metallo-beta-lactamase fold within nucleic acids processing enzymes: the beta-CASP family. Nucleic Acids Res. 30:3592–3601.

Cecere, T. E., S. M. Todd, and T. Leroith. 2012. Regulatory T cells in arterivirus and coronavirus infections: do they protect against disease or enhance it? Viruses 4:833–46.

Chen, N., J. C. M. Dekkers, C. L. Ewen, and R. R. R. Rowland. 2015. Porcine reproductive and respiratory syndrome virus replication and quasispecies evolution in pigs that lack adaptive immunity. Virus Res. 195:246–9.

Cino-Ozuna, A., R. Rowland, J. Nietfeld, M. Kerrigan, J. Dekkers, and C. Wyatt. 2012. Preliminary Findings of a Previously Unrecognized Porcine Primary Immunodeficiency Disorder. Vet. Pathol. 50:144–146.

Dvorak, C. C., and M. J. Cowan. 2010. Radiosensitive Severe Combined Immunodeficiency Disease. Immunol. Allergy Clin. North Am. 30:125–142.

197

Engström, W., and M. Granerus. 2009. The expression of the insulin-like growth factor II, JIP-1 and WT1 genes in porcine nephroblastoma. Anticancer Res. 29:4999–5003.

Ewen, C., A. Cino-Ozuna, H. He, M. Kerrigan, J. Dekkers, C. Tuggle, R. Rowland, and C. Wyatt. 2014. Analysis of blood leukocytes in a naturally occurring immunodeficiency of pigs shows the defect is localized to B and T cells. Vet. Immunol. Immunopathol. 162:174–179.

Fang, Y., P. Schneider, W. P. Zhang, K. S. Faaberg, E. a. Nelson, and R. R. R. Rowland. 2007. Diversity and evolution of a newly emerged North American Type 1 porcine arterivirus: Analysis of isolates collected between 1999 and 2004. Arch. Virol. 152:1009–1017.

Hess, A. S., N. J. Boddicker, R. R. R. Rowland, J. K. Lunney, G. S. Plastow, and J. C. M. Dekkers. 2014. Genetic Parameters and Effects for a Major QTL of Piglets Experimentally Infected with a Second Porcine Reproductive and Respiratory Syndrome Virus. In: The Tenth World Congress of Genetics Applied to Livestock Production. Vancouver, BC, Canada.

Holtkamp, D. J., J. B. Kliebenstein, E. J. Neumann, J. J. Zimmerman, H. F. Rotto, T. K. Yoder, C. Wang, P. E. Yeske, C. L. Mowrer, and C. a. Haley. 2013. Assessment of the economic impact of porcine reproductive and respiratory syndrome virus on United States pork producers. J. Swine Heal. Prod. 21:72–84.

Hu, J., and C. Zhang. 2014. Porcine reproductive and respiratory syndrome virus vaccines: current status and strategies to a universal vaccine. Transbound. Emerg. Dis. 61:109–20.

Huang, J., X. Guo, N. Fan, J. Song, B. Zhao, Z. Ouyang, Z. Liu, Y. Zhao, Q. Yan, X. Yi, A. Schambach, J. Frampton, M. a Esteban, D. Yang, H. Yang, and L. Lai. 2014. RAG1/2 Knockout Pigs with Severe Combined Immunodeficiency. J. Immunol. 193:1496–503.

Jacobs, C., Y. Huang, T. Masud, W. Lu, G. Westfield, W. Giblin, and J. M. Sekiguchi. 2011. A hypomorphic Artemis human disease allele causes aberrant chromosomal rearrangements and tumorigenesis. Hum. Mol. Genet. 20:806–819.

Kimman, T. G., L. A. Cornelissen, R. J. Moormann, J. M. J. Rebel, and N. Stockhofe-Zurwieden. 2009. Challenges for porcine reproductive and respiratory syndrome virus (PRRSV) vaccinology. Vaccine 27:3704–3718.

Kuzmuk, K. N., and L. B. Schook. 2011. Pigs as a Model for Biomedical Sciences. Genet. Pig Sec2:426–444.

Lee, K., D.-N. Kwon, T. Ezashi, Y.-J. Choi, C. Park, A. C. Ericsson, A. N. Brown, M. S. Samuel, K.-W. Park, E. M. Walters, D. Y. Kim, J.-H. Kim, C. L. Franklin, C. N. Murphy, R. M. Roberts, R. S. Prather, and J.-H. Kim. 2014. Engraftment of human iPS cells and allogeneic porcine cells into pigs with inactivated RAG2 and accompanying severe combined immunodeficiency. Proc. Natl. Acad. Sci. U. S. A. 111:7260–7265.

Lee, P. P., L. Woodbine, K. C. Gilmour, S. Bibi, C. M. Cale, P. J. Amrolia, P. A. Veys, E. G. Davies, P. A. Jeggo, and A. Jones. 2013. The many faces of Artemis-deficient combined

198 immunodeficiency — Two patients with DCLRE1C mutations and a systematic literature review of genotype–phenotype correlation. Clin. Immunol. 149:464–474.

Lewis, C. R. G., T. Ait-Ali, M. Clapperton, A. L. Archibald, and S. Bishop. 2007. Genetic perspectives on host responses to porcine reproductive and respiratory syndrome (PRRS). Viral Immunol. 20:343–358.

Lewis, C. R. G., M. Torremorell, L. Galina-Pantoja, and S. C. Bishop. 2009. Genetic parameters for performance traits in commercial sows estimated before and after an outbreak of porcine reproductive and respiratory syndrome. J. Anim. Sci. 87:876–884.

López-Soria, S., M. Nofrarías, M. Calsamiglia, A. Espinal, O. Valero, H. Ramírez-Mendoza, A. Mínguez, J. M. Serrano, O. Marín, A. Callén, and J. Segalés. 2011. Post-weaning multisystemic wasting syndrome (PMWS) clinical expression under field conditions is modulated by the pig genetic background. Vet. Microbiol. 149:352–7.

Lunney, J. K. 2007. Advances in swine biomedical model genomics. Int. J. Biol. Sci. 3:179–184.

Lunney, J. K., J. P. Steibel, J. M. Reecy, E. Fritz, M. F. Rothschild, M. Kerrigan, B. Trible, and R. R. Rowland. 2011. Probing genetic control of swine responses to PRRSV infection: current progress of the PRRS host genetics consortium. BMC Proc. 5 Suppl 4:S30.

Mestas, J., and C. C. W. Hughes. 2004. Of mice and not men: differences between mouse and human immunology. J. Immunol. 172:2731–2738.

Meurens, F., A. Summerfield, H. Nauwynck, L. Saif, and V. Gerdts. 2012. The pig: A model for human infectious diseases. Trends Microbiol. 20:50–57.

Murtaugh, M. P., Z. Xiao, and F. Zuckermann. 2002. Immunological responses of swine to porcine reproductive and respiratory syndrome virus infection. Viral Immunol. 15(4):533-547.

Park, C., H. W. Seo, K. Han, I. Kang, and C. Chae. 2014. Evaluation of the efficacy of a new modified live porcine reproductive and respiratory syndrome virus (PRRSV) vaccine (Fostera PRRS) against heterologous PRRSV challenge. Vet. Microbiol. 172:432–42.

Patton, J. B., R. R. Rowland, D. Yoo, and K.-O. Chang. 2009. Modulation of CD163 receptor expression and replication of porcine reproductive and respiratory syndrome virus in porcine macrophages. Virus Res. 140:161–171.

Pearson, T., D. L. Greiner, and L. D. Shultz. 2008. Humanized SCID mouse models for biomedical research. Curr. Top. Microbiol. Immunol. 324:25–51.

Poinsignon, C., D. Moshous, I. Callebaut, R. de Chasseval, I. Villey, and J.-P. de Villartay. 2004. The metallo-beta-lactamase/beta-CASP domain of Artemis constitutes the catalytic core for V(D)J recombination. J. Exp. Med. 199:315–321.

199

Powell, E. J., N. M. Ellinwood, C. Loving, E. H. Waide, E. Snella, J. Jens, J. C. M. Dekkers, and C. K. Tuggle. Brotherly love: MHC matched allogenic bone marrow transfer rescues spontaneous immunodeficiency in SCID pigs. In Preparation.

Ramos, A. M., R. P. M. A. Crooijmans, N. A. Affara, A. J. Amaral, L. Alan, J. E. Beever, C. Bendixen, C. Churcher, R. Clark, P. Dehais, M. S. Hansen, J. Hedegaard, Z. Hu, H. H. Kerstens, A. S. Law, D. Milan, D. J. Nonneman, G. A. Rohrer, M. F. Rothschild, and T. P. L. Smith. 2009. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology. 4:e6524.

Renukaradhya, G. J., X.-J. Meng, J. G. Calvert, M. Roof, and K. M. Lager. 2015. Live porcine reproductive and respiratory syndrome virus vaccines: Current status and future direction. Vaccine 33:4069–80.

Rowland, R. R. R., B. Robinson, J. Stefanick, T. S. Kim, L. Guanghua, S. R. Lawson, and D. A. Benfield. 2001. Inhibition of porcine reproductive and respiratory syndrome virus by interferon- gamma and recovery of virus replication with 2-aminopurine. Arch. Virol. 146:539–555.

Serão, N., O. Matika, R. A. Kemp, J. C. S. Harding, S. C. Bishop, G. S. Plastow, and J. C. M. Dekkers. 2014. Genetic analysis of reproductive traits and antibody response in a PRRS outbreak herd. J. Anim. 92:2905–2921.

Shultz, L. D., F. Ishikawa, and D. L. Greiner. 2007. Humanized mice in translational biomedical research. Nat. Rev. Immunol. 7:118–30.

Suzuki, S., M. Iwamoto, Y. Saito, D. Fuchimoto, S. Sembon, M. Suzuki, S. Mikawa, M. Hashimoto, Y. Aoki, Y. Najima, S. Takagi, N. Suzuki, E. Suzuki, M. Kubo, J. Mimuro, Y. Kashiwakura, S. Madoiwa, Y. Sakata, A. C. F. Perry, F. Ishikawa, and A. Onishi. 2012. Il2rg gene-targeted severe combined immunodeficiency pigs. Cell Stem Cell 10:753–758.

Swindle, M. M., A. Makin, A. J. Herron, F. J. Clubb, and K. S. Frazier. 2012. Swine as Models in Biomedical Research and Toxicology Testing. Vet. Pathol. 49:344–356.

Tanner, A., S. E. Taylor, W. Decottignies, and B. K. Berges. 2014. Humanized mice as a model to study human hematopoietic stem cell transplantation. Stem Cells Dev. 23:76–82.

De Villartay, J. P. 2009. V(D)J recombination deficiencies. Adv. Exp. Med. Biol. 650:46–58.

Vukasinovic, N., and A. C. Clutter. 2010. Analysis of Survival of Pigs Challenged with the Porcine Reproductive and Respiratory ( PRRS ) Virus. In: Proceedings of the Ninth World Congress on Genetics Applied to Livestock Production. Leipzig, Germany. p. 647–650.

Watanabe, M., K. Nakano, H. Matsunari, T. Matsuda, M. Maehara, T. Kanai, M. Kobayashi, Y. Matsumura, R. Sakai, M. Kuramoto, G. Hayashida, Y. Asano, S. Takayanagi, Y. Arai, K. Umeyama, M. Nagaya, Y. Hanazono, and H. Nagashima. 2013. Generation of Interleukin-2 Receptor Gamma Gene Knockout Pigs from Somatic Cells Genetically Modified by Zinc Finger Nuclease-Encoding mRNA. PLoS One 8:1–8.

200

APPENDIX A. RESULTS FROM PHASING OF SNP GENOTYPES FROM

COMMERCIAL ANIMALS TO PREDICT SCID CARRIER STATUS.

Table A.1. Number of animals carrying SCID haplotype 16 (h16) and h12 from 14,974 commercial pigs or pigs from the RFI selection line at ISU, which were sourced from commercial Yorkshire pigs. The breeds of pigs from the breeding company were unknown, with non-descript line identifiers (1, 2, and 3) provided. Haplotypes were based on 55 SNPs in the 2 Mb surrounding the Artemis gene.

Dataset Breed n h16 n h12 n n n n animals carriers carriers h16/h16 h12/h12 h12/h16 total

National Yorkshire 38 34 1 1 4 985 Swine Registry

Breeding 1 0 0 0 0 0 5,665 Company 2 0 209 0 0 0 5,492

3 0 78 0 0 0 2,832

Total 38 321 1 1 4 14,974

RFI - ISU Yorkshire 111 201 0 5 3 2,350

201

APPENDIX B. MOLECULAR GENETIC TESTS USED TO DETERMINE SCID

STATUS IN THE IOWA STATE UNIVERSITY RFI POPULATION.

Haplotype tests

In order to identify affected, carrier, and homozygous normal pigs in the ISU population that do not have genotype data, allele-specific primer sets were designed for seven SNPs in the region surrounding the Artemis gene, using Web-based Allele-Specific PCR (WASP;

Wangkumhang et al, 2007). PCR cycling conditions for each test included denaturation at 94°C for 30 seconds (30s), annealing at temperatures indicated for each primer pair in table A.1 for 30 seconds, and elongation at 72°C for the time indicated in Table B.1 (E-Time). Each PCR was begun with an initial hot start of 95°C for 2 minutes (2m), 40 cycles of amplification, and ended with a final elongation step at 72°C for 5m. Table B.1 gives the primer sequences, PCR condition information, location, and genotypes for the seven SNPs for which allele specific primers were designed. The bottom portion of the table indicates genotypes, locations of the

SNPs, and the number of the SNP in the 1 Mb haplotype from Figure B.1 in parentheses. Three of these SNPs are part of the 21 SNPs in the 1 Mb used for haplotype analysis, as indicated in

Table B.1. The remaining allele-specific primer sets were designed for SNPs in the 1 Mb upstream of Artemis. These SNPs were chosen as they segregated with the phenotype and the primer design software was able to produce primers sets that were very specific for each allele.

202

Table B.1. Primer and PCR conditions, genotypes, location and number of the SNPs. For reference, the Artemis gene (DCLRE1C ENSSSCG00000011049) is located at Chromosome 10: 51,553,277-51,597,276:

SNP Product Anneal E- Name Allele Size FWD Primer Sequence REV Primer Sequence Temp Time A 177 TCCTCTGACCAAGCCTCTG TCGTCCATGTACCAGAGC 56 30s ASGA- T (SEQ ID NO:11) CT (SEQ ID NO:12) 0048074 C 177 TCCTCTGACCAAGCCTCTG CGTCCATGTACCAGAGCC 56 30s T (SEQ ID NO:13) G (SEQ ID NO:14) A 659 AACCAGTCCCTGACCAAC TCCATATTTGTTAAGGGCA 54 30s ASGA- TG (SEQ ID NO:15) GTAATCT (SEQ ID NO:16) 0048114 G 131 TGCTCAGAGCTTTACATG GGCCCATGTTGACATAAA 54 30s GATTTAG (SEQ ID NO:17) GC (SEQ ID NO:18) A 643 TCCTCTGCAGGGTTTCAAA CAGGGTGTGGGACTTTGT 54 30s ALGA- G (SEQ ID NO:19) T (SEQ ID NO:20) 0059043 G 127 TCAGCTTGGGCAGCTAGG CCACAGGCACATTGATCT 54 30s (SEQ ID NO:21) TG (SEQ ID NO:22) A 576 AGTTGAAATCAAAGTATC AACTGTAACAAGCGTCCC 55 30s H3GA- CCAA (SEQ ID NO:23) TTTCT (SEQ ID NO:24) 0030245 C 576 AGTTGAAATCAAAGTATC AACTGTAACAAGCGTCCC 55 30s CCAA (SEQ ID NO:25) TTTCG (SEQ ID NO:26) ALGA- A 593 GGTATTCTCCTCCTCTACC CTGGATTGGCAGAGGCTC 55 30s 059061 TCT (SEQ ID NO:27) TTTAT (SEQ ID NO:28) G 593 GGTATTCTCCTCCTCTACC CTGGATTGGCAGAGGCTC 55 30s TCT (SEQ ID NO:29) TTTAC (SEQ ID NO:30) ALGA- A 475 GAATGGGAGGTGAGTAAG CCAGCTGCAAGGGAGACT 55 30s 0059066 TAAA (SEQ ID NO:31) (SEQ ID NO:32) C 475 GAATGGGAGGTGAGTAAG CCAGCTGCAAGGGAGACG 55 30s TAAA (SEQ ID NO:33) (SEQ ID NO:34) ALGA- A 425 AGCATTAAGACTGTGTGT GGTCAAAGTCGTGGGTGT 55 30s 0059080 GTGT (SEQ ID NO:35) GTTT (SEQ ID NO:36) G 425 AGCATTAAGACTGTGTGT GGTCAAAGTCGTGGGTGT 55 30s GTGT (SEQ ID NO:37) GTTC (SEQ ID NO:38) ASGA- ASGA- ALGA- H3GA- ALGA- ALGA- ALGA- 0048074 0048114 0059043 0030245 059061 0059066 0059080 51,153,137 51,812,252 51,975,024 52,066,694 52,086,867 52,109,172 52,174,549 Location (5) (16) (21) h12 C G G C A C A h16 C A G C A C A Normal A G A A G A G

203

Figure B.1. The 12 and 16 haplotypes that are associated with the SCID phenotype. Sire 15801 carries the 12 haplotype and sire 11403 and the 4 dams that produced the initial four SCID piglets, carry the 16 haplotype.

DNA and cDNA sequencing to identify the causative mutations

Total RNA was extracted from two sources: whole tissue from ears of 1-day-old piglets, and fibroblasts that were cultured from ear snip tissue. mRNA was reverse transcribed to cDNA.

Artemis transcripts were then amplified from the cDNA using the following primers: forward 5’-

GGATCCGTGTTCGCCAACGCT-3’ (SEQ ID NO:39 in Patent No. US-2015-0216147) and reverse 5’-GCGGCCGCAGAGCTGCCTTTTAGGTTAT-3’ (SEQ ID NO:40 in Patent No. US-

2015-0216147), using an annealing temperature of 60°C and an extension time of 2m 30s. The full Artemis cDNA was amplified using PCR technique, cloned into a TOPO vector, then plasmids transferred into E. coli bacteria which were then grown on LB plates with Ampicillin.

Individual colonies were then picked into LB-Ampicillin media and PCR amplified to ensure that each colony contained a vector with Artemis cDNA. Positive PCR products were cleaned

204 using Exo-Sap and sent to the Iowa State University DNA Facility for sequencing.

Supplemental Figure 3.1 shows the different transcripts observed in cells from affected pigs.

While many types of transcripts were observed, specific parts of the gene are missing for transcripts from the 12 and the 16 haplotypes.

For haplotype 16, all transcripts were found to lack exon 8, which results in a transcript that is 141 bases shorter than the expected full transcript. The sequence and translation of the longest cDNA clone from haplotype 16 is shown in Sequence and Translation 1 (SEQ ID NOS:1 and 2, respectively, in Patent No. US-2015-0216147). Exon 8 and the surrounding region were amplified from genomic DNA extracted from ear snip tissue to investigate the presence or absence of signal sequences required for normal splicing. Exons 7 and 8 and portions of the surrounding introns were amplified using the following primers: forward 5’-

CTCAGTGGGTTAGGGACCTG-3’ (SEQ ID NO:41 in Patent No. US-2015-0216147) and reverse 5’- GCCATCTGATAGGGTTTCCA-3’ (SEQ ID NO:42 in Patent No. US-2015-

0216147), with annealing temperature of 54°C and elongation time of 1m. Animals homozygous for the 16 haplotype were found to have a point mutation of GàA at the 615th base of the PCR product (Sequence 4; SEQ ID NO:7 in Patent No. US-2015-0216147). This base is part of the signal sequence for a splice donor site for exon 8, and is required for proper splicing. The A mutation was seen only in homozygous 16 haplotype animals (Sequence 4; SEQ ID NO:7 in

Patent No. US-2015-0216147), but a G at this position was seen in the 12 haplotype and all normal haplotypes (Sequence 5; SEQ ID NO:8 in Patent No. US-2015-0216147) that were sequenced.

Transcripts from the 12 haplotype were found to lack a 137 base pair long exon 10, which would cause a frameshift mutation, resulting in a stop codon shortly after the missing

205 exon. Sequence and translation of the longest cDNA clone from haplotype 12 are shown in

Sequence and Translation 2 (SEQ ID NOS:3 and 4 in Patent No. US-2015-0216147). Exons 10 and 11 and portions of the surrounding introns were amplified from genomic DNA of homozygous haplotype 12 animals using the following primers: forward 5’-

GCTAAAGTCCAGGCCAGTTG-3’ (SEQ ID NO:43 in Patent No. US-2015-0216147) and reverse 5’- CAAGAGTCCCCACCAGTCTT-3’ (SEQ ID NO:44 in Patent No. US-2015-

0216147), with annealing temperature of 56°C and elongation time of 1m. The signal sequence required for splicing of exon 10 was the same as that seen for normal and reference sequences.

However, sequencing of the genomic region around exon 10 identified a nonsense point mutation in the 12 haplotype exon 10 sequence at base 116 of the PCR product (Sequence 6; SEQ ID

NO:9 in Patent No. US-2015-0216147), with a mutation from GàA. Under normal transcription, splicing and translation, this mutation from GàA would change a Tryptophan amino acid codon to a stop codon, resulting in a protein that ends with 266 of the total 712 amino acids of the normal Artemis protein. We have not observed such a transcript in 12/12 animal cells thus far. For the 12 haplotype transcript we do observe, the predicted protein translation would be 277 amino acids long (Translation 2; SEQ ID NO:2 in Patent No. US-2015-0216147).

The genomic sequences for exons 10 and 11 are shown in sequence 7 (SEQ ID NO:10 in Patent

No. US-2015-0216147).

These results for the 12 and 16 haplotype sequencing did not show any transcripts containing all exons expressed in pig cells that were homozygous for 12 or the 16 through rt-

PCR amplification using primers at the beginning and end of the normal transcript. Thus a normal Artemis protein cannot be produced from any of the RNAs observed. For the 16 haplotype, a normal amount of RNA should be present in these cells, which will make a protein

206 that missing 47 amino acids. For the 12 haplotype, the protein produced would be severely truncated if a transcript containing exon 10 was produced; however, no evidence that such a transcript is stably produced in 12/12 homozygous animals was found. Translation of the transcript observed would also result in a truncated protein.

While >150 different mutated alleles of Artemis have been found through 2010 (Pannicke et al. 2010), none of the reported mutant alleles are exactly the same as the two mutants we have identified or as their predicted RNAs/proteins. Thus, it is difficult to be sure of the actual resulting residual level of activity expressed by these mutations. We anticipate that the protein predicted to be expressed from the 16 mutation may have some residual activity, while the protein predicted to be expressed from the 12 mutation would be completely non-functional.

Genotyping Methods Used for Crossbred Animals

Known carrier pigs were mated to Duroc, Landrace, or crossbred animals in order to introduce a more diverse genetic background into the SCID carrier population. Offspring that resulted from these matings were tested for carrier status (50% of each litter produced expected to be SCID carriers) using the seven sets of allele-specific primers described above. The genetic tests performed were not able to accurately distinguish carriers from homozygous normal piglets.

To genotype these crossbred animals, genomic regions containing exons with causative mutations (exon 8 for suspected haplotype 16 carriers and exon 10 for suspected haplotype 12 carriers) were amplified, cleaned using Exo-Sap, then sequenced. This method of genotyping is most accurate in animals outside of the ISU Yorkshire population, such as animals of other breeds or commercial animals.

207

Direct Genetic Tests for the SCID mutations

Figure 3.5 shows the orientation to the two different mutations, their relationship to each other along the chromosome, and the new genetics tests for SCID in pigs.

Specific tests were developed for the mutant sequence at each of the two variants in the

Artemis/DCLRE1C gene. For the causative mutation in haplotype 16, found at position 615 of

Sequence 4 (SEQ ID NO:7), the test was established for all animals in which it was assessed.

For the causative mutation in haplotype 12, found at position 116 of Sequence 6 (SEQ ID NO:9 in Patent No. US-2015-0216147), the test was able to genotype most animals but not all (see information below).

The ABI PRISM® SNaPshot™ Multiplex Kit (Lifetechnologies, P/N 4323151) technology was used for the tests. SNaPshot can be used to distinguish four nucleotides at a specific position in a sequence by color. First, a PCR is performed to amplify the region surrounding the SNP of interest, i.e. one PCR for the region surrounding haplotype 12 (h12) SNP and one for the region surrounding haplotype 16 (h16) SNP. The primers used are called the outside primers for the h12 SNP and for the h16 SNP in Table B.2. The melting temperature for both PCRs is Tm=54°C. Next, the PCR products are cleaned from remaining primers and dNTPs through adding 5 units of Shrimp Alkaline Phosphatase (SAP, USB Corporation, P/N 70092) and

2 units of Exonuclease I (ExoI, New England BioLabs, P/N M0293) to every 15ul of PCR product. To get to the right concentration, ExoI is diluted in a buffer containing 80mM Tris-HCl and 2mM MgCl2. The mixture of PCR product, SAP and ExoI is incubated at 37°C for 1 hour and deactivated at 75°C for 15 minutes. The cleaned PCR products (one for each SNP) of one animal can thereafter be pooled and analyzed together with the SNaPshot™ Multiplex Kit. This

208 kit encloses a SNaPshot Multiplex Ready Reaction Mix that contains fluorescently labeled ddNTPs and AmpliTaq® DNA polymerase in a reaction buffer. In a reaction volume of 10ul, we have 5ul of the SNaPshot Multiplex Ready Reaction Mix, 3ul of pooled PCR products and

0.2uM of each extension primer (for sequence see Table B.1). The extension reaction goes as followed: 25 cycles of 10 minutes at 96°C, 5 minutes at 50°C and 30 minutes at 60°C. 1 unit of

SAP is added after this reaction to remove unincorporated fluorescent ddNTPs from the reaction.

This SAP incubation is done at 37°C for 1 hour and deactivated at 75°C for 15 minutes. To read the fluorescent signal, this product needs to run on an ABI Prism 3700 DNA Analyzer (or 310 or

3100 Genetic Analyzer), using the GeneScan 120 LIZ size standard, with dye set E5.

Table B.2. PCR Primers and Oligonucleotides used in SNaPshot Direct Genetic Tests for the Causative Mutations

Haplotype Primer Primer Primer Sequence SCID Normal Use Directio 5’ – 3’ allele allele n PCR Forward CTCAGTGGGTTAGGGACCTG (SEQ ID NO:46) Reverse GCCATCTGATAGGGTTTCCA (SEQ ID NO:47) h16 SNaPshot Forward GAGGAGTTCGGAGTCCAG (SEQ ID NO:48) A G Reverse TCTCTGGGAGAAAGAGCCCTCAGGTA (SEQ ID T C NO:49) PCR Forward GCATTCACTCAGGCTGCTTT (SEQ ID NO:50) Reverse CCCAGGAAATACTGGCTCAT (SEQ ID NO:51) h12 SNaPshot Forward CCCCATCTTTTTTAGGCAGAGGAATATTTTCATT (SEQ ID A G NO:52) Reverse CTCTCTCTCTCTCCGGATGTTATTCCACAGGGTAATTTAT T C TC (SEQ ID NO:53)

Both tests are performed in matings in which the parents contain both the h16 and h12 alleles, in order to correctly determine the combined genotype at both positions. In matings where only one SCID haplotype is segregating, only one of the two tests needs to be performed.

We have not observed an animal that carries both mutations on the same chromosome, the two mutations appear to always be on separate chromosomes.

209

H16 test (position 615 of Sequence 4; SEQ ID NO:7 in Patent No. US-2015-0216147)

The h16 test correctly determines the allele at the h16 position (either G/C or A/T) for all possible genotypes. These include a GG result for normal animals at both the h12 and h16 positions, as well as h12/h12 animals, a GA result for h12/h16 and normal/h16 animals, and a

AA result for h16/h16 animals.

H12 test (position 116 of Sequence 6; SEQ ID NO:9 in Patent No. US-2015-0216147)

The h12 test detects the h12 mutant sequence at position 116 of SEQ ID NO:9 (in Patent

No. US-2015-0216147) but will not always detect the normal sequence on the other chromosome in a h12/h16 or a h12/normal animal. Recent results from SNaPshot genotyping of h12 mutation carriers have shown that the primers given above may preferentially amplify the h12 mutated allele when present. In the absence of the h12 mutated allele, SNaPshot results show the expected GG (normal/normal at this position) genotype. In certain cases, SNaPshot genotyping of animals with the known genotype of h12/h16 or h12/normal have shown only the h12 mutant allele (AA).

Combining technologies to obtain exact genotype at the Artemis locus

Correct identification of genotypes for animals that have resulted in these SNaPshot genotyping errors have been found through the use of the primers discussed in Table B.1 in addition to SNaPshot results from h16 allele genotyping. The primers of Table B.1 of are capable of determining whether an animal is SCID affected, carrier, or normal. This information, combined with the genotype from h16 SNaPshot will show the exact genotype of that animal.