Mapping DNA Structural Variation in Dogs
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 1, 2021 - Published by Cold Spring Harbor Laboratory Press Resource Mapping DNA structural variation in dogs Wei-Kang Chen,1,4 Joshua D. Swartz,1,4,5 Laura J. Rush,2 and Carlos E. Alvarez1,3,6 1Center for Molecular and Human Genetics, The Research Institute at Nationwide Children’s Hospital, Columbus, Ohio 43205, USA; 2Department of Veterinary Biosciences, The Ohio State University, Columbus, Ohio 43210, USA; 3Department of Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio 43210, USA DNA structural variation (SV) comprises a major portion of genetic diversity, but its biological impact is unclear. We propose that the genetic history and extraordinary phenotypic variation of dogs make them an ideal mammal in which to study the effects of SV on biology and disease. The hundreds of existing dog breeds were created by selection of extreme morphological and behavioral traits. And along with those traits, each breed carries increased risk for different diseases. We used array CGH to create the first map of DNA copy number variation (CNV) or SV in dogs. The extent of this variation, and some of the gene classes affected, are similar to those of mice and humans. Most canine CNVs affect genes, including disease and candidate disease genes, and are thus likely to be functional. We identified many CNVs that may be breed or breed class specific. Cluster analysis of CNV regions showed that dog breeds tend to group according to breed classes. Our combined findings suggest many CNVs are (1) in linkage disequilibrium with flanking sequence, and (2) associated with breed-specific traits. We discuss how a catalog of structural variation in dogs will accelerate the iden- tification of the genetic basis of canine traits and diseases, beginning with the use of whole genome association and candidate-CNV/gene approaches. [Supplemental material is available online at www.genome.org.] Recent findings revealed that DNA structural variation (SV) is tory elements (for reviews, see Lupski and Stankiewicz 2005; common in normal rodents (Graubert et al. 2007; Guryev et al. Reymond et al. 2007). 2008) and primates (Iafrate et al. 2004; Sebat et al. 2007; Lee et al. Disease relevance of DNA copy number alteration was already 2008; for review, see Freeman et al. 2006). The term SV and its appreciated from recurrent deletions of tumor suppressors (e.g., subset of copy number variation (CNV), refer to alterations from 1 TP53 and RB1) and amplifications of oncogenes (e.g., MYC and kb to multi-Mb. The development of high-resolution array com- EGFR) in cancer. However, since the recent discovery of CNV in parative genome hybridization (aCGH) allowed the genome-wide germline DNA, several affected genes were shown to be associated discovery of submicroscopic copy number changes (Ishkanian with disease: CCL3L1 with HIV/AIDS, FCGR3 with glomerulone- et al. 2004). More recently, other approaches have allowed other phritis, DEFB2 with Crohn’s, C4A/C4B with lupus, and PRSS1 with types of SV (e.g., inversions) and polymorphisms to be surveyed. pancreatitis (Gonzalez et al. 2005; Aitman et al. 2006; Fellermann For example, high-density single nucleotide polymorphism (SNP) et al. 2006; Le Marechal et al. 2006; Fanciulli et al. 2007; Yang et al. platforms can allow detection of subsets of SNPs and SVs (Conrad 2007). CNV also provides material and mechanism for creating et al. 2006; McCarroll et al. 2006). Most recently, paired end new genes. CNV thus accounts for a great deal of the genetic approaches have evolved to yield massive numbers of new SVs at variation in animals and humans. For example, it was recently ultra-high resolution (although requiring significant cost and ef- shown that the CNV landscape in flies is under natural selection fort) (Tuzun et al. 2005; Korbel et al. 2007). (Emerson et al. 2008). Little is known about CNV in mammalian Many human CNVs appear to be old because they are in evolution, as only Euarchontoglires—the clade of primates and linkage disequilibrium with nearby genetic markers (Hinds et al. rodents—has been surveyed for CNV. 2006). Others are recent or recurrent. Notably, CNVs are enriched Dogs have extraordinary advantages as animal models in regions that are rearrangement hotspots, such as those rich in (Lindblad-Toh et al. 2005): (1) they are numerous, (2) they share segmental duplications (or ‘‘low copy repeats’’) (Sharp et al. 2005). many diseases that are similar to those of humans, (3) they have In humans, more than 5000 CNVs have been identified in normal a greater than fivefold reduced lifespan compared to humans, (4) individuals, and at least several hundred are common in the they often have a high level of health care, and (5) live in an en- population (Redon et al. 2006; Korbel et al. 2007; Wong et al. 2007; vironment that is very similar to that of their owners. Humans are Kidd et al. 2008). Most known CNVs span genes and appear likely more closely related to mice than dogs (divergence times of 80 to affect genetic networks. Thus, while SNPs and point mutations million years ago [Mya] and 90 Mya, respectively). However, dogs are more frequent, CNVs seem more likely to have an impact on share ;650 Mb of ancestral sequence that is absent in mice, and phenotype. CNVs can have various effects, including changes in the sequence divergence rate of dogs is half that of mice (Lindblad- gene expression levels, disruption of gene dosage, unmasking of Toh et al. 2005). This suggests that a subset of human biology is recessive alleles or regulatory polymorphisms, and loss of regula- more similar to that of dogs versus mice. But the main advantage of canine models is their evolutionary history. Largely based on 20 yr of studying dog and pigeon breeding, Darwin (1859) began 4Co-first authors. 5Present address: Department of Chemistry, Vanderbilt University, On the Origin of Species by saying that variation under domestication Station B 351822, Nashville, TN 37235, USA. is the best clue to understanding evolution. He believed that 6 Corresponding author. selection is the most important aspect of evolution, and that it E-mail [email protected]; fax (614) 722-2817. Article published online before print. Article and publication date are at is exaggerated in domestication. In many cases, it also offers http://www.genome.org/cgi/doi/10.1101/gr.083741.108. documentation of parental breeds (or subspecies), of the selection 19:00–00 Ó2009 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/09; www.genome.org Genome Research 1 www.genome.org Downloaded from genome.cshlp.org on October 1, 2021 - Published by Cold Spring Harbor Laboratory Press Chen et al. began our study by conducting aCGH on the well-characterized OSW T-cell lymphoma cell line from an Airedale Terrier (Kisseberth et al. 2007; Supplemental Table S1). The aCGH reference DNA for this hybridization—and for all others reported here—was from a pedigreed Boxer. Notably, OSW was previously analyzed on a low resolution BAC array, and several copy number alterations were validated (Kisseberth et al. 2007). Each spot of such BAC arrays is comprised of many fragments 0.5–2 kb in size, but our spots cor- respond to single unique-sequence oligos. We thus consider these two platforms to cross-validate each other (as essentially different methodologies). This comparison also confirmed the mapping of the arrayed oligos. (In depth analysis and discussion of the exten- sive copy number alteration present in this single lymphoma cell line are outside the scope of this work, and will be included in a second publication.) Figure 1 compares the linearized whole ge- nome karyogram of OSW to that of a normal dog (Rottweiler), and shows a close up of all of chr 11. There is very strong correlation between the published low resolution BAC aCGH results and our largest alterations. This is most apparent for the copy number gain of nearly all of chr 13 and the one copy loss of almost all of chr 22 (cf. Fig. 1 in this study with Fig. 3 of Kisseberth et al. 2007). Figure 1. Whole-genome DNA copy number quantification of the OSW Kisseberth et al. (2007) additionally confirmed the DNA copy dog lymphoma cell line by aCGH. DNA was cohybridized with a reference number changes of four genes within alterations detected in both pedigreed Boxer (in two colors) to an oligo array with <5 kb mean studies (P16, MYC, RB1, PTEN). Beyond validating our methods, we spacing. Copy number is shown as log2 ratios, with gains above the also identified alterations previously implicated in canine lym- midline and losses below. The top panel shows the linearized whole- phoma, such as the whole chromosome amplification of chr 13 genome copy number of a normal Rottweiler and the middle panel that of OSW. (There is no Y chromosome data, and the last segment before the (containing MYC) and loss of chr 11 and regions of chr 22 (con- X is the unmapped genome assembly.) Note the complete or nearly taining RB1) and chr 26 (containing PTEN) (Hahn et al. 1994; complete gain of chr 13 and losses of chr 22 and chr 32. The bottom Thomas et al. 2003; Modiano et al. 2007). The copy number alter- panel shows all of OSW chr 11, with seven large single-copy losses and ation of some of those regions/genes, and others, are known to be one small two-copy loss. The P16 tumor suppressor gene lies within the two-copy loss region. associated with human lymphoma (e.g., P16 loss and MYC gain). Finally, we also identified novel candidate cancer genes (e.g., see SLITRK1 below).