<<

Journal of Heredity 2010:101(2):246–250 Ó The American Genetic Association. 2009. All rights reserved. doi:10.1093/jhered/esp091 For permissions, please email: [email protected]. Advance Access publication November 25, 2009 Characterization of a Minimal Microsatellite Set for Whole Genome Scans Informative in and Coldblood Breeds

EVELYN H. MITTMANN,VIRGINIE LAMPE,STEFANIE MO¨ MKE,ALEXANDRA ZEITZ, AND OTTMAR DISTL

From the Institute of Animal Breeding and Genetics, University of Veterinary Medicine Hanover, Bu¨nteweg 17p, 30559 Hannover, Germany.

Address correspondence to Dr Ottmar Distl at the address above, or e-mail: [email protected].

The availability of a high-quality draft sequence of the horse reported by Swinburne et al. (2006) and Penedo et al. (2005), makes known the physical location of microsatellites. The Tozaki et al. (2004, 2007) and the radiation hybrid (RH) aim of the present study was to establish a highly maps of Chowdhary et al. (2003) and Raudsepp et al. (2008) polymorphic minimal screening set of microsatellite markers have provided diverse sets of microsatellite markers. for (MSSH) annotated on the The aim of the present study was to establish a highly assembly EquCab2.0. We have used the previously polymorphic microsatellite marker set which covers the reported linkage and radiation hybrid maps and have whole equine genome with evenly spaced markers and which extended these marker sets by filling in gaps as noted from is anchored on the horse genome assembly EquCab2.0. This annotation on the horse sequence. This MSSH covers all marker set should be highly informative and facilitate autosomes and the X chromosome with 322 evenly spaced mapping studies in the horse. Therefore, the number of microsatellites whose positions were determined on the alleles, heterozygosity (HET), and polymorphism information horse genome assembly (EquCab2.0). The average chro- content (PIC) were determined in a random sample of the mosomal distance among markers amounts to 7.44 Mb. worldwide distributed Hanoverian warmblood (HW) horse The characteristics established for this microsatellite set and a panel of several German coldblood horse breeds. were the number of alleles, the observed heterozygosity (HET), and the polymorphism information content (PIC) for Materials and Methods Hanoverian warmblood (HW) and several German cold- blood horse breeds (CB). The average number of alleles Development of the Marker Set and Animals for was 7.3 and 8.0 in HW and CB, respectively. HET was at Genotyping 71% for HW and CB, PIC at 65% (HW) and 67% (CB). This The Horsemap database at the INRA Biotechnology MSSH allows scanning of the whole horse genome at close Laboratories Home Page (http://locus.jouy.inra.fr) contains to 7- to 10-Mb resolution. information on 1538 microsatellites. At the NCBI nucleo- Key words: genome, heteroyzygosity, horse, microsatellite, PIC tide database (http://www.ncbi.nlm.nih.gov/), 24 124 equine microsatellites are available since July 2009 (queries on 8/19/2009). The MSSH was developed using micro- The equine genome is estimated to contain approximately satellites from the HORSEMAP homepage, the linkage 2.7 billion base pairs distributed on 31 autosomes and the X maps of Swinburne et al. (2006), Penedo et al. (2005), and chromosome. The identification of genetic loci and markers Tozaki et al. (2004, 2007) and the RH maps of Chowdhary as well as their spacing on the chromosomes is one of the et al. (2003) and Raudsepp et al. (2008). For all of these necessary preconditions for linkage studies including fine markers their physical location on the horse genome mapping of quantitative trait loci, parentage identification, assembly EquCab2.0 was determined using BLAST- or genetic diversity analyses. We used the latest horse like alignment tool (BLAT) and an e-value of 10À5 as genome assembly (EquCab2.0) to develop this minimal threshold. The expected size range of the markers was screening set of microsatellites markers for horses (MSSH). determined using the polymerase chain reaction (PCR) in The majority of microsatellites reported in the horse was silico function and the genome version ‘‘September 2007’’ of dinucleotide repeats (CA). The comprehensive linkage maps the University of California, Santa Cruz Genome Browser

246 Brief Communications

Table 1. Distribution of the number of alleles for the MSSH group of horses comprised several German coldblood horse breeds (CB), such as South German, Rhenish German, Number of markers Saxon–Thuringian, and Schleswig coldblood. Each of these Hanoverian German coldblood breeds has been influenced and refined by a number horses horses Number of European coldblood lines. Rhenish German and Saxon– of alleles n % n % Thuringian coldblood were strongly influenced by Belgian and Ardenner coldblood horses (Aberle et al. 2004). ,4 6 1.9 6 2.3 4–6 119 38.3 83 31.1 The markers were genotyped on an average number of 7–9 138 44.4 126 47.2 362 HW horses and 299 German coldblood horses. 10–12 39 12.5 30 11.2 13–15 5 1.6 13 4.9 PCR and Genotyping of Markers .15 4 1.3 9 3.4 Genomic DNA was isolated from ethylenediaminetetra- acetic acid blood samples using the QIAamp 96 DNA Spin (http://genome.ucsc.edu/cgi-bin/hgPcr?command5start). Blood Kit (Qiagen, Hilden, Germany). PCR was performed The distances between the markers were calculated and for all on PTC 100, PTC 200 (MJ Research, Watertown, MA), or chromosomal regions being not covered by microsatellites professional thermocyclers (Biometra, Go¨ttingen, Ger- within a distance of 10–20 Mb, we developed a total of many). A PCR program with varying annealing temperatures 54 new markers to bridge these gaps in this preliminary (Ta) and a general procedure has been used as follows: after marker set. 4 min of initial denaturation at 94 °C; 36 cycles of 30 s at 94 Therefore, we generated permutation sequences with all °C, 60 s at optimum annealing temperature, 30 s at 72 °C; variations of di-, tri-, and tetrarepeat motifs with a minimum and the final cooling to 4 °C for 10 min were carried out. All length of 15 repeats and a maximum length of 30 repeats. For PCR reactions were performed in 12 ll reaction volumes these sequences, alignments were identified in the EquCab2.0 using 10 ng DNA, 1.2 ll10Â incubation buffer containing assembly within 1–2 Mb of the targeted location. These newly 15 mM MgCl2, 0.5 ll dimethyl sulfoxide, 0.3 ll each dNTP developed markers are distributed on 14 autosomes (ECA1, (100 lM each) and 0.5 U Taq polymerase (Qbiogene, 2, 4, 5, 7, 10, 11, 15, 16, 18, 21, 22, 26, and 31). Heidelberg, Germany). All forward primers were labeled Each of the chosen markers in the MSSH has been with the IRD700 or IRD800 at the 5# end. characterized using the number of alleles, the observed To increase efficiency, primer pairs were pooled into HET and the PIC. HET shows the proportion of PCR multiplex groups of 2–9 markers. The remaining heterozygote individuals. PIC is defined as the probability primer pairs were amplified separately. The multiplex groups that the marker genotype of a given offspring will, in the and the separately amplified PCR products were pooled absence of crossing over, allow to deduce which one of the according to their size and labeling and diluted with 2 marker alleles the offspring received from its parents. formamide loading buffer in ratios from 1:10 to 1:30. For The values for HET and PIC were used to test the the analysis of the marker genotypes, the PCR products information content of all markers in this preliminary set. were size-fractionated by gel electrophoresis on an auto- Markers which did not show a minimum value of 4 alleles mated sequencer (LI-COR 4200/S-2, LI-COR 4300, and a HET and PIC greater than 0.50 were removed and Lincoln, NE) using 4% and 6% polyacrylamide denaturing replaced by more polymorphic markers as far as possible. gels (Rotiphorese Gel40, Carl Roth, Karlsruhe, Germany). Particularly in genomic regions with a low marker density or Allele sizes were scored against IRD700- and IRD800- with markers exhibiting low information content, new labeled DNA ladders. SAS/Genetics, version 9.2 (SAS microsatellites had to be developed. We chose several horse Institute, Cary, NC) was employed to calculate number of breeds for evaluation of the information content of the alleles, HET, and PIC. marker set. The reason for testing the degree of poly- morphisms of the microsatellites in several breeds was that HET and PIC vary among breeds and through genotyping Results of horses from several breeds, markers with a breed specific abundance of polymorphisms should be excluded. We have The mean number of alleles of the whole equine marker set chosen the HW as a representative breed for warmblood including 322 microsatellite markers was 7.3 in HW and 8.0 horses. This breed may also exhibit a similar degree of in CB (Table 1). Seven microsatellites had ,4 alleles for the polymorphisms like or breeds with larger HW and for the CB there were 6 markers with ,4 alleles. proportions of genes because the average HET and PIC values with values less than 0.50 were seen in proportion of Thoroughbred genes amounts to 35% in HW. 7 and 19 markers (HW) and in 4 and 11 markers (CB). Further significant contributions to Hanoverians have been However, these markers with a lower HET were retained as made by Arabians, Anglo-Arabians, , and other no other suitable markers were found to cover these strains (Hamann and Distl 2008). The regions. HW is the largest warmblood breed worldwide and has The marker with the lowest number of alleles was influenced many European warmblood breeds. The second UMNe244 which is located in a region on ECA14 where

247 Journal of Heredity 2010:101(2)

Table 2. Distribution of the observed HET for the MSSH on ECA1. On ECA3, we found the markers with the highest average number of alleles (8.6) in the HW. For the Number of markers CB, ECA31 showed an average number of alleles of 12.5. Hanoverian German coldblood The highest marker HET was observed on ECA21 for the horses horses HW (HET: 0.80) and for the CB on ECA31 (HET: 0.81). HET (%) n % n % The highest PIC values for HW were on ECA26 and 31 (PIC: 0.72) and for CB on ECA30 and 31 (PIC: 0.77). 30–40 2 0.6 0 0 40–50 5 1.6 4 1.5 50–60 37 11.9 32 12.0 60–70 98 31.5 83 31.1 Discussion 70–80 117 37.6 98 36.7 80–90 46 14.8 48 18.0 The aim of this study was to establish a MSSH that benefits 90–100 6 1.9 2 0.7 scientists using a microsatellite-based approach. Considering the 2 groups of horse breeds genotyped, the mean values for number of alleles, observed HET and PIC of this marker set differed only slightly. This may indicate that the markers only very few polymorphic microsatellites were identified. chosen may also be similarly polymorphic in other horse The marker ABGe001 had the highest number of alleles with breeds. However, a number of microsatellites exhibited large 24 for HW and 32 for CB. differences in their polymorphisms. Eleven markers had The mean values for HET and PIC slightly differed high information content in the HW or in the CB but not among HW and CB but overall the CB showed a higher vice versa. These markers were distributed on 10 chromo- degree of polymorphisms. The distribution of the HET and somes (ECA1, 2, 5, 14, 15, 17, 19, 22, 27, and the X PIC values for the MSSH is shown in Tables 2 and 3. chromosome). The characteristics of the MSSH such as average When comparing the positions of the microsatellites in chromosomal marker coverage, number of markers per the MSSH with the positions of the markers in the linkage chromosome, mean number of alleles per chromosome, map of Swinburne et al. (2006), the order of the markers on HET and PIC of the horses genotyped are given in Table 4. most chromosomes was consistent with the annotation on The mean spacing among the markers of the MSSH the horse genome assembly EquCab2.0. Marker order had according to the horse genome assembly EquCab 2.0 was been switched between TKY601 and COR020 on chromo- 7.44 Mb. There were 2 gaps of more than 20 Mb on ECA3 some 10. A less subtle difference in marker position was (21.8 Mb) and on the X chromosome (21.67 Mb). All evident for COR062, which had been located on the microsatellites of the MSSH including marker names and proximal end of ECA19 on EquCab1.0 in agreement with references, primer sequences, chromosomal location of the the linkage map of Swinburne et al. (2006). Using the BLAT markers on EquCab2.0, accession numbers for markers, results from EquCab2.0, it turned out that COR062 has to number of alleles, HET, and PIC for HW and CB are given be relocated on ECA5. Swinburne et al. (2006) could not in Supplementary Table 1. Multiplex groups for the MSSH make a distinction between the location of COR068 and are provided in Supplementary Table 2. All the markers of COR073 on ECA21 nor between TKY011 and LEX051 on the MSSH had amplicons in the expected size ranges, strong ECA15. In the present, physical map distances of 1.8 Mb and clear bands to distinguish the alleles, as low as possible and 3.0 Mb were found in between these 2 microsatellites. stutter bands, and no multiple products. On ECA25, the orientation of the markers in the linkage Due to the different lengths of the chromosomes, the maps of Swinburne et al. (2006) and Penedo et al. (2005) as number of markers ranged from 3 on ECA13 to 25 markers well is completely reversed in comparison to the present annotation on EquCab2.0. The order of the markers in the linkage map of Penedo et al. (2005) on ECA1, 4, 6, 10, 16, 18, 20, and 26 reflects Table 3. Distribution of the PIC for the MSSH minor discrepancies with the ordering in the MSSH. Some Number of markers adjacent markers switched their positions, like LEX058 and COR046 on ECA1, UCD465 and COR070 on ECA6, Hanoverian German coldblood TKY867 and LEX017 on ECA10, as well as TKY101 and horses horses TKY016 on ECA18. On ECA4, 10, and 20, 1 or 2 markers PIC (%) n % n % changed their positions in comparison with the MSSH and 20–30 1 0.3 0 0 were annotated more distally or proximally on the respective 30–40 2 0.6 2 0.7 chromosome. Chromosome 26 is oriented in opposite 40–50 16 5.1 9 3.4 directions in the linkage maps of Penedo et al. (2005) and 50–60 67 21.5 52 19.5 Swinburne et al. (2006). The comparative RH map of 60–70 111 35.7 99 37.1 Raudsepp et al. (2008) and the linkage map of Swinburne 70–80 100 32.2 80 30.0 et al. (2006) would support the marker order found in the 80–90 14 4.5 25 9.4 horse genome assembly EquCab2.0.

248 Brief Communications

Table 4. Average chromosomal coverage through the minimal screening set, number of markers per chromosome, average number of alleles (NA), observed HET, and PIC per horse chromosome (ECA)

Hanoverian horses German coldblood horses Average Number of ECA distance (Mb) markers NA HET PIC NA HET PIC 1 7.15 25 7.8 0.69 0.65 9.0 0.73 0.69 2 5.11 23 8.1 0.69 0.65 9.4 0.75 0.70 3 9.53 12 8.6 0.76 0.71 8.9 0.73 0.69 4 4.72 22 8.2 0.71 0.64 10.4 0.76 0.73 5 5.25 18 6.7 0.70 0.65 7.4 0.71 0.67 6 7.06 11 6.4 0.70 0.64 7.1 0.70 0.66 7 14.08 5 6.8 0.64 0.59 6.6 0.69 0.66 8 11.76 8 5.9 0.66 0.62 7.8 0.72 0.67 9 9.28 8 5.9 0.68 0.61 6.6 0.66 0.61 10 4.67 17 7.6 0.72 0.66 8.4 0.74 0.70 11 7.66 7 7.9 0.73 0.68 8.1 0.69 0.65 12 6.62 4 7.8 0.69 0.69 9.3 0.70 0.67 13 10.64 3 5.7 0.71 0.63 7.3 0.75 0.67 14 9.81 9 5.4 0.65 0.55 7.0 0.65 0.61 15 4.58 19 7.7 0.72 0.67 7.7 0.68 0.66 16 4.60 18 6.5 0.73 0.68 7.2 0.69 0.67 17 8.97 8 5.9 0.63 0.59 7.4 0.66 0.63 18 4.13 19 7.3 0.74 0.67 7.2 0.75 0.68 19 8.57 6 6.3 0.63 0.57 7.3 0.66 0.64 20 8.02 7 6.9 0.68 0.64 7.2 0.68 0.58 21 4.81 11 7.6 0.80 0.70 7.0 0.67 0.61 22 5.55 8 7.8 0.66 0.65 7.8 0.72 0.67 23 11.14 4 6.5 0.72 0.61 6.8 0.75 0.67 24 7.79 5 7.2 0.76 0.70 7.4 0.71 0.66 25 7.91 4 6.8 0.65 0.59 7.3 0.69 0.60 26 4.65 8 7.9 0.76 0.72 8.4 0.74 0.71 27 6.66 5 7.4 0.68 0.61 7.0 0.68 0.67 28 9.24 4 6.5 0.75 0.66 8.3 0.77 0.68 29 4.81 6 6.8 0.68 0.60 7.2 0.69 0.64 30 6.01 4 6.8 0.69 0.66 9.0 0.78 0.77 31 5.00 4 8.3 0.77 0.72 12.5 0.81 0.77 X 10.34 11 7.2 0.70 0.70 7.6 0.65 0.67

Between the most recently published RH map of phisms (SNPs) like the Illumina equine 50K beadchip, Raudsepp et al. (2008) and the horse genome assembly microsatellite-based linkage studies may often be an EquCab2.0, there is strong collinearity. Only a few minor alternative and can be employed as a first option for discrepancies were found in marker positions due to local disease and trait mapping. When the approach for disease rearrangements on ECA17, 18, and 25. The marker and quantitative trait mapping is mainly based on affected UMNe448 changed its location from ECA1 in the RH individuals or animals with extreme phenotypes, linkage map to ECA2 on horse sequence EquCab2.0. The marker studies can be performed, and in the case of the UCD464 was located on the proximal end of ECA25 using availability of a very informative marker set, these studies EquCab2.0, whereas its position has been reversed to the will be very worthwhile and attractive. distal end of ECA25 on the RH map. In summary, we have developed a resource for This MSSH presents a useful tool for linkage studies conducting genome-wide scans at about 7- to 10-Mb in many horse breeds including warmblood and cold- resolution in the horse. Previously published linkage and blood breeds, Thoroughbreds, and breeds with a signif- RH maps and especially the horse genome assembly enabled icant proportion of Thoroughbred blood. In us to develop this 322 microsatellite-based marker set for standardbreds, ponies, and primitive horse breeds this horses. The integration of this marker set with EquCab2.0 marker set was not yet tested for its information content. should further serve fine mapping, identification of positional candidate genes, and follow-up studies using A similar marker set comprising 507 markers at 5 Mb SNP panels or SNP chips. spacing was developed for the dog (Sargan et al. 2007). However, this MSSH does not allow identification of Supplementary Material disease-associated haplotypes by linkage disequilibrium (LD) mapping. Due to the high costs for LD mapping Supplementary material can be found at http://www. using beadchips based on single nucleotide polymor- jhered.oxfordjournals.org/.

249 Journal of Heredity 2010:101(2)

References Sargan DR, Aguirre-Hernandez J, Galibert F, Ostrander EA. 2007. An extended microsatellite set for linkage mapping in the domestic dog. Aberle K, Wrede J, Distl O. 2004. Analysis of relationships between J Hered. 98:221–231. German heavy horse breeds based on pedigree information. Berl Mu¨nch Swinburne JE, Boursnell M, Hill G, Pettitt L, Allen T, Chowdhary B, Tiera¨rztl Wschr. 117:72–75. Hasegawa T, Kurosawa M, Leeb T, Mashima S, et al. 2006. Single linkage Chowdhary BP, Raudsepp T, Kata SR, Goh G, Millon LV, Allan V, Piumi group per chromosome genetic linkage map for the horse, based on two F, Gue´rin G, Swinburne J, Binns M, et al. 2003. The first-generation whole- three-generation, full-sibling, crossbred horse reference families. Genomics. genome radiation hybrid map in the horse identifies conserved segments in 87:1–29. human and mouse genomes. Genome Res. 13:742–751. Tozaki T, Penedo MCT, Oliveira RP, Katz JP, Millon LV, Ward T, Hamann H, Distl O. 2008. Genetic variability in Hanoverian warmblood Pettigrew DC, Brault LS, Tomita M, Kurosawa M, et al. 2004. Isolation, horses using pedigree analysis. J Anim Sci. 86:1503–1513. characterization and chromosome assignment of 341 newly isolated equine Penedo MCT, Millon LV, Bernoco D, Bailey E, Binns M, Cholewinski G, microsatellite markers. Anim Genet. 35:462–504. Ellis N, Flynn J, Gralak B, Guthrie A, et al. 2005. International equine gene Tozaki T, Swinburne J, Hirota K, Hasegawa T, Ishida N, Tobe T. 2007. mapping workshop report: a comprehensive linkage map constructed with Improved resolution of the comparative horse-human map: investigating data from new markers and by merging four mapping resources. Cytogenet markers with in silico and linkage mapping approaches. Gene. 392:181–186. Genome Res. 111:5–15. Raudsepp T, Gustafson-Seabury A, Durkin K, Wagner ML, Goh G, Received June 7, 2009; Revised September 12, 2009; Seabury CM, Brinkmeyer-Langford C, Lee EJ, Agarwala R, Stallknecht-Rice Accepted September 25, 2009 E, et al. 2008. A 4,103 marker integrated physical and comparative map of the horse genome. Cytogenet Genome Res. 122:28–36. Corresponding Editor: Dr. Ernest Bailey

250