Microsatellites and Their Genomic Distribution, Evolution, Function and Applications: a Review with Special Reference to Fish Genetics ⁎ Dimitry A
Total Page:16
File Type:pdf, Size:1020Kb
Aquaculture 255 (2006) 1–29 www.elsevier.com/locate/aqua-online Review Microsatellites and their genomic distribution, evolution, function and applications: A review with special reference to fish genetics ⁎ Dimitry A. Chistiakov a,b, , Bart Hellemans b, Filip A.M. Volckaert b a Department of Pathology and Laboratory Medicine, University of Cincinnati Medical Center, 231 Albert Sabin Way, Cincinnati, OH 45267-0529, USA b Laboratory of Aquatic Ecology, Katholieke Universiteit Leuven, Ch. de Bériotstraat 32, B-3000 Leuven, Belgium Received 31 March 2005; received in revised form 17 November 2005; accepted 25 November 2005 Abstract Microsatellites represent codominant molecular genetic markers, which are ubiquitously distributed within genomes. Due to their high level of polymorphism, relatively small size and rapid detection protocols, these markers are widely used in a variety of fundamental and applied fields of life and medical sciences. In the field of aquaculture, microsatellites represent workhorse markers, which are useful for the characterization of genetic stocks, broodstock selection, constructing dense linkage maps, mapping economically important quantitative traits, identifying genes responsible for these traits and application to marker-assisted breeding programmes. In this review, genomic distribution, function, evolution and practical applications of microsatellites are considered, with special emphasis on fish genetics and aquaculture. © 2005 Elsevier B.V. All rights reserved. Keywords: Evolution; Functional relevance; Genomic distribution; Microsatellite; SSR; Teleostei Contents 1. Introduction ....................................................... 2 2. Genomic distribution of microsatellites ......................................... 2 3. Evolution of microsatellites ............................................... 3 4. Function of microsatellites ................................................ 5 4.1. DNA structure................................................... 5 4.2. DNA recombination................................................ 5 4.3. DNA replication.................................................. 5 4.4. Gene expression.................................................. 5 5. Development of type I (coding) and type II (non-coding) markers................................................... 7 ⁎ Corresponding author. Department of Pathology and Laboratory Medicine, University of Cincinnati Medical Center, 231 Albert Sabin Way, Cincinnati, OH 45267-0529, USA. Tel.: +1 513 558 4402; fax: +1 513 558 2141. E-mail address: [email protected] (D.A. Chistiakov). 0044-8486/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.aquaculture.2005.11.031 2 D.A. Chistiakov et al. / Aquaculture 255 (2006) 1–29 6. Applications of microsatellites ..............................................9 6.1. Genetic mapping ..................................................9 6.2. Individual DNA identification and parentage assignment ............................ 12 6.3. Phylogeny, population and conservation genetics ................................ 13 6.4. Molecular epidemiology and pathology ..................................... 15 6.5. Quantitative trait loci mapping.......................................... 16 6.6. Marker-assisted selection ............................................ 18 7. Prospects ........................................................ 20 Acknowledgements ...................................................... 21 References .......................................................... 21 1. Introduction in the smallest bacterial genomes (Gur-Arie et al., 2000). The existence of SSRs in eukaryotic genomes Microsatellites, or simple sequence repeats (SSRs), has been known since the 1970s (Bruford et al., 1996). represent a unique type of tandemly repeated genomic Hamada et al. (1982) demonstrated a large number and sequences, which are abundantly distributed across gen- wide occurrence of these sequences from yeast through omes and demonstrate high levels of allele polymor- to vertebrates. Tautz and Renz (1984) hybridized differ- phism. They are codominant markers of relatively small ent microsatellite sequences to genomic DNA from a size, which can be easily amplified with the polymerase variety of organisms and reported many types of simple chain reaction. These features provide the foundation sequences. for their successful application in a wide range of fun- The majority of microsatellites (30–67%) found are damental and applied fields of biology and medicine, dinucleotides. In the genome of vertebrates, (AC)n is the including forensics, molecular epidemiology, parasito- most common dinucleotide motif. It is 2.3-fold more logy, population and conservation genetics, genetic map- frequent than (AT)n, the second most general type of ping and genetic dissection of complex traits. In the field dinucleotides (Toth et al., 2000). Interestingly, in pri- of fisheries and aquaculture, microsatellites are useful for mates mononucleotide repeats are mostly represented the characterization of genetic stocks, broodstock selec- by poly (A/T) tracts, which are the most frequent classes tion, constructing dense linkage maps, mapping econom- of SSRs (Beckmann and Weber, 1992). In total, higher- ically important quantitative traits and identifying genes order SSR classes (tri-, tetra-, penta-and hexanucluo- responsible for these traits and application in marker- tides) are about 1.5-fold less common in genomic DNA assisted breeding programmes. Although microsatel- of vertebrates than dinucleotides (Toth et al., 2000). In lites are considered selectively neutral markers, they the genome of Japanese pufferfish or fugu Takifugu often represent functionally relevant polymorphisms. rubripes, dinucleotide repeats have the highest relative SSRs contribute to DNA structure, chromatin organi- frequency (34%) followed by tetranucleotides (21%), zation, regulation of DNA recombination, transcription trinucleotides (19%), mononucleotides (16.5%), hexa- and translation, gene expression and cell cycle dyna- nucleotides (6%) and pentanucleotides (3%) (Edwards mics. To date, microsatellites are far from being com- et al., 1998). In total, 1.29% of the genome of Japanese pletely identified and characterised, providing intriguing pufferfish consists of microsatellites. For the closely perspectives for discovery of new properties and cha- related spotted green pufferfish Tetraodon nigroviridis, racteristics of SSRs, which will help the design of SSRs cover 3.21% of the genome (Crollius et al., 2000). new research fields and the practical use of these In fugu, one microsatellite is found every 1.87 kilobases markers. (kb) of DNA, and a CA repeat (the most common type of tandem repeat) occurs every 6.56 kb of DNA. In 2. Genomic distribution of microsatellites three-spined stickleback Gasterosteus aculeatus,CA dinucleotides also are the most common type of micro- Microsatellites are stretches of DNA consisting of satellites, occuring approximately once every 14 kb tandemly repeated short units of 1–6 base pairs (bp) in (Peichel et al., 2001). For comparison, in the human length. SSRs typically span between twenty and a few genome, one microsatellite was found every 6 kb and hundred bases (Beckmann and Weber, 1992). They are one CA repeat occurred every 30 kb of DNA (Beck- ubiquitous in prokaryotes and eukaryotes, present even mann and Weber, 1992). D.A. Chistiakov et al. / Aquaculture 255 (2006) 1–29 3 Microsatellites can be found anywhere in the ge- flounder and Japanese pufferfish (Venkatesh and Bren- nome, both in protein-encoding and noncoding DNA ner, 1997). (Toth et al., 2000). In eukaryotic organisms, SSRs have In contrast to other types of repeat motifs, triplets are been shown to be in excess in noncoding regions com- found in both coding and non-coding genomic regions pared to a random distribution pattern (Metzgar et al., with a high frequency (Wren et al., 2000; Morgante et 2000). They are relatively rare in coding DNA, ranging al., 2002). In all vertebrates, (G+C)-rich motifs (e.g., between 7–10% in higher plants (Wang et al., 1994; CCG, CAG) are the most common among trinucleo- Varshney et al., 2002) and between 9–15% in verte- tides. These repeats dominate in exons, whereas they brates (Moran, 1993; Jurka and Pethiyagoda, 1995; Van are less common in intronic sequences (Toth et al., Lith and Van Zutphen, 1996). Only 11.6% of a total of 2000). In humans, the expansion of trinucleotides, 6042 microsatellites were found in protein-coding encoding polyproline (CCG)n, polyarginine (CGG)n, regions in the genome of Japanese pufferfish (Edwards polyalanine [(GCC)n and (GCG)n] and polyglutamine et al., 1998). A relatively low frequency of SSRs in (CAG)n tracts within exons has been described. Such coding regions could be explained by negative selection expansions can lead to various neurodegenerative and against frameshift mutations in the translated sequences neuromuscular disorders, including myotonic distrophy, (Metzgar et al., 2000; Li et al., 2004). fragile X syndrome, Huntington's disease and spinocer- Di- and tetranucleotide motifs are mostly clustered ebellar ataxia (Jasinska et al., 2003; Brown and Brown, in noncoding regions. In vertebrates, they are distrib- 2004). Naturally occurring triplet repeat instability of uted 42- and 30-fold less frequently in exons than in transcribed sequences occurs not only in humans, but intronic sequences and intergenic regions, respectively also in lower vertebrates.