
Natural insertions in rice commonly form tandem duplications indicative of patch-mediated double-strand break induction and repair Justin N. Vaughn and Jeffrey L. Bennetzen1 Department of Genetics, University of Georgia, Athens, GA 30602 Contributed by Jeffrey L. Bennetzen, December 4, 2013 (sent for review September 13, 2013) The insertion of DNA into a genome can result in the duplication duplications were also hypothesized to be caused by slippage be- and dispersal of functional sequences through the genome. In cause, out of 85 insertions producing such duplications, 50 were addition, a deeper understanding of insertion mechanisms will associated with flanking repeats >2 bp (14). Replication slippage inform methods of genetic engineering and plant transformation. would presumably require a preexisting short repeat because Exploiting structural variations in numerous rice accessions, we priming must occur between the end of the loop that will become have inferred and analyzed intermediate length (10–1,000 bp) the duplication and the position to where replication slips. insertions in plants. Insertions in this size class were found to be Authors of more recent work investigating insertions across the approximately equal in frequency to deletions, and compound human genome suggest alternatives to replication slippage on insertion–deletions comprised only 0.1% of all events. Our find- the grounds that homology is often either nonexistent or very ings indicate that, as observed in humans, tandem or partially short, whereas the length of homology and the length of insertion tandem duplications are the dominant form of insertion (48%), are not correlated (10). These researchers favor a model based on although short duplications from ectopic donors account for a siz- DSBs being repaired by nonhomologous end-joining (NHEJ). able fraction of insertions in rice (38%). Many nontandem dupli- However, conventional models of DSB repair are strained to predict tandem duplications >10 bp, much less >100 bp. Such cations contain insertions from nearby DNA (within 200 bp) and models require extensive single-stranded, complementary ends can contain multiple donor sources—some distant—in single to be preserved during the break. Moreover, DSBs produced by events. Although replication slippage is a plausible explanation Tal-effector nucleases in humans do not yield insertions that for tandem duplications, the end homology required in such > form tandem repeats, despite the fact that the breaks generate a model is most often absent and rarely is 5 bp. However, end a5′ overhang (15). Thus, this common class of mutations cur- homology is commonly longer than expected by chance. Such find- rently lacks a firm molecular explanation. ings lead us to favor a model of patch-mediated double-strand- Similar to tandem duplications, short duplications are com- break creation followed by nonhomologous end-joining. Addition- monly found within 100 bp of one another, but with unique in- ally, a striking bias toward 31-bp partially tandem duplications tervening DNA (16). By comparing human polymorphisms with suggests that errors in nucleotide excision repair may be resolved chimp sequence, Thomas et al. (16) inferred that the repeats via a similar, but distinct, pathway. In summary, the analysis of were recent insertions. As discussed by the authors and herein, recent insertions in rice suggests multiple underappreciated causes a mechanism for such duplications is even less forthcoming than of structural variation in eukaryotes. for tandem duplications. In this study, we used extensive population-scale rice rese- double-strand break repair | structural DNA variation quencing data to confirm that tandem duplications are also abundant natural polymorphisms in the plant kingdom. Addi- tionally, we found that many insertions in rice, although not enomic DNA insertion causes genome expansion and, po- ∼ Gtentially, the rearrangement and diffusion of protein domains perfectly tandem, are from a 50-bp window around the insertion and regulatory elements throughout the genome (1, 2). Addi- tionally, genetic engineers generally aim to integrate specific DNA Significance into the nuclear genome, so the natural mechanisms by which this integration occurs may serve as a starting point to elaborate and Very short insertions are usually attributable to replication improve genome modification (3, 4). Common causes of gene- slippage. Another class of longer insertions (>10 bp) creates sized insertions are unequal recombination (5), transposable ele- tandem duplications even in the absence of preexisting repeats. ment replication (1), and ectopic recombination stimulated by This work provides analysis into the properties and mechanistic double-strand breaks (DSBs) in the genome (2, 6). Shorter events implications of such insertion polymorphisms segregating in rice. are less well characterized, but it appears that these can be created To our knowledge, this work is the first comprehensive analysis by similar processes (7). Still, high-throughput sequencing of DSB of this major class of natural mutations in any plant. Inspired by repair events in humans (8) and plants (9) suggests that insertions the prior experiments of Stéphane Vispé and Masahiko Satoh, related to induced breaks are very rare and very short. we propose a model for how a substantial number of double- Although the processes described above can produce duplica- strand breaks are created and how they might result in tandem tions at distant genetic loci, the most common form of non- duplications. The model, based on patch-mediated nick repair, microsatellite-associated insertions in humans is tandem duplica- is indirectly supported by recently published experiments using tions (10). Once created, tandem duplications can be dramatically a modified CRISPR-associated 9 nicking enzyme. expanded by unequal recombination or replication slippage. Such duplications may be deleterious, or they may be promoted Author contributions: J.N.V. and J.L.B. designed research; J.N.V. performed research; J.N.V. by selection for a novel or expanded function (11, 12). analyzed data; and J.N.V. and J.L.B. wrote the paper. Although tandem repeats are ubiquitous in eukaryotic genomes, The authors declare no conflict of interest. the mechanisms for their origin are still in question. Early Freely available online through the PNAS open access option. analysis of human indel mutations indicated that replication 1To whom correspondence should be addressed. E-mail: [email protected]. slippage was the most effective model to explain the origin of This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. assorted repeats (13). In other studies, longer, de novo tandem 1073/pnas.1321854111/-/DCSupplemental. 6684–6689 | PNAS | May 6, 2014 | vol. 111 | no. 18 www.pnas.org/cgi/doi/10.1073/pnas.1321854111 Downloaded by guest on October 2, 2021 site. We rarely found the end homology in tandem repeats that is DSB repair outcomes, which commonly produce deletions (6, 9), expected for replication slippage, although we did note a bias whereas we inferred approximately equivalent frequencies of toward short microhomology between insertion ends and in- insertions relative to deletions. As expected, the number of sertion site. These data led us to elaborate on the DSB model of structural variations correlated with the size of a chromosome. tandem duplication, proposing that long patch base excision re- Chromosome 3 had the highest percentage of inferable events, pair (BER) on complementary strands commonly leads to such likely because of its high-quality assembly. patterns (17). Additionally, we characterized common forms of In the human genome, short insertions (8–100 bp) commonly nontandem, but local, duplication. create tandem duplications that are in the same orientation and have no unique spacer sequence between the resultant repeats Results (10). Such insertions are impossible to position exactly (Fig. 1 A– Inferring Insertions. Recent mutational events can be inferred by C), and so we used the trace extension metric, d, to first char- comparing orthologous sequences between two lineages with an acterize the inferred insertions. As illustrated in Fig. 1 A–C and orthologous sequence in a known outgroup lineage (1). The Messer and Arndt (10), d can characterize whether an insertion extant state in the sister lineages matching the outgroup state is is a tandem duplication (l ∼ d in Fig. 1B) or comes from a more inferred to be the ancestral state. Although such inferences may distant site (d = 0 in Fig. 1A). Also, d allows one to determine be false due to segregating polymorphisms in the ancestral whether an insertion and its donor have similarity that extends population, they are generally valid for Oryza sativa (rice) com- beyond the boundaries of the insertion, even when the insertion parisons using Oryza glaberrima as an outgroup (1). Insertion > variations segregating in an extant population are more likely to creates a tandem duplication (d l in Fig. 1C). be recent mutations and, hence, are less likely to complicate > interpretation resulting from multiple events and sequence di- Tandem Duplications Accounts for More Than Half of 9-bp Insertions and Rarely Exhibit Extensive End Homology. The sharp diagonal in vergence. For these reasons, we chose to use recently published > data regarding genetic variation in a sample of 50 rice accessions Fig. 1D demonstrates that, as in humans, insertions 9bp (18). In that study, indels
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-