Proc. Natl. Acad. Sci. USA Vol. 92, pp. 11125-11129, November 1995

Mg-SINE: A short interspersed nuclear element from the rice blast , Magnaporthe grisea (transposon/repeated DNA) PRADEEP KACHROO*, SALLY A. LEONGt, AND BHARAT B. CHATTOO*t *Department of Microbiology and Biotechnology Center, Faculty of Science, M. S. University of Baroda, Baroda 390 002, India; and tU.S. Department of Agriculture-Agricultural Research Service Disease Research Unit, 1630 Linden Drive, University of Wisconsin, Madison, WI 53706 Communicated by M. S. Swaminathan, M. S. Swaminathan Research Foundation, Madras, India, June 28, 1995

ABSTRACT A short interspersed nuclear element, Mg- Magnaporthe grisea, an ascomycetous fungus responsible for SINE, was isolated and characterized from the genome of the the blast disease of rice (Oryza sativa), is known to generate rice blast fungus, Magnaporthe grisea. Mg-SINE was isolated new pathogenic variants at a high frequency (15). Various as an insertion element within Pot2, an inverted-repeat trans- mechanisms have been postulated to explain pathogenic vari- poson from M. grisea and shows typical features of a mam- ability in this fungus (15-17), although its molecular basis malian SINE. Mg-SINE is present as a 0.47-kb interspersed remains elusive. We initiated an analysis of repeated DNA sequence at "100 copies per haploid genome in both rice and sequences of M. grisea in order to delineate their role in non-rice isolates of M. grisea, indicating a common evolution- genome organization and generation of variability. We report ary origin. Secondary structure analysis ofMg-SINE revealed here the presence of a SINE sequence in the genome of M. a tRNA-related region at the 5' end which folds into a grisea and further show that these elements are capable of cloverleaf structure. Genomic fusions resulting in chimeric causing insertions and possibly other genomic rearrange- Mg-SINEs (Ch-SINEs) composed of a sequence homologous ments.§ to Mg-SINE at the 3' end and an unrelated sequence at its 5' end were also isolated, indicating that this and other DNA rearrangements mediated by these elements may have a major MATERIALS AND METHODS effect on the genomic architecture of this fungus. Bacterial Strains and Fungal Isolates. Escherichia coli strains JM101 and DH5a were used for bacterial transforma- Genomes of diverse harbor interspersed repeated tions and propagation. M. grisea isolates (35 isolates, sequences (1). These repetitive elements may be mobile or 18 of which were rice pathogens) representing diverse geo- maintained stably in the genome (2). Mobile DNA elements graphical regions were used in the present study. are classified broadly into two categories, based on the inter- Nucleic Acid Analysis. Genomic DNA isolation, Southern mediates formed during their replication (3). The elements hybridizations, contour-clamped homogeneous electric field that replicate via an RNA intermediate are subgrouped further (CHEF) analysis, and copy-number determination were car- into either , which have structural features ried out as described (18). Plasmid DNA was prepared by the similar to retroelements, or retroposons, which duplicate via mini-maxi procedure (19). The complete sequence of 5-BlOl RNA intermediates but do not have -like structure (1.4 kb), 32-5-E H13 (1.1 kb), 23-4-D H7 (2.3 kb) was (4). The majority of mammalian interspersed retroposons are determined by making subclones in pUC19 and carrying out classified as short interspersed nuclear elements (SINEs) or double-strand sequencing with Sequenase version 2.0 (United long interspersed nuclear elements (LINEs) (5, 6). States Biochemical). Sequence of Mg-SINE was determined Typical features of a generic SINE are the presence of an from 5-BlOl (PCR amplicon), 9-rep (BamHI genomic clone of RNA polymerase III promoter, an adenine-rich 3' end varying pUC19), and 32-12-A (20) by using the primers P1 to from 8 to 50 bp, and direct repeats at the ends (6). The P4 (see Fig. 2). Sequence of Ch-SINE was determined from dispersion and amplification of SINEs are thought to occur by cosmid clones 32-5-E and 23-4-D. Sequence comparisons were retroposition in which the RNA polymerase III-dependent carried out with DNA INSPECTOR version 3.25 (Textco, West SINE transcripts are reverse transcribed into cDNA which Lebanon, NH) and SEQAID II version 3.70 (Rhoads and Roufa, then integrates into new genomic sites (7). Since SINEs lack Molecular Genetics Laboratory, Kansas State University) open reading frames, they have been hypothesized to depend software programs. on function in trans in order to make PCR analysis used a 5' primer (5'-GTCATTGAAA- cDNA (6). 13 upstream of a Pot2 insertion in of elements and GAGAAGTAAAA-3') bp SINEs form a unique group transposable a single-copy region (CRP) (18) and a 3' primer (5'- although no single functional role has been demonstrated for GAAATTGCCAATTATCG-3') located 0.9 kb downstream them unequivocally, they have been shown to be actively was carried out as described involved in insertional inactivation of genes (8, 9) and in the from the start of Pot2. PCR (21). Insertion of these Total RNA was extracted from isolate Guy 11 by an acid formation of chimeric sequences (10). on elements at previously unoccupied sites has been shown to phenol method (22). RNA was fractionated glyoxal/1.7% result in mRNA truncation, altered polyadenylylation, and agarose gels in 10 mM phosphate buffer and blotted onto MSI modified protein structure (11-13). These elements can also nylon membranes. Northern blotting was carried out as de- undergo homologous and nonhomologous recombination scribed (23). which can contribute to genomic flux and may even result in genetic disorders (14). Abbreviations: CHEF, contour-clamped homogeneous electric field; LINE, long interspersed nuclear element; SINE, short interspersed nuclear element. The publication costs of this article were defrayed in part by page charge tTo whom reprint requests should be addressed. payment. This article must therefore be hereby marked "advertisement" inE §The sequences reported in this paper have been deposited in the accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank database (accession nos. U35313 and U35230). 11125 Downloaded by guest on September 27, 2021 11126 Genetics: Kachroo et al. Proc. Natl. Acad. Sci. USA 92 (1995)

Rice Path2gens Non Rice Pathogens ab c d e f g h i j k m no pq r s t u vwxyz z" kb *EcO3ou0 ofo co10 M1 1I O l C M C 12 1

0.1 -~~~~~~:, -s 0_ __ _._=. : X -~~~~~~~~~~~~~~7.1 -~~~~~~~~~~~~6.11.0 _~E *_ _SF_-4~~~~~~~~~~~~~~.0I = o - 3~~~~~~~~~~~.0

FIG. 1. PCR analysis of CRP, an insertion of Pot2 within a *~~~~ ~ ~~~~~~~~~~~~~~~~~~ . 0_ single-copy region. The isolates are listed above each lane. Isolate B101 amplified a polymorphic band of 1.4 kb (shown by arrowhead), while all other isolates amplified a monomorphic band of 0.9 kb. (A) Ethidium bromide-stained gel. The 1-kb ladder (BRL) was used as molecular size markers. (B) Autoradiogram of the Southern blot ofthe gel shown inA, probed with an internal 0.9-kb BamHI fragment ofPot2 (18). FIG. 3. Southern hybridization analysis of genomic DNA of rice- RESULTS and non-rice-infecting isolates of M. grisea probed with Mg-SINE. Genomic DNA (2 jig) was digested with Pst I, electrophoresed through Sequence Analysis. The distribution of specific insertions of a 0.8% agarose gel, transferred to MSI nylon membrane and hybrid- Pot2, an transposon ofM. grisea (18), in various ized with the radiolabeled 258-bp PCR amplicon of Mg-SINE (pre- rice- and non-rice-infecting isolates was analyzed by PCR. pared as in Fig. 2B). Pst I does not have any site within Mg-SINE. The Amplification of a Pot2 insertion within a single-copy region, lanes contain genomic DNA from isolate F (India), lane a; B (India), lane b; B101 (India), lane c; B157 (India), lane d; 4360-17-1 (United CRP, revealed the expected 0.9-kb band in most of the isolates States), lane e; P06-6 (Philippines), lane f; JMB840610 (Philippines), tested except for the isolate B101, in which a band of 1.4 kb was lane g; 101 (Philippines), lane h; 102 (Philippines), lane i; 103 identified (arrowhead, Fig. 1). Sequence analysis of this frag- (Philippines), lane j, V8601 (Philippines), lane k; 104 (Philippines), ment (p5-B101) showed the presence of an additional 0.5 kb of lane 1; 91-A-58 (United States), lane m; Pd 8824 (Philippines), lane n; DNA within Pot2, resulting in a duplication of 16 bp at the Ec522 (Philippines), lane o; Pr886 (Philippines), lane p; Pr342 (Phil- target site (Pot2 region flanking the insertion site). The ippines), lane q; Pr8988 (Philippines), lane r; Cb334 (Philippines), lane duplications were present as direct repeats and were A+T- s; Pg3393 (India), lane t; Cd88215 (Philippines), lane u; Lc454 or (Philippines), lane v; Ei 476 (Philippines), lane w; 4091-5-8 (laboratory rich. (13 of 16 nt being A T). The 0.5-kb DNA showed many strain), lane x; Bd8401 (Philippines), lane y; Dc88428 (Philippines), features common to SINE sequences, such as the presence of lane z; Lh88490 (Philippines), lane z. A- and B-box consensus sequences which showed an exact match to the tRNA polymerase III promoter consensus se- sequence analysis of four individual Mg-SINEs from isolate quence (Fig. 2A) (6). The A box starts 8 bp downstream of the B101 (5-B3lOl) and 2539 (9-rep, 35-12-A, and 42-1-B) revealed 5' end of the element, and the two conserved regions are only a single base difference, a transition mutation (T -*- C) at separated by 32 nt. This element was called Mg-SINE (M. position 419 in 9-rep, 35-12-A, and 42-1-B3, indicative of the grisea SINE). conserved nature of this sequence. Mg-SINEs terminate with Mg-SINE, excluding the 16-bp direct repeats, is 472 bp in a characteristic trinucleotide repeat, TAC, which is reiterated, size in the insertion clone p5-B101 (Fig. 2B). The target-site with different clones showing variable numbers of iterations. duplication for these elements was found to vary from 1 bp in Sequence analysis of the 3' end of 5-BlOl, 9-rep, 35-12-A, insertion 9-rep (isolate 2539) or 2 bp in 35-12-A (isolate 2539) 7-3-B3, and 42-1-B revealed five, five, nine, six, and six trinu- to 16 bp in insertion 5-BlOl (isolate B101) (Fig. 2C). Complete cleotide repeats, respectively. Mg-SINE did not show any

A 5' A box boz 3

t-RNA Pol III Consensus: 7 bp--TGGCNNAGTNGG ----17-60 bP--- GGTTCGANNCC FIG. 2. Sequence analysis and target-site charac- Mg-SINE: 8 bp--TGGCGCAGT-GG ----- 32 bp --- GGTTCGAATCC terization of Mg-SINE. (A) Alignment of conserved sequence of RNA polymerase III A- and B-box B regions of Mg-SINE with tRNA polymerase III consensus. of GGG GGG AAT GGC GCA GTG GTT AAG CGC CAT GGC TGT TAC TTG AGT AAC GTA GGT TCG 57 promoter (B) Nucleotide sequence AAT CCT GCT CCC TCC ACC TCC TCC CCG CTT ACC GTG GCA AGG ATA ACC CGG CTG GCT 114 Mg-SINE primers P1, P2, P3, and P4 were used for 4 P1 sequence analysis. P2 and P3 were also used for PCR ATC GAC AAC AAC CAC CCG CCT GTG CCA TCG AGC CCA CAC TTG TTG GGA ACT TCG TCT 171 P2 amplification of a 258-bp fragment which was used as CCA TGG CTA CAA ACG ACC GAC AGC TCC GCT GTT TGG ACA CAG GCC CCT CTC GAT GAA 228 a probe for Southern and Northern analysis. Se- GAG CCG TCA ACT GCC GTC AGA CGA ATC CTC CGT GCG CCT GAA GGC ACG GGG TCG CGT 285 quence comparison of three individual Mg-SINE CAG TGA ACA AAC CTG TAA GCA GGA CCG GGC ACG AAC CCG GTC AGG CTC GAT TCG CTT 342 and a CTA TGC CCT TGT TTT CCG CTG TGT AAA TAG AGA AAA TAG AAA GCG CGC CGA GAT ACC 399 clones-5-B1O1, 35-12-A, 9-rep-revealed 4 P3 transition mutation from T (marked as bold type) to CCT CGG AGG TTG CTA ACG GTC GGC TAA TGA GCC GGG CCG AGC CCG GCG TTA AAT AAT 456 C at position 419 in 35-12-A and 9-rep. Numbers at P4 right mark the nucleotide positions. (C) Target site AC TAC TAC TAC TAC TAC TA 472 and the flanking region of Mg-SINE insertions. Sequences 1, 2, and 3 represent insertion-site se- C quence of 5-BlOl, 35-12-A, and 9-rep, respectively. The target-site duplication, conserved dinucleotide 1. CTATrATAAATATCTC-Mg-SINECTATlrATAAATATCTCGAmAATc-AclTGGAlTTAcT TrGATTTIlAATTCrCA CCA 2. CT--Mg-SINE--CTGT &Tmrr CT, and the poly(T) stretch downstream of the 3. C--Mg-SNE-CTTC insertion site are shown in bold type. Downloaded by guest on September 27, 2021 Genetics: Kachroo et al. Proc. Natl. Acad. Sci. USA 92 (1995) 11127 A B 1 2 1 2 3 1 2 3

FIG. 5. Northern hybridization analysis of total RNA from M. -4.6 Mb grisea with Mg-SINE as probe. To- tal RNA from a rice-infecting iso- --0.5 kb late of M. grisea (Guyll) was frac- tionated in a 1.7% agarose gel, blotted onto MSI nylon membrane, and hybridized with the 258-bp PCR amplicon of Mg-SINE (pre- pared as in Fig. 2B). Lane 1 and 2 represent 30 and 18 ,ug of total FIG. 4. Southern hybridization analysis of electrophoretically sep- RNA, respectively. arated of M. grisea probed with Mg-SINE. Chromo- somal DNA from isolate A, lane 1; B101, lane 2; and B157, lane 3, was conserved nucleotide positions-e.g., those involved in the fractionated in a 0.8% agarose gel, transferred to MSI nylon mem- formation of the stem structure (C.G, C-G, and U-A) of the brane, and hybridized with the radiolabeled 258-bp PCR amplicon of to be conserved in Mg-SINE. This Mg-SINE (prepared as in Fig. 2B). The molecular size 4.6 Mb TIC loop were also found corresponds to 6 of isolate Guy 11 (24). (A) Ethidium analysis also revealed an unpaired A+T-rich end toward the 3' bromide-stained gel. (B) Autoradiogram of the Southern blot of the end of the element and a tRNA-unrelated region (data not gel shown in A. shown). Isolation and Characterization of Chimeric Elements. prominent open reading frames, as commonly observed in When a genomic library of isolate 2539 in the cosmid vector other SINEs (6). pMLF1 (20) was probed with Mg-SINE, hybridization signals DNA and RNA Analysis. The presence and distribution of of varied intensities were observed. Cosmid DNA digested Mg-SINE within the genomes of various isolates of M. grisea with several restriction enzymes and probed with Mg-SINE were studied by Southern hybridization and CHEF analysis. indicated that the weakly hybridizing cosmid clones may Genomic DNA of 35 isolates representing rice pathogens, represent either a divergent family of a similar class of non-rice pathogens, and laboratory strains of M grisea was elements or defective Mg-SINEs (data not shown). Sequence digested with Pst I or EcoRV (data are shown for 27 isolates) analysis of fragments from the cosmid clones 35-12-A, 7-3-B, and hybridized to a 258-bp internal fragment generated by and 42-1-B, which hybridize intensely to Mg-SINE, revealed PCR amplification of Mg-SINE with primers P2 and P3 (Fig. to that of whereas the cosmid 2B). The genomic DNA of all the rice pathogens and 10 out sequences identical Mg-SINE, of 15 non-rice pathogens showed intense hybridization signals, clones 32-5-E and 23-4-D, which hybridized weakly, showed indicating Mg-SINE to be a multicopy, repeated DNA element only a partial homology to Mg-SINE, restricted to the 3' end. (Fig. 3). This repeat element was present in a very low copy Sequencing of these partially homologous regions and their number in 5 non-rice pathogens-Pd8824, Pr8988, 4091-5-8, flanking sequences revealed that homology to Mg-SINE re- Bd8401, and Dc88428, the last three isolates showing a copy sides in the 240 bp at the 3' end (Fig. 7A and B, shown within number of <5. These possibly represent a group of isolates box). The 452-bp sequence preceding the region homologous which may have escaped amplification of Mg-SINE during the to Mg-SINE was identical in both 32-5-E and 23-4-D cosmid evolution of M grisea. subclones; beyond this region, the two subclones showed Distribution of Mg-SINE on various chromosomes was divergent sequence. Both the cosmid subclones terminated studied by CHEF analysis of two rice isolates and one non-rice with the trinucleotide repeat TAC at the 3' end, a character- isolate of M grisea. All the isolates used in the present study istic feature of Mg-SINE, followed by a divergent sequence. showed the presence of B chromosomes which were polymor- phic in size as well as number (Fig. 4A). A CHEF gel (Fig. 4A) probed with Mg-SINE showed hybridization to all resolvable chromosomes and B chromosomes ofM grisea, indicating it to OL IL be a dispersed repeated DNA sequence (Fig. 4B). The copy C number of Mg-SINE as determined by quantitative dot blot CG analysis was '100 for isolates B157 and Pg3393. ) G Transcription of Mg-SINE was studied by Northern blot analysis. When hybridized to the 258-bp PCR amplicon from Mg-SINE, total RNA from isolate Guy 11 revealed a transcript of 0.5 kb size in the lanes containing 18 and 30 ,tg of total RNA (Fig. 5). Secondary Structure. The possible secondary structure of Mg-SINE derived by use of the Genetics Computer Group programs FOLD and SQUIGGLE (25) showed a tRNA-related region, a tRNA-unrelated region, and an A+T-rich region. The 5' end of Mg-SINE (bp 1-73) folds in a tRNA-like iT 40 structure (Fig. 6). The cloverleaf structure terminates with nucleotides CCA at the 3' end, a characteristic feature of in tRNAs, although unlike tRNA, the cytosine of CCA Mg- FIG. 6. Secondary structure analysis of Mg-SINE. The structural SINE shows pairing with guanosine at the 5' end, adenine analysis was carried out with the SQUIGGLE and FOLD programs of the staying unpaired. The positions of the D loop and TPC loops University of Wisconsin Genetics Computer Group (25). Nucleotide of tRNA correspond, respectively, to the A and B boxes of numbers are given from the 5' end of Mg-SINE. A and B designate the Mg-SINE, both of which also form a loop-like structure, while conserved polymerase III consensus promoter sequence. The 5' region the anticodon arm forms the middle loop. Several highly (bp 1-73) folds into a cloverleaf structure typical of tRNA. Downloaded by guest on September 27, 2021 11128 Genetics: Kachroo et al. Proc. Natl. Acad. Sci. USA 92 (1995)

A ' --1.2kb, 1.7kb 1.8 kb-_m

S B EV HHBs A Bs H B P C 1 2 3 4 5 6 7 8 kb 4-1 .6 kb ---2 kb ---40.8 kb-b-

f Ch-SIN E - ff 7 7 T11 ---- Bg HP HHBs A Bs H P B CGAGGAACTGCGTTTGCCAAGtCCCCTACTGGGCCATTATTTGGCCATGAGGACCGGCCACGGCGATTTCA- 71 PS AGGCCTACCATGACCGTTTCAACCACCAGGATGCAAACACCTCGTGTGCCTGGTGCTGGAAGCGGACCTCC- 142 CCTGAGCACCCGGTGCACTGCCGCTATTCGCGGGCGGTGTGGAGAAACTGGCCGTGGCCTAATAACGACCG- 213 1 P6 0.5 GCCGGCCGGGCCGCCAAATCGCGCCCAACGCTGGAAATTCTTCCAGACAAGCTTCGGGCAACCGAAAAGCT- 284 TTGAGGCGTTTTCGATAGCCACCAACTACTTCAGCGCCCGCCCCAGAGCGGCCCGCCAGCGCCCCGCGCGC- 355 ACGAACGCACTTTACGCCTAGGGACACCGATCGTAAACGACTCGAACTCTGACGAGGAATAGACTTACTGC- 426 CTTTTCACCCACACTTCACGGCAC d CCGTCAGACGAACCCTCCGTGCACCTTAGGCACGGGGTCGCGTC- 497 AGTGAACTAACCCCCTAAGCAGGACCGGGCCCGAACCCGGTCAGGCACGATCCGCCTCTGCCCTCCTTGTT- 568 TTCCCCCTGTGTAAATAAAGAAGATAGAACGCGCGCCGAGATACCCCTCGGGAGGTTGCTAACGGCCGGCT- 639 AACAAGCCGGCCGAGCCGCCGGTTAACTAATACTACTACTACTACTACTACTA - 692

FIG. 7. Analysis of Ch-SINE from cosmid clones 23-4-D and 32-5-E. Both of these clones revealed a weak hybridization signal, as compared with cosmid clones which contain complete Mg-SINEs (data not shown). (A) Restriction map of fragments hybridizing to Mg-SINE, from cosmid clones 23-4-D (Upper) and 32-5-E (Lower). Box represents the 692-bp region of Ch-SINE which is common to both the cosmid clones. Stippled and filled boxes indicate the 3' SINE-like and the trinucleotide repeat regions, respectively. Open box indicates the 5' non-SINE-like region of Ch-SINE. The thick line indicates the region which was sequenced from both the cosmid clones. Restriction sites: A, Apa I; B, BamHI; Bs, BssHII; EV, EcoRV; H, HindIll; P, Pst I; S, Sac I. (B) Nucleotide sequence of Ch-SINE. The sequence within the box represents the region of Ch-SINE showing homology to the 3' end of Mg-SINE. P5 and P6 are the primers used to amplify 207 bp of non-SINE-like region from different cosmid clones as shown in Fig. 8C, which was also used as probe for Southern hybridization analysis. Sequence in bold type marks the region which showed compressions when sequenced from either strand and is the best possible representation of this region. Numbers at right mark the nucleotide positions. (C) PCR analysis of Ch-SINE from cosmid clones which revealed a less intense hybridization signal to the Mg-SINE probe, indicating that weak hybridization signal corresponds to the presence of Ch-SINE in these . PCR amplification was carried out with primers P5 and P6 as shown in Fig. 8B. PCR products are shown for cosmid 7-3-B, lane 1; cosmid 36-7-E, lane 2; cosmid 32-1-B, lane 3; cosmid 47-12-G, lane 4; cosmid 5-7-C, lane 5; cosmid 32-5-E, lane 6; and cosmid 23-4-D, lane 7. Cosmid clone 43-6-G, lane 8, did not hybridize to Mg-SINE and was used as a negative control. The 1-kb ladder (BRL) was used as a molecular size marker. The trinucleotide repeats present at the 3' end were reiterated as the presence of A- and B-box consensus sequences of RNA six and seven times, respectively, in cosmids 32-5-E and 23-4-D. polymerase III promoters, direct duplications of the target site, The 692-bp region is referred to as Ch-SINE, since it is a and an A+T-rich region at the 3' end. The element is chimera of SINE- and non-SINE-like sequences. Southern comparable to mammalian SINEs, suggesting that this group analysis of rice and non-rice isolates using a 207-bp PCR of elements may not be restricted to higher eukaryotes. amplicon derived from the non-SINE-like region of Ch-SINE Sequence comparisons of three individual Mg-SINEs from (primers P5 and P6; Fig. 7 B and C) as probe revealed that isolates B101 and 2539 revealed the conserved nature of the Ch-SINE is a moderately repeated DNA element ubiquitous sequence. The 3' trinucleotide repeat showed a variable num- among M. grisea isolates (data not shown). Analysis of Ch- ber of reiterations in individual elements, which could arise as SINE did not show any sequence similarity to the A and B a result of slippage of DNA polymerase during replication, a boxes of the RNA polymerase III promoter, indicating that the phenomenon commonly observed among the short reiterated 0.7-kb region represented a non-SINE repetitive DNA ele- repeat units (26, 27). Target sites of three Mg-SINE insertions ment. Sequence comparison of 240 bp of the SINE-like region showed a very high level of sequence conservation at the 3' end of the 0.7-kb Ch-SINE to Mg-SINE revealed partial sequence of the target site (Fig. 2C). The dinucleotide CT was found to identity between them (Fig. 8). The comparison indicated that be conserved in all the insertions, whereas the nucleotides divergence may be due to point mutations, insertions, and flanking CT were found to be divergent. Analysis of the 3' end deletions. of Mg-SINE from additional cosmid clones (42-1-B and 7-3-B) also showed the presence of CT at the 3' end of the target site, indicating that there may be target-site specificity for Mg- DISCUSSION SINE insert.ions. It is possible that the insertion of Mg-SINE The insertion element described in this report is -0.5 kb in size at a genomic site is coupled to the recognition of the target site and shows all the features of a typical SINE retroposon, such by a DNA-nicking enzyme which recognizes a conserved

GCCGTCAGACGAATCCTCCGTGCGCCTGAAGGCACGGGGTCGC Mg-SINE 283 GCCGTCAGACGAACCCTCCGTGCACCTT AGGCACGGGGTCGC Ch-SINE 494 GTCAGTGAACAAACCTG TAAGCAGGACCGGGCACGAACCCGGTCAGGCTCGATTCGCTTCTATGCC Mg-SINE 349 FIG. 8. DNA sequence alignment of Mg- SINE and Ch-SINE regions that show sequence GTCAGTGAACTAACCCCCTAAGCAGGACCGGGCCCGAACCCGGTCAGGCACGATCCGCCTCT GCCCTC Ch-SINE 562 similarity. The alignment was carried out with the CTTGTTTTCCO CTGTGTAAAT AGAGAAAATAGAAAGCGCGCCGAGATACCCCTCGG AGGTTGCTAA Mg-SINE 415 SEQAID II version 3.70 software program. Dots mark identical bases; gaps indicate a sequence CTTGTTTTCCCCCTGTGTAAATAAAGAAG ATAGAACGCGCGCCGAGATACCCCTCGGGAGGTTGCTAA Ch-SINE 630 divergence due to point mutations (both transi- Mg-SINE 472 tion and transversions, shown by bold type), CGGTCGGCTAATQAGCCGGGCCGAGCC CGGCQTTAAATAATACTACTACTACTACTA insertions, or deletions (shown by bold type). CGGCCGGCTAACAAGCCGG CCGAGCCGCCGG TTAACTAATACTACTACTACTACTACTACTA Ch-SINE 692 Numbers at right mark the nucleotide positions. Downloaded by guest on September 27, 2021 Genetics: Kachroo et al. Proc. Natl. Acad. Sci. USA 92 (1995) 11129

cytosine residue in the upper strand, the cleavage site for the manuscript. We thank Prof. Marvin Edelman and Ms. Irit Orr bottom strand being variable. Several topoisomerase II en- (Weizmann Institute of Science, Rehovot, Israel) for help with the zymes possess specific asymmetric recognition/cleavage sites secondary structure analysis. We thank Prof. Prescott Deininger and are known to cleave the top strand 3- to 10-fold more (Louisiana State University, New Orleans) for useful suggestions. This strand work was supported by grants from the Department of Biotechnology, efficiently than the bottom (28). Government of India, and by the Rockefeller Foundation to B.B.C. The presence of a transcript of -0.5 kb strongly suggests and to S.A.L. by the U.S. Department of Agriculture and the Rock- that Mg-SINE may be a functional retroposon. The diffuse efeller Foundation. nature of the signal possibly indicates transcripts of various sizes produced by several active Mg-SINE sequences. In vivo 1. Berg, D. E. & Howe, M. M. (1989) Mobile DNA, eds. Berg, E. & Howe, and in vitro transcription experiments have shown that Alu M. M. (Am. Soc. Microbiol., Washington, DC). families from mammals are tran- 2. Jeffreys, A. J., Wilson, V. & Thein, S. L. (1985) Nature (London) 314, and other related SINE 67-73. scribed by RNA polymerase III (29, 30). Since SINEs do not 3. Bingham, P. M. & Zachar, Z. (1989) in Mobile DNA, eds. Berg, E. & Howe, possess any open reading frames, the reverse transcriptase M. M. (Am. Soc. Microbiol., Washington, DC), pp. 485-502. function should be available in trans for the synthesis of 4. Hutchison, C. A., III, Hardies, S. C., Loeb, D. D., Shehee, W. R. & Edgell, cDNA to take place (6). The M. grisea genome has been M. H. (1989) in Mobile DNA, eds. Berg, E. & Howe, M. M. (Am. Soc. Microbiol., Washington, DC), pp. 593-617. shown to possess a functional retroelement (M. Farman, Y. 5. Singer, M. (1982) Cell 28, 433-434. Tosa, N. Nitta, and S.L., unpublished results; refs. 31 and 32) 6. Deininger, P. L. (1989) in Mobile DNA, eds. Berg, E. & Howe, M. M. (Am. and a LINE-like sequence hybridizing to various size tran- Soc. Microbiol., Washington, DC), pp. 619-636. scripts in a Northern blot (33). Our unpublished results 7. Jagadeeswaran, P., Forget, B. G. & Weissman, S. M. (1981) Cell 26, 141-142. indicate the presence of more than one class of retroelements 8. Wallace, M. R., Andersen, L. B., Saulino, A. M., Gregory, P. E., Glover, in the genome of M. grisea, and it seems likely that Mg-SINE T. W. & Collins, F. S. (1991) Nature (London) 353, 864-866. utilizes the reverse transcriptase produced by any of the 9. Goldberg, Y. P., Rommens, J. M., Andrew, S. E., Hutchinson, G. B., Lin, retroelements or LINE sequences to generate a cDNA copy B., Theilmann, J., Graham, R., Glaves, M. L., Starr, E., McDonald, H., Naris, J., Schappert, K., Kalchman, M. A., Clarke, L. A. & Hayden, M. R. which could be inserted at a region containing a double- (1993) Nature (London) 362, 370-373. strand break in the genome. 10. Daniels, G. L. & Deininger, P. L. (1991) Nucleic Acids Res. 19, 1649-1656. The high copy number of Ch-SINE in the genome of M. 11. Oliviero, S. & Monaci, P. (1988) Nucleic Acids Res. 16, 1285-1293. grisea indicates that a fusion/insertion event followed by 12. Vidal, F., Mougneau, E., Glaichenhaus, N., Vaigot, P., Darmon, M. & Cuzin, F. (1993) Proc. Natl. Acad. Sci. USA 90, 208-212. deletion of a major region of inserted sequence or a recom- 13. White, R. J. (1994) RNA Polymerase III Transcription (Landes, Austin, TX). bination event leading to generation of Ch-SINE must be 14. Lehrman, M. A., Goldstein, J. L., Russell, D. W. & Brown, M. S. (1987) followed by the amplification of this repeat. The fusion event Cell 48, 827-835. could be the consequence of premature termination of reverse 15. Ou, S. H. (1985) Rice Diseases (Commonwealth Mycological Institute, Kew, Surrey, U.K.), 2nd Ed., pp. 109-201. transcriptase followed by joining to a genomic sequence and 16. Giatgong, P. & Frederiksen, R. A. (1969) Phytopathology 59, 1152-1157. subsequent amplification of the fusion structure. This could 17. Yamasaki, Y. & Niizeki, H. (1965) Bull. Natl. Inst. Agric. Sci. 13, 231-274. also represent a possible mechanism for the generation of 18. Kachroo, P., Leong, S. A. & Chattoo, B. B. (1994) Mol. Gen. Genet. 245, repeated DNA sequences in the genome of eukaryotes. The 339-348. 19. Noguchi, T. (1991) Technique 3, 33-36. presence of Ch-SINE in both rice and non-rice pathogens ofM. 20. Leong, S. A., Farman, M. L., Budde, A. D., Smith, R. J., Tosa, Y. & Nitta, grisea indicates that these elements evolved before the diver- N. (1994) in Rice Blast Disease, eds. Zeigler, R. S., Leong, S. A. & Teng, gence of the two host-specific forms. Ch-SINE resembles the P. S. (CAB, Wallingford, U.K.), pp. 87-110. type II Galago SINE family (10), which is a chimeric element 21. Kachroo, P. (1995) Ph.D. dissertation (Maharaja Sayajirao University of of sequences related to the Monomer in its Baroda, Baroda, India). composed family 22. Kohrer, K. & Domdey, H. (1991) Methods Enzymol. 194, 398-405. left half and sequences identical to the Alu family in the right 23. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular Cloning: A half. Laboratory Manual (Cold Spring Harbor Lab. Press, Plainview, NY), 2nd The dispersed nature of the repetitive element as evi- Ed. denced CHEF the insertion of into 24. Skinner, D. Z., Budde, A. D., Farman, M. L., Smith, R. J., Leung, H. & by analysis, Mg-SINE Leong, S. A. (1993) Theor. Appl. Genet. 87, 545-557. Pot2, the detection of a transcript in a Northern blot, and the 25. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids Res. 12, restriction fragment length polymorphisms seen in the Mg- 387-389. SINE-hybridizing bands indicate that Mg-SINE is possibly 26. Nelson, D. L. (1993) in Genome Rearrangement and Stability, eds. Davis, functional. PCR analysis of 30 different isolates showed that K. E. & Warren, S. T. (Cold Spring Harbor Lab. Press, Plainview, NY), Vol. 7, pp. 1-24. only B101 carried a Mg-SINE insertion within Pot2, which in 27. Lustig, A. J. & Petes, T. D. (1993) in Genome Rearrangement and Stability, turn is inserted within a single-copy region, CRP. This eds. Davis, K. E. & Warren, S. T. (Cold Spring Harbor Lab. Press, suggests that the Mg-SINE insertion within Pot2 may be a Plainview, NY), Vol. 7, pp. 79-106. recent event. The and presence of 28. Osheroff, N., Zechiedrich, E. L. & Gale, K. C. (1991) BioEssays 13, amplification widespread 269-275. these elements throughout the genome may have major 29. Fuhrman, S., Deininger, P. L., laPorte, P., Friedmann, T. & Geiduschek, effects on genome structure and function and could be one E. P. (1981) Nucleic Acids Res. 9, 6439-6457. of the ways by which pathogenic variability is generated 30. Slagel, V. K. & Deininger, P. L. (1989) Nucleic Acids Res. 17, 8669-8682. among isolates of M. grisea. 31. LeBrun, M.-H., Chumley, F. & Valent B. (1994) in Rice Blast Disease, eds. Zeigler, R. S., Leong, S. A. & Teng, P. S. (CAB, Wallingford, U.K.), p. 593. 32. Schull, V. & Hamer, J. (1994) in Rice Blast Disease, eds. Zeigler, R. S., We thank Dr. Autar Mattoo (U.S. Department of Agriculture, Leong, S. A. & Teng, P. S. (CAB, Wallingford, U.K.), pp. 65-86. Beltsville, MD) and Drs. Hira L. Nakhasi and Gary Swergold (Na- 33. Hamer, J., Farrall, L., Orbach, M. J., Valent, B. & Chumley, F. G. (1989) tional Institutes of Health, Bethesda, MD) for critical reading of the Proc. Natl. Acad. Sci. USA 86, 9981-9985. Downloaded by guest on September 27, 2021