Journal of Biomedicine and Biotechnology

LINE-1 Retrotransposition: Impact on Genome Stability and Diversity and Human Disease

Guest Editors: Nina Luning Prak and Abdelali Haoudi

Journal of Biomedicine and Biotechnology LINE-1 Retrotransposition: Impact on Genome Stability and Diversity and Human Disease

Journal of Biomedicine and Biotechnology LINE-1 Retrotransposition: Impact on Genome Stability and Diversity and Human Disease

Guest Editors: Nina Luning Prak and Abdelali Haoudi

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

This is a special issue published in volume 2006 of “Journal of Biomedicine and Biotechnology.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Founding Managing Editor Abdelali Haoudi, Eastern Virginia Medical School, USA

Editors-in-Chief H. N. Ananthaswamy, USA Marc Fellous, France Peter M. Gresshoff, Australia Associate Editors Halima Bensmail, USA Vladimir Larionov, USA Wolfgang A. Schulz, Germany Marìa A. Blasco, Spain George Perry, USA O. John Semmes, USA Shyam K. Dube, USA Steffen B. Petersen, Denmark James L. Sherley, USA Mauro Giacca, Italy Nina Luning Prak, USA Mark A. Smith, USA James Huff, USA Annie J. Sasco, France Lisa Wiesmuller, Germany Editors Claude Bagnis, France Shahid Jameel, India Allal Ouhtit, USA Mohamed Boutjdir, USA Celina Janion, Poland Kanury V. S. Rao, India Douglas Bristol, USA Pierre Lehn, France Gerald G. Schumann, Germany Ronald E. Cannon, USA Nan Liu, USA Pierre Tambourin, France V. Singh Chauhan, India Yan Luo, USA Michel Tibayrenc, France Jean Dausset, France John Macgregor, France Leila Zahed, Lebanon John W. Drake, USA James M. Mason, USA Steven L. Zeichner, USA Hatem El Shanti, USA Ali Ouaissi, France

Contents

LINE-1 Retrotransposition: Impact on Genome Stability and Diversity and Human Disease, Nina Luning Prak and Abdelali Haoudi Volume 2006 (2006), Article ID 37049, 2 pages

The ORF1 Protein Encoded by LINE-1: Structure and Function During L1 Retrotransposition, Sandra L. Martin Volume 2006 (2006), Article ID 45621, 6 pages

Do LINEs Have a Role in X- Inactivation?, Mary F. Lyon Volume 2006 (2006), Article ID 59746, 6 pages

LINE-1 Endonuclease-Dependent Retrotranspositional Events Causing Human Genetic Disease: Mutation Detection Bias and Multiple Mechanisms of Target Disruption, Jian-Min Chen, Claude Férec, and David N. Cooper Volume 2006 (2006), Article ID 56182, 9 pages

L1 Antisense Promoter Drives Tissue-Specific Transcription of Human , Kert Mätlik, Kaja Redik, and Mart Speek Volume 2006 (2006), Article ID 71753, 16 pages

Links Between Repeated Sequences, Sachiko Matsutani Volume 2006 (2006), Article ID 13569, 3 pages

The Genomic Distribution of L1 Elements: The Role of Insertion Bias and Natural Selection, Todd Graham and Stephane Boissinot Volume 2006 (2006), Article ID 75327, 5 pages

L1 Retrotransposons in Human Cancers, Wolfgang A. Schulz Volume 2006 (2006), Article ID 83672, 12 pages

LINE-1 Hypomethylation in a Choline-Deficiency-Induced Liver Cancer in Rats: Dependence on Feeding Period, Kiyoshi Asada, Yashige Kotake, Rumiko Asada, Deborah Saunders, Robert H. Broyles, Rheal A. Towner, Hiroshi Fukui, and Robert A. Floyd Volume 2006 (2006), Article ID 17142, 6 pages

DNA Damage and L1 Retrotransposition, Evan A. Farkash and Eline T. Luning Prak Volume 2006 (2006), Article ID 37285, 8 pages

Do Small RNAs Interfere With LINE-1?, Harris S. Soifer Volume 2006 (2006), Article ID 29049, 8 pages

The Potential Regulation of L1 Mobility by RNA Interference, Shane R. Horman, Petr Svoboda, and Eline T. Luning Prak Volume 2006 (2006), Article ID 32713, 8 pages Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 37049, Pages 1–2 DOI 10.1155/JBB/2006/37049

Editorial LINE-1 Retrotransposition: Impact on Genome Stability and Diversity and Human Disease

Nina Luning Prak and Abdelali Haoudi

Department of Microbiology and Molecular Cell Biology, Eastern Virginia Medical School, Norfolk, Virginia 23501, USA

Received 6 February 2006; Accepted 6 February 2006 Copyright © 2006 N. L. Prak and A. Haoudi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

When we started thinking about devoting an issue of the cleic acid binding, and nucleic acid chaperone activity. In her Journal of Biomedicine and Biotechnology to LINE-1s, we were paper, Mary F Lyon discusses the possible role of L1s in X- not sure it would fly. LINE-1s are long interspersed elements chromosome inactivation (XCI). Lyon presents evidence and that account for 17% of the mass of the (1), a possible mechanism for the accumulation of L1s on the hu- but far fewer than 17% of geneticists work on them! Nev- man X-chromosome in such a manner that they could fulfill ertheless, L1s have received considerable press lately, includ- the potential function as booster elements in XCI. Whether ing a number of high-profile stories featuring retrotranspo- L1s are part of the mechanism of XCI or a result of it re- sition in L1 transgenic mice, functional studies of L1, and mains enigmatic. Shifting gears from L1 biology to L1 ef- the potential contributions of L1s to the transcriptome (2– fects on the genome, Jian-Min Chen et al point to the chal- 13). Based on these exciting developments in the L1 field and lenges of detecting of autosomal L1-mediated insertions in the collective enthusiasm and expertise of the contributors to their review article. In addition, Chen et al discuss the man- this edition, we are pleased to present this special L1 issue. ner in which target genes are disrupted by L1-mediated retro- The LINE-1 element (long interspersed element, L1) is an transpositional events and comment that these are likely to autonomous retrotransposon that propagates in the genome depend upon several different factors such as the type of via retrotransposition. During retrotransposition, L1 DNA insertion (ie, L1 direct, L1 trans-driven Alu,orL1trans- is transcribed to RNA and processed. The processed RNA driven SVA), the precise locations of the inserted sequences is reverse-transcribed by the L1-encoded reverse transcrip- within the target gene regions, the length of the inserted tase and the cDNA copy is inserted into a new chromoso- sequences, and possibly also their orientation. In their re- mal location. Over 500,000 L1 copies reside in the human search article, Kert Matl¨ık et al identify and characterize 49 genome, but L1 retrotransposition can also mobilize Alu ele- chimeric L1 mRNAs, continuing the theme of L1 effects on ments (short interspersed elements) and contribute to pro- genes. These chimeric transcripts are due to L1 sense or anti- cessed pseudogene formation (1, 14–16). Much have been sense promoter activity arising from within or nearby exist- learned in recent years about L1 structure, function, and con- ing genes. In 45 out of the 49 cases, the chimeric transcript tribution to the genome, but much more remain to be under- is in the same transcriptional orientation as the neighbor- stood, particularly how L1 insertions influence the expres- ing/surrounding gene. In addition, Matl¨ık et al show that the sion and function of neighboring genes and how L1 mobility L1 antisense promoter (ASP) can give rise to a chimeric tran- is kept in check. script whose coding region is identical to the ORF of mRNA This L1 issue is organized into two parts. The first part of several genes such as: KIAA1797, CLCN5,andSLCO1A2. consists of six papers describing L1 biology (two papers) and Finally and most provocatively, they provide evidence that the interactions of L1s with the genome (four papers). In the L1 ASP can alter the tissue-specific pattern of transcrip- a minireview, Sandra L Martin describes the structure and tion of some genes. Their study provides another dimension function of the L1 ORF1 protein during L1 retrotransposi- to the ways in which L1 can influence gene expression. In a tion. Her minireview article includes an update on recent minireview article, Sachiko Matsutani discusses the links be- studies describing L1 ORF1 protein-protein interactions, nu- tween LINE-1 and SINE (Alu) elements and how L1-encoded 2 Journal of Biomedicine and Biotechnology proteins contribute to the mobilization of other mobile el- could be targeted by RNAi, with an emphasis on different ements including Alu andprocessedpseudogenesandeven forms of double-stranded and hairpin RNA. Horman et al cellular genes. In their minireview article, Todd Graham and also point out that the conventional cell-culture-based L1 Stephane Boissinot discuss factors affecting how L1s are dis- retrotransposition assay (which relies on the expression of an tributed in the genome. L1 elements do not appear to be antisense marker cassette) may induce RNAi. randomly distributed in the genome. Graham and Boissinot We thank the contributors for their thought-provoking discuss factors that could skew the distribution of L1s in manuscripts and hope that readers will enjoy this special is- the genome, including L1 insertion bias and selection (ei- sue of the Journal of Biomedicine and Biotechnology on L1s. ther negative or positive) after insertion. The notion of neg- ative selection arising when an L1 insertion has especially Nina Luning Prak deleterious consequences (including increased recombina- Abdelali Haoudi tion, altered transcription of neighboring genes, and contin- ued retrotransposition) provides a bridge to the second sec- tion of the issue, which deals with the regulation of L1s. Nina Luning Prak is an Assistant Profes- The regulation of L1s can occur on many levels. L1 reg- sor at the Department of Pathology and ulation can occur before the element has a chance to get Laboratory Medicine at the University of going (pre- or posttranscriptional silencing, inefficient full- Pennsylvania Health System. Nina Luning length transcription due to premature polyadenylation, in- Prak conducted her PhD research on anti- efficient translation due to RNA editing) during retrotrans- body gene rearrangement under the super- position (sequestration of L1 machinery in certain intracel- vision of Martin Weigert. After completing lular compartments, competition for L1 machinery by other medical school and a residency in clinical substrates, blocking or modification of insertions by cellular pathology at the hospital of the University of Pennsylvania, Nina joined the Laboratory DNA repair machinery) or after retrotransposition (apop- of Haig Kazazian for a postdoctoral fellowship. With Members of tosis of a cell with a “disastrous” insertion, an immune re- the Kazazian Laboratory, Nina created an EGFP-tagged L1 trans- sponse due to neoantigens created by the insertion or silenc- genic mouse that her lab currently uses to study how L1 mobility ing of chromatin containing the insertion). This section on is regulated. Of particular interest to the Luning Prak lab are how L1 regulation begins with a broad and learned overview by L1s mobilize in the setting of cell stress and whether RNAi lim- Wolfgang A Schulz. In this review article, Schulz describes its L1 mobility. These and other topics pertaining to L1s and their why L1s might be afforded greater mobility in certain kinds ecology in the genome are highlighted in this special issue on L1 of cancer cells and how changes in DNA repair machinery retrotransposons. and epigenetic alterations could contribute to altered expres- Abdelali Haoudi received his PhD degree in sion and activity of L1s. He critically reviews evidence for cellular and molecular genetics jointly from and against involvement of L1s in chromosomal breakage Pierre & Marie Curie University and Orsay and recombination. In their research article, Kiyoshi Asada et University, Paris, France. He then joined the al describe an animal model in which epigenetic alterations National Institutes of Health (NIEHS, NIH) contribute to the development of hepatocellular carcinoma. for a period of four years after winning Asada et al monitor the amount of cytosine methylation in the competitive and prestigious NIH Fog- the L1 5’UTR using combined bisulfite restriction analysis arty International Award. Dr Haoudi then in rats fed a choline-deficient diet. They find that levels of joined the Myles Thaler Center for AIDS L1 5’UTR methylation decrease with increasing age, with in- and Human Retroviruses at the University creasing length of time on the choline-deficient diet, and ap- of Virginia Medical School, Charlottesville, then shortly after that, he joined the faculty at the Department of Microbiology and pear to be lower in tumor than in nontumor tissue. They dis- Molecular Cell Biology at Eastern Virginia Medical School in Nor- cuss the potential utility of L1 methylation status as an indi- folk, Va, in 2001. Dr Haoudi is interested in uncovering mech- cator of genome-wide methylation and the potential contri- anisms by which mobile genetic retroelements, both retroviruses bution of altered L1 activity to genomic instability in tumors. and retrotransposons, induce genetic instability and apoptosis in The notion that L1s can contribute to genomic instability is human cells and the molecular basis of cancer including cell cy- further explored by Evan A Farkash who reviews the litera- cle checkpoints and DNA repair mechanisms. Dr. Haoudi is also ture on mobile element activation and DNA damage. In his the Codirector of the Cancer Biology and Virology Focal Group. article review, Farkash describes ways in which L1s may be He has founded the Journal of Biomedicine and Biotechnology mobilized in the setting of genotoxic stress. Both of the final (http://www.j-biomed-biotech.org) and is also the Founder and papers in this section focus on RNA interference (RNAi), an President of the International Council of Biomedicine and Biotech- evolutionarily conserved process of sequence-specific, post- nology (http://www.i-council-biomed-biotech.org). transcriptional gene silencing, as a potential mechanism for regulating L1s. Harris S Soifer reviews the published evidence of how RNAi controls mobile elements in other eukaryotes and provides a series of arguments for why RNAi would be a reasonable mechanism to constrain L1s in humans. In their review article, Shane R Horman et al describe how L1 RNA Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 45621, Pages 1–6 DOI 10.1155/JBB/2006/45621

Mini Review The ORF1 Protein Encoded by LINE-1: Structure and Function During L1 Retrotransposition

Sandra L. Martin

Department of Cell and Developmental Biology, School of Medicine, University of Colorado, Fitzsimons Campus, PO Box 6511 , Mail Stop 8108, Aurora, CO 80045, USA Received 17 September 2005; Revised 10 December 2005; Accepted 13 December 2005 LINE-1 or L1 is an autonomous non-LTR retrotransposon in mammals. Retrotransposition requires the function of the two L1- encoded polypeptides, ORF1p and ORF2p. Early recognition of regions of homology between the predicted amino acid sequence of ORF2 and known endonuclease and reverse transcriptase enzymes led to testable hypotheses regarding the function of ORF2p in retrotransposition. As predicted, ORF2p has been demonstrated to have both endonuclease and reverse transcriptase activities. In contrast, no homologs of known function have contributed to our understanding of the function of ORF1p during retrotrans- position. Nevertheless, significant advances have been made such that we now know that ORF1p is a high-affinity RNA-binding protein that forms a ribonucleoprotein particle together with L1 RNA. Furthermore, ORF1p is a nucleic acid chaperone and this nucleic acid chaperone activity is required for L1 retrotransposition.

Copyright © 2006 Sandra L. Martin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

LINE-1 BACKGROUND intermediate into a new genomic DNA copy of L1 during TPRT: endonuclease [4] and reverse transcriptase [5]. These L1 is an interspersed repeated DNA found in mammalian two activities of ORF2p were predicted from sequence simi- genomes that attained its high copy number by retrotrans- larity between L1 and known apurinic/apyrimidinic endonu- position. It belongs to a large family of mobile elements clease [4] and reverse transcriptase [6, 7] proteins, then ver- that replicate via reverse transcription of an RNA inter- ified biochemically [4, 8]. In contrast, the role of ORF1p mediate. These elements, non-LTR retrotransposons, are during retrotransposition has remained far more elusive be- distinct from retroviruses and LTR-containing retrotrans- cause the amino acid sequence predicted by ORF1 lacks posons which also replicate via reverse transcription of an homology with any protein of known function (see [9], RNA intermediate. Non-LTR retrotransposons are widely and results of October 2005 NCBI protein and nucleotide distributed throughout eukaryotes and likely all share the searches). novel mechanism of replication known as target-site-primed reverse transcription, or TPRT, whereby reverse transcrip- ORF1 AND RELATED SEQUENCES tion of the L1 RNA template is primed from a 3 hydroxyl at the genomic insertion site (reviewed in [1]). The first intact coding sequence for an ORF1 protein was Close inspection of sequences of the > 500, 000 copies found by sequence analysis of a mouse L1 element called of either mouse or human L1 reveals a multigene fam- L1Md-A2 [7]. Comparison of the theoretical translation of ily comprised mainly of truncated, mutated, or rearranged this 41.2 kd protein from L1Md-A2 to that of a consen- copies of a small number of functional, full-length elements; sus primate L1 sequence revealed that the C-terminal half only a subset of the full-length elements is capable of retro- of ORF1 was evolving under selective pressure, whereas transposition (see [2, 3], for recent reviews). The functional the N-terminal half was not. This early analysis also noted retrotransposons are 6-7 kb in length and contain two long that the predicted ORF1 protein is quite basic, a common open reading frames (ORFs), both of which encode pro- feature of nucleic acid-binding proteins [7]. Subsequently, teins that are required for retrotransposition [3]. ORF2 en- many ORF1-like sequences have been determined from the codes a 146 kd polypeptide which provides the two known L1 elements of different mammals, and from related ele- enzymatic activities that are required to convert the RNA ments found in fish [16]. The C-terminal, homologous, basic 2 Journal of Biomedicine and Biotechnology

Coiled-coil domain Conserved domain divergence by accumulating replacement substitutions, or if Mouse A novel sequences are often acquired from nonhomologous Mouse TF sources. Rat An unexpected feature of the L1 ORF1 sequence is re- vealed when its amino acid sequence is used as the query Human in a BLASTP search. The program reports that it has de- Rabbit tected a putative conserved domain. This conserved domain is essentially the entire mammalian ORF1 protein sequence Fish and has been annotated “transposase 22.” Given that trans- posase is the enzyme responsible for the DNA breaking- Figure 1: Schematic representation of the ORF1 protein. The thin joining reactions that occur during transposition of a wide bar for each species represents the entire length of the protein. Thicker bars represent the coiled-coil (gray, based upon coils anal- variety of DNA elements [27], it seems likely to be a mis- ysis [10]) and conserved domains (black, based upon multiple- nomer to call this domain a transposase for several reasons. sequence alignments using T-Coffee [11]). These two domains over- Most significantly, the TPRT reaction used by L1 and the lap in the human, rabbit, and fish, but not the two mouse or rat other non-LTR retrotransposons does not require a trans- ORF1 protein sequences, as indicated. Sequences used were mouse posase enzymatic activity because cDNA is synthesized in A101 (Q91V68l, [12]), mouse L1spa (O54849, [13]), rat (Q63303), situ using chromosomal primers [28]. Thus, L1 replication human L1rp (AAD39214, [14]), rabbit ([15], not in GenBank), and lacks any intermediate equivalent to the double-strand DNA fish swimmer (AAD02927, [16]). The two mouse and the human substrate of transposases and the related integrases [27]. ORF1 protein sequences are from retrotransposition-competent el- Futhermore, biochemical and mutational analyses demon- ements; the other sequences are from untested elements. A model strate that the endonuclease activity of L1 ORF2p is responsi- for the trimeric structure of mouse L1 and its role in TPRT appeared ble for the DNA cleavages that occur during TPRT [4, 29, 30]. in [17]byMartinetal. Finally, the conserved domain in ORF1p of known func- tional mouse and human L1 elements lacks an apparent DDE motif [31], which is conserved in the active sites of transposases and integrases. Due to vast sequence divergence domain is a general feature of the ORF1 proteins from all of among members of the transposase/integrase superfamily of these elements (the conserved domain in Figure 1). proteins, their DDE motifs are best recognized in structure A second predicted feature of all of these ORF1 amino [32] rather than sequence alignments; hence absolute resolu- acid sequences is the presence of a long coiled-coil domain tion of the question of whether L1 ORF1p should be anno- upstream of the conserved domain (Figure 1). In human L1, tated “transposase 22” awaits atomic-level resolution of its this coiled-coil domain encompasses a leucine zipper [18]. In structure. rabbit ORF1, this coiled-coil region appears similar to ker- atin [15]. The most likely explanation for the poor sequence ORF1p IS REQUIRED FOR RETROTRANSPOSITION similarity among the different ORF1 sequences in this region with one another and with other coiled-coil-containing pro- Even the relatively conserved C-terminal domain of ORF1 is teins (eg, keratin) is that all share a coiled-coil domain with more divergent than ORF2 when the sequences of human distinct evolutionary origins, probably brought into proxim- and mouse L1s are compared [7]. Hence, when it was fi- ity with the conserved, basic C-terminal domain via recom- nally possible to measure L1 retrotransposition activity us- bination. Recombination to create novel sequence variants is ing an autonomous retrotransposition assay [5, 33], it was often evident in L1 lineages [19–23]. With this scenario, the unexpected to learn that mutations in ORF1 were at least as constraints imposed by a requirement for protein-protein in- severe, if not more so, than those that abolish reverse tran- teraction via a coiled-coil domain in ORF1 protein forces a scriptase activity. No retrotransposition events were detected small degree of apparent similarity in the absence of homol- in human L1 mutants in which either the serine at position ogy among these diverse sequences. Conversely, it is possi- 119 of ORF1p was replaced with a stop codon, or a highly ble that all of these ORF1 sequences evolved from a common conserved diarginine at 261/262 was replaced by dialanine; ancestor, but extremely rapid divergence of the sequences to- in both of these cases, the frequency of retrotransposition wards the N-terminus of the protein has obscured the evi- was less than 0.06% of the wild-type parental element. In dence for this homology. Interestingly, sequence variation in contrast, mutation of a critical active-site residue in the re- this N-terminal region of ORF1p is particularly great within verse transcriptase domain of ORF2 (D702Y), which abol- subtypes of human [24], rat (see [2], and references therein), ishes in vitro enzymatic activity [8], knocked retrotranspo- and mouse (see [25, 26], and references therein) L1. Posi- sition down to 0.15% of wild type [5]. The other known tive selection acting within this portion of the ORF1 protein enzymatic activity of ORF2 in L1 is endonuclease, which is is associated with the evolutionary success/extinction of hu- also required for TPRT [34]. Several mutations that elimi- man L1 lineages, perhaps reflecting drive for ORF1p to ei- nate detectable endonuclease activity in an in vitro nicking ther attract or avoid an interacting factor [24]. Additional se- assay again knock retrotransposition down to 0.2–1%, but quences from L1 elements in other species may shed light on do not eliminate it [34]. We observe similar effects of mu- whether the amino acid sequence of the N-terminal region tations in the ORF1 conserved domain compared to the en- is undergoing strong selective pressure for rapid sequence donuclease and reverse transcriptase domains in mouse L1. Sandra L. Martin 3

Thus, to date, the most stringent mutations of L1 are those in the trimer being the biologically relevant form of mouse ORF1. As noted when the leakiness of the ORF2 mutations ORF1p [17]. was originally observed, it is likely that ORF2p is more readily supplied in trans (albeit with substantially reduced efficiency, FUNCTIONAL ANALYSIS OF ORF1p: [5]), whereas ORF1p appears to be more stringently required NUCLEIC ACID BINDING in cis with the L1 RNA. These findings imply that ORF1 is critical for an earlier step in the retrotransposition cycle than The bias towards highly basic amino acids in ORF1p led to reverse transcription itself, for example, regulating expres- the hypothesis that this protein interacts with nucleic acids sion of ORF2 [35] or recruitment of ORF2p into the L1 ri- [7]. Early evidence for such interaction was provided by bonucleoprotein complex, and/or delivering the L1 RNP to cosedimentation of ORF1p with L1 RNA in sucrose gradients the chromosomal DNA and facilitating the strand exchanges loaded with cytoplasmic extracts prepared from the mouse that are required during TPRT [17, 36]. embryonal carcinoma cell line F9. The heavy complexes that In light of the stringent cis-requirement for ORF1p dur- formed were termed L1 ribonucleoprotein particles, or L1 ing L1 retrotransposition, it is interesting that ORF1p ap- RNP. L1 RNPs are not sensitive to disruption by EDTA, but pears to be dispensable when the L1 machinery provided are sensitive to proteolysis [42]. Exposure of the RNPs to by ORF2p is usurped by the human SINE, Alu, for its UV light rapidly cross-links the RNA to protein, indicating a amplification—this surprising finding may be explained if close association between L1 RNA and protein [21]. The hu- the SRP9/14 protein can replace ORF1p function [37]. In man ORF1p (p40) also associates with L1 RNA based upon contrast, ORF1p is required along with ORF2p for processed a series of cosedimentation experiments. p40 remains in the pseudogene formation by L1 [38, 39]. supernatant upon centrifugation at 800 and 12,000 x g,but pellets at 160,000 x g. Treatment of the cytoplasmic extract FUNCTIONAL ANALYSIS OF ORF1p: (800 x g supernatant) with RNase but not DNase prior to PROTEIN-PROTEIN INTERACTION centrifugation shifts p40 from the 160,000 x g pellet to the supernatant, indicating that the protein is pelleting because Leucine zippers and coiled-coil domains are typically as- it is in a large complex with RNA, or an RNP. The human L1 sociated with protein-protein interactions. In cytoplasmic RNPs are not dependent on divalent cations or disturbed by extracts from human cells that express high levels of L1, 10 mM EDTA, thus it appears that human ORF1p is bound NTera2D1, the ORF1p (also called p40) partitions into a to RNA in an RNP that is quite similar to the mouse L1 RNP. 160,000 x g pellet. Treatment of this pellet with increasing Further experiments indicated that the RNA present in these concentrations of glutaraldehyde to cross-link the protein RNP was L1 RNA and not actin or G3PDH RNA [9]. The shifts increasing amounts of the 40 kd ORF1 protein into presence of ORF1p in RNP was found to be sensitive to high complexes that run at 78, 89, 100, and 200 kd on SDS-PAGE, concentrations of monovalent cations as well as RNase treat- suggesting that the ORF1ps in these cytoplasmic particles are ment [43, 44], leading to an enrichment procedure for RNA- interacting closely with one another, or with other cellular free ORF1p from human cells [43], which was then used to proteins. This study also examined full-length p40 and vari- provide evidence for one or two relatively high-affinity bind- ous truncations expressed in Ecolifor protein-protein inter- ing sites for ORF1p in L1 RNA [35]. It is important to note actions, thereby mapping the multimerization domain to the that all of the above experiments examined the interaction of N-terminal half of the protein, in the region of the predicted L1 RNA with ORF1p in extracts from animal cells where L1 coiled-coil [9]. RNA and ORF1p were present as minority components of a Similar findings were obtained with mouse L1 ORF1p complex mixture. using somewhat different experimental approaches. Recom- A more direct assessment of the nucleic acid-binding binant protein purified from Ecolicoimmunoprecipitated properties of the ORF1 protein is provided by studies of 35S-labeled protein synthesized in vitro in rabbit reticulo- highly purified protein prepared after overexpression in ei- cyte lysate, demonstrating that mouse ORF1p is able to ther Ecolior baculovirus-infected insect cells. As with self-associate [40]. A combination of yeast 2-hybrid and ORF1p from mammalian cells, it is critical to take precau- GST pull-down assays were later used to map the region tions against copurification of RNA with the protein—when in mouse ORF1p responsible for multimerization; the pre- protein is purified in standard, nondenaturing conditions dicted coiled-coil domain is both necessary and sufficient without high concentrations of monovalent cation, RNA is for protein-protein interaction [41]. More recently, overex- coenriched and the protein is heavily contaminated with nu- pression of soluble ORF1p in baculovirus permitted analy- cleic acid. This is readily apparent on a wavelength scan, or sis of its multimerization state by analytical ultracentrifuga- by examining purified protein by ethidium bromide stain- tion. These studies revealed that the full-length protein forms ing after electrophoresis through agarose gels [41]. Our ear- a highly stable homotrimer, whereas a truncated ORF1p liest experiments with protein expressed in Ecoliused de- containing just the carboxy-terminal C-1/3 does not self- naturing conditions (8 M urea) to purify the protein from associate, even at relatively high protein concentrations [17]. the insoluble inclusion body fraction which simultaneously All of the above findings consistently support the conclusion removed bound nucleic acid. That protein was used for that the coiled-coil domain is wholly responsible for mul- UV cross-linking and electrophoretic mobility-shift assays timerization in both mouse and human L1 ORF1ps, with (EMSAs), which demonstrated that ORF1p binds RNA and 4 Journal of Biomedicine and Biotechnology single-stranded DNA [40]. The affinities observed in those pattern of L1 as well as experimental evidence from the au- experiments, however, were lower than those obtained with tonomous retrotransposition assay (see [38, 39], and refer- subsequent experiments that were done using protein puri- ences therein). fied from the soluble fraction rather than the refolded dena- tured protein from inclusion bodies, probably because most FUNCTIONAL ANALYSIS OF ORF1p: of the protein was not correctly refolded to its native form af- NUCLEIC ACID CHAPERONE ACTIVITY ter the denaturation. Interestingly, the RNA-binding region of the full-length ORF1p was mapped by simply examin- Non-LTR retrotransposons are present throughout Eukary- ing various GST-ORF1p fusion constructs (containing full- ota, but diverged long ago into five groups based upon the length and a variety of truncated regions of ORF1p) for the phylogenetic relationships of their reverse transcriptase re- presence of copurifying RNA. As long as the Ecoliextracts gion (the only sequence feature conserved among all non- and affinity purification steps were kept in physiological con- LTR retrotransposons) and the type and organization of their centrations of monovalent cation, RNA copurified with the protein domains [1]. Three of these groups, L1, I, and Jockey, protein if it contained the RNA-binding domain. All dele- each named for the first element of that group described, tions containing the C-1/3 basic domain were contaminated have a separate ORF upstream of their reverse transcriptase- with RNA and those that lacked it were free of RNA con- containing ORF. In two of these three groups, I and Jockey, tamination. This same region of mouse ORF1p was found the upstream ORFs (ie, ORF1s), both contain zinc-finger do- to be both necessary and sufficient for binding nucleic acid mains, making their ORF1 proteins reminiscent of retrovi- based upon transfer of 32P from RNA to protein by UV cross- ral gag proteins. An important function associated with the linking [41]. zinc-finger-containing, nucleocapsid domain of gag is that of The RNA-binding properties of the full-length mouse nucleic acid chaperone, which is critical for retroviral repli- ORF1 protein purified from baculovirus were further as- cation [46]. Nucleic acid chaperones are proteins that fa- sessed using coimmunoprecipitation and filter-binding as- cilitate rearrangements of nucleic acids to their thermody- says. These experiments examined the affinity of ORF1p for namically most stable form. A combination of at least three a variety of transcripts, and tested whether a specific cis- protein features contribute to nucleic acid chaperone activ- acting sequence in mouse L1 RNA recruits ORF1p. The pres- ity: charge neutralization due to an excess of basic amino ence of a high-affinity site in human L1 RNA was suggested acids, a higher affinity for single-stranded than for double- based upon preferential coimmunoprecipitation of a 41 nt stranded nucleic acids, and the ability to lower the coop- T1 nuclease-resistant fragment with ORF1 antibody [35]. erativity of the helix: coil transition [47]. These properties The mouse L1 RNA coimmunoprecipitation experiments re- must be exquisitely balanced so that the chaperone can pro- vealed that efficient recovery of the 32P-labeled RNA required mote both melting and annealing of nucleic acids. The ORF1 at least 38 nt, suggesting a length effect rather than a sequence protein from the non-LTR retrotransposon, I factor, shares requirement. All longer RNAs tested precipitated efficiently, several biochemical properties with retroviral nucleocapsid independent of sequence. Further evidence that ORF1p is a proteins, including the ability to accelerate annealing of com- nonsequence-specific RNA-binding protein was provided by plementary single-strand DNA sequences; these observations results of nitrocellulose filter-binding assays using highly pu- led to the suggestion that the I factor ORF1 protein functions rified mouse ORF1p expressed in baculovirus. Transcripts as a nucleic acid chaperone during replication [48]. that contained specifically the 38 nt sequence in either the Mouse L1 ORF1 protein also accelerates annealing of sense or antisense orientation both bound with high affin- complementary oligonucleotides. In addition, it lowers the ity. Although there is a slight increase in the apparent bind- Tm of mispaired duplex DNA, accelerates a strand displace- ing affinity of ORF1p to RNA containing the sense 38 nt se- ment reaction if an imperfect duplex is challenged by the quence compared to the same sequence in antisense orienta- addition of the perfect complement, and alters the force re- tion, it is only 4- to 7-fold and therefore too small to be con- quired for the helix: coil transition in single-molecule studies sidered specific binding for sense versus antisense L1 RNA using optical forceps [36, 49]. Significantly, the nucleic acid [45]. chaperone activity of ORF1p is required for retrotransposi- This discrepancy between the results with mouse and hu- tion. A single-point mutation that destroys effective chap- man L1 ORF1ps regarding the existence of a high-affinity erone activity (R297K) without affecting RNA or single- binding site within L1 RNA has not been resolved. Possibly, it stranded DNA binding affinity also destroys retrotransposi- is due to differences between mouse and human L1, or, more tion activity [49]. Consistent with this observation, the anal- likely, between the reagents used for the assays. For example, ogous mutation in human L1 also destroys retrotransposi- it is possible that another protein that is critical for the site- tion, but not RNP formation [44]. specificity was present in the partially purified preparation from human cells, but missing when the protein was purified SUMMARY from baculovirus-infected insect cells or Ecoli.Thequestion of whether L1 RNA contains a specific, high-affinity bind- L1 is arguably the most significant dynamic force currently ing site for ORF1p is important for L1 biology because it of- operating upon the mammalian genome. Retrotransposition fers an attractive explanation for the cis-preference of ORF1p is just one of many facets of L1’s contribution to genetic for L1 RNA that is evident from both the evolutionary plasticity and diversity [50], although it lies at the root of Sandra L. Martin 5 all of the others. Retrotransposition requires the proteins [11] Notredame C, Higgins DG, Heringa J. T-Coffee: a novel encoded by both of the two open reading frames in L1. The method for fast and accurate multiple sequence alignment. two known functions of the protein encoded by ORF2, en- Journal of Molecular Biology. 2000;302(1):205–217. donuclease and reverse transcriptase, were readily predicted [12] DeBerardinis RJ, Goodier JL, Ostertag EM, Kazazian HH Jr. baseduponsequencehomology,whereashomologyhasso Rapid amplification of a retrotransposon subfamily is evolving far failed to provide clues regarding the function of the ORF1 the mouse genome. Nature Genetics. 1998;20(3):288–290. [13] Naas TP, DeBerardinis RJ, Moran JV, et al. An actively retro- protein. In spite of this disadvantage, however, several sig- transposing, novel subfamily of mouse L1 elements. The EM- nificant advances have been made in establishing the struc- BO Journal. 1998;17(2):590–597. ture and function of this critical retrotransposition protein [14] Schwahn U, Lenzner S, Dong J, et al. Positional cloning of though a series of in vivo and in vitro experiments. The the gene for X-linked retinitis pigmentosa 2. Nature Genetics. protein binds both RNA and DNA, with a higher affinity 1998;19(4):327–332. for single-stranded than double-stranded nucleic acids. The [15] Demers GW, Matunis MJ, Hardison RC. The L1 family of long RNA-binding function leads to RNP formation and safe de- interspersed repetitive DNA in rabbits: sequence, copy num- livery of the RNP to genomic DNA so that it can undergo ber, conserved open reading frames, and similarity to keratin. TPRT. The nucleic acid chaperone activity of ORF1p likely Journal of Molecular Evolution. 1989;29(1):3–19. contributes more directly to reverse transcription by TPRT, [16] Duvernell DD, Turner BJ. Swimmer 1, a new low-copy- number LINE family in teleost genomes with sequence sim- perhaps by facilitating the strand exchanges that place the ilarity to mammalian L1. Molecular Biology and Evolution. DNA primer onto the RNA or cDNA template, or by melt- 1998;15(12):1791–1793. ing secondary structure in the RNA, or both. [17] Martin SL, Branciforte D, Keller D, Bain DL. Trimeric struc- ture for an essential protein in L1 retrotransposition. Proceed- ACKNOWLEDGMENTS ings of the National Academy of Sciences of the United States of America. 2003;100(24):13815–13820. I thank D Branciforte, E Epperson, P W-l Li, C Rose, and A [18] Holmes SE, Singer MF, Swergold GD. Studies on p40, Walker for helpful comments on the manuscript and NIH the leucine zipper motif-containing protein encoded by GM40367 for support. the first open reading frame of an active human LINE-1 transposable element. The Journal of Biological Chemistry. REFERENCES 1992;267(28):19765–19768. [19] Adey NB, Schichman SA, Hutchison CA III, Edgell MH. Com-  [1] Eickbush TH, Malik HS. Origins and evolution of retrotrans- posite of A and F-type 5 terminal sequences defines a subfam- posons. In: Craig NL, Craigie R, Gellert M, Lambowitz AM, ily of mouse LINE-1 elements. JournalofMolecularBiology. eds. Mobile DNA II. Washington, DC: American Society of Mi- 1991;221(2):367–373. crobiology Press; 2002:1111–1144. [20] Adey NB, Schichman SA, Graham DK, Peterson SN, Edgell [2] Furano AV. The biological properties and evolutionary dy- MH, Hutchison CA III. Rodent L1 evolution has been driven namics of mammalian LINE-1 retrotransposons. Progress in by a single dominant lineage that has repeatedly acquired new Nucleic Acid Research and Molecular Biology. 2000;64:255–294. transcriptional regulatory sequences. Molecular Biology and [3] Moran JV, Gilbert N. Mammalian LINE-1 retrotransposons Evolution. 1994;11(5):778–789. and related elements. In: Craig NL, Craigie R, Gellert M, Lam- [21] Saxton JA, Martin SL. Recombination between subtypes cre- bowitz AM, eds. Mobile DNA II. Washington, DC: American ates a mosaic lineage of LINE-1 that is expressed and actively Society of Microbiology Press; 2002:836–869. retrotransposing in the mouse genome. JournalofMolecular [4] Feng Q, Moran JV, Kazazian HH Jr, Boeke JD. Human L1 Biology. 1998;280(4):611–622. retrotransposon encodes a conserved endonuclease required [22] Hayward BE, Zavanelli M, Furano AV. Recombination creates for retrotransposition. Cell. 1996;87(5):905–916. novel L1 (LINE-1) elements in Rattus norvegicus. Genetics. [5] Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, 1997;146(2):641–654. Kazazian HH Jr. High frequency retrotransposition in cul- [23] Cabot EL, Angeletti B, Usdin K, Furano AV. Rapid evolution tured mammalian cells. Cell. 1996;87(5):917–927. of a young L1 (LINE-1) clade in recently speciated Rattus taxa. [6] Hattori M, Kuhara S, Takenaka O, Sakaki Y. L1 family of repet- Journal of Molecular Evolution. 1997;45(4):412–423. itiveDNAsequencesinprimatesmaybederivedfromase- [24] Boissinot S, Furano AV. Adaptive evolution in LINE-1 retro- quence encoding a reverse transcriptase-related protein. Na- transposons. Molecular Biology and Evolution. 2001;18(12): ture. 1986;321(6070):625–628. 2186–2194. [7] Loeb DD, Padgett RW, Hardies SC, et al. The sequence of a [25] Goodier JL, Ostertag EM, Du K, Kazazian HH Jr. A novel ac- large L1Md element reveals a tandemly repeated 5 end and tive L1 retrotransposon subfamily in the mouse. Genome Re- several features found in retrotransposons. Molecular and Cel- search. 2001;11(10):1677–1685. lular Biology. 1986;6(1):168–182. [26] Mears ML, Hutchison CA III. The evolution of modern lin- [8] Mathias SL, Scott AF, Kazazian HH Jr, Boeke JD, Gabriel A. Re- eages of mouse L1 elements. Journal of Molecular Evolution. verse transcriptase encoded by a human transposable element. 2001;52(1):51–62. Science. 1991;254(5039):1808–1810. [27] Craig NL. Unity in transposition reactions. Science. 1995; [9] Hohjoh H, Singer MF. Cytoplasmic ribonucleoprotein com- 270(5234):253–254. plexes containing human LINE-1 protein and RNA. The [28] Luan DD, Korman MH, Jakubczak JL, Eickbush TH. Reverse EMBO Journal. 1996;15(3):630–639. transcription of R2Bm RNA is primed by a nick at the chro- [10] Lupas A, Van Dyke M, Stock J. Predicting coiled coils from mosomal target site: a mechanism for non-LTR retrotranspo- protein sequences. Science. 1991;252(5010):1162–1164. sition. Cell. 1993;72(4):595–605. 6 Journal of Biomedicine and Biotechnology

[29] Cost GJ, Boeke JD. Targeting of human retrotransposon in- [47] Williams MC, Rouzina I, Wenner JR, Gorelick RJ, Musier- tegration is directed by the specificity of the L1 endonu- Forsyth K, Bloomfield VA. Mechanism for nucleic acid chap- clease for regions of unusual DNA structure. Biochemistry. erone activity of HIV-1 nucleocapsid protein revealed by sin- 1998;37(51):18081–18093. gle molecule stretching. Proceedings of the National Academy [30] Morrish TA, Gilbert N, Myers JS, et al. DNA repair mediated of Sciences of the United States of America. 2001;98(11): 6121– by endonuclease-independent LINE-1 retrotransposition. Na- 6126. ture Genetics. 2002;31(2):159–165. [48] Dawson A, Hartswood E, Paterson T, Finnegan DJ. A LINE- [31] Kulkosky J, Jones KS, Katz RA, Mack JP, Skalka AM. Residues like transposable element in Drosophila,theI factor, encodes critical for retroviral integrative recombination in a region a protein with properties similar to those of retroviral nucleo- that is highly conserved among retroviral/retrotransposon in- capsids. The EMBO Journal. 1997;16(14):4448–4455. tegrases and bacterial insertion sequence transposases. Molec- [49] Martin SL, Cruceanu M, Branciforte D, et al. LINE-1 ular and Cellular Biology. 1992;12(5):2331–2338. retrotransposition requires the nucleic acid chaperone ac- [32] Davies DR, Mahnke Braam L, Reznikoff WS, Rayment I. The tivity of the ORF1 protein. Journal of Molecular Biology. three-dimensional structure of a Tn5 transposase-related pro- 2005;348(3):549–561. tein determined to 2.9-A˚ resolution. The Journal of Biological [50] Han JS, Boeke JD. LINE-1 retrotransposons: modulators of Chemistry. 1999;274(17):11904–11913. quantity and quality of mammalian gene expression? BioEs- [33] Ostertag EM, Prak ETL, DeBerardinis RJ, Moran JV, Kazazian says. 2005;27(8):775–784. HH Jr. Determination of L1 retrotransposition kinetics in cul- tured cells. Nucleic Acids Research. 2000;28(6):1418–1423. [34] Cost GJ, Feng Q, Jacquier A, Boeke JD. Human L1 element target-primedreversetranscriptioninvitro.The EMBO Jour- nal. 2002;21(21):5899–5910. [35] Hohjoh H, Singer MF. Sequence-specific single-strand RNA binding protein encoded by the human LINE-1 retrotranspo- son. The EMBO Journal. 1997;16(19):6034–6043. [36] Martin SL, Bushman FD. Nucleic acid chaperone activity of the ORF1 protein from the mouse LINE-1 retrotransposon. Molecular and Cellular Biology. 2001;21(2):467–475. [37] Dewannieux M, Esnault C, Heidmann T. LINE-mediated retrotransposition of marked Alu sequences. Nature Genetics. 2003;35(1):41–48. [38] Wei W, Gilbert N, Ooi SL, et al. Human L1 retrotransposi- tion: cis preference versus trans complementation. Molecular and Cellular Biology. 2001;21(4):1429–1439. [39] Esnault C, Maestre J, Heidmann T. Human LINE retro- transposons generate processed pseudogenes. Nature Genetics. 2000;24(4):363–367. [40] Kolosha VO, Martin SL. In vitro properties of the first ORF protein from mouse LINE-1 support its role in ribonucleo- protein particle formation during retrotransposition. Proceed- ings of the National Academy of Sciences of the United States of America. 1997;94(19):10155–10160. [41] Martin SL, Li J, Weisz JA. Deletion analysis defines distinct functional domains for protein-protein and nucleic acid in- teractions in the ORF1 protein of mouse LINE-1. Journal of Molecular Biology. 2000;304(1):11–20. [42] Martin SL. Ribonucleoprotein particles with LINE-1 RNA in mouse embryonal carcinoma cells. Molecular and Cellular Bi- ology. 1991;11(9):4804–4807. [43] Hohjoh H, Singer MF. Ribonuclease and high salt sensitivity of the ribonucleoprotein complex formed by the human LINE-1 retrotransposon. Journal of Molecular Biology. 1997;271(1):7– 12. [44] Kulpa DA, Moran JV. Ribonucleoprotein particle formation is necessary but not sufficient for LINE-1 retrotransposition. Human Molecular Genetics. 2005;14(21):3237–3248. [45] Kolosha VO, Martin SL. High-affinity, non-sequence-specific RNA binding by the open reading frame 1 (ORF1) protein from long interspersed nuclear element 1 (LINE-1). The Jour- nal of Biological Chemistry. 2003;278(10):8112–8117. [46] Rein A, Henderson LE, Levin JG. Nucleic-acid-chaperone ac- tivity of retroviral nucleocapsid proteins: significance for vi- ral replication. Trends in Biochemical Sciences. 1998;23(8):297– 301. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 59746, Pages 1–6 DOI 10.1155/JBB/2006/59746

Review Article Do LINEs Have a Role in X-Chromosome Inactivation?

Mary F. Lyon

Mammalian Genetics Unit, Medical Research Council (MRC), Harwell, Oxfordshire OX11 0RD, UK

Received 31 May 2005; Revised 22 November 2005; Accepted 4 December 2005 There is longstanding evidence that X-chromosome inactivation (XCI) travels less successfully in autosomal than in X-chromo- somal chromatin. The interspersed repeat elements LINE1s (L1s) have been suggested as candidates for “boosters” which promote the spread of XCI in the X-chromosome. The present paper reviews the current evidence concerning the possible role of L1s in XCI. Recent evidence, accruing from the human genome sequencing project and other sources, confirms that mammalian X- are indeed rich in L1s, except in regions where there are many genes escaping XCI. The density of L1s is the highest in the evolutionarily oldest regions. Recent work on X; autosome translocations in human and mouse suggested failure of stabilization of XCI in autosomal material, so that genes are reactivated, but resistance of autosomal genes to the original silencing is not excluded. The accumulation of L1s on the X-chromosome may have resulted from reduced recombination or late replication. Whether L1s are part of the mechanism of XCI or a result of it remains enigmatic.

Copyright © 2006 Mary F. Lyon. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION less successfully in autosomal than in X-chromosomal chro- matin. At an early stage in the discovery of X-chromosome in- The concept of the X-inactivation centre has led to ma- activation (XCI), the phenomena seen in mouse X; au- jor advances. The XIST/Xist gene (XIST is the human gene tosome translocations provided important clues. The first symbol and Xist the mouse gene symbol) cloned from the relevant observation that led to the discovery of XCI was XIC has been shown to be necessary and sufficient for the that female mice heterozygous for X-linked colour genes initiation of XCI in the early embryo [5, 6]. Xist codes for an showed characteristic patterns of colour variegation in their untranslated mRNA which coats the inactive X-chromosome coats different from the patterns produced by autosomal (Xi) [7] and initiates gene silencing. This is followed by a colour genes [1]. Similar variegated coat colour patterns oc- process of stabilization of the inactive state, which is in- curred in females heterozygous for X; autosome transloca- dependent of Xist and involves a complex series of chro- tions when the translocated autosomal segment included matin changes, including histone modifications and methy- loci of coat colour genes [2]. This was attributed to the lation of CpG islands (reviewed in [8]). The Xi becomes late travel of the inactivating signal from the X-chromosome replicating, its histones H3 and H4 are hypoacetylated, hi- into the attached autosome. However, the inactivation of stone H3 is methylated at lysines K9 and K27, its CpG is- colour genes occurred in only one of the two segments into lands are methylated, and it is associated with the variant which the autosome was broken. This led to the concept histone macro-H2A.1. These are the so-called hallmarks of of the X-inactivation centre (XIC) on the X-chromosome, XCI. from which the silencing signal spread in a cis-limited man- Thus, much is known concerning the initiation of XCI ner along the chromosome [3, 4]. Only X-chromosomal but the basis for the apparently less successful travel of the or autosomal segments in physical continuity with the XIC silencing in autosomal than in X-chromosomal chromatin underwent inactivation. Another feature seen in X; auto- remains enigmatic. Riggs [9, 10] suggested the existence of some translocations was that the autosomal segment in- “way stations” or “boosters” along the X-chromosome which volved did not undergo complete inactivation. Genes dis- enhanced the spread of XCI. I suggested that interspersed re- tant from the translocation break showed less variegation, peat elements, specifically LINE1s, were candidates for the and therefore less inactivation, than genes nearer to the boosters [11, 12]. This paper provides a review of the evi- break [4]. It appeared that the silencing signal travelled dence for and against this suggestion. 2 Journal of Biomedicine and Biotechnology

EVIDENCE FOR CANDIDATURE OF LINE1s upstream or downstream of 19 human genes that escaped AS BOOSTER ELEMENTS XCI and 73 normally inactivated genes and found no signif- icant differences. They did, however, find a lower density of Richness of LINE1s on the X-chromosome CpG islands around escaping genes. This work in turn was Although XCI is incomplete in autosomal material, it seems, contradicted by Carrel and Willard [23] who studied a much from many human X; autosome translocations and from larger number of escaping and nonescaping genes, and found ff mouse transgenes, that it can travel to some extent in any no di erence in CpG island content but, like Ross et al, found autosome. Therefore, to be candidates for boosters, elements that regions with many escaping genes did have a lower L1 must be present throughout the genome but must be par- content. Two groups have studied chicken transgenes intro- ticularly dense on the X-chromosome. The mouse and hu- duced into the mouse X-chromosome, to determine if they man X-chromosomes were first described as being rich in would be subject to XCI, given that they are unlikely to con- L1s on the basis of FISH (fluorescent in situ hybridization) tain recognition sequences important for XCI (such as young evidence with antibodies to L1s [13, 14]. This was later ex- L1s that are absent in the chicken genome). Riggs [24] intro- tended to other species [15, 16]. In the fruit bat Carollia,an duced a chicken transgene into the mouse Hprt locus. The X; autosome translocation forms part of the normal kary- transgene had complete flanking sequences but no L1s and otype. Parish et al [17] found that the X-chromosomal part yet it was inactivated when on the Xi. However, Goldman of the translocation was rich in L1s but the autosomal part et al [25, 26] introduced a larger construct, consisting of 11 was no richer than other autosomes. Thus, the data suggest copies of the chicken 17 kb transferrin gene (187 kb in all), that richness in L1s is a common feature of mammalian X- into the mouse X-chromosome and the transgene escaped chromosomes. inactivation. This is consistent with L1s acting over relatively Data accruing from the human genome project have large distances. yielded further detailed information on the distribution of As in the human, data from the mouse genome sequenc- L1s along the human X-chromosome. Bailey et al [18]com- ing project have confirmed that the mouse X-chromosome . . pared the human X-chromosome with several autosomes is rich in L1s having 28 5% L1s compared with 14 6% in au- and found the X-chromosome to be indeed rich in L1s, hav- tosomes [27]. The mouse Y-chromosome also is rich in L1s. ing 26% L1s compared with only 13% in autosomal DNA. Tsuchiya et al [28] studied a region of the human X, Xp11.2, On the human X-chromosome, some genes escape XCI and that contains a domain of escaping genes and compared it remain active on the Xi. These escaping genes tend to be clus- with a conserved region in the mouse, which has only one tered in certain regions, particularly in the distal part of the escaping gene. They found no association of L1 density with short arm, Xp. Bailey et al found that the concentration of escape from XCI, except that L1 density was reduced in the L1s in this distal part of Xp was lower than in the vicinity of SMCX/Smcx gene region. In contrast, the density of long ter- the XIC. They also studied the age of the L1s and found that minal repeats (LTRs) was decreased in the human escaping they were predominantly of a younger age, dating from the region. time in evolution of the eutherian radiation. Ross et al [19] studied the complete annotated sequence of the human X- Recent data on X; autosome translocations chromosome and were able to provide further detail on the distribution of L1s. They confirmed that the X-chromosome Incomplete XCI in autosomal segments attached to the X- was indeed rich in L1s, having 29% L1s compared with a chromosome in X; autosome translocations could have var- genome average of 17%. The density of L1s was high around ious causes. The silencing signal could fail to travel in au- the XIC but the XIST gene itself lay in a 60 kb region not rich tosomal chromatin, or autosomal genes might be resistant in L1s. Distal Xp, where there are many genes escaping XCI, to the signal, or the stabilization of XCI might fail so that had an L1 density not greater than that in autosomes. Graves genes undergo reactivation. It is difficult to tell these causes and Watson [20] had studied the evolutionary history of the apart in specific cases, but some clues are available. In some X-chromosome and divided it into an older X-conserved re- cases, genes at a long distance from the translocation break gion (XCR), which had evolved from an autosome at an early in the autosome are inactivated, while some other intersti- stage, and a region more recently added from an autosome, tial genes are not. In such a case, the escape of the inter- the X-added region, XAR. These regions were further divided stitial genes from XCI must be either due to resistance to into five subregions according to their evolutionary age [21]. the original silencing signal or to initial silencing followed Ross et al found that the density of L1s in the various regions by reactivation. To distinguish between these two possibil- was strongly correlated with evolutionary age, being highest ities requires observations at different times. For example, in the evolutionarily oldest regions. observations might be made of mouse embryos and liveborn Thus, these data are compatible with L1s having a role as young of various ages, or cultured cells might be followed boosters in XCI. However, they do not reveal whether L1s are over different passages. In early work on the mouse, there part of the mechanism of XCI or whether the density of L1s were several translocations involving and the is a consequence of XCI. coat colour genes albino, c, and pinkeye, p [29]. Both the c Evidence that appears to conflict with the data of Bailey et and p genes showed variegation, resulting from the inacti- al and Ross et al comes from the work of Ke and Collins [22]. vation of the wild-type alleles in the translocated segment, They studied the density of L1s at points from 1 kb to 100 kb but the level of the variegation and hence of the inactivation Mary F. Lyon 3 varied from one translocation to another. Hence, incomplete inactivating signal had travelled a long distance along the XCI of the two colour genes cannot have been solely due to segment, over 100 Mb. Five of the six expressed sequences their resistance to the silencing signal or the stabilizing mech- were interstitial, having inactivated sequences on both sides. anism. The degree of travel of the signals into the autosome These results were consistent either with the resistance of must have been at least partly responsible, and this is con- these genes to the original silencing signal or with their re- firmed by the fact that variegation was lower with greater dis- activation. tance of the genes concerned from the XIC. Evidence that re- A similar picture consistent with either resistance to si- activation was also involved was provided by Cattanach [30] lencing, or with reactivation, or with both has emerged from who showed that animals with inactivation of the wild-type other studies with human X; autosome translocations, in allele of the albino gene due to the translocation Is(In7;X)1Ct which various hallmarks of XCI have been studied. Keo- darkened with age, as the wild-type allele regained expres- hane et al [33] found no spread into the autosome of XIST sion. The distribution of the pigmented cells was such that RNA, hypoacetylation, or late replication in two unbalanced they could not have arisen by movement of pigmented cells translocations, but others obtained different results. Sharp et around the body, but must have arisen by change in expres- al [34] studied five unbalanced translocations, one of which sion of pigment genes. Thus, in this group of translocations, had also been used by Keohane et al and found considerable it appears that both failure of travel of the signal and failure variation in the spread of XCI. The overall picture was of dis- of stabilization were involved. continuous spread of XCI, with some interstitial genes not Recently there has been further work with mouse and inactivated. Hallmarks of XCI, including hypoacetylation of human translocations. In the mouse, Duthie et al [31] histones, late replication, and CpG island methylation, varied studied two translocations resulting in variegation for coat from case to case and from cell to cell. In contrast to the re- colour genes, T(4;X)37H which gives variegation for the sults of Keohane et al, they found some spreading of XCI into gene brown, b, and Is(In7;X)1Ct (previously studied by Cat- the autosome in the case also studied by these authors. Hall tanach) which gives variegation for albino, c, pink-eye, p,and et al [35] studied two unbalanced translocations in which the ruby2, ru2. They found only limited coating of the autosomal relatively normal phenotype suggested considerable gene in- material with Xist RNA and variation from cell to cell. Sim- activation. In both translocations, only part of the autoso- ilarly, the spread of hypoacetylation of histone H4 into the mal segment was coated with XIST RNA. In one, the auto- autosome was limited and variable. In the X-chromosome some showed hypoacetylation, and in both, late replication itself, Xist RNA was restricted to light G-bands. These results varied from cell to cell. The autosomal segments seemed to are consistent with initial spread of Xist RNA to a point be- show more hallmarks of XCI than was indicated by the XIST yond that of the inactivated genes, followed by the failure of RNA coverage, and the authors suggested that the XIST RNA the stabilization mechanism so that hallmarks of XCI, such had originally extended further. The work was done with cul- as hypoacetylation of histones, did not appear, and in some tured cells, and at earlier passages of the same cells, com- cells genes were reactivated. Resistance of some genes to the plete late replication of the autosomal segment had been ob- original silencing signal is also possible. As Cattanach [30] served, whereas it was now only partial, suggesting a loss of had previously shown the reactivation of the wild-type allele late replication with time. Failure of stabilization of inacti- of the c gene in Is(In7;X)1Ct, it is tempting to attribute all vation seemed a better explanation of the effects seen than the effects seen to reactivation, but in fact the failure of travel resistance to the original silencing signal. of the silencing signal and the resistance of some genes to it Thus, the evidence as a whole from X; autosome translo- are also possible. cations is that incomplete XCI in autosomal material involves Recent results with human X; autosome translocations a combination of factors. The early work on mouse translo- have also suggested the occurrence of reactivation of auto- cations indicated incomplete travel of the original silencing somal material. These studies have used unbalanced translo- signal in different translocations. The more recent data sug- cation products. In individuals with balanced X; autosome gest that, in addition, there may be resistance of some genes translocations the translocated X-chromosome is typically to the silencing signal, and also failure of the stabilizing sig- the active one (Xa) in all cells. Inactivation of the translo- nal, leading to reactivation. The possibility of reactivation cated X leads to genetic imbalance, because of the inac- would fit with the earlier work of Cattanach [30] on the re- tivation of the attached autosome and failure of inactiva- activation of the albino gene with age in the insertion Is1Ct. tion of the X-chromosome segment lacking the XIC. Cells It is also consistent with the studies of Hall et al [36]who with the translocated X-chromosome as the Xi are thus se- introduced an XIST transgene into an autosome in human lected against. Conversely, in an unbalanced translocation, cultured cells, and found diminution of signs of XCI with inactivation of the translocated segment tends to restore ge- time in culture. The general picture at present is that both netic balance and is selected. White et al [32] studied a the original silencing and also the stabilizing mechanism are female with an unbalanced X; 4 translocation and a nor- liable to failure in autosomal material. mal phenotype, suggesting that most of the excess chro- mosone 4 segment was inactive. They found that 14/20 genes Possible mechanisms of action of LINE1s and expressed sequence tags (ESTs) were inactivated and 6/20 were expressed. The inactivated genes were distributed When the Xist gene is introduced as a transgene into an auto- widely along the autosomal segment, indicating that the some, effects similar to those in X; autosome translocations 4 Journal of Biomedicine and Biotechnology are seen. The Xist RNA coats the autosome, but not com- MECHANISM OF ACCUMULATION OF LINE1s pletely, and the genes are silenced, but those at a distance ON THE X-CHROMOSOME from the transgene may escape silencing [37]. Wutz and Jaenisch [38] constructed an inducible Xist cDNA transgene Consideration of the possible means of accumulation of L1s and introduced it into an autosome in mouse embryonic on the X-chromosome might yield some clues as to whether stem (ES) cells. In undifferentiated ES cells, the Xist RNA this accumulation is a part of the mechanism of XCI or is a travelled along the chromosome and brought about silencing result of it. One possibility is that L1s have accumulated as a of genes. This silencing was not accompanied by hallmarks result of reduction in recombination on the X-chromosome of XCI and was reversed when the agent inducing the cDNA as the X- and Y-chromosomes have differentiated. The Y- expression was removed. When the ES cells were allowed chromosome is thought to have differentiated from the X- to differentiate in the presence of Xist cDNA, hallmarks of chromosome by a series of inversions [19, 44]. Each inver- XCI appeared and the silencing became irreversible and in- sion will have led to an absence in males of recombination dependent of expression of Xist by the inducing agent. Thus, in the inverted segment. This will have been followed by the there is some developmental factor, unidentified at present, degeneration of the isolated segment of the Y-chromosome. required for the start of the stabilizing process. The timing It has been suggested that the level of L1s in mammalian of travel of Xist RNA is hence very important. If this RNA genomes is a result of competition between insertion and is not in place when the developmental factor initiates stabi- excision, and excision is thought to involve recombination. lization, the process will fail. It is not yet known how long is Thus, as recombination on the X-chromosome decreased in required for the process of coating of the chromosome with evolution, L1s would accumulate [16, 45]. In the Drosophila Xist RNA. However, data of Latham [39] and of Huynh and species Dmirandaretrotransposons were seen to accumu- Lee [40], that in early mouse embryos X-linked genes at a dis- late during the evolution of a neo-Y chromosome [42], al- tance from the XIC remain active at later stages than those though they did not accumulate on the X. Another possi- near the XIC, suggest that it is quite a slow process. Possi- bility is that late replication of the X-chromosome when in- bly, L1s could affect the speed of travel of Xist RNA. Another activemightleadtoanaccumulationofL1s[16]. Thus, ac- possibility is that they could promote the attachment of Xist cumulation of L1s might be due to differentiation of the X- RNA to the chromatin, and hence could aid in the stabiliza- and Y-chromosomes in evolution and in this sense inciden- tion. tal to XCI. It should be noted, however, that the age of the The process of stabilization might also be affected in oth- L1s dates them from the time of the eutherian radiation [18], er ways. L1s might affect the binding of histone-modifying well after the time of the differentiation of the X- and Y- enzymes or might affect CpG island methylation. Hansen chromosomes. Even if the accumulation of L1s is inciden- [41] studied methylation of L1s in females suffering from tal to XCI, they could still be part of the mechanism. There the autosomal recessive ICF syndrome (immunodeficiency, could be a self-sustaining system. It is possible that as the centromeric instability, and facial anomalies) caused by the X- and Y-chromosomes differentiated, L1s accumulated and deficiency of the DNMT3B de novo DNA methyltransferase this favoured the spread of XCI, which presumably began at gene. These females showed normal methylation of CpG the XIC. Further accumulation of L1s in evolution could then islands and L1s on autosomes and the active X, but re- have been selected. duced methylation of both L1s and CpG islands on the in- active X-chromosome. This indicated that L1s and CpG is- CONCLUSION lands on the inactive X are methylated by DNMT3B, and that autosomal and active X L1s are methylated by some At present, there is good evidence for an accumulation of other enzyme. Hansen concluded that X-chromosomal L1s L1s on the human X-chromosome in such a distribution are probably unmethylated at the time of initiation of XCI that they could fulfil the function of booster elements in in the early embryo. This is consistent with them having a XCI. Whether or not they indeed have such a function is role in XCI. Another possibility is that L1s tend to change less clear. Information from the sequence of the human X- the state of chromatin towards the heterochromatic state chromosome showing that regions with many escaping genes and this again might tend to assist stabilization of XCI have a lower density of L1s is very provocative. In addi- [42]. tion to the more limited coverage of Tsuchiya et al [28], it An interesting point is that Allen et al [43]foundanex- would be very valuable to have similar data on the mouse X- cess of L1s in the flanking regions of autosomal monoallel- chromosome, which has been much rearranged in evolution ically expressed genes, both imprinted genes and randomly and where there are fewer escaping genes. monoallelically expressed genes. In human and mouse, the L1s were primarily of an evolutionarily young, species- REFERENCES specific type, suggesting that they had accumulated after the genes became monoallelic, and that the genes concerned had [1] Lyon MF. Gene action in the X-chromosome of the mouse developed a strategy for monoallelic expression involving L1s (Mus musculus L.). Nature. 1961;190(4773):372–373. after first becoming monoallelic. As with the X-chromosome, [2] Russell LB, Bangham JW. Variegated-type position effects in the role of these L1s is not clear. the mouse. Genetics. 1961;46(5):509–525. Mary F. Lyon 5

[3] Lyon MF. Cytogenetics, discussion. In: Proceedings of the Sec- [24] Riggs AD. X chromosome inactivation, differentiation, and ond International Conference on Congenital Malformations; DNA methylation revisited, with a tribute to Susumu Ohno. July 1963; New York, NY. 67–68. Cytogenetic and Genome Research. 2002;99(1–4):17–24. [4] Russell LB. Mammalian X-chromosome action: inactivation [25] Goldman MA, Stokes KR, Idzerda RL, et al. A chicken trans- limited in spread and region of origin. Science. 1963;140:976– ferrin gene in transgenic mice escapes X-chromosome inacti- 978. vation. Science. 1987;236(4801):593–595. [5] Penny GD, Kay GF, Sheardown SA, Rastan S, Brockdorff N. [26] Goldman MA, Reeves PS, Wirth CM, et al. Comparative Requirement for Xist in X chromosome inactivation. Nature. methylation analysis of murine transgenes that undergo or 1996;379(6561):131–137. escape X-chromosome inactivation. Chromosome Research. [6] Wutz A, Jaenisch R. A shift from reversible to irreversible X in- 1998;6(5):397–404. activation is triggered during ES cell differentiation. Molecular [27] Waterston RH, Lindblad-Toh K, Birney E, et al. Initial se- Cell. 2000;5(4):695–705. quencing and comparative analysis of the mouse genome. Na- [7] Clemson CM, McNeil JA, Willard HF, Lawrence JB. XIST RNA ture. 2002;420(6915):520–562. paints the inactive X chromosome at interphase: evidence for [28] Tsuchiya KD, Greally JM, Yi Y, Noel KP, Truong JP, Disteche a novel RNA involved in nuclear/chromosome structure. The CM. Comparative sequence and X-inactivation analyses of a Journal of Cell Biology. 1996;132(3):259–275. domain of escape in human Xp11.2 and the conserved seg- [8] Heard E. Recent advances in X-chromosome inactivation. ment in mouse. Genome Research. 2004;14(7):1275–1284. Current Opinion in Cell Biology. 2004;16(3):247–255. [29] Russell LB, Montgomery CS. Comparative studies on X- [9] Gartler SM, Riggs AD. Mammalian X-chromosome inactiva- autosome translocations in the mouse. II. Inactivation of au- tion. Annual Review of Genetics. 1983;17:155–190. tosomal loci, segregation, and mapping of autosomal break- [10] Riggs AD. Marsupials and mechanisms of X-chromosome in- points in five T(X;1)’S. Genetics. 1970;64(2):281–312. activation. Australian Journal of Zoology. 1990;37(2):419–441. [30] Cattanach BM. Position effect variegation in the mouse. Ge- [11] Lyon MF. Epigenetic inheritance in mammals. Trends in Ge- netical Research. 1974;23(3):291–306. netics. 1993;9(4):123–128. [31] Duthie SM, Nesterova TB, Formstone EJ, et al. Xist RNA ex- [12] Lyon MF. X-chromosome inactivation: a repeat hypothesis. hibits a banded localization on the inactive X chromosome Cytogenetics and Cell Genetics. 1998;80(1–4):133–137. and is excluded from autosomal material in cis. Human Molec- [13] Boyle AL, Ballard SG, Ward DC. Differential distribution of ular Genetics. 1999;8(2):195–204. long and short interspersed element sequences in the mouse [32] White WM, Willard HF, Van Dyke DL, Wolff DJ. The genome: chromosome karyotyping by fluorescence in situ hy- spreading of X inactivation into autosomal material of an bridization. Proceedings of the National Academy of Sciences of X;autosome translocation: evidence for a difference between the United States of America. 1990;87(19):7757–7761. autosomal and X-chromosomal DNA. The American journal [14] Korenberg JR, Rykowski MC. Human genome organization: of Human Genetics. 1998;63(1):20–28. Alu, lines, and the molecular structure of metaphase chromo- [33]KeohaneAM,BarlowAL,WatersJ,BournD,TurnerBM.H4 some bands. Cell. 1988;53(3):391–400. acetylation, XIST RNA and replication timing are coincident [15] Baker RJ, Wichman HA. Retrotransposon Mys is concentrated and define X;autosome boundaries in two abnormal X chro- on the sex chromosomes: implications for copy number con- mosomes. Human Molecular Genetics. 1999;8(2):377–383. tainment. Evolution. 1990;44(8):2083–2088. [34] Sharp AJ, Spotswood HT, Robinson DO, Turner BM, Jacobs [16] Wichman HA, Van den Bussche RA, Hamilton MJ, Baker RJ. PA. Molecular and cytogenetic analysis of the spreading of X Transposable elements and the evolution of genome organiza- inactivation in X;autosome translocations. Human Molecular tion in mammals. Genetica. 1992;86(1–3):287–293. Genetics. 2002;11(25):3145–3156. [17] Parish DA, Vise P, Wichman HA, Bull JJ, Baker RJ. Distribu- [35] Hall LL, Clemson CM, Byron M, Wydner K, Lawrence JB. Un- tion of LINEs and other repetitive elements in the karyotype of balanced X;autosome translocations provide evidence for se- the bat Carollia: implications for X-chromosome inactivation. quence specificity in the association of XIST RNA with chro- Cytogenetic and Genome Research. 2002;96(1–4):191–197. matin. Human Molecular Genetics. 2002;11(25):3157–3165. [18] Bailey JA, Carrel L, Chakravarti A, Eichler EE. Molecular evi- [36] Hall LL, Byron M, Sakai K, Carrel L, Willard HF, Lawrence JB. dence for a relationship between LINE-1 elements and X chro- An ectopic human XIST gene can induce chromosome inac- mosome inactivation: the Lyon repeat hypothesis. Proceedings tivation in postdifferentiation human HT-1080 cells. Proceed- of the National Academy of Sciences of the United States of Amer- ings of the National Academy of Sciences of the United States of ica. 2000;97(12):6634–6639. America. 2002;99(13):8677–8682. [19] Ross MT, Grafham DV, Coffey AJ, et al. The DNA sequence of [37] Lee JT, Jaenisch R. Long-range cis effects of ectopic X- the human X chromosome. Nature. 2005;434(7031):325–337. inactivation centres on a mouse autosome. Nature. 1997;386 [20] Graves JAM, Watson JM. Mammalian sex chromosomes: (6622):275–279. evolution of organization and function. Chromosoma. 1991; [38] Wutz A, Jaenisch R. A shift from reversible to irreversible X in- 101(2):63–68. activation is triggered during ES cell differentiation. Molecular [21] Lahn BT, Page DC. Four evolutionary strata on the human X Cell. 2000;5(4):695–705. chromosome. Science. 1999;286(5441):964–967. [39] Latham KE. X chromosome imprinting and inactivation in [22] Ke X, Collins A. CpG islands in human X-inactivation. Annals preimplantation mammalian embryos. Trends in Genetics. of Human Genetics. 2003;67(3):242–249. 2005;21(2):120–127. [23] Carrel L, Willard HF. X-inactivation profile reveals extensive [40] Huynh KD, Lee JT. Inheritance of a pre-inactivated pater- variability in X-linked gene expression in females. Nature. nal X chromosome in early mouse embryos. Nature. 2003; 2005;434(7031):400–404. 426(6968):857–862. 6 Journal of Biomedicine and Biotechnology

[41] Hansen RS. X inactivation-specific methylation of LINE-1 ele- ments by DNMT3B: implications for the Lyon repeat hypoth- esis. Human Molecular Genetics. 2003;12(19):2559–2567. [42] Steinemann M, Steinemann S. Enigma of Y chromosome degeneration: neo-Y and neo-X chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica. 1998;102-103(1):409–420. [43] Allen E, Horvath S, Tong F, et al. High concentrations of long interspersed nuclear element sequence distinguish monoallel- ically expressed genes. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(17):9940– 9945. [44] Graves JAM. The origin and function of the mammalian Y chromosome and Y-borne genes—an evolving understanding. BioEssays. 1995;17(4):311–320. [45] Langley CH, Montgomery E, Hudson R, Kaplan N, Charlesworth B. On the role of unequal exchange in the containment of transposable element copy number. Genetical Research. 1988;52(3):223–235. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 56182, Pages 1–9 DOI 10.1155/JBB/2006/56182

Review Article LINE-1 Endonuclease-Dependent Retrotranspositional Events Causing Human Genetic Disease: Mutation Detection Bias and Multiple Mechanisms of Target Gene Disruption

Jian-Min Chen,1, 2, 3 Claude Ferec,´ 1, 2, 3, 4 and David N. Cooper5

1 INSERM U613, G´en´etique Mol´eculaire et G´en´etique Epid´ ´emiologique, 29220 Brest, France 2 Facult´edeM´edecine de Brest et des Sciences de la Sant´e, Universit´e de Bretagne Occidentale, 29238 Brest, France 3 Etablissement Franc¸ais du Sang-Bretagne, 35000 Rennes, France 4 Hopitalˆ Morvan, CHRU Brest, Laboratoire de G´en´etique Mol´eculaire et d’Histocompatibilit´e, 29200 Brest, France 5 Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park Campus, Cardiff CF14 4XN, UK

Received 20 April 2005; Revised 11 October 2005; Accepted 13 October 2005 LINE-1 (L1) elements are the most abundant autonomous non-LTR retrotransposons in the human genome. Having recently performed a meta-analysis of L1 endonuclease-mediated retrotranspositional events causing human genetic disease, we have ex- tended this study by focusing on two key issues, namely, mutation detection bias and the multiplicity of mechanisms of target gene disruption. Our analysis suggests that whereas an ascertainment bias may have generally militated against the detection of autosomal L1-mediated insertions, autosomal L1 direct insertions could have been disproportionately overlooked owing to their unusually large size. Our analysis has also indicated that the mechanisms underlying the functional disruption of target genes by L1-mediated retrotranspositional events are likely to be dependent on several different factors such as the type of insertion (L1 direct, L1 trans-driven Alu, or SVA), the precise locations of the inserted sequences within the target gene regions, the length of the inserted sequences, and possibly also their orientation.

Copyright © 2006 Jian-Min Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION deletions created upon L1-mediated retrotransposition, and the process of L1-mediated insertion [6]. Here, we have ex- LINE-1 (long interspersed element-1) or L1 elements ac- tended this analysis by focusing on two key issues namely, count for ∼ 17% of the human genome sequence [1]. How- mutation detection bias and the multiple mechanisms of tar- ever, of > 500, 000 L1 copies, only ∼ 80–100 are capable of get gene disruption. Note that during the preparation of this active retrotransposition [2]. Retrotranspositionally compe- review, three further examples of simple Alu insertions caus- tent L1 elements are typically 6.0 kb in length and L1 retro- ing human disease have been reported; these were also in- transposition is thought to occur by target-site-primed re- cluded in the analysis (Table 1). verse transcription [3–5]. Apart from simple self-insertion, L1 retrotransposition can alter the primary structure of the MUTATION DETECTION DISPLAYS human genome in a variety of different ways (6,7,8,9). A SIGNIFICANT BIAS Recently, we performed a comprehensive meta-analysis of 48 L1 endonuclease-mediated retrotranspositional events Since the first report that de novo L1 insertions into the fac- that cause human genetic disease. This analysis explored the tor VIII gene (F8) had caused severe haemophilia A [7], sequence features associated with the different L1-mediated numerous examples of simple L1-mediated retrotransposi- human retrotransposons (ie, L1 direct insertions, L1 trans- tional events (ie, those involving no loss of target gene ma- driven Alu insertions, and L1 trans-driven SVA (short in- terial; n = 42) have been identified as a cause of human ge- terspersed nucleotide elements-R, variable number of tan- netic disease (Table 1). Based upon results from in vitro stud- dem repeats, and Alu insertions)), the frequency of genomic ies [8, 9], we have systematically annotated disease-causing 2 Journal of Biomedicine and Biotechnology

Table 1: L1 endonuclease-mediated retrotranspositional events known to cause human genetic disease.∗

Length of Location of insert Disrupted Chrom. Inserted Insertion size Target gene Original detection poly (A) within the target Reference gene location element (bp)/orientationa disruptionc methodd tail (bp) geneb Simple insertions APC 5q L1 Ta 520/S 222 E15 —e Southern blotting [56] Initially failed to CHM Xq L1 Ta 6017/AS 71 E6 (+35; −82) Skipping of E6 [22] amplify exon 6 CYBB Xp L1 Ta 836/S 69 I5 (+1864; −278) Complex splicing RT-PCR [23] CYBB Xp L1 Ta 1722/S 101 E4 — Southern blotting [57] DMD Xp L1 Ta 1400/S 38 E48 — Southern blotting [58] No expression of 5’-UTR DMD Xp L1 Ta 530/AS 73 muscle form RT-PCR [45] (see text) transcript F8 Xq L1 Ta 3800/S 54 E14 — Southern blotting [7] F8 Xq L1 preTa 2300/AS 77 E14 — Southern blotting [7] F9 Xq L1 Ta 463/S 68 E5 — Not specified [59] F9 Xq L1 Ta 163/S 125 E7 — PCR [60] Reduced mRNA HBB 11p L1 Ta 6000/AS 107 I2 (+765; −85) Southern blotting [41, 61] expression (15%) No mRNA RP2 Xp L1 Ta 6000/S 64 I1 (+633; −15641) Southern blotting [24] expression Initially failed to RPS6KA3 Xp L1 HS 2800/AS Yes f I3 (+5177; −8) Skipping of E4 [40] amplify exon 4 APC 5q AluYb8 278/S 40 E15 — PCR [62] BCHE 3q AluYb9 289/S 38 E2 — Southern blotting [63] BRCA1 17q AluS 286/S Yesf E11 — Protein truncation test [21] BRCA2 13q AluYc1 281/S 62 E22 (+36; −163) Skipping of E22 PCR [25] BRCA2 13q AluYa5 285/S Yesf E3 Skipping of E3 Southern blotting [21] BTK Xq AluY—/AS— E8— PCR[64, 65] BTK Xq AluY 281/S 74 E9 — PCR [15] CASR 3q AluYa5 280/AS 93 E7 — PCR [66] CLCN5 Xp AluYa5 281/S 50 E11 Skipping of E11 PCR [26, 32] CRB1 1q AluY—/AS70E7— PCR[67] EYA1 8q AluYa5 —/AS 97; 31g E10 — Southern blotting [68] F8XqAluYb9 288/AS 37 I18 (+1734; −5) Skipping of E19 PCR [35] F9XqAluYa5a2 244/S 78 E5 — Southern blotting [69] F9XqAluYa5a2 237/S 39 E5 — PCR [70] F9XqAluY 279/AS 40 E8 — Not specified [59] FGFR2 10q AluYa5 283/AS 69 I8 (−2) Skipping of E9 PCR [36] FGFR2 10q AluYb8 288/AS 47 E9 — PCR [36] GK Xp AluYc1 241/AS 74 I4 (+13629; −42) See text PCR [39] HESX1 3p AluYb8 288/S 30 E3 Complex splicing PCR [27] No mRNA HMBS 11q AluYa5 279/AS 39 E5 (+32; −18) PCR [28] expression IL2RG Xq AluYa5 —/AS — I7 ( −17) — PCR [64, 65] NF1 17q AluYa5 282/AS 40 I40 (+134; −27) Skipping of E41 Southern blotting [37] SERPING1 11q AluYc1 285/S 42 I6 — Not specified [71] TNFRSF6 10q AluYa5 281/AS 33 I7 (+1212; −50) Skipping of E8 RT-PCR [38] ZFHX1B 2q AluYa5 281/S 93 E8 — PCR [72] Initially failed to amplify a small ARH 1p SVA 2600/S 57 I1 (+687; −9453) No expression region of intron 1 [50] in a homozygous patient Initially failed to BTK Xq SVA 491/S 74 E9 (+51; −26) Skipping of E9 [48] amplify E9 by PCR Nearly no FCMD 9q SVA 3062/S Yes f 3’-UTR (see text) Southern blotting [51] expression SPTA1 1q SVA 632/S 50 E5 (+60; −87) Skipping of E5 RT-PCR [49] Jian-Min Chen et al 3

Table 1: Continued Genomic deletions associated with L1-mediated retrotransposons DMD Xp L1 Ta 608/AS (Δ1 bp) 16 E44 (+145; −3) Skipping of E44 PCR [54] FCMD 9q L1 Ta 1200/S (Δ6 bp) 59 I7 (+2527; −24) Complex splicing Southern blotting [55] Initially failed to ABCD1 Xq AluYb9 98/S (Δ4726 bp) 20 I5 NAh amplify several [17] exons by PCR In vitro synthesized- APC 5q AluYb9 93/AS (Δ1599 bp) 60 E14 NA [52] protein assay F8XqAluYb8 290/AS (Δ2 bp) 47 E14 — Southern blotting [73] SERPINC1 1q Alu 6/AS (Δ1444 bp) 40 I3b NA Southern blotting [53] Genomic deletions associated with only simple poly (A) insertions AGA 4q NA NA/AS (Δ2076 bp) 37 I8 NA RT-PCR [74] BRCA2 13q NA NA/S (Δ6212 bp) 35 I13 NA Southern blotting [75] COL4A6 Xq NA NA/AS (Δ > 40 kb) 70 I2 NA Southern blotting [76]

∗The entries are presented in the same order as in Table 2 from Chen et al [6] for easy comparison, except for the addition of three simple Alu insertions (BRCA1 [21]; BRCA2 [21]; and HESX1 [27]) that have been reported during the preparation of this review. Data on chromosomal location, inserted element and orientation, insertion size, and length of poly (A) tail were derived from Table 2 in Chen et al [6]. aWith respect to the sense strand of the disrupted gene. S, sense; AS, antisense. The lengths of the genomic deletions associated with L1-mediated retrotransposons and simple poly (A) insertions are indicated in parentheses. bI, intron; E, exon. When an insertion occurred into an intron/exon and accompanying RNA analysis data were available, the position of the insertion’s integration site was indicated in parentheses (+, relative to the first nucleotide of the intron/exon; −, relative to the last nucleotide of the intron/exon). cOnly the effect on the target gene’s pre-mRNA splicing and/or mRNA expression was evaluated. dThe method that initially suggested/identified the mutation at the nucleotide level. PCR indicates all PCR-based techniques using genomic DNA as templates. eData not available. fPoly (A) tail present but number of residues not specified. g97 bp in the affected mother and 31 bp in the affected daughter, respectively. hNot applicable.

L1-mediated retrotranspositional events that have been as- L1 endonuclease-mediated retrotranspositional events with sociated with genomic deletions (n = 9; Table 1). All these respect to the mutation detection method(s) that initially events probably resulted from L1 endonuclease-dependent suggested/identified the presence of an insertion or dele- retrotranspositional activity because not only have all the tion at the nucleotide (ie, DNA or RNA) level. The loca- inserts integrated at typical L1-endonuclease cleavage sites, tions of these lesions within the target genes (ie, in the 5- but they also possess poly (A) tails (see [6, Tables 1 and 2 untranslated regions (UTRs), exons, introns, or 3-UTRs, and Figure 3]). By contrast, the three L1-derived extra-short resp) were also systematically annotated. Then, in order to inserts (termed “hyphen elements” by Audrezet´ et al [10]) assess the likelihood of having underestimated the occur- identified at the junctions of large genomic deletions [10– rence of this type of mutational event, we attempted to re- 12] did not share the above two hallmark characteristics late the chromosomal location of the affected genes, as well of L1 endonuclease-dependent retrotranspositional events. as the types, sizes, and precise locations of the inserted se- These three mutations have therefore been proposed to have quence within the genes, to the mutation detection methods arisen via a “repair” process for existing DNA lesions, an L1 employed (Table 1). endonuclease-independent mechanism [13] that is likely to In the context of the analysis of possible mutation detec- be qualitatively different from L1 endonuclease-based inser- tion bias, we excluded, for reasons of simplicity, the follow- tional mutagenesis (see [6, Table 1]). ing entries from further consideration: (i) the three large ge- The above 51 L1 endonuclease-mediated retrotranspo- nomic deletions that were associated with only simple poly sitional events account for ∼ 0.1% of known mutations (A) insertions, since the type of L1-mediated retrotranspo- (∼ 52, 000 as of April 2005) causing human genetic disease, son involved is unknown (Table 1) and (ii) the SVA simple based upon the data collated in the Human Gene Muta- insertions, owing to their limited number (only 4; Table 1). tion Database (http://www.hgmd.org/;[14]). The occurrence Our primary focus has therefore been the L1 and Alu in- of L1-mediated simple retrotranspositional events has how- sertions, both of which have been frequently found to cause ever long been thought to have been underestimated since human genetic disease. In addition, we did not consider the large insertions may often be overlooked by routinely used 42 simple insertions separately from the 6 genomic deletions PCR-based mutation detection techniques (eg, [15, 16]). In associated with L1-mediated retrotransposons, on the basis this review, we have sought to explore how this mutation de- that all were considered to have resulted from the same L1 tection bias could have operated. To this end, we first manu- endonuclease-mediated insertional mechanism. However, it ally evaluated the original publications that reported the 51 is important to emphasize that, of the latter 6 cases, three 4 Journal of Biomedicine and Biotechnology

Table 2: X-chromosome/autosome comparison with respect to gene number, disease genes, known mutations, and retrotranspositional insertion events. No. of Alu No. of L1 direct Known/putative a Diseases/traits Disease genes Mutations insertions causing insertions causing Size (Mb) genes in the in OMIMc in HGMDd in HGMDe genetic disease genetic disease human genomeb A: X chromosome 11 12 155 1,098 895 124 10010 B: autosomes+Y 18 3 3045 ∼ 39, 000 14,977 1877 42155 (A/B) % 65.1 400 5.1 2.8 6.0 6.6 23.8 a,bData from [20]. cData from OMIM ( http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM&itool=toolbar) as of April 2005. d,eData from HGMD (http://www.hgmd.org/) as of April 2005. events were associated with extremely short target gene dele- L1 direct inserts are usually much longer than Alu inserts tions (Δ1bp in DMD; Δ6bp in FCMD;andΔ2bp in F8) (Table 1). Although, in principle, the presence of large inserts yet contain relatively long inserts (Table 1); they were thus in X-linked genes in males might be initially suggested by treated here as simple insertions. By contrast, the three Alu the failure to PCR amplify the exon(s) under investigation insertions that are associated with large genomic deletions (eg, as in the case of the 6017 bp L1 insertion in the CHM (Δ4726 bp in ABCD1; Δ1599 bp in APC;andΔ1444 bp in gene [22]), most of the L1 direct insertions listed in Table 1 SERPINC1) were treated as simple large genomic deletions were reported to have been initially identified by RT-PCR since in all cases, the inserted Alu sequences plus the poly (A) or Southern blotting. Given the extensive efforts devoted to tails are rather short (Table 1). screening for X-linked disease (note particularly the identi- Since a typical Alu sequence is invariably < 290 bp in fication of two inserts that had become integrated into deep length and the poly (A) tails associated with L1-mediated intronic regions (CYBB [23]; RP2 [24])), we surmise that the retrotransposons are usually < 100 bp (see also [6, Table 4]), current figure (n = 12) of L1 direct inserts into the X chro- the length of an Alu insert plus its poly (A) tail should be mosome may approach complete ascertainment. In this con- < 400 bp (Table 1). At first sight, it would appear unlikely text, it is noteworthy that with respect to the insertions caus- that those Alu inserts which cause sex-linked disease are go- ing human X-linked disease, the number of reported L1 di- ing to be significantly underestimated both because X-linked rect insertions (n = 12) is approximately the same as that of diseases readily come to clinical attention in males and be- reported Alu insertions into X-linked genes (n = 11). How- cause inserts of < 400 bp into male X chromosomes are read- ever, by comparison with disease-causing Alu insertions that ily identifiable by routine PCR-based methods. Indeed, as is have become integrated within autosomal genes (n = 18), evident from Table 1, only in rare cases have the simple Alu an apparent paucity of disease-causing autosomal L1 direct inserts that have become integrated into the X chromosome inserts (n = 3) is evident (Table 1). The reason for this find- been detected by Southern blotting or RT-PCR, and these ing may be quite simple: the longer the inserts are, the more would also have been amenable to detection by routine PCR- easily will they be missed by routine PCR-based techniques based methods. Whilst an electrophoretic band of larger than in the presence of a wild-type allele. It is therefore not un- the expected size was demonstrated in the cases of Alu inser- reasonable to conclude that the occurrence of L1 direct in- tions when PCR products were examined, failure to PCR am- sertions causing autosomal disease has probably been signif- plify several exons was encountered in the case of the 4726 bp icantly underestimated. deletion involving the X-linked ABCD1 gene [17]. To obtain further insights into this issue, we examined To date, whereas 11 Alu insertions have been identified in the above finding in the context of a multiple pairwise com- X-linked genes as a cause of human genetic disease in male parison (Table 2). This revealed that, in general, mutations in patients, the comparable figure for the autosomes is only 18 X-linked genes are significantly over-represented in HGMD (Table 1). Although the X chromosome has been claimed by comparison with both the proportion of X-linked to to be a preferred target for retrotransposition [18, 19], it non-X-linked genes in HGMD (4-fold; p<0.0001), and is difficult to accept that the observed chromosomal distri- the proportion of X-linked to non-X-linked genes in the bution of retrotranspositional mutations reflects the actual genome as a whole (8-fold; p<0.0001). This could be distribution since the X chromosome comprises only ∼ 5% due to a number of different factors including (i) the X of the human genome [20]. Consequently, it would appear chromosome bearing a slightly higher proportion of genes likely that at least a proportion of Alu insertions causing that are “disease genes” than other chromosomes, (ii) X- human autosomal disease have been overlooked by routine linked disease may come to clinical attention more readily PCR-based techniques. This could have been due to prefer- than autosomal disease since recessive mutations will be- ential PCR amplification of the wild-type allele which would come manifest in hemizygous males, (iii) hemizygous in- have “masked” the Alu insertion mutant allele, an example sertional mutations on the X-chromosome may, using cur- being the failure to detect two Alu insertions by routinely rently used mutation detection techniques, be more read- used methods [21]. ily detectable than heterozygous/compound heterozygous Jian-Min Chen et al 5 insertional mutations on the autosomes (due to the inher- possible mechanisms proposed by the original authors, we ent limitations of PCR/“masking” of the mutant allele by favour nonsense-mediated mRNA decay [33, 34]. the wild-type allele), (iv) greater effort may have been ex- All 5 informative Alu intronic insertions are located pended, historically, in identifying the genes and characteriz- nearer to the downstream exons than to the preceding exons. ing the mutational spectra underlying X-linked disease, and Consequently, most of them (n = 4) were found to cause (v) the X-chromosome may represent a preferred retrotrans- skipping of the downstream exons: whilst two most likely af- positional target as compared to other chromosomes. In re- fect the correct recognition of the splice acceptor sites (F8 ality, a combination of all these different factors has probably [35]; FGFR2 [36]), the other two may affect the branch site been operating. These considerations are also likely to apply that is usually located very close to the end of the intron (NF1 to retrotranspositional insertions and may together account [37]; TNFRSF6 [38]). The remaining intronic insertion (GK for the discrepancy in the observed prevalence of insertions [39]) was, however, reported not to “cause any deletions, into the X-chromosome as compared with the autosomes. duplications, premature stop codons, or frameshifts in the individual with benign glycerol kinase deficiency, as deter- mined by RT-PCR (data not shown).” This notwithstanding, MULTIPLE MECHANISMS OF TARGET since no other mutations were present within the coding re- GENE DISRUPTION gions and intron-exon boundaries of the gene, and since the Alu insertion does not represent a polymorphism, this in- We also systematically surveyed the original publications that sertion was concluded to be indeed disease-causing [39]. Al- reported the 51 L1 endonuclease-mediated retrotransposi- though we concur with this conclusion, we nevertheless feel tional events with respect to the evidence presented for func- that the functional consequence(s) of the Alu insertion may tional disruption of the target genes at the RNA level (ie, have been overlooked. In this regard, it is worth pointing out aberrant splicing and/or decreased mRNA expression). The that the patient’s radiochemically measured GK activity was information obtained was further evaluated in the context of 32% (ie, not a complete loss) that of the mean normal con- the size, orientation, and integration sites of the inserts wher- trol [39]. It is therefore possible that the Alu insertion did not everpossibleandappropriate(Table 1). completely disrupt normal pre-mRNA splicing. However, in the RT-PCR analysis, the aberrantly spliced transcripts may Simple insertions have been unstable and could thus have been “masked” by correctly spliced stable transcripts. Alu insertions L1 insertions Of the 18 simple Alu insertions that integrated within coding regions, only five were informative with respect to the func- As with the Alu simple insertions, only one of the 8 L1 sim- tional disruption of the target genes at the RNA level (BRCA2 ple insertions in coding regions was informative with respect [25]; BRCA2 [21]; CLCN5 [26], HESX1 [27]; HMBS [28]). to target gene disruption; it caused the skipping of the exon This was in sharp contrast to the 7 simple Alu insertions that involved [22], probably through a similar mechanism to the are known to have become integrated into intronic regions, above-discussed Alu insertion into the BRCA2 gene [25]. By 5 of which were informative. The probable reason for this contrast, all 4 intronic insertions were informative: whilst phenomenon is that Alu insertions into coding regions will two insertions were associated with either the skipping of a invariably lead to the loss of a functional protein product, ir- single exon (RPS6KA3 [40]) or an extremely complex splic- respective of the precise point at which the gene expression ing pattern (CYBB [23]), the other two insertions resulted in pathway has been disrupted. a significant, or even complete, loss of the mRNA transcript The Alu insertion into exon 22 of the BRCA2 gene re- (HBB [41]; RP2 [24]). The latter two examples will now be sulted in the skipping of the exon involved through “some discussed in detail in the light of a recent report [42]. unknown mechanism” [25]. With hindsight, this insertion, Both L1 RNA and open-reading-frame-2 (ORF2) pro- which integrated fairly deeply into the exon involved (36 bp tein are very difficult to detect in mammalian cells, suggest- after the first nucleotide and 163 bp before the last nucleotide ing a mammalian-specific mechanism for negatively regu- of exon 22; Table 1), could have disrupted cis-splicing ele- lating L1 expression (see [42] and references therein). In- ments such as an exon splicing enhancer or/and could have deed, the A-rich sense strand of an active human L1 element interacted with trans-acting cellular splicing factors, resulting (ie, LINE-1.3; [43]), containing many canonical (n = 19) in the “silencing” of the upstream constitutional splice ac- and noncanonical (n = 141) polyadenylation signals, has ceptor site (for reviews, see [29–31]). Consistent with this been noted to be prone to generate truncated transcripts by postulate, the Alu insertion in the CLCN5 gene [32]was premature polyadenylation, at least under in vitro conditions recently suggested to interfere with splicing regulatory ele- [44]. However, using a different cell culture assay, Han et ments, resulting in exon 11 skipping [26]. However, this is al [42] have shown that poor expression of the ORF2 protein certainly not the case for the Alu insertion into exon 5 of is mainly due to the inability of RNA polymerase to elon- the HMBS gene: both in vitro expression studies and in vivo gate efficiently through L1 coding sequences (despite a minor RT-PCR analyses demonstrated that the mutant HMBS al- contribution from premature polyadenylation). Moreover, lele was not expressed at the RNA level [28]. Of the various these authors have demonstrated that an ORF2 sequence, 6 Journal of Biomedicine and Biotechnology when placed in the antisense orientation, inhibits transcrip- SPTA1 [49]), whereas the other two were reported to be as- tion primarily by promoting premature polyadenylation. sociated with virtually undetectable mRNA expression (ARH Based upon these observations, Han et al [42] predicted that [50]; FCMD [51]). In the case of the ARH mutation, “al- L1 elements which have become inserted into introns could though no mRNA was detectable by Northern blotting, small attenuate the expression of target genes either by prema- amounts of cDNA could be amplified using RT-PCR” [50]. ture truncation of RNA (in the antisense orientation) or by Similarly, “the transcript of this (FCMD) gene was nearly un- promoting transcriptional elongation (in the sense orienta- detectable in FCMD patients who carried the insertion ho- tion), both mechanisms resulting in the decreased produc- mozygously, and significantly lower than normal in patients tion of full-length pre-mRNA. Consistent with this postulate, heterozygous for the insertion and another mutation haplo- highly expressed genes were found to contain relatively small type” [51]. As previously discussed [6], although SVA ele- amounts of L1 sequence, whereas poorly expressed genes ments are relatively poorly characterized, they are composed contained large amounts [42]. of highly repetitive sequences (for a detailed sequence de- In particular, the full-length de novo L1 insertion into scription, see [51]; refer also to [50, Figure 2]). Importantly, intron 1 of the RP2 gene that is associated with the com- both SVA insertions are rather long (2600 and 3062 bp, resp). pletelossofRP2 mRNA synthesis [24] was cited by Han et Moreover, the SVA insertion in ARH [50] is very similar to al [42] as an example to support their thesis. As is evident the L1 insertion in RP2 [24] in the following respects: both from Table 1, the L1 insert in the HBB gene [41] shares re- were in the sense orientation and both had been inserted markable similarities with that found in the RP2 gene [24]: into the first introns of their respective genes in compara- both are full-length and both became integrated within in- ble locations (Table 1). Thus, it is tempting to speculate that trons. However, whereas the full-length RP2 L1 insertion was the 2600 bp SVA insert may also compromise transcriptional in the sense orientation and resulted in the complete loss elongation resulting in an undetectable level of mRNA (even of gene expression, the full-length HBB L1 insertion was in although it is C-rich, cf L1 which is A-rich). the antisense orientation and the amount of mRNA tran- That the 3062 bp SVA element had been inserted into the scribed from the affected allele was reduced to 30% of nor- 3-UTR of the FCMD gene [51]effectively serves to exclude mal (the mRNA transcripts from the affected and unaffected a possible effect on transcriptional initiation. It is also perti- alleles were distinguishable by a codon 2 polymorphism and nent to note that the normal FCMD transcript comprises a no splicing variants were detected [41]). This concurs with long 3-UTR of 5952 bp; the SVA integration site is 4375 bp the in vitro finding that “inserting ORF2 in the antisense downstream of the TGA translational termination codon and orientation produced a similar, but less potent, decrease in 1454 bp upstream of the poly (A) addition signal sequence. full-length RNA” [42]. Thus, the HBB insertion may serve as Thus, it is very likely that the 3062 bp SVA insertion (in sense an additional example of an insertion that is consistent with orientation) may either inhibit transcriptional elongation or the proposal that the insertion of L1 elements into a target cause abnormal polyadenylation resulting in the complete gene’s introns can significantly alter the expression of that loss of gene expression. gene [42]. The above notwithstanding, it would appear unlikely that Genomic deletions associated with L1-mediated the significantly 5-truncated L1 insert (only 530 bp) in the retrotranspositional events DMD gene [45] caused the complete loss of the muscle (M) isoform of dystrophin through inhibition of transcriptional In the 6 cases associated with large target gene deletions (ie, elongation and/or premature polyadenylation; this conclu- the 3 events associated with L1-mediated retrotransposons sion is based upon the in vitro observation that the level (ABCD1 [17]; APC [52]; SERPINC1 [53]) plus the 3 events of reporter RNA expression was inversely correlated with associated with only a simple poly (A) tail (Table 1)), the role the length of transfected L1 ORF2 (see [42, Figure 3]). In- played by L1-mediated short insertions in the functional dis- deed, this short insert, which had integrated just 28 bp up- ruption of the target genes cannot be independently assessed. stream of the ATG codon initiating translation of the M iso- Of the three events associated with extremely short genomic form encoded by the dystrophin (DMD) gene, must have af- deletions, only two are informative: whilst the 608 bp L1 in- fected transcriptional initiation and/or regulation. Although sertion in exon 44 of the DMD gene caused the skipping the expression of the M isoform was completely abolished, of the exon involved [54], the 1200 bp L1 insertion into in- there were compensatory increases in the expression of the tron 7 of the FCMD gene yielded a complex splicing pattern nonmuscle B (brain) and CP (cerebellar Purkinje) isoforms including the skipping of exons 7 and 8, the skipping of only in the patient’s skeletal muscle [45, 46]. (The M, B, and CP exon 7, and the skipping of exons 7, 8, and 9, respectively isoforms are generally considered to be functionally homolo- [55]. gous. However, the transcripts encoding these isoforms con- tain a unique first exon and are expressed from different, tissue-specific promoters, see [47] and references therein.) CONCLUSIONS

SVA insertions Mutation detection bias is a complex issue. This notwith- standing, our analysis has suggested that at least two factors Of the four SVA insertions (Table 1), two were inserted into (namely, clinical selection and the choice of mutation detec- exons causing the skipping of the exons involved (BTK [48]; tion techniques) may have contributed to a significant bias in Jian-Min Chen et al 7 detecting L1-mediated retrotranspositional events that cause [5] Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, human genetic disease. Although there is a general tendency Kazazian HH Jr. High frequency retrotransposition in cul- for autosomal L1-mediated insertions to be overlooked, au- tured mammalian cells. Cell. 1996;87(5):917–927. tosomal L1 direct insertions appear likely to be the most se- [6] Chen J-M, Stenson PD, Cooper DN, Ferec´ C. A systematic riously underestimated owing to their unusually large size. analysis of LINE-1 endonuclease-dependent retrotransposi- In particular, given the two examples of L1 direct inserts tional events causing human genetic disease. Human Genetics. that have integrated within deep intronic regions (CYBB 2005;117(5):411–427. [23]; RP2 [24]), it would appear that methods other than [7] Kazazian HH Jr, Wong C, Youssoufian H, Scott AF, Phillips PCR-based techniques (eg, RT-PCR and Southern blotting) DG, Antonarakis SE. Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for should be employed whenever necessary and possible, with a mutation in man. Nature. 1988;332(6160):164–166. view to maximizing the mutation detection rate. [8] Gilbert N, Lutz-Prigge S, Moran JV.Genomic deletions created Our analysis has also demonstrated that the mechanisms upon LINE-1 retrotransposition. Cell. 2002;110(3):315–325. underlying the functional disruption of target genes by L1- [9] Symer DE, Connelly C, Szak ST, et al. Human l1 retrotrans- mediated retrotranspositional events are dependent on sev- position is associated with genetic instability in vivo. Cell. eral factors such as the type of insertion, the precise loca- 2002;110(3):327–338. tions of the inserted sequences within the target gene re- [10] Audrezet´ M-P, Chen J-M, Raguen´ es` O, et al. Genomic rear- gions, the length of the inserted sequences, and perhaps also rangements in the CFTR gene: extensive allelic heterogene- their orientation. Thus, an Alu insert might not be capable ity and diverse mutational mechanisms. Human Mutation. of efficiently inhibiting transcriptional elongation owing to 2004;23(4):343–357. its small size. Moreover, inserts that have integrated within [11] Mager DL, Henthorn PS, Smithies O. A Chinese Gγ+(Aγδβ)0 5-or3-UTRs would be likely to affect the target genes dif- thalassemia deletion: comparison to other deletions in the hu- ferently from those that have integrated within coding or in- man β-globin gene cluster and sequence analysis of the break- tronic regions. Further, the unique examples of full-length points. Nucleic Acids Research. 1985;13(18):6559–6575. L1 inserts integrated into intronic regions (HBB [41]; RP2 [12] Van de Water N, Williams R, Ockelford P, Browett P. A [24]) suggest that both the length and orientation of L1 in- 20.7 kb deletion within the factor VIII gene associated serts may be important in the context of transcriptional in- with LINE-1 element insertion. Thrombosis and Haemostasis. hibition. This notwithstanding, the precise mechanisms un- 1998;79(5):938–942. derlying certain insertions, for example, the large SVA insert [13] Morrish TA, Gilbert N, Myers JS, et al. DNA repair mediated in the deep intronic region in the ARH gene [50] still remains by endonuclease-independent LINE-1 retrotransposition. Na- ture Genetics. 2002;31(2):159–165. to be clarified. [14] Stenson PD, Ball EV, Mort M, et al. Human Gene Mu- tation Database (HGMD): 2003 update. Human Mutation. ACKNOWLEDGMENTS 2003;21(6):577–581. [15] Conley ME, Partain JD, Norland SM, Shurtleff SA, Kazazian We are grateful to Peter Stenson (Cardiff, UK) for provid- HH Jr. Two independent retrotransposon insertions at the ing HGMD data and to Angus Clarke (Cardiff,UK)forhelp- same site within the coding region of BTK. Human Mutation. 2005;25(3):324–325. ful discussions about the possible causes of mutation detec- tion bias. Jian-Min Chen is a Visiting Professor of genetics [16] Kazazian HH Jr. An estimated frequency of endogenous inser- tional mutations in humans. Nature Genetics. 1999;22(2):130. supported by the MinisteredelaJeunesse,del’` Education´ [17] Kutsche K, Ressler B, Katzera HG, et al. Characterization of Nationale et de la Recherche, France. This work was sup- breakpoint sequences of five rearrangements in L1CAM and ported by the Institut National de la SanteetdelaRecherche´ ABCD1 (ALD) genes. Human Mutation. 2002;19(5):526–535. Medicale´ (INSERM), France. [18]EmersonJJ,KaessmannH,BetranE,LongM.Extensive gene traffic on the mammalian X chromosome. Science. REFERENCES 2004;303(5657):537–540. [19] Khil PP, Oliver B, Camerini-Otero RD. X for intersection: [1] Lander ES, Linton LM, Birren B, et al. Initial sequencing and retrotransposition both on and off the X chromosome is more analysis of the human genome. Nature. 2001;409(6822):860– frequent. Trends in Genetics. 2005;21(1):3–7. 921. [20] Ross MT, Grafham DV, Coffey AJ, et al. The DNA sequence of [2] Brouha B, Schustak J, Badge RM, et al. Hot L1s account the human X chromosome. Nature. 2005;434(7031):325–337. for the bulk of retrotransposition in the human population. [21] Teugels E, De Brakeleer S, Goelen G, Lissens W, Sermijn E, De Proceedings of the National Academy of Sciences of the United Greve J. De novo Alu element insertions targeted to a sequence States of America. 2003;100(9):5280–5285. common to the BRCA1 and BRCA2 genes. Human Mutation. [3] Luan DD, Korman MH, Jakubczak JL, Eickbush TH. Reverse 2005;26(3):284. transcription of R2Bm RNA is primed by a nick at the chro- [22] van den Hurk JA, van de Pol DJ, Wissinger B, et al. Novel types mosomal target site: a mechanism for non-LTR retrotranspo- of mutation in the choroideremia (CHM) gene: a full-length L1 sition. Cell. 1993;72(4):595–605. insertion and an intronic mutation activating a cryptic exon. [4] Feng Q, Moran JV, Kazazian HH Jr, Boeke JD. Human L1 Human Genetics. 2003;113(3):268–275. retrotransposon encodes a conserved endonuclease required [23] Meischl C, de Boer M, Ahlin˚ A, Roos D. A new exon created by for retrotransposition. Cell. 1996;87(5):905–916. intronic insertion of a rearranged LINE-1 element as the cause 8 Journal of Biomedicine and Biotechnology

of chronic granulomatous disease. European Journal of Human [41] Divoky V, Indrak K, Mrug M, Brabec V, Huisman THJ, Prchal Genetics. 2000;8(9):697–703. JT. A novel mechanism of β thalassemia: the insertion of L1 [24] Schwahn U, Lenzner S, Dong J, et al. Positional cloning of retrotransposable element into β globin IVS II. Blood. 1996; the gene for X-linked retinitis pigmentosa 2. Nature Genetics. 88(suppl 1):148a. 1998;19(4):327–332. [42] Han JS, Szak ST, Boeke JD. Transcriptional disruption by the [25] Miki Y, Katagiri T, Kasumi F, Yoshimoto T, Nakamura Y. Mu- L1 retrotransposon and implications for mammalian tran- tation analysis in the BRCA2 gene in primary breast cancers. scriptomes. Nature. 2004;429(6989):268–274. Nature Genetics. 1996;13(2):245–247. [43] Dombroski BA, Mathias SL, Nanthakumar E, Scott AF, [26] Claverie-Mart´ın F, Flores C, Anton-Gamero´ M, Gonzalez-´ Kazazian HH Jr. Isolation of an active human transposable el- Acosta H, Garc´ıa-Nieto V. The Alu insertion in the CLCN5 ement. Science. 1991;254(5039):1805–1808. gene of a patient with Dent’s disease leads to exon 11 skipping. [44] Perepelitsa-Belancio V, Deininger P. RNA truncation by pre- Journal of Human Genetics. 2005;50(7):370–374. mature polyadenylation attenuates human mobile element ac- tivity. Nature Genetics. 2003;35(4):363–366. [27] Sobrier M-L, Netchine I, Heinrichs C, et al. Alu-element inser- [45] Yoshida K, Nakamura A, Yazaki M, Ikeda S, Takeda S. Inser- tion in the homeodomain of HESX1 and aplasia of the anterior tional mutation by transposable element, L1, in the DMD gene pituitary. Human Mutation. 2005;25(5):503. results in X-linked dilated cardiomyopathy. Human Molecular [28] Mustajoki S, Ahola H, Mustajoki P, Kauppinen R. Insertion Genetics. 1998;7(7):1129–1132. of Alu element responsible for acute intermittent porphyria. [46] Nakamura A, Ikeda S, Yazaki M, et al. Up-regulation of the Human Mutation. 1999;13(6):431–438. brain and Purkinje-cell forms of dystrophin transcripts, in [29] Blencowe BJ. Exonic splicing enhancers: mechanism of action, Becker muscular dystrophy. The American Journal of Human diversity and role in human genetic diseases. Trends in Bio- Genetics. 1997;60(6):1555–1558. chemical Sciences. 2000;25(3):106–110. [47]BastianuttoC,BestardJA,LahnakoskiK,etal.Dystrophin [30] Faustino NA, Cooper TA. Pre-mRNA splicing and human dis- muscle enhancer 1 is implicated in the activation of non- ease. Genes & Development. 2003;17(4):419–437. muscle isoforms in the skeletal muscle of patients with X- [31] Zheng ZM. Regulation of alternative RNA splicing by exon linked dilated cardiomyopathy. Human Molecular Genetics. definition and exon sequences in viral and mammalian gene 2001;10(23):2627–2635. expression. Journal of Biomedical Science. 2004;11(3):278–294. [48] Rohrer J, Minegishi Y, Richter D, Eguiguren J, Conley ME. [32] Claverie-Mart´ın F, Gonzalez-Acosta´ H, Flores C, Anton-´ Unusual mutations in Btk: an insertion, a duplication, an in- Gamero M, Garc´ıa-Nieto V. De novo insertion of an Alu se- version, and four large deletions. Clinical Immunology. 1999; quence in the coding region of the CLCN5 gene results in 90(1):28–37. Dent’s disease. Human Genetics. 2003;113(6):480–485. [49] Hassoun H, Coetzer TL, Vassiliadis JN, et al. A novel mobile [33] Baker KE, Parker R. Nonsense-mediated mRNA decay: termi- element inserted in the alpha spectrin gene: spectrin dayton. nating erroneous gene expression. Current Opinion in Cell Bi- A truncated alpha spectrin associated with hereditary ellipto- ology. 2004;16(3):293–299. cytosis. The Journal of Clinical Investigation. 1994;94(2):643– 648. [34] Maquat LE. Nonsense-mediated mRNA decay: splicing, trans- lation and mRNP dynamics. Nature Reviews Molecular Cell Bi- [50] Wilund KR, Yi M, Campagna F, et al. Molecular mechanisms ology. 2004;5(2):89–99. of autosomal recessive hypercholesterolemia. Human Molecu- lar Genetics. 2002;11(24):3019–3030. [35] Ganguly A, Dunbar T, Chen P, Godmilow L, Ganguly T. [51] Kobayashi K, Nakahori Y, Miyake M, et al. An ancient retro- Exon skipping caused by an intronic insertion of a young Alu transposal insertion causes Fukuyama-type congenital muscu- Yb9 element leads to severe hemophilia A. Human Genetics. lar dystrophy. Nature. 1998;394(6691):388–392. 2003;113(4):348–352. [52] Su LK, Steinbach G, Sawyer JC, Hindi M, Ward PA, Lynch [36] Oldridge M, Zackai EH, McDonald-McGinn DM, et al. De PM. Genomic rearrangements of the APC tumor-suppressor novo alu-element insertions in FGFR2 identify a distinct gene in familial adenomatous polyposis. Human Genetics. pathological basis for Apert syndrome. The American Journal 2000;106(1):101–107. of Human Genetics. 1999;64(2):446–461. [53] Beauchamp NJ, Makris M, Preston FE, Peake IR, Daly ME. [37]WallaceMR,AndersenLB,SaulinoAM,GregoryPE,Glover Major structural defects in the antithrombin gene in four fam- TW, Collins FS. A de novo Alu insertion results in neurofibro- ilies with type I antithrombin deficiency—Partial/complete matosis type 1. Nature. 1991;353(6347):864–866. deletions and rearrangement of the antithrombin gene. [38] Tighe PJ, Stevens SE, Dempsey S, Le Deist F, Rieux-Laucat F, Thrombosis and Haemostasis. 2000;83(5):715–721. Edgar JD. Inactivation of the Fas gene by Alu insertion: retro- [54] Narita N, Nishio H, Kitoh Y, et al. Insertion of a 5 truncated transposition in an intron causing splicing variation and au- L1 element into the 3 end of exon 44 of the dystrophin gene toimmune lymphoproliferative syndrome. Genes & Immunity. resulted in skipping of the exon during splicing in a case of 2002;3(suppl 1):S66–S70. Duchenne muscular dystrophy. The Journal of Clinical Investi- [39] Zhang Y-H, Dipple KM, Vilain E, et al. AluY insertion (IVS4- gation. 1993;91(5):1862–1867. 52ins316alu) in the glycerol kinase gene from an individ- [55] Kondo-Iida E, Kobayashi K, Watanabe M, et al. Novel mu- ual with benign glycerol kinase deficiency. Human Mutation. tations and genotype-phenotype relationships in 107 families 2000;15(4):316–323. with Fukuyama-type congenital muscular dystrophy (FCMD). [40] Mart´ınez-Garay I, Ballesta MJ, Oltra S, et al. Intronic L1 inser- Human Molecular Genetics. 1999;8(12):2303–2309. tion and F268S, novel mutations in RPS6KA3 (RSK2) causing [56] Miki Y, Nishisho I, Horii A, et al. Disruption of the APC gene Coffin-Lowry syndrome. Clinical Genetics. 2003;64(6):491– by a retrotransposal insertion of L1 sequence in a colon cancer. 496. Cancer Research. 1992;52(3):643–645. Jian-Min Chen et al 9

[57] Brouha B, Meischl C, Ostertag EM, et al. Evidence consistent [73] Sukarova E, Dimovski AJ, Tchacarova P, Petkov GH, Efremov with human L1 retrotransposition in maternal meiosis I. The GD. An Alu insert as the cause of a severe form of hemophilia American Journal of Human Genetics. 2002;71(2):327–336. A. Acta Haematologica. 2001;106(3):126–129. [58] Holmes SE, Dombroski BA, Krebs CM, Boehm CD, Kazazian [74] Jalanko A, Manninen T, Peltonen L. Deletion of the C-terminal HH Jr. A new retrotransposable human L1 element from the end of aspartylglucosaminidase resulting in a lysosomal accu- LRE2 locus on chromosome 1q produces a chimaeric inser- mulation disease: evidence for a unique genomic rearrange- tion. Nature Genetics. 1994;7(2):143–148. ment. Human Molecular Genetics. 1995;4(3):435–441. [59] Li X, Scaringe WA, Hill KA, et al. Frequency of recent retro- [75] Wang T, Lerer I, Gueta Z, et al. A deletion/insertion mutation transposition events in the human factor IX gene. Human Mu- in the BRCA2 gene in a breast cancer family: a possible role tation. 2001;17(6):511–519. of the Alu-polyA tail in the evolution of the deletion. Genes, [60] Mukherjee S, Mukhopadhyay A, Banerjee D, Chandak GR, Ray Chromosomes and Cancer. 2001;31(1):91–95. K. Molecular pathology of haemophilia B: identification of five [76] Segal Y, Peissel B, Renieri A, et al. LINE-1 elements at the novel mutations including a LINE 1 insertion in Indian pa- sites of molecular rearrangements in Alport syndrome-diffuse tients. Haemophilia. 2004;10(3):259–263. leiomyomatosis. The American Journal of Human Genetics. [61]KimberlandML,DivokyV,PrchalJ,SchwahnU,BergerW, 1999;64(1):62–69. Kazazian HH Jr. Full-length human L1 insertions retain the capacity for high frequency retrotransposition in cultured cells. Human Molecular Genetics. 1999;8(8):1557–1560. [62] Halling KC, Lazzaro CR, Honchel R, et al. Hereditary desmoid disease in a family with a germline Alu I repeat mutation of the APC gene. Human Heredity. 1999;49(2):97–102. [63] Muratani K, Hada T, Yamamoto Y, et al. Inactivation of the cholinesterase gene by Alu insertion: possible mechanism for human gene transposition. Proceedings of the National Academy of Sciences of the United States of America. 1991;88 (24):11315–11319. [64] Lester T, McMahon C, Van Regemorter N, Jones A, Genet S. X-linked immunodeficiency caused by insertion of Alu repeat sequences. Journal of Medical Genetics. 1997;34(suppl 1):S81. [65] Ostertag EM, Kazazian HH Jr. Biology of mammalian L1 retrotransposons. Annual Review of Genetics. 2001;35:501– 538. [66] Janicic N, Pausova Z, Cole DE, Hendy GN. Insertion of an Alu sequence in the Ca2+-sensing receptor gene in famil- ial hypocalciuric hypercalcemia and neonatal severe hyper- parathyroidism. The American Journal of Human Genetics. 1995;56(4):880–886. [67] den Hollander AI, ten Brink JB, de Kok YJ, et al. Mutations in a human homologue of Drosophila crumbs cause retinitis pigmentosa (RP12). Nature Genetics. 1999;23(2):217–221. [68] Abdelhak S, Kalatzis V, Heilig R, et al. Clustering of muta- tions responsible for branchio-oto-renal (BOR) syndrome in the eyes absent homologous region (eyaHR) of EYA1. Human Molecular Genetics. 1997;6(13):2247–2255. [69] Vidaud D, Vidaud M, Bahnak BR, et al. Haemophilia B due to adenovo insertion of a human-specific Alu subfamily mem- ber within the coding region of the factor IX gene. European Journal of Human Genetics. 1993;1(1):30–36. [70] Wulff K, Gazda H, Schroder W, Robicka-Milewska R, Her- rmann FH. Identification of a novel large F9 gene mutation— An insertion of an Alu repeated DNA element in exon e of the factor 9 gene. Human Mutation. 2000;15(3):299. [71] Stoppa-Lyonnet D, Carter PE, Meo T, Tosi M. Clusters of in- tragenic Alu repeats predispose the human C1 inhibitor lo- cus to deleterious rearrangements. Proceedings of the National Academy of Sciences of the United States of America. 1990;87 (4):1551–1555. [72] Ishihara N, Yamada K, Yamada Y, et al. Clinical and molecular analysis of Mowat-Wilson syndrome associated with ZFHX1B mutations and deletions at 2q22-q24.1. Journal of Medical Ge- netics. 2004;41(5):387–393. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 71753, Pages 1–16 DOI 10.1155/JBB/2006/71753

Research Article L1 Antisense Promoter Drives Tissue-Specific Transcription of Human Genes

Kert Matlik,¨ Kaja Redik, and Mart Speek

Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia

Received 26 July 2005; Revised 11 November 2005; Accepted 16 November 2005 Transcription of transposable elements interspersed in the genome is controlled by complex interactions between their regulatory elements and host factors. However, the same regulatory elements may be occasionally used for the transcription of host genes. One such example is the human L1 retrotransposon, which contains an antisense promoter (ASP) driving transcription into adjacent genes yielding chimeric transcripts. We have characterized 49 chimeric mRNAs corresponding to sense and antisense strands of human genes. Here we show that L1 ASP is capable of functioning as an alternative promoter, giving rise to a chimeric transcript whose coding region is identical to the ORF of mRNA of the following genes: KIAA1797, CLCN5,andSLCO1A2. Furthermore, in these cases the activity of L1 ASP is tissue-specific and may expand the expression pattern of the respective gene. The activity of L1 ASP is tissue-specific also in cases where L1 ASP produces antisense RNAs complementary to COL11A1 and BOLL mRNAs. Simultaneous assessment of the activity of L1 ASPs in multiple loci revealed the presence of L1 ASP-derived transcripts in all human tissues examined. We also demonstrate that L1 ASP can act as a promoter in vivo and predict that it has a heterogeneous transcription initiation site. Our data suggest that L1 ASP-driven transcription may increase the transcriptional flexibility of several human genes.

Copyright © 2006 Kert Matlik¨ et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION Retrotransposons may provide alternative promoters for host genes. Here, the known examples include LTR-mediated transcription of Agouti [16], PTN [17], apoC-I, EDNRB [18], Non-LTR and LTR retrotransposons are the two most abun- β dant classes of transposable elements that contain regula- CYP19 [19], 3GAL-T5 [20], and SPAMI genes [21]. We have tory regions (promoter, enhancer, and polyadenylation sig- previously shown that in transformed cells many human nal) necessary for their transcription and transposition [9]. genes are transcribed from the L1 ASPs located in introns of Although most of the non-LTR retrotransposons and all the these genes [2, 3]. LTR retrotransposons in the human genome have lost their To reveal the possible function of L1 ASP as an alterna- transpositional competence due to broken ORFs, a large tive promoter of human genes, we carried out a systematic number of them have retained regulatory sequences [10]. search for additional chimeric L1 ESTs/mRNAs deposited in Scattered all over the chromosomes, retrotransposons can af- GenBank. Here we describe 49 chimeric mRNAs generated ff fect the regulation of host genes’ transcription. by L1 ASP-driven transcription. Four of these chimeras di er from the bona fide mRNAs by 5 untranslated region (UTR) Recent studies carried out in several laboratories have and another four (antisense RNAs) have regions comple- revealed that LTR retrotransposons, such as an intracister- mentary to exons of known mRNAs. Based on these bioin- nal A-particle in mice [11], endogeneous retroviruses in hu- formatic data, we show that L1 ASP is capable of function- mans and mice [12], and Wis 2-1A in wheat [13], can in- ing as an alternative promoter in normal human tissues and fluence transcription of adjacent genes. Similarly, two fami- drives tissue-specific transcription of several human genes. lies of non-LTR retrotransposons, L1 [3] and B2 SINE [14], have been shown to drive transcription of human and mouse METHODS genes, respectively. It has been shown that the effect of retro- transposons on the host gene expression depends on their Computational analysis epigenetic status and thus may cause phenotypic variation The search and analysis of chimeric L1 transcript sequences between genetically identical individuals [15]. derived from the human subset of EST division of GenBank, 2 Journal of Biomedicine and Biotechnology

Table 1: Primers used for the detection of mRNAs, chimeric mRNAs, and L1 splice forms.

mRNA Forward primer Reverse primer Annealing temperature AL711955 CTTGTGGCAGAAGGGAGAAG GCAGCAGAGAGGACTTTGG 65◦C (L1) KIAA1797 TCTCAGACTGCTGTGCTA GCAGCAGAGAGGACTTTGG 60◦C CLCN5 (uP) GGAGAAAACAGGGCCACATA CATGCTCAGAGTTCCAGCAA 60◦C CLCN5 (dP) GACCCTTTTGTCTCCCTTCC CATGCTCAGAGTTCCAGCAA 60◦C L1-CLCN5 CTGCTGTGCTAGCAATCAGC CATGCTCAGAGTTCCAGCAA 60◦C SLCO1A2 AAAGCGTTCCAGGTATTTTTG GCTCTTCAGGGTGTTCCAAG 55◦C L1-SLCO1A2 CTGCTGTGCTAGCAATCAGC GCTCTTCAGGGTGTTCCAAG 60◦C MET ACGGTCCAAAGGGAAACTCT CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-A CTGCTGTGCTAGCAATCAGC CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-B CTAAGCAAGCCTGGGCAATG CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-C TTCCCGGCTGCTTTGTTTAC CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-D GGCTCCACCCAGTTCGAGCT CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-E AGGCAGGCCTCCTTGAGCTCTG CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-F AGGTGGAGCCTACAGAGGCAG CCTTGTAGATTGCAGGCAGAC 60◦C L1-MET-G TGCAGAGGTTACTGCTGTCT CCTTGTAGATTGCAGGCAGAC 60◦C COL11A1 GGATTTCAAGGCAAGACCG TTTGCACCTTCTTTTCCTGC 55◦C L1-COL11A1 CTGCTGTGCTAGCAATCAGC TAGGGTGATCCAGGTCCTCA 60◦C BOLL CGCAAACATCAAACCAGATG TACTGTGTGGTGGCCTGGTA 60◦C L1-BOLL CTGCTGTGCTAGCAATCAGC GCCTTCAAATGCAGGACTGT 60◦C L1 II sp v1 CTCCCCCAGCCTCGCTGC GGTTCATCTCACTGGCTC 60◦C L1 IV sp v1 CTGCTGTGCTAGCAATCAGC GGTTCATCTCACTGGAAA 55◦C 1sp v stand for splice variant

EMBL, and DDBJ was carried out by using the strategy de- (Fermentas), and 0.75 μM primers. Each reaction contained scribed earlier [2]. The alignment of EST and mRNA se- 0.5 μl cDNA and 0.5 units of Taq polymerase in a final vol- quences to genomic contigs was done with SPIDEY [1]and ume of 10 μl. After cDNA denaturation at 95◦Cfor1minute, confirmed with the human genome browser available at Uni- amplification (35–40 cycles) was carried out by using the fol- versity of California, Santa Cruz [5]. BLAST [6], BLAST2 se- lowing cycling profile: 95◦C30s,55◦–65◦C30s,and72◦C quences [7], and SPIDEY programs, used in the analysis of 30 s for products < 0.5Kbor1minuteforproducts> 0.5Kb. sequences of RT-PCR products, were run on the National Primers and annealing temperatures used are given in the Center for Biotechnology Information BLAST network ser- supplementary table Table 1. The locations of primers are vice using default parameters. shown in Figures 1 and 2. PCR products were sized on 1-2% The Transcriptional start sites in the DBTSS [22]were agarose gels and analyzed by restriction mapping. After gel mapped using the BLASTN [6]. The accession numbers elution, their sequences were determined from both ends us- of the respective one-pass cDNA entries were OFR00417, ing BigDye Terminator cycle sequencing kit (Applied Biosys- CNR02292, KAR05296, TDR09332, T3R04859, TDR07820, tems). KMR03236, HKR11044, KMR01202, COL02332, KMR- First strand L1-MET cDNA was synthesized with a re- 02654, TDR05153, TDR04283, T3R08474, T3R07002, TDR- verse primer positioned in MET exon 5 (TATGGTCAGC- 08640, DMC04507, HKR03051, T7R06886, T3R04414, CTTGTCCCTC) using total RNA isolated from human ter- 29R05294, OFR01051, T3R00241, and HKR11121. Splice site atocarcinoma cell line (NTera2D1) and RevertAid H minus search was done with NNSPLICE 0.9 [23]andNetGene2 M-MuLV reverse transcriptase (Fermentas). This cDNA was [24]. denatured at 95◦C for 1 minute and amplified (30 cycles, see above) using one of the primer pairs (L1-MET-A-G) shown RT-PCR, Southern blot, and sequence analysis in Table 1. For Southern blot analysis, the RT-PCR products PCR amplification of the human cDNAs of the multiple tis- obtained were sized on an agarose gel, transferred to a nylon sue cDNA (MTC) panels I and II (BD Biosciences Clon- membrane and hybridized with a riboprobe specific to MET tech) was carried out using recombinant Taq polymerase and exons 2–5. Hybridization-positive products were detected by Taq buffer with (NH4)2SO4,2.0mMMg2Cl2,0.2mMdNTP autoradiography. Kert Matlik¨ et al 3

1 2 3 4 5 6 7 8 910111213141516 1 2 3 4 5 6 7 8 910111213141516 1bp . 1 640 bp CLCN5 uP AL711955 (AK056560) Lung Placenta Pancreas 407 bp 017794) . 1 569 bp L1-CLCN5 (BP351387) ASP P KIAA1797 (NM L1P 1a 1b 234 5 6 7 8 910 586 bp 000084) uPASP dP CLCN5 dP

1.407 bp 1.1bp ATG −4a−3a −2a L1Ta −1a 1b 2 3 4 (NM

569 bp ATG ATG 586 bp 640 bp (a) (b)

1 2 3 4 5 6 7 8 910111213141516 1 2 3 4 5 6 7 8 910111213141516 000245) 021094) MET 188 bp 915 bp SLCO1A2 (NM (NM Lung Liver Placenta Pancreas 623 bp 315 bp

ASP2 ASP1 P L1-MET (BX955947)

P ASP (BF208095) L1-SLCO1A2 L1PA2 L1PA2 547 bp L1PA2 123 122a 345

315 bp 324 bp ATG ATG 623 bp (BX955947) 188 bp 915 bp 547 bp (c) (d)

Figure 1: Distribution of chimeric mRNAs derived from the L1 ASP as an alternative promoter. The presence of native mRNAs derived from a gene predicted by (a) AL711955 and KIAA1797,(b)CLCN5 [25], (c) SLCO1A2,(d)MET proto-oncogene [26], and their corresponding chimeric transcripts is shown at the upper and lower RT-PCR panels. cDNAs were derived from the following human tissues: 1, thymus; 2, prostate; 3, spleen, 4, small intestine; 5, colon; 6, ovary; 7, testis; 8, peripheral blood leukocytes; 9, placenta; 10, skeletal muscle; 11, brain; 12, kidney; 13, heart; 14, lung; 15, pancreas, and 16, liver. GenBank accession numbers for each mRNA and chimeric L1 mRNA are shown. Product sizes are shown on the left of each panel and below the forward primer on the scheme. L1 (PA2 or Ta subfamily) is shown by a largeboxwiththe5 UTR region indicated in red and its orientation is marked by an arrow. Exons are marked by open boxes (not in scale). Splicing schemes are shown by lines. The location of translation initiation, codon is marked by ATG. Primers used in PCR are shown by arrowheads below the exons. (b) Exons transcribed from the CLCN5 upstream promoter [uP] are designated with −1a to −4 a. (c) A 315 bp RT-PCR product corresponds to L1-SLCO1A2 transcript derived from the upstream L1 ASP (L1 ASP2), but not from the L1 ASP (L1 ASP1) predicted by the EST (BX955947). (d) A minor L1-MET splice variant is shown by a broken line. P stands for promoter and dP stands for downstream promoter.

RESULTS of host gene. Our search revealed 81 ESTs containing the op- posite strand of L1 5 UTR, followed by a region identical L1 ASP is predicted to function as to a cellular mRNA or random genomic sequence. Of this an alternative promoter large number of chimeric transcripts, 49 ESTs represented mRNAs derived from the genes annotated in RefSeq database Wehavepreviouslycharacterized9outof25ESTsrepre- [8] (see the supplementary table (Table 2)). The remaining senting the L1 ASP-driven transcription of human genes [2]. 32 ESTs contained noncoding or repetitive DNA sequences Using the strategy described earlier [2] and an updated ver- (Alus, MIR, LTR, L1, etc) spliced to the L1 5 UTR. Since they sion of the dbEST (12 May 2004), we extended our search to contained only short ORFs (< 100 aa) and had no similarity reveal chimeric transcripts derived from an L1 ASP acting as to known proteins, as revealed by BLASTP analysis, they were a sole/alternative promoter or driving antisense transcription not analyzed further. 4 Journal of Biomedicine and Biotechnology

1 2 3 4 5 6 7 8 910111213141516 COL11A1 161 bp (NM 001854) Testis 244 bp L1-COL11A1 (AV693621)

38 39 40 41 42 43 44 45 46 47

161 bp L1PA2 ASP 244 bp (a)

1 2 3 4 5 6 7 8 910111213141516 583 bp BOLL (NM 033030)

∗ Prostate Leukocyte 242 bp L1-BOLL 211 bp (BE866323)

1234 5 6 7 89 10 11 L1PA2 L1PA2

ATG 583 bp ASP1 ASP2 211 bp 242 bp (BE866323) (b)

Figure 2: Distribution of antisense RNAs derived from L1 ASP. The presence of mRNAs derived from (a) COL11A1 [27], (b) BOLL [28], and their antisense RNAs is shown at the upper and lower RT-PCR panels, respectively. The exons of antisense RNAs L1-COL11A1 and L1- BOLL complementary to exon 40 of COL11A1 and exon 6 of BOLL are shown as grey boxes. (a) COL11A1 exons 38–47. (b) Two L1-BOLL splice variants and a nonspecific product, marked by an asterisk, are presented. A 211 bp product derived from the L1 ASP1 is identical to EST BE866323 (splicing scheme III). A novel 242 bp product generated from the L1 ASP2 corresponds to the splicing scheme V. For the remaining description details, see Figure 1 legend.

Because of our interest in the L1 ASP-driven transcrip- of G-protein signalling 6) [29]. For the remaining three ESTs, tion of human genes, we carried out a detailed analysis of splicing occurred within the coding sequence, giving rise to the 49 chimeric ESTs (Table 2). While most of the ESTs (40 the chimeric mRNA lacking bona fide translation initiation out of 49) corresponded to mRNAs generated from the L1 signals. Since translation initiation signals are commonly lo- ASPs of full-length L1s located in introns, 7 ESTs/mRNAs cated in the second exon of mammalian mRNAs [30], an (NM 017794, BP351387, BM557937, CF593264, BP358215, L1 ASP located in the first intron could also give rise to BX955947, and BU176833) were derived from L1 ASPs lo- a translatable chimeric mRNA. Of the 3 ESTs (BM910612, cated upstream of genes. In these 7 cases, L1 ASP may BE735854, and BP352155) derived from such L1 ASPs, only function as an alternative promoter. Four of these cases one (BE735854) had translation initiation signals matching (NM 017794, BP351387, BX955947, and BP358215) repre- those of the bona fide mRNA. sented chimeric mRNAs that contained the first coding exon Of the 49 ESTs/mRNAs analyzed, 45 chimeras matched of the gene. Thus, their translation could produce proteins the orientation of the respective gene, while 4 ESTs had re- identical to those encoded by the respective gene (Table 3). gions complementary to the exons of known mRNAs and These genes encoded hypothetical protein KIAA1797 (possi- thus were derived from the opposite strand of the gene bly involved in mitotic chromosome condensation), CLCN5 (Table 3). (chloride channel 5) [25], SLCO1A2 (solute carrier organic Two of these ESTs (CB960713 and AV693621) were de- anion transporter family member 1A2), and RGS6 (regulator rived from the L1 ASPs located in the intron 25 of ABCA9 Kert Matlik¨ et al 5

Table 2: Widespread L1 ASP-driven transcription of human genes revealed from ESTs/mRNAs.

Similarity to L1 Similarity to Location in the EST1 Source2 5UTR opposite Orientation6 known mRNA4 genome5 strand3

Type I splicing (1 EST)

4–59 ≡ 592–647 Arylsulfatase G, (96%) Pool of 40 cell line NM 014960 NT 010641 (chr 17) BU943355++ (4 ex) 60–289 ≡ 762–990 Sense polyA+ 331–649 ≡ 10/11 (96%) L1PA3 1342–1660 (99%) AC007780

Type II splicing (2 ESTs)

Olfactory receptor, 12–117 ≡ 542–647 family 56, (97%) Embryonic stem subfamily B, NT 009237 (chr 11) CD642260 (4 ex) 118–230 ≡ 878–990 Antisense cell line WA01/H1 member 4, 3/1 (96%) L1PA2 NM 001005181 AC022762 373–728 ≡ 802–443 (98%)

4–150 ≡ 501–647 Hypothetical RA-induced NT2 (93%) protein KIAA1797, NT 008413 (chr 9) NM 017794 (46 ex) neuronal 151–262 ≡ 878–990 AL711955* Sense 5/45 precursor cells (93%) L1P 331–834 ≡ 60–563 AL354879 (99%)

Type III splicing (22 ESTs)

Fibronectin type III domain containing Brain, 1–134 ≡ 514–647 6 (cytokine NT 086641 (chr 3) BM910612 (6 ex) astrocytoma (98%) L1Ta (Hs) Sense receptor), 1/7 grade IV cell line AC011597 NM 144717 268–915 ≡ 336–982 (98%)

Hypothetical 4–126 ≡ 520–647 protein BC014608, NT 021877 (chr 1) BF676152 (3 ex) Prostate (91%) L1PA2 NM 138796 Sense 5/11 AC097061 127–713 ≡ 422–1005 (91%)

Breast carcinoma 1–125 ≡ 523–647 amplified sequence Uninduced NT2 NT 010783 (chr 17) AU123136++ (7 ex) (96%) L1PA2 3, NM 017679 Sense cell line 9/24 AC079005 126–623 ≡ 710–1208 (99%)

Secernin 3 Ntera-2 1–111 ≡ 538–649 (dipeptidase), NT 005403 (chr 2) AA226814+ (3 ex) neuroepithelial (93%) L1PA2 NM 024583 Sense 5/8 cells AC018470 112–347 ≡ 843–1075 (96%)

Cardiomyopathy

= 4–45 . 606–647 associated 5, Pool of 40 cell line NT 006713 (chr 5) BU959632 (5 ex) L1Ta (Hs) NM 153610 Sense polyA+ 8/12 AC008496 46–559 ≡ 3235–3748 (97%) 6 Journal of Biomedicine and Biotechnology

Table 2: Continued.

Hepatocyte growth factor receptor (MET Bladder 2–57 ≡ 592–647 proto-oncogene), NT 007927 (chr 7) BF208095+ (6 ex) carcinoma cell (94%) L1PA2 Sense NM 000245 2/21 line AC002080 132–456 ≡ 1387–1714 (99%) 462–663 ≡ 1805–2013 (92%) Dynein, cytoplasmic, Ntera-2 1–39 ≡ 609–647 intermediate NT 007910 (chr 7) AA220950+ (3 ex) neuroepithelial (89%) L1PA3 Sense polypeptide 1, 5/17 cells AC022261 NM 004411 40–247 ≡ 613–818 (95%) Cholinergic receptor, nicotinic, Brain 1–110 ≡ 538–647 beta poly-peptide 4, NT 024654 (chr 15) BM557937 (7 ex) astrocytoma (93%) L1PA3 Sense NM 000750 5/6 grade IV cell line AC022748 410–713 ≡ 168–471 (99%) Nuclear antigen Placenta 2–105 ≡ 544–647 Sp100, NM 003113 NT 005403 (chr 2) BG335812 (> 6ex) choriocarcinoma (93%) L1PA2 Sense 106–522 ≡ 139–556 2/25 cell line AC009949 (90%) Chromosome 14 Bladder 1–43 ≡ 605–647 open reading frame NT 025892 (chr 14) BE865812+ (4 ex) carcinoma cell (97%) L1Ta (Hs) 37, NM 001001872 Sense 5/7 line AL049838 44–343 ≡ 933–1228 (96%) ≡ Bol, boule-like 1–92 · 556–647 Bladder (Drosophila), NT 005246 (chr 2) BE866323+ (4 ex) carcinoma cell (96%) L1PA2 NM 033030 Antisense 3/11 line AC073058 L1PA2 145–204 ≡ 790–731 AC020550 (98%) Well- Hypothetical differentiated 1–113 ≡ 535–647 protein FLJ31340, NT 086723 (chr 7) BP352155 (5 ex) squamous cell (96%) L1PA2 BX346336* Sense 1/>5 carcinoma cell AC004519 114–490 ≡ 500–876 line TE13 (98%) Well- differentiated 1–67 = 581–647 Chloride channel 5, NT 086939 (chr X) BP351387 (5 ex) squamous cell L1Ta (Hs) NM 000084 Sense 5/12 carcinoma cell AL663118 213–583 = 243–613 line TE13 Hypothetical Well- protein MGC16169 differentiated 1–71 ≡ 576–647 (protein kinase) NT 086651 (chr 4) BP351082 (> 4ex) squamous cell (95%) L1PA3 Sense NM 033115 17/24 carcinoma cell AC114734 72–593 ≡ line TE13 1913–2433 (99%) WD repeat and FYVE domain 1–65 ≡ 581–647 containing 2, NT 086801 (chr 13) BP369881 (6 ex) Testis (92%) L1PA3 Sense NM 052950 3/12 AL136525 66–570 ≡ 460–963 (99%) Kert Matlik¨ et al 7

Table 2: Continued.

Hypothetical Brain Ntera-2 1–67 ≡ 581–647 protein FLJ35779, NT 086677 (chr 5) AA226765 (3 ex) neuroepithelial (92%) L1PA3 NM 152408 Sense 4/11 cells AC025170 68–356 ≡ 480–767 (97%)

Phospholipase C, 29–95 ≡ 581–647 beta 1, NM 182734 NT 011387 (chr CF593264 (> 5ex) Placenta (95%) L1PA3 Sense 174–769 ≡ 20) 5/33 AL050323 103–692 (98%) RAB GTPase Embryonal 1–67 ≡ 581–647 activating protein NT 086598 (chr 1) BP873102 (5 ex) kidney cell (95%) L1PA2 1-like, NM 014857 Sense 4/21 line =“293” AL022400 68–583 ≡ 731–1244 (95%) FLJ16237 protein, Placenta 25–92 ≡ 580–647 NM 001004320 NT 086703 (chr 7) CD110319 (2 ex) “preeclamptic (97%) L1PA2 Sense 93–568 ≡ 428–900 2/13 placenta” AC004452 (97%)

Polycystic kidney 2–77 ≡ 572–647 and hepatic disease Pooled from NT 007592 (chr 6) BX476029 (5 ex) (93%) L1PA3 1, NM 138694 Sense different tissues 43/67 AL121946 78–567 ≡ 7273–7762 (99%)

ATP-binding 30–107 ≡ 570–647 cassette, subfamily NT 010641 (chr CB960713 (4 ex) Placenta (96%) L1PA3 A, NM 172386 Antisense 17) 25/38 AC005922 108–208 = 3283–3183

Catenin (cadherin- associated protein), Embryonic stem 14–115 ≡ 547–647 alpha 3, NT 086771 (chr CD644604 (3 ex) cells, cell (94%) L1PA3 Sense NM 013266 10) 5/19 line =“WA01” AC022029 116–736 ≡ 755–1375 (98%)

Type IV splicing (1 EST)

Hypothetical 29–230 ≡ 531–732 protein FLJ32800, (94%) NM 152647 NT 010194 (chr CF594290 (9 ex) Placenta 231–340 ≡ 878–988 = Sense 354–451 15) 5/16 (95%) L1PA2 1305–1402 AC022306 452–780 ≡ 1642–1964 (97%)

Type V splicing (19 ESTs)

Activin A receptor, Lung large cell 17–215 ≡ 533–732 type IC, NT 005403 (chr 2) BE787024++ (3 ex) carcinoma cell (98%) L1Ta (Hs) NM 145259 Sense 2/9 line AC079750 216–752 ≡ 548–1086 (95%) CD96 antigen, Bladder 1–178 ≡ 554–732 NM 005816 NT 086640 (chr 3) BE568884+ (4 ex) carcinoma cell Sense (97%) 179–627 ≡ 2/15 line 659–1113 (97%) 8 Journal of Biomedicine and Biotechnology

Table 2: Continued. RAB3A interacting Colon 8–185 ≡ 553–732 protein, NT 086796 (chr BE617461++ (6 ex) adenocarcinoma (98%) L1PA2 NM 175625 Sense 12) 3/10 cell line AC092916 186–738 ≡ 998–1556 (98%) Secretory carrier Bladder 1–163 ≡ 570–732 membrane protein NT 006713 (chr 5) BE568818+ (3 ex) carcinoma cell (93%) L1PA2 1, NM 052822 Sense 6/8 line AC010585 164–516 ≡ 717–1063 (97%) Guanylate binding 4–166 ≡ 571–732 protein 1, Pool of 40 cell line NT 004686 (chr 1) BU858570 (2 ex) (93%) L1PA2 NM 002053 Sense polyA+ RNAs 2/11 AL691464 167–402 ≡ 259–494 (95%) Hypothetical Bladder 2–123 ≡ 612–732 protein FLJ36166, NT 086704 (chr 7) BF028725 (3 ex) carcinoma cell (91%) L1PA2 NM 182634 Sense 2/21 line AC004800 124–264 ≡ 3282–3424 (95%) Chromosome 6 6week, 1–94 ≡ 640–732 open reading frame differentiated, NT 086697 (chr 6) AA224229+ (4 ex) (98%) L1Ta (Hs) 170, NM 152730 Sense post-mitotic hNT, 22/30 AL365308 95–430 ≡ neurons 2622–2957 (99%) Zinc finger protein 2–187 ≡ 547–732 638, NM 014497 NT 022184 (chr 2) BG542212++ (> 3 ex) Lung (97%) L1Ta (Hs) Sense 188–638 ≡ 18/28 AC096569 3576–4013 (92%) Collagen, type XI, 1–172 ≡ 559–732 alpha 1, variant A, Hepatocellular NT 004623 (chr 1) AV693621 (2 ex) (93%) L1PA2 NM 001854 Antisense carcinoma 46/67 AL627203 187–279 = 3433–3341 Similar to beta-1, 4- Pancreas 1–95 ≡ 638–732 mannosyltransferase, NT 005588 (chr 3) BE735854+ (6 ex) adenocarcinoma (93%) L1PA2 CD708577* Sense 1/>5 cell line AC092903 95–387 ≡ 174–466 (99%) Hypothetical protein FLJ10986, Soares placenta 1–52 = 681–732 NT 029223 (chr 1) R64632 (4 ex) NM 018291 Sense Nb2HP L1PA2 AL713859 11/14 53–406 ≡ 1319–1671 (98%) Well- Chromosome 9 differentiated 1–126 ≡ 608–732 open reading frame NT 008413 (chr 9) BP352672 (4 ex) squamous cell (94%) L1PA2 39, NM 017738 Sense 2/23 carcinoma cell AL354711 127–603 = line TE13 152–631 Regulator of G-protein Mammary gland 1–147 ≡ 586–732 signalling 6, NT 026437 (chr BP358215 (7 ex) tumor cell line (92%) L1PA2 Sense NM 004296 14) 5/17 T47D AL391749 148–581 ≡ 188–621 (99%) Kert Matlik¨ et al 9

Table 2: Continued.

Breast carcinoma 1–107 ≡ 626–732 amplified sequence Soares breast NT 010783 (chr H72033 (4 ex) (97%) L1PA2 3, NM 017679 Sense 2NbHBst 17) 9/24 AC079005 108–370 ≡ 710–967 (95%)

Cell line=ZR-75- Monogenic, 1, MCF7, audiogenic seizure 1–159 ≡ 574–732 SK-BR-3, susceptibility 1 NT 086677 (chr CA488981 (3 ex) (91%) L1PA2 Sense MDA-MB-231, homolog, 5) 83/98 AC034215 hTERT-HME1, NM 032119 ≡ LNCaP 160–736 17956–18532 (99%) Solute carrier organic anion 1–116 ≡ 617–732 Pooled from transporter family, NT 009714 (chr BX955947 (3 ex) (89%) L1PA2 Sense different tissues member 1A2, 12) 5/14 AC006559 NM 021094 240–342 = 186–288

Hypothetical 2–129 ≡ 605–732 protein FLJ38736, Pooled from NT 086827 (chr BX477512++ (3 ex) (93%) L1PA2 NM 182758 Sense different tissues 15) 18/20 AC024061 130–551 = 3191–3216

Embryonic stem FLJ46156 protein, 1–151 ≡ 582–732 cells, embryoid NM 198499 NT 086806 (chr CN412489++ (2 ex) (98%) L1PA2 Sense bodies from H1, 7 152–348 = 14) 8/37 AL133299 and H9 cell lines 1087–1283

Baculoviral IAP Embryonic stem 1–180 ≡ 553–732 repeat-containing cells, NT 033899 (chr CN408255 (4 ex) (95%) L1PA2 2, NM 001166 Sense DMSO-treated 11) 6/9 AP00942 181–514 = H9 cell line 2766–3099 Type VI splicing (4 ESTs)

Hypothetical 10–220 ≡ 780–990 LOC388927, Embryonic stem NT 015926 (chr CD643062 (8 ex) (97%) L1PA2 XM 371478 Sense cell line WA01/H1 2) ND AC018741 237–744 ≡ 1–509 (99%)

Rho GTPase Eye 1–227 ≡ 763–989 activating protein NT 022184 (chr BU176833 (6 ex) retinoblastoma (96%) L1PA3 25, NM 014882 Sense 2) 5/10 cell line AC105054 536–878 ≡ 419–757 (97%) Similar to hypothetical Bladder 1–60 ≡ 931–990 protein NT 010859 (chr BE568192 (3 ex) carcinoma cell (98%) L1PA2 Sense LOC375127, 18) 3/5 line AP005264 XM 496265 95–367 ≡ 213–490 (95%) 10 Journal of Biomedicine and Biotechnology

Table 2: Continued.

Monogenic, audiogenic seizure Embryonal 6–135 ≡ 861–990 susceptibility 1 NT 086677 (chr BP245205 (3 ex) kidney cell line (95%) L1PA2 Sense homolog, 5) 91/98 293 AC099512 NM 032119 138–574 ≡ 17953–18384 (98%)

1 EST/mRNA GenBank accession number and number of exons (ex) determined by SPIDEY [1]. ESTs are grouped according to 6 different splicing schemes [2]. Sixteen identical or similar ESTs described earlier by Nigumann et al [2] and Wheelan et al [44]areshownby+ and ++,respectively. 2 Source of the EST as annotated in EST division of GenBank. 3 EST similarity (≡) or identity (=) to a representative L1 genomic clone #11A [3]. Subfamily of L1 [4] and GenBank accession number were determined by genome browser [5].ForsomeESTsthe5 nucleotides (< 28 nt) were derived either from vector/adaptor or represented as low quality sequence. 4 Similarity/identity to known mRNA as determined by BLASTN [6] and BLAST2 sequences [7] programs. mRNA description is based on the RefSeq database [8]. If the mRNA has not been described, an EST (marked by an asterisk) is shown. This EST contains a putative first exon transcribed from the non-L1 (native) promoter. 5 Genomic contig (accession no), chromosome (chr), and position of the L1 ASP in the intron, upstream (5) or downstream (3)/total number of exons, as determined with MegaBLAST and SPIDEY programs. ND stands for not determined. 6 Orientation with respect to the gene’s transcription.

(ATP-binding cassette, subfamily A, member 9) [31] and in- obtained from the CLCN5 mRNA derived from the down- tron 46 of COL11A1 (collagen type XI alpha 1) [27], respec- stream promoter. However, the latter is inactive in placenta tively (Table 3). The remaining two ESTs (CD642260 and suggesting that the L1 ASP provides placenta-specific expres- BE866323) were derived from L1 ASPs located downstream sion to one of the protein isoforms encoded by CLCN5.The of the gene. One of these L1 ASPs resided 77 Kb downstream other protein isoform has a 70 aa-long N-terminal extension of the single exon gene encoding olfactory receptor, family and is derived from an mRNA generated from the CLCN5 56, subfamily B, member 4 (OR56B4)[32] and the other lo- upstream promoter. This promoter is active in a number of cated34KbdownstreamofBOLL, homologous to the bol or tissues. boule-like gene of Drosophila [28]. Figure 1(c) shows that the chimeric L1-SLCO1A2 mRNA predicted from the EST (BX955947) is derived from the L1 L1 ASP provides an alternative promoter for ASP located 61 Kb upstream of the SLCO1A2 first exon. Sur- several human genes prisingly, RT-PCR yielded a 315 bp product (instead of the expected 324 bp product) derived from another L1 ASP lo- To reveal the potential of L1 ASP to function as an alterna- cated about 24 Kb further upstream. This novel chimeric tive promoter, we determined the expression profile of the mRNA is expressed exclusively in placenta, while SLCO1A2 chimeric mRNAs (containing bona fide translation initia- mRNA is present in a number of tissues, but not in placenta. tion signals) in 16 different human tissues. For comparison, Therefore, similarl to CLCN5, L1 ASP is responsible for the we also determined transcription from the native promoters placenta-specific expression of SLCO1A2. (genes’ true promoters). Results for the three chimeric mR- Since the multiple tissue cDNA panel has been pro- NAs (KIAA1797,L1-CLCN5,andL1-SLCO1A2)whichwere duced using different donors for different tissues (brain and detected in the tissues studied are presented in the following lung pooled from 2 donors and other tissues pooled from section. 4–45 donors, except leukocytes which were pooled from 550 Figure 1(a) shows that both the chimeric KIAA1797 donors; the total number of donors was ∼750), it is conceiv- mRNA, derived from the L1 ASP located about 26 Kb up- able that an RT-PCR product represents a donor-specific L1 stream of the first exon of gene, and the native mRNA (the insertion rather than tissue specific activity of the L1 ASP in 5 end of the mRNA was predicted from EST AL711955) are that chromosomal position. Sequence analysis showed that expressed in lung and pancreas. In addition, native mRNA is only one of the L1 elements (L1-CLCN5), for which the expressed in testis, placenta, and liver. tissue-specificity of L1 ASP activity was examined (Figures Figure 1(b) shows that the chimeric L1-CLCN5 mRNA 1 and 2), belongs to the highly polymorphic L1Ta subfamily is expressed exclusively in placenta, while CLCN5 mRNAs [33]. The rest of the L1 elements, depicted in Figures 1 and derived from the upstream and downstream promoters (lo- 2, belong to the L1PA2 subfamily that expanded before the cated about 102 Kb and 44 Kb from the L1 ASP, resp) pro- divergence of hominids [34], although some polymorphic duce mRNAs expressed strongly in lung. Translation of the insertions have been reported in humans [35]. It is unlikely chimeric mRNA could yield a protein identical to the one that an L1 insertion is found in only one of the ∼750 donors Kert Matlik¨ et al 11

Table 3: Examples of the L1 ASP functioning as an alternative promoter or driving antisense transcription of human genes.

Similarity to L1 Similarity to Locationinthe EST1 Source2 5UTR opposite Orientation6 known mRNA4 genome5 strand3

Type II splicing

Olfactory receptor, 12–117 ≡ 542–647 family 56, subfamily (97%) CD642260 Embryonic stem B, member 4, NT 009237 (chr 11) 118–230 ≡ 878–990 Antisense (4 ex) cell line WA01/H1 NM 001005181 3/1 (96%) L1PA2 373–728 ≡ 802–443 AC022762 (98%)

Hypothetical protein 4–150 ≡ 501–647 RA-induced NT2 KIAA1797, NM 017794 (93%) NT 008413 (chr 9) neuronal precursor AL711955∗ Sense (46 ex) 151–262 ≡ 878–990 5/45 cells 331–834 ≡ 60–563 (93%) L1P AL354879 (99%)

Type III splicing

≡ Bol, boule-like 1–92 · 556–647 (Drosophila), BE866323+ Bladder carcinoma NT 005246 (chr 2) (96%) L1PA2 NM 033030 Antisense (4 ex) cell line 3/11 AC073058 L1PA2 145–204 ≡ 790–731 AC020550 (98%)

Well-differentiated Chloride channel 5, BP351387 squamous cell 1–67 = 581–647 NT 086939 (chr X) NM 000084 Sense (5 ex) carcinoma cell line L1Ta (Hs) AL663118 5/12 213–583 = 243–613 TE13

ATP-binding cassette, 30–107 ≡ 570–647 subfamily A, CB960713 NT 010641 (chr 17) Placenta (96%) L1PA3 NM 172386 Antisense (4 ex) 25/38 AC005922 108–208 = 3283–3183

Type V splicing

Collagen, type XI, 1–172 ≡ 559–732 alpha 1, variant A, AV693621 Hepatocellular NT 004623 (chr 1) (93%) L1PA2 NM 001854 Antisense (2 ex) carcinoma 46/67 AL627203 187–279 = 3433–3341

Regulator of Mammary gland 1–147 ≡ 586–732 G-protein signalling BP358215 NT 026437 (chr 14) tumor cell line (92%) L1PA2 6, NM 004296 Sense (7ex) 5/17 T47D AL391749 148–581 ≡ 188–621 (99%) 12 Journal of Biomedicine and Biotechnology

Table 3: Continued. Solute carrier organic 1–116 ≡ 617–732 anion transporter BX955947 Pooled from NT 009714 (chr 12) (89%) L1PA2 family, member 1A2, Sense (3 ex) different tissues 5/14 AC006559 NM 021094 240–342 = 186–288

1 EST/mRNA GenBank accession number and number of exons (ex) determined by SPIDEY [1]. ESTs are grouped according splicing schemes [2]. EST described earlier by Nigumann et al [2]ismarkedby+. 2 Source of the EST as annotated in EST division of GenBank. 3 EST similarity (≡) or identity (=) to a representative L1 genomic clone #11A [3]. Subfamily of L1 [4] and GenBank accession number were determined by genome browser [5]. For some ESTs, the 5 nucleotides (< 28 nt) were either derived from vector/adaptor or represented as low quality sequence. 4 Similarity/identity to known mRNA as determined by BLASTN [6] and BLAST2 sequences [7] programs. mRNA description is based on the RefSeq database [8]. If the mRNA has not been described, an EST (marked by an asterisk) is shown. This EST contains a putative first exon transcribed from the non-L1 (native) promoter. 5 Genomic contig (accession no), chromosome (chr), and position of the L1 ASP in the intron, upstream (5), or downstream (3)/total number of exons, as determined with MegaBLAST and SPIDEY programs. ND stands for not determined. 6 Orientation with respect to the gene’s transcription.

represented in the MTC panel while it is present in GenBank in prostate and peripheral blood leukocytes, respectively. The (Table 3) and Ntera2D1 cell line (data not shown). There- 5 ends of these transcripts are spliced according to splicing fore we believe that the RT-PCR products obtained represent schemes III and V [2]. BOLL mRNA is expressed exclusively tissue-specific L1 ASP activity of fixed or high frequency L1 in testis. L1-BOLL contains a 60 nt region complementary to insertions. the 3 part of exon 6 of BOLL (Table 3). These results sug- In summary, the examples analyzed here provide evi- gest that L1 ASP-driven antisense transcription has no gen- dence that L1 ASP can function as an alternative promoter eral correlation with the transcription of the host gene. in normal human tissues. Our results show that the L1 ASP- driven transcription correlates with that of the respective L1 ASP-derived transcripts are present in native promoter (Figure 1(a)) or expands the tissue-specific all human tissues examined expression pattern of the respective gene (Figures 1(b) and 1(c)). Our study revealed that chimeric transcripts derived from Although our primary goal was to reveal the potential of the six unique genomic regions are present only in a few L1 ASP as an alternative promoter that generates translatable tissues. To examine the tissue specificity of L1 ASP activ- mRNAs, we also determined the distribution of the chimeric ity more generally, we studied tissue-specific distribution of L1-MET mRNA derived from the L1 ASP located in the sec- L1 ASP-derived transcripts, in which splicing occurs within ond intron of the MET proto-oncogene [26]. Figure 1(d) the L1 5 UTR (splice variants II and IV) [2]. The use of shows that the expression of the chimeric L1-MET mRNA these splice variants allowed us to discriminate between the correlates with that of the MET mRNA. L1 ASP-derived spliced transcripts and transcripts passing through the whole L1 5 UTR. Figure 3 shows that the splice L1 ASP generates antisense transcripts variant II is expressed in most human tissues, except in thy- complementary to different mRNAs mus, skeletal muscle, and brain. The variant IV shows a more uniform expression pattern with minimal expression in pla- Of the 49 chimeric ESTs analyzed, only four corresponded centa, skeletal muscle, and brain. In summary, these results to mRNAs that contained regions complementary to the ex- show that L1 ASP-derived transcripts are present in all hu- ons of known mRNAs (see above). The expression data are man tissues examined. presented for only those two so-called antisense RNAs which were detected in the human tissues examined. L1 ASP-driven transcription is characterized by Figure 2(a) shows that the chimeric L1-COL11A1 heterogeneous start site mRNA, derived from the L1 ASP located in the intron 46 of COL11A1, is expressed in testis and to a lesser extent in pla- The fact that the sequence corresponding to the opposite centa. Similarly, COL11A11 mRNA is present in these tissues. strand of L1 5 UTR is present in the EST or mRNA se- It should be noted that L1-COL11A1 (EST: AV693621) con- quence (Table 2) does not necessarily mean that transcrip- tains a 90 nt region complementary to the entire exon 40 of tion is initiated in the L1 ASP region, that is, in the L1 5 UTR COL11A1 (Table 3). around positions +400 to +600 [3]. In order to find evidence Figure 2(b) shows that two alternatively spliced variants that the L1 ASP region acts as a promoter in vivo, we ana- of the chimeric L1-BOLL, derived from the L1 ASPs located lyzed the database of transcriptional start sites (DBTSS) [22] about34Kband87KbdownstreamofBOLL, are expressed for the presence of transcriptional start sites (TSS) which Kert Matlik¨ et al 13

1234 5 678 910111213141516 104 bp L1sp.v.II

149 bp L1sp.v.IV

SP ORFs L1 5 UTR ASP 262 116 II 347 116 IV

Figure 3: Distribution of L1 splice variants II and IV. The presence of splice variants was estimated by RT-PCR in 16 normal human tissues (numbered as in Figure 1 legend) using a reverse primer designed to hybridize to the junction of exons 1 and 2. The schematically represented splice variants II and IV use a common splicing acceptor site at position +116 and splicing donor sites located at positions +262 and +347, respectively [2]. SP stands for L1 sense promoter; sp v stands for splice variant.

G | 601 TCTGCAGAGGTTACTGCTG TCTT TTTGTTTGTCTGTGCCCTGCCCCCAGAG F | E | 550 GTGGAGCCTACAGAGGCAGGCAGGCCTCCTTGAGCTGTGGTGGGCTCCACC D | C | B | 499 CAGTTCGA GCT T C C C GGC T G C T T T G T T TAC C TAA G C A A G C C TGGGCAATGG

448 CGGGCGCCCCTCCCCCAGCCTCGTTGCCGCCTTGC A GTTT G ATCTCAGACT A | 397 GCTGTGCT A G CAAT C AGCGGGACTCCGTGGGCGTAGGACCCTCCGAGCCAG (a)

ABCDE F G

1000 bp

500 bp

(b)

Figure 4: TSS mapped to the L1 ASP region. (a) The position of TSS present in the DBTSS is shown highlighted on the consensus sequence of L1Hs [4] between positions 347 and 601. TSS with single and multiple entries present in the database are represented by yellow and blue highlight, respectively. The letters above the sequence mark the 3 end of the oligonucleotide primers used in RT-PCR (see Table 1). (b) Southern blot RT-PCR analysis of the L1-MET transcripts. The lanes are marked according to the primers used in the PCR. Multiple bands on each lane represented the different splice variants of the L1-MET transcript, as confirmed by sequence analysis.

map to the opposite strand of L1 5 UTR. It has been esti- within ∼13% of the 5 UTR) clearly shows that the region mated that more than 80% of the TSS in the DBTSS repre- from +386 to +503, overlapping with the L1 ASP region, sent true sites of transcription initiation, that is, they corre- must contain a promoter. These results also suggest that spond to the full-length cDNAs [36]. Twenty four of the 34 transcription initiates at various positions within the L1 ASP TSS, which mapped to the opposite strand of the L1 5 UTR, region (Figure 4(a)). resided between positions +386 and +503 (Figure 4(a)). The To confirm the transcription initiation in the L1 ASP observed nonuniform distribution of the TSS (∼70% of TSS region, we analyzed the distribution of L1-MET chimeric 14 Journal of Biomedicine and Biotechnology transcripts (Figure 1(d)) by using RT-PCR and various of initiation signals [39]. Comparison between the 5 UTRs oligonucleotide primers. Figure 4(b) shows that amplifica- of the native and chimeric mRNA revealed no major differ- tion of L1-MET cDNA can be carried out using primers A–F, ences in the above-mentioned factors that can abrogate the but not by using primer G. This result indicates that the TSS usage of the genuine ORF (data not shown). Therefore, it is is located in the L1 ASP region between the binding sites of likely that the chimeric L1 transcripts may be translated with primers A and F, while the region corresponding to primer efficiency comparable to that of the native transcripts. GisabsentfromtheL1-MET transcripts. Also, an in silico Alternative promoters can also generate mRNAs with search for potential splicing signals [23, 24] did not reveal different 5 coding exons, which may be used in the gen- any acceptor sites in the region between primers G and E, eration of N-terminal variants of the same protein [40]. lending support to the conclusion that transcription is initi- Similarly, most L1 ASPs located in introns may, in princi- ated in the L1 ASP region rather than read through the L1 ple, produce chimeric mRNAs and their translation could 5 UTR. The difference in band intensities (Figure 4(b))ob- yield N-terminally truncated proteins. However, transcrip- served for different primer pairs is consistent with the pre- tion from an L1 ASP located in an intron (39 examples de- dicted start site heterogeneity. In summary, our results show scribed in Table 2) may be strongly inhibited because of the that the L1 ASP can act as a promoter in vivo and its activity readthrough transcription from the upstream native pro- is characterized by start site heterogeneity. moter [41, 42]. In addition, if transcripts from the intronic L1 ASPs are produced, they may not be readily translated DISCUSSION because of the absence of proper initiation context. Al- though N-terminally truncated proteins with possible dom- In this paper we show that L1 ASP can cause widespread inant negative effects have been shown to exist in normal transcription of human genes and its activity correlates with and cancer cells [40] (references therein), additional exper- that of the native promoter in some cases, while in other cases iments are required to prove the translation of chimeric L1 it can expand the tissue-specific expression pattern of the re- transcripts. spective gene. It is believed that two or more genes located We have detected two L1 ASP-derived antisense RNAs in a single expression domain are coexpressed [37]. Accord- complementary to the exons of COL11A1 and BOLL mR- ingly, an L1 ASP located near or within a gene may behave NAs (Figure 2). The other two antisense RNAs predicted like a “parasite” whose activity is dependent on the tran- from the ESTs (Table 3) were not detected in the human scription of the gene. This is exemplified by the simultaneous tissues analyzed. Antisense RNAs and antisense transcrip- transcription from the L1 ASP and native promoter (Figures tion are known to cause downregulation of gene transcripts 1(a), 1(d),and2(a)). Surprisingly, in other cases the L1 ASP via RNAi-mediated mRNA degradation [43] and transcrip- activity may be regulated independently, as observed here tional collision [42], respectively. The possible regulatory in- for L1-CLCN5,L1-SLCO1A2,andL1-BOLL mRNAs (Figures teraction between sense and antisense RNAs or transcription 1(b), 1(c),and2(b)). Although the L1 ASP-driven transcripts may be revealed from the negative (or inverse) correlation were detected in all tissues examined (Figure 3), the results of their expression. The partial positive correlation between described suggest that the L1 ASPs at defined loci are not ff COL11A1 mRNA and its antisense counterpart and the neg- active in all tissues. The di erent tissue-specific activity of ative correlation between BOLL and L1-BOLL suggest that L1 ASPs can hardly be explained by their minimal sequence ff there is no general correlation between the L1 ASP-driven divergence, but could be explained with di erences in their antisense transcription and the transcription of the gene. epigenetic state. In some cases, transcriptionally active epi- In summary, we have demonstrated that L1 ASP is active genetic state could be stochastically confined to some L1s in in a wide variety of normal human tissues and it is capable certain tissues. of functioning as an alternative promoter by providing the Our results show that L1 ASP acts as an alternative pro- tissue-specific expression of several human genes. moter of several human genes (Figures 1(a)–1(c)). Alterna- tive promoters, giving rise to alternative first exons, gener- ate variation in gene expression by increasing transcriptional ACKNOWLEDGMENTS flexibility and translational diversity. For example, the hu- We thank Jaanika Riiel for the help in computer analysis, man NOS1 gene, encoding neuronal isoform of nitric ox- Richard Tamme for critical reading of the manuscript, and ide synthase, has 9 alternative promoters, which determine Tonis˜ Timmusk for sharing unpublished data. This work was its tissue-specific transcription and translational efficiency ff supported in part by the Grant no 5171 from Estonian Sci- of the resulting NOS1 mRNAs with di erent 5 UTRs [38]. ence Foundation. Another striking example is the human BDNF gene, encod- ing brain-derived neurotrophic factor, which has 6 promot- ers and first noncoding exons differentially used in different REFERENCES parts of the brain (A Kazantseva and T Timmusk, personal [1] Wheelan SJ, Church DM, Ostell JM. Spidey: a tool for mRNA- communication). The L1 ASP, acting as an alternative pro- to-genomic alignments. Genome Research. 2001;11(11):1952– moter, generates a chimeric mRNA whose translation could 1957. produce a protein identical to the genuine protein. However, [2] Nigumann P, Redik K, Matlik¨ K, Speek M. Many human genes the translatability of this transcript depends on the length of are transcribed from the antisense promoter of L1 retrotrans- the 5 UTR, the number of upstream ORFs, and the strength poson. Genomics. 2002;79(5):628–634. Kert Matlik¨ et al 15

[3] Speek M. Antisense promoter of human L1 retrotransposon [22] Suzuki Y, Yamashita R, Nakai K, Sugano S. DBTSS: DataBase drives transcription of adjacent cellular genes. Molecular and of human Transcriptional Start Sites and full-length cDNAs. Cellular Biology. 2001;21(6):1973–1985. Nucleic Acids Research. 2002;30(1):328–331. [4] Smit AF, Toth´ G, Riggs AD, Jurka J. Ancestral, mammalian- [23] Reese MG, Eeckman FH, Kulp D, Haussler D. Improved wide subfamilies of LINE-1 repetitive sequences. Journal of splice site detection in Genie. Journal of Computational Bi- Molecular Biology. 1995;246(3):401–417. ology: A Journal of Computational Molecular Cell Biology. [5] Kent WJ, Sugnet CW, Furey TS, et al. The human genome 1997;4(3):311–323. browser at UCSC. Genome Research. 2002;12(6):996–1006. [24] Brunak S, Engelbrecht J, Knudsen S. Prediction of human [6] Altschul SF, Madden TL, Scha¨ffer AA, et al. Gapped BLAST mRNA donor and acceptor sites from the DNA sequence. Jour- and PSI-BLAST: a new generation of protein database search nal of Molecular Biology. 1991;220(1):49–65. programs. Nucleic Acids Research. 1997;25(17):3389–3402. [25] Fisher SE, van Bakel I, Lloyd SE, Pearce SH, Thakker RV, Craig [7] Tatusova TA, Madden TL. BLAST 2 Sequences, a new tool for IW. Cloning and characterization of CLCN5, the human kid- comparing protein and nucleotide sequences. FEMS Microbi- ney chloride channel gene implicated in Dent disease (an X- ology Letters. 1999;174(2):247–250. linked hereditary nephrolithiasis). Genomics. 1995;29(3):598– [8] Pruitt KD, Maglott DR. RefSeq and LocusLink: NCBI gene- 606. centered resources. Nucleic Acids Research. 2001;29(1):137– [26] Park M, Dean M, Kaul K, Braun MJ, Gonda MA, Vande 140. Woude G. Sequence of MET protooncogene cDNA has fea- [9] Deininger PL, Batzer MA. Mammalian retroelements. Genome tures characteristic of the tyrosine kinase family of growth- Research. 2002;12(10):1455–1465. factor receptors. Proceedings of the National Academy of Sci- [10] Kazazian HH Jr. Mobile elements: drivers of genome evolu- ences of the United States of America. 1987;84(18):6379–6383. tion. Science. 2004;303(5664):1626–1632. [27] Bernard M, Yoshioka H, Rodriguez E, et al. Cloning and se- [11] Rakyan VK, Blewitt ME, Druker R, Preis JI, Whitelaw E. quencing of pro-alpha 1 (XI) collagen cDNA demonstrates Metastable epialleles in mammals. Trends in Genetics. 2002; that type XI belongs to the fibrillar class of collagens and 18(7):348–351. reveals that the expression of the gene is not restricted [12] van de Lagemaat LN, Landry J-R, Mager DL, Medstrand P. to cartilagenous tissue. The Journal of Biological Chemistry. Transposable elements in mammals promote regulatory vari- 1988;263(32):17159–17166. ation and diversification of genes with specialized functions. [28] Strausberg RL, Feingold EA, Grouse LH, et al. Generation and Trends in Genetics. 2003;19(10):530–536. initial analysis of more than 15,000 full-length human and [13] Kashkush K, Feldman M, Levy AA. Transcriptional activation mouse cDNA sequences. Proceedings of the National Academy of retrotransposons alters the expression of adjacent genes in of Sciences of the United States of America. 2002;99(26):16899– wheat. Nature Genetics. 2003;33(1):102–106. 16903. [14] Ferrigno O, Virolle T, Djabari Z, Ortonne JP, White RJ, Ab- [29] Chatterjee TK, Liu Z, Fisher RA. Human RGS6 gene structure, erdam D. Transposable B2 SINE elements can provide mo- complex alternative splicing, and role of N terminus and G bile RNA polymerase II promoters. Nature Genetics. 2001; protein γ-subunit-like (GGL) domain in subcellular localiza- 28(1):77–81. tion of RGS6 splice variants. The Journal of Biological Chem- [15] Whitelaw E, Martin DI. Retrotransposons as epigenetic me- istry. 2003;278(32):30261–30271. diators of phenotypic variation in mammals. Nature Genetics. [30] Lander ES, Linton LM, Birren B, et al. Initial sequencing and 2001;27(4):361–365. analysis of the human genome. Nature. 2001;409(6822):860– [16] Duhl DM, Vrieling H, Miller KA, Wolff GL, Barsh GS. Neo- 921. morphic agouti mutations in obese yellow mice. Nature Ge- [31] Piehler A, Kaminski WE, Wenzel JJ, Langmann T, Schmitz G. netics. 1994;8(1):59–65. Molecular structure of a novel cholesterol-responsive A sub- [17] Schulte AM, Lai S, Kurtz A, Czubayko F, Riegel AT, Well- class ABC transporter, ABCA9. Biochemical and Biophysical stein A. Human trophoblast and choriocarcinoma expression Research Communications. 2002;295(2):408–416. of the growth factor pleiotrophin attributable to germ-line in- [32] Malnic B, Godfrey PA, Buck LB. The human olfactory receptor sertion of an endogenous retrovirus. Proceedings of the Na- gene family. Proceedings of the National Academy of Sciences of tional Academy of Sciences of the United States of America. the United States of America. 2004;101(8):2584–2589. 1996;93(25):14759–14764. [33] Myers JS, Vincent BJ, Udall H, et al. A comprehensive analysis [18] Medstrand P, Landry J-R, Mager DL. Long terminal repeats of recently integrated human Ta L1 elements. The American are used as alternative promoters for the endothelin B recep- Journal of Human Genetics. 2002;71(2):312–326. tor and apolipoprotein C-I genes in humans. The Journal of [34] Furano AV, Duvernell DD, Boissinot S. L1 (LINE-1) retro- Biological Chemistry. 2001;276(3):1896–1903. transposon diversity differs dramatically between mammals [19] Landry J-R, Rouhi A, Medstrand P, Mager DL. The Opitz and fish. Trends in Genetics. 2004;20(1):9–14. syndrome gene Mid1 is transcribed from a human endoge- [35] Bennett EA, Coleman LE, Tsui C, Pittard WS, Devine SE. Nat- nous retroviral promoter. Molecular Biology and Evolution. ural genetic variation caused by transposable elements in hu- 2002;19(11):1934–1942. mans. Genetics. 2004;168(2):933–951. [20] Dunn CA, Medstrand P, Mager DL. An endogenous retrovi- [36] Maruyama K, Sugano S. Oligo-capping: a simple method to ral long terminal repeat is the dominant promoter for hu- replace the cap structure of eukaryotic mRNAs with oligori- man β1,3-galactosyltransferase 5 in the colon. Proceedings of bonucleotides. Gene. 1994;138(1-2):171–174. the National Academy of Sciences of the United States of Amer- [37] Spector DL. The dynamics of chromosome organization and ica. 2003;100(22):12841–12846. gene regulation. Annual Review of Biochemistry. 2003;72:573– [21] Dunn CA, Mager DL. Transcription of the human and rodent 608. SPAM1 / PH-20 genes initiates within an ancient endogenous [38] Wang Y, Newton DC, Robb GB, et al. RNA diversity has retrovirus. BMC Genomics. 2005;6(1):47. profound effects on the translation of neuronal nitric oxide 16 Journal of Biomedicine and Biotechnology

synthase. Proceedings of the National Academy of Sciences of the United States of America. 1999;96(21):12150–12155. [39] Kozak M. Pushing the limits of the scanning mechanism for initiation of translation. Gene. 2002;299(1-2):1–34. [40] Landry J-R, Mager DL, Wilhelm BT. Complex controls: the role of alternative promoters in mammalian genomes. Trends in Genetics. 2003;19(11):640–648. [41] Eszterhas SK, Bouhassira EE, Martin DI, Fiering S. Transcrip- tional interference by independently regulated genes occurs in any relative arrangement of the genes and is influenced by chromosomal integration position. Molecular and Cellular Bi- ology. 2002;22(2):469–479. [42] Prescott EM, Proudfoot NJ. Transcriptional collision between convergent genes in budding yeast. Proceedings of the Na- tional Academy of Sciences of the United States of America. 2002;99(13):8796–8801. [43] McManus MT, Sharp PA. Gene silencing in mammals by small interfering RNAs. Nature Reviews. Genetics. 2002;3(10):737– 747. [44] Wheelan SJ, Aizawa Y, Han JS, Boeke JD. Gene-breaking: a new paradigm for human retrotransposon-mediated gene evolu- tion. Genome Research. 2005;15(8):1073–1078. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 13569, Pages 1–3 DOI 10.1155/JBB/2006/13569

Mini-Review Article Links Between Repeated Sequences

Sachiko Matsutani

Division of Microbiology, National Institute of Health Sciences, Kamiyoga 1-18-1, Setagaya-ku, Tokyo 158-8501, Japan

Received 23 May 2005; Revised 28 September 2005; Accepted 2 October 2005

L1 and Alu elements are long and short interspersed retrotransposable elements (LINEs and SINEs) in humans, respectively. Proteins encoded in the autonomous L1 mediate retrotransposition of the nonautonomous Alu and cellular mRNAs. Alu is the only active SINE in the human genome and is derived from 7SL RNA of signal recognition particle. In the other eukaryotic genomes, various tRNA- and 5S rRNA-derived SINEs are found. Some of the tRNA- and 5S rRNA-derived SINEs have partner LINEs of which 3 sequences are similar to those of the SINEs. One of the tRNA-derived SINEs is shown to be mobilized by its partner LINE. Many copies of tRNA and 5S rRNA pseudogenes are present in the human genome. These pseudogenes may have been generated via the retrotransposition process using L1 proteins. Although there are no sequence similarities between L1 and Alu, L1 functionally links with Alu and even cellular genes, impacting on our genome shaping.

Copyright © 2006 Sachiko Matsutani. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION splicing out of the autocatalytic intron, target site duplica- tion, and integrations into consensus A-rich sequences. Re- In the human genome, coding sequences are less than 5% verse transcriptase primes on the 3 terminalpolyAstretchof whilerepeatsequencesaremorethan50%[1]. Most of these the L1 mRNA [7]. Also in the experiments using neomycin- repeat sequences are derived from retrotransposons, which marked Alu sequences, the 3 terminal polyA tracts of the transpose through RNA intermediates. L1 and Alu elements Alus were required for retrotransposition. Moreover, L1 can are the most successful families of non-LTR elements rep- mediate retrotransposition of a cellular mRNA that is not resenting approximately 30% of the human genome [1]. L1 associated with retrotransposon, although the rate of retro- is about 6 kb long, has an internal promoter for RNA poly- transposition is 100–1000 fold lower than that in the case merase II, and encodes two essential polypeptides (ORF1 and of Alu [13–15]. Very recently, U6 snRNA was reported to ORF2) for retrotransposition (see, eg, [2–4]). The product be mobilized by L1 [16]. L1 mobilizes Alu and different of ORF1 is an RNA-binding protein, and ORF2 encodes a kinds of cellular RNAs, and plays important roles in human protein with endonuclease and reverse transcriptase activi- genome shaping. Figure 1 is a schematic representation of the ties [5–9]. While L1 moves autonomously, Alu is a nonau- retrotransposition of the human L1 elements and their de- tonomous element. Alu elements are short (about 300 bp), pendents. Retrotransposition of L1 and Alu (and processed and have internal promoters for RNA polymerase III [10]. genes) results in insertion mutations, and crossing-over be- Since Alu elements encode no proteins, it had been presumed tween the homologous elements is one of the sources of ge- that Alu borrows the enzymes like reverse transcriptases from netic variations (see, eg, [17]). Insertions of Alu elements in- other sources for retrotransposition. troduce alternative 3 splice sites into existing genes, possi- bly resulting in defective splicing [18]. Alu and L1 elements HUMAN L1 CAN MOBILIZE ALU AND can alter the distribution of methylation in the genome, and PROCESSED PSEUDOGENES possibly transcription of genes [19, 20]. These rearrange- ments have a great impact on the genome evolution. Most The idea that SINE transposition can be mediated by L1 ele- mutations may be harmless, because coding and control se- ment was described by Feng et al [8], Jurka [11], and Martin quences comprise less than 5% of the human genome DNA et al [12]. Recently, Dewannieux et al [13] showed that L1 [1]. However, for example, it is reported that Alu insertions can mobilize Alu in the human cells: neomycin-marked Alu cause neurofibromatosis, haemophilia, familial hyperchores- sequences transposed in the Hela cells transiently expressing terolaemia, breast cancer, insulin-resistant diabetes type II the L1 ORF2 proteins; and the transposition process included and Ewing sarcoma [19]. 2 Journal of Biomedicine and Biotechnology

Similarity between LINE and its partner SINE

5S rRNA-related

tRNA-related In human 7SL-related

+1 +1 +1 Genes LINEs Pol II ORF1 ORF2 PolyA SINEs Pol III PolyA PolyA (L1) AAAAA (Alu) AAAAA Pol II +1

Genes for small RNAs Pol III

L1 Alu Processed RNA AAAAA RNA AAAAA RNAs AAAAA

L1 proteins

New New Pol II ORF1 ORF2 PolyA Pol III PolyA Pseudogenes PolyA L1 AAAAA Alu AAAAA AAAAA

Figure 1: Retrotransposition of the human LINE-1 (L1) elements and their dependents. L1 contains two open reading frames ORF1 and ORF2. Products of these ORFs associate with the transcripts of L1, Alu, and cellular genes. The RNA-protein complexes bind to another part of the genome, and new elements of L1 and Alu, and pseudogenes which lack introns are generated. Genes for small RNAs like tRNA and 5S rRNA may also retrotranspose. In the other eukaryotic genomes, there are also tRNA- and 5S rRNA-derived SINEs. tRNA- and 5S rRNA-derived SINEs have tRNA- and 5S rRNA-related regions at the 5 ends, respectively. The 3 end regions of some of the tRNA- and 5S rRNA-derived SINEs show similarities to the 3 end regions of their partner LINEs at the nucleotide sequence level.

LINKS BETWEEN LINES AND SINES IN tRNA genes have type 2 internal promoters, while 5S rRNA- THE OTHER EUKARYOTIC GENOMES derived SINEs and 5S rRNA genes have type 1 internal pro- moters. Different from the type 1 promoters, the type 2 pro- The Alu element which is the only active SINE in the hu- moters in 5S rRNA genes synthesize the RNAs with the DNA man genome, is thought to be derived from the 7 SL RNA signals upstream of the transcribed region, which perhaps re- that is a component of signal recognition particle [21]. In sults in the restricted distribution of 5S rRNA-derived SINEs the other eukaryotic genomes, there are SINEs which are de- in eukaryotic species [24]. rived from tRNA genes [22]. Nucleotide sequences of the 5 regions of tRNA-derived SINEs are similar to those of tRNA genes. Some tRNA-derived SINEs have sequence sim- CONTRIBUTION OF L1 TO OUR GENOME EVOLUTION ilarity to LINEs in their 3 end regions [22]: for example, HE1 SINE and HER1 LINE in higher elasmobranchs (sharks, Interestingly, there are many copies of pseudogenes, frag- skates, and rays), tortoise SINE and CR1-like LINE of tur- ments and paralogues of tRNA and 5S rRNA genes in the tle, salmonid Hpa1 SINE and RSg-1 LINE of rainbow trout, human genome [1]: 497 copies of the true tRNA genes com- and P.s.1/SINE and Lucy-1 CR1-like LINE in Podarcis sic- pared with 324 copies of their related genes; and 4 copies of ula. This fact leads us to think that these SINEs are mobi- the true 5S rRNA gene compared with 520 copies of its re- lized by the partner LINEs. Indeed, it is demonstrated that lated genes. As described above, in the human genome, Alu is in Hela cells, the 3 end of a fish (eel) SINE is recognized the only active SINE, and the active tRNA-derived SINE and by the reverse transcriptase of its partner LINE, and that the the 5S rRNA-derived SINE have not been found. However, it fish SINE can be mobilized by the partner’s machinery [23]. is possible that human tRNA and 5S rRNA genes retrotrans- In the zebrafish genome, 5S rRNA-derived SINEs have been pose with the L1-encoded proteins, if the transcripts acciden- found [24]: their 5 end regions are similar to the 5S rRNA at tally contain A-rich sequences at the 3 ends [25]. The human the nucleotide sequence level, and their 3 regions resemble genome is reported to contain many types of chimeric retro- the 3 parts of their partner LINEs. One may imagine that genes that were formed using the L1 integration machinery LINEs had ever existed (or still exists) with ability to pro- [26]. It should be noted that a new insertion of Alu to the vide their 3 parts for generation of partner SINEs. All of germline is computationally estimated to occur in about 1 of SINEs, tRNA genes, and 5S rRNA genes are transcribed by every 100 births [27]. Directly and indirectly, L1 greatly con- RNA polymerase III. Alu elements, tRNA-derived SINEs, and tributes to the evolution of our genome. Sachiko Matsutani 3

REFERENCES sequence family in the human genome. Cell. 1980;22(1 pt 1): 209–218. [1] Lander ES, Linton LM, Birren B, et al. Initial sequencing and [22] Ohshima K, Okada N. SINEs and LINEs: symbionts of eukary- analysis of the human genome. Nature. 2001;409(6822):860– otic genomes with a common tail. Cytogenetic and Genome Re- 921. search. 2005;110(1–4):475–490. [2] Athanikar JN, Badge RM, Moran JV. A YY1-binding site is [23] Kajikawa M, Okada N. LINEs mobilize SINEs in the eel required for accurate human LINE-1 transcription initiation. through a shared 3 sequence. Cell. 2002;111(3):433–444. Nucleic Acids Research. 2004;32(13):3846–3855. [24] Kapitonov VV, Jurka J. A novel class of SINE elements de- [3] Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, rived from 5S rRNA. Molecular Biology and Evolution. 2003; Kazazian HH Jr. High frequency retrotransposition in cul- 20(5):694–702. tured mammalian cells. Cell. 1996;87(5):917–927. [25] Szafranski K, Dingermann T, Glockner G, Winckler T. Tem- [4] Swergold GD. Identification, characterization, and cell speci- plate jumping by a LINE reverse transcriptase has created a ficity of a human LINE-1 promoter. Molecular and Cellular Bi- SINE-like 5S rRNA retropseudogene in Dictyostelium. Molec- ology. 1990;10(12):6718–6729. ular Genetics and Genomics. 2004;271(1):98–102. [5] Hohjoh H, Singer MF. Cytoplasmic ribonucleoprotein com- [26] Buzdin A, Gogvadze E, Kovalskaya E, et al. The human plexes containing human LINE-1 protein and RNA. The genome contains many types of chimeric retrogenes generated EMBO Journal. 1996;15(3):630–639. through in vivo RNA recombination. Nucleic Acids Research. [6] Martin SL, Cruceanu M, Branciforte D, et al. LINE-1 2003;31(15):4385–4390. retrotransposition requires the nucleic acid chaperone ac- [27] Kazazian HH Jr. An estimated frequency of endogenous inser- tivity of the ORF1 protein. Journal of Molecular Biology. tional mutations in humans. Nature Genetics. 1999;22(2):130– 2005;348(3):549–561. 130. [7] Cost GJ, Feng Q, Jacquier A, Boeke JD. Human L1 element target-primedreversetranscriptioninvitro.The EMBO Jour- nal. 2002;21(21):5899–5910. [8] Feng Q, Moran JV, Kazazian HH Jr, Boeke JD. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87(5):905–916. [9] Mathias SL, Scott AF, Kazazian HH Jr, Boeke JD, Gabriel A. Re- verse transcriptase encoded by a human transposable element. Science. 1991;254(5039):1808–1810. [10] Fuhrman SA, Deininger PL, LaPorte P, Friedmann T, Gei- duschek EP. Analysis of transcription of the human Alu family ubiquitous repeating element by eukaryotic RNA polymerase III. Nucleic Acids Research. 1981;9(23):6439–6456. [11] Jurka J. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proceedings of the National Academy of Sciences of the United States of America. 1997;94(5):1872–1877. [12] Martin F, Olivares M, Lopez MC, Alonso C. Do non-long ter- minal repeat retrotransposons have nuclease activity? Trends in Biochemical Sciences. 1996;21(8):283–285. [13] Dewannieux M, Esnault C, Heidmann T. LINE-mediated retrotransposition of marked Alu sequences. Nature Genetics. 2003;35(1):41–48. [14] Esnault C, Maestre J, Heidmann T. Human LINE retro- transposons generate processed pseudogenes. Nature Genetics. 2000;24(4):363–367. [15] Wei W, Gilbert N, Ooi SL, et al. Human L1 retrotransposi- tion: cis preference versus trans complementation. Molecular and Cellular Biology. 2001;21(4):1429–1439. [16] Gilbert N, Lutz S, Morrish TA, Moran JV. Multiple fates of L1 retrotransposition intermediates in cultured human cells. Molecular and Cellular Biology. 2005;25(17):7780–7795. [17] Kazazian HH Jr, Goodier JL. LINE drive. retrotransposition and genome instability. Cell. 2002;110(3):277–280. [18] Sorek R, Ast G, Graur D. Alu-containing exons are alterna- tively spliced. Genome Research. 2002;12(7):1060–1067. [19] Batzer MA, Deininger PL. Alu repeats and human genomic di- versity. Nature Reviews Genetics. 2002;3(5):370–379. [20] Han JS, Szak ST, Boeke JD. Transcriptional disruption by the L1 retrotransposon and implications for mammalian tran- scriptomes. Nature. 2004;429(6989):268–274. [21] Weiner AM. An abundant cytoplasmic 7S RNA is comple- mentary to the dominant interspersed middle repetitive DNA Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 75327, Pages 1–5 DOI 10.1155/JBB/2006/75327

Mini-Review Article The Genomic Distribution of L1 Elements: The Role of Insertion Bias and Natural Selection

Todd Graham1 and Stephane Boissinot1, 2

1 Department of Biology, Queens College, City University of New York, Flushing, NY 11367, USA 2 Graduate School and University Center, City University of New York, New York, NY 10016, USA

Received 5 August 2005; Revised 6 December 2005; Accepted 13 December 2005 LINE-1 (L1) retrotransposons constitute the most successful family of retroelements in mammals and account for as much as 20% of mammalian DNA. L1 elements can be found in all genomic regions but they are far more abundant in AT-rich, gene-poor, and low-recombining regions of the genome. In addition, the sex chromosomes and some genes seem disproportionately enriched in L1 elements. Insertion bias and selective processes can both account for this biased distribution of L1 elements. L1 elements do not appear to insert randomly in the genome and this insertion bias can at least partially explain the genomic distribution of L1. The contrasted distribution of L1 and Alu elements suggests that postinsertional processes play a major role in shaping L1 distribution. The most likely mechanism is the loss of recently integrated L1 elements that are deleterious (negative selection) either because of disruption of gene function or their ability to mediate ectopic recombination. By comparison, the retention of L1 elements because of some positive effect is limited to a small fraction of the genome. Understanding the respective importance of insertion bias and selection will require a better knowledge of insertion mechanisms and the dynamics of L1 inserts in populations.

Copyright © 2006 T. Graham and S. Boissinot. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION be more abundant on the sex chromosomes (Korenberg and Rykowski [6], Boyle et al [7], Bailey et al [8], Boissinot et The sequencing of several mammalian genomes has revealed al [9], Parish et al [10]), in genes that are expressed at low that all are littered with hundreds of thousands copies of level (Han et al [11]), and in monoallelically expressed genes LINE-1 (L1) retrotransposons that account for ∼ 20% of (Allen et al [12]). Differences exist in the distribution of L1 their mass (Lander et al [1], Waterston et al [2]). The abun- elements with regard to their age and size. Younger L1 ele- dance of L1 elements in mammalian genomes is specific of ments are located on average closer to genes than older el- this class of vertebrates and should be considered a diagnostic ements (Medstrand et al [13]) and full-length elements are feature of mammals to the same extent the possession of hair more abundant on the sex chromosomes than on autosomes and the production of milk by females are. As L1 elements (Boissinot et al [9]). Although most of the L1 elements found are also responsible for the amplification of SINEs (eg, Alu in the human and mouse genomes were inserted after the in primates, B1 and B2 in mouse) and processed pseudogenes split between primates and rodents, their distributions are (Esnault et al [3], Dewannieux and Heidmann [4], Dewan- strikingly similar (Lander et al [1], Waterston et al [2]), sug- nieux et al [5]), it is believed that L1 activity may account for gesting that some common mechanisms have shaped L1 dis- as much as 50% of mammalian DNA. tribution in primates and rodents. Here we review the molec- Although L1 elements can be found almost anywhere ular mechanisms and evolutionary processes that might have in the genome, their abundance varies considerably among played a role in shaping the genomic distribution of L1 ele- genomic regions. In general, L1 elements are much more ments and we evaluate their relative contribution to the bi- abundant in AT-rich, low-recombining, and gene-poor re- ased distribution of L1 elements. gions of the genome. In addition to this general trend, L1 elements can be locally very rare or extremely abundant. L1 ELEMENTS ARE NOT INSERTED RANDOMLY For instance, L1 elements constitute 89% of a 100 Kb re- gion on chromosome X while they are virtually absent from The first possible source of bias comes from the retrotrans- the homeobox gene clusters (Lander et al [1]). They seem to position process itself. The reaction of the retrotransposition 2 Journal of Biomedicine and Biotechnology requires the target site to be cut by the L1-encoded en- some relationship between the transcriptional activity of a donuclease. As the consensus target site of L1 endonucle- gene and its hospitability to novel L1 insertions (Muotri et ase is TT/AAAA (Jurka [14]), it is plausible that L1 inserts al [20]). Because insertional hotspots are not particularly en- preferentially in AT-rich regions because this motif is over- riched in old L1 elements, their contribution to the biased represented in these regions (Cost and Boeke [15]). It was distribution of L1 remains unclear but they could very well even suggested that the preference of L1 elements for AT-rich explain the local abundance of elements in certain genes. regions could be an adaptation of L1 to its host because inser- The abundance of recent L1 insertions varies significantly tion of L1 in gene-poor regions limits the burden of L1 retro- among chromosomes, with chromosomes 4 and X appar- transposition (Lander et al [1], Cost and Boeke [15]). How- ently being prone to L1 insertions. A significantly larger ever, the majority of L1 insertions sites differ from the in- number of Ta-1 insertions were found on chromosome 4 sertion site consensus sequence (Jurka [14], Cost and Boeke than on other autosomes, not only because chromosome 4 is [15]) and there is probably no shortage of insertion sites any- relatively gene-poor, but also because it contains several de- where in the genome. In addition, Alu elements are more tectable insertional hotspots (Boissinot et al [19]). Eleven of abundant in GC-rich regions of the genome despite the fact the 14 disease-causing insertion sites mentioned above are on that they have the same consensus target site-as L1 elements the X chromosome (Ostertag and Kazazian [16]). Although (Jurka [14]). Although it is likely that the target-site prefer- X-linked deleterious mutations are in general more likely to ence of L1 endonuclease is, at least in part, responsible for the be apparent because of male hemizygosity, this bias is not suf- distribution bias of L1, this hypothesis has not been tested ficiently strong to account for the high frequency of disease- rigorously. causing L1 insertions on the X. Therefore, it seems that the Beside the possible bias caused by the L1 endonuclease, X chromosome is unusually prone to novel L1 insertions, al- the analysis of de novo insertions and of recently integrated though an analysis of the Ta-1 family did not reveal an excess elements revealed the presence of insertional hotspots in the of recent L1 insertions on the X (Boissinot et al [19]). What- human genome. Of 14 de novo disease-causing insertions ever the cause of the insertion bias, it is possible that inser- listed in Ostertag and Kazazian [16], three were in the factor tion bias is, at least in part, responsible for the abundance of VIII gene, four in the dystrophin gene, and two in the CYBB L1 elements on chromosomes X and 4. gene. Another set of genes was the target of multiple L1 and L1-mediated (Alu, SVA) insertions: an L1 and two Alu ele- NEGATIVE SELECTION ELIMINATES ments inserted in the factor IX gene, and an Alu and an SVA DELETERIOUS L1 ELEMENTS inserted in the BTK gene. A novel L1 insertion in the fac- tor IX of dog has recently been described (Brooks et al [17]) In general, L1 insertions (like most genetic changes) are and two L1 insertions occurred recently and independently much more likely to be deleterious or neutral than favor- in human and gorilla within the same 1 Kb region (DeBer- able. An L1-containing allele is considered deleterious when ardinis and Kazazian [18]). In addition, a recent analysis of it decreases the fitness of the individual that carries it either the currently amplifying Ta-1 family found that a number of by reducing its survival or its reproductive success. As selec- recently integrated Ta-1 elements were clustered in the hu- tion against deleterious allele will act as soon as the novel man genome more often than expected by chance suggesting L1-containing allele is produced by retrotransposition, it is the existence of insertional hotspots on several autosomes unlikely that such deleterious alleles reach high frequencies (Boissinot et al [19]). Together these observations indicate in populations. In most cases, they will be lost rapidly from that some genomic regions are more likely to be the target populations and will never (or rarely) be observed. of L1 retrotransposition events than others, and suggest that L1 elements have the potential to disrupt the function of insertional hotspots may be conserved among mammalian host genes in many ways. First, a novel L1 insertion in the species. It is unclear why some genomic regions are inser- coding sequence of a gene would most likely inactivate the tional hotspots but it is plausible that the transcriptional sta- protein-coding function of the gene, as exemplified by inser- tus of the target site region plays a role. If the structure of the tions in exons of the factor VIII gene and in the dystrophin DNA is modified during transcription in a way that makes gene (Kazazian et al [21], Narita et al [22]). L1 elements in- it more hospitable for L1 retrotransposition, transcription- serted in intronic sequences can also have a deleterious ef- ally active regions would undergo a higher number of trans- fect by introducing splice sites (Schwahn et al [23], Meischl position events. This hypothesis requires further investiga- et al [24]) and polyadenylation signals (Perepelitsa-Belancio tions with regard to some identified hotspots (Boissinot et and Deininger [25]), or by negatively affecting gene tran- al [19]). While some of these hotspots were in the vicinity scription (Han et al [11]). If inserted upstream of genes, L1 of genes expressed in gonads and during early embryogen- elements can also affect their regulation by disrupting reg- esis (Boissinot, Entezam, and Furano, unpublished observa- ulatory sequences or by inserting their own regulatory se- tion), a genome-wide analysis of genes that are transcribed quence such as their sense or antisense internal promoters. in testes failed to find a significant excess of L1 elements in Thus, L1 elements are significantly more abundant down- those genes (Graham and Boissinot, unpublished observa- stream of genes than upstream (Graham and Boissinot, un- tion). A recent analysis of L1 retrotransposition in neuronal published observation). Finally, it has recently been demon- precursor cells showed that a number of de novo L1 inser- strated in a cell-culture assay that L1 retrotransposition can tions were in neuronally expressed genes lending support for cause large (> 3 Kb) genomic deletions (Gilbert et al [26], T. Graham and S. Boissinot 3

Symer et al [27]). Such events would certainly be extremely (ie, the bias toward AT-rich regions), it is possible that the deleterious if it occurred in a gene-rich region, but ge- recruitment of L1 sequences in some regions could result nomic deletions caused by L1 retrotransposition are, in gen- in a local enrichment of L1. It has been proposed that L1 eral, small (< 500 bp) and relatively rare (Myers et al [28]) may affect the expression pattern of entire genomic regions as they account for the total loss of only 18 Kb since the or chromosomes and that this effect could be sufficiently human-chimpanzee split (Han et al [29]). All the possible strong to positively affect the abundance of L1 in these re- effects L1 elements can have on gene function would likely gions. The idea is that L1 elements would act as “boosters” cause their selective loss from gene-rich regions. that promote the expansion of heterochromatin and conse- The abundance of L1 sequences across the genome gives quently repress the transcription of genes. This hypothesis them the potential to be efficient mediator of ectopic (ie, has been proposed to explain the spread of X-inactivation nonallelic) recombination. Such events lead to chromosomal along the entire X chromosome (ie, the Lyon hypothesis) rearrangements that are, in general, very deleterious (Bur- (Lyon [40]). Evidence for this role includes the strong en- winkel and Kilimann [30],Segaletal[31]), although some richment for L1 elements near the X-inactivation center on have played an important role in genomic evolution (Fitch the X chromosome (Bailey et al [8]) and the observation et al [32]). If we assume that the frequency of ectopic ex- of X: autosome translocations, showing that the failure of change correlates with the recombination rate, then we ex- the X-inactivation signal to spread is often correlated with pect L1 elements to be more deleterious when they reside in the abundance of L1 elements. In addition, genes that es- highly recombining regions and therefore eliminated by neg- cape X-inactivation are located in regions with a lower abun- ative selection. Because longer L1 elements are more likely to dance of L1 (Bailey et al [8]). The Lyon hypothesis would mediate ectopic recombination, this model of selection pre- explain the abundance of L1 elements on the X chromo- dicts a negative correlation between the length of L1 elements somes in several mammalian species, although there are im- and the recombination rate of the genomic region where they portant variations in the abundance of L1 elements near the reside. Indeed, long elements accumulate in low- and non- X-inactivation center (Chureau et al [41]) suggesting that the recombining regions of the genome (Boissinot et al [9]; Song evolution of X-inactivation predates the recruitment of L1 and Boissinot, unpublished data) and are lacking from re- elements as boosters. The hypothesis that L1 elements can combination hotspots (Myers et al [33]). Thus, the negative promote the inactivation of one copy of a gene is also sup- effect of ectopic recombination may cause the selective loss ported by the evidence that monoallelically expressed genes of L1 elements from highly-recombining regions and there- are located in regions of the genome that are enriched in L1 fore their accumulation in low recombining regions, which elements (Allen et al [12]). Another way L1 elements can af- are typically AT-rich and gene-poor. fect the expression of genes comes from the ability of L1 el- ements to reduce the amount of transcript produced when POSITIVE SELECTION IN FAVOR OF L1 ELEMENTS they are inserted in an intron (Han et al [11]). This obser- vation led to the suggestion that intronic L1 elements con- Since L1 elements have been described, scientists have won- tribute to the fine tuning of gene expression (the Rheostat dered which benefit for its host L1 could have. So far, there hypothesis) and may account for some of the differences in is absolutely no evidence that L1 could have any useful func- L1 abundance among genes (Han et al [11], Han and Boeke tion for its host. However, recent evidence suggests that in a [42]). A negative correlation between the expression of genes few cases, L1 sequences may have been coopted by the host and the abundance of L1 in their introns has recently been re- for its own benefit. Note that the occasional recruitment of ported (Han et al [11]). Because the same observation could L1 sequences does not imply a function for L1. In some rare equally indicate that low-expressed genes are just more per- cases, ready-to-use motifs contained within the L1 sequence missive to the presence of L1 in their introns than highly ex- seem to have been retained by the host (Makalowski [34], pressed genes, more data are needed to validate the rheostat Kazazian [35]). For instance, the 5 UTR of modern L1 el- hypothesis. ements contain sense and antisense promoters which have L1 elements may also be retained in the genome because occasionally been recruited as regulators of the transcrip- they can reduce linkage between genes and therefore increase tion of host genes (Yang et al [36], Speek [37], Nigumann et the efficiency of selection. In a region of low recombina- al [38]), and fragments of L1 sequences have been incorpo- tion, many weakly selected mutations can interfere with each rated within protein-coding sequences (Nekrutenko and Li other, therefore limiting the effect of selection due to tight [39]). However, the number of described cases of cooptation linkage between loci. The insertion of L1 elements can mit- is very small and this mechanism has no significant effect on igate this interference by simply increasing the distance be- the overall distribution of L1. In addition, one should always tweenloci(Comeron[43]). Though this idea has not been keep in mind that the retention of an L1 element affecting the tested so far, it has been proposed as a general mechanism expression or sequence of a gene does not imply that this el- to explain the length of introns and the amount of noncod- ement was positively selected (ie, improved the fitness of the ing DNA in genomes (Comeron [43]). A prediction of this host); it might as well have been neutral. model is that longer introns and a higher proportion of non- Although positive selection in favor of L1 inserts is un- coding DNA (including L1) will be favored in regions of low likely to have affected the overall genomic distribution of L1 recombination. 4 Journal of Biomedicine and Biotechnology

CONCLUSION [9] Boissinot S, Entezam A, Furano AV. Selection against delete- rious LINE-1-containing loci in the human lineage. Molecular L1 distribution is affected by a number of factors that act Biology and Evolution. 2001;18(6):926–935. at the time of insertion or after the element is inserted. The [10] Parish DA, Vise P, Wichman HA, Bull JJ, Baker RJ. Distribu- main difficulty in determining the relative importance of in- tion of LINEs and other repetitive elements in the karyotype of sertion bias and selection is twofold. First, different mecha- the bat Carollia: implications for X-chromosome inactivation. nisms (ie, insertion bias and the different types of selection) Cytogenetic and Genome Research. 2002;96(1–4):191–197. can have the same effect on L1 distribution, and the same [11] Han JS, Szak ST, Boeke JD. Transcriptional disruption by the observation can be explained by radically different mecha- L1 retrotransposon and implications for mammalian tran- scriptomes. Nature. 2004;429(6989):268–274. nisms. For instance, the abundance of L1 elements on the X [12] Allen E, Horvath S, Tong F, et al. High concentrations of long chromosome can be explained by a bias of insertion, a re- interspersed nuclear element sequence distinguish monoallel- ffi duced e ciency of negative selection, or the recruitment of ically expressed genes. Proceedings of the National Academy of L1 elements as mediator of X-inactivation. Second, genomic Sciences of the United States of America. 2003;100(17):9940– parameters such as GC content, gene richness, and recombi- 9945. nation rate are not independent, and using correlations be- [13] Medstrand P, van de Lagemaat LN, Mager DL. Retroelement tween any of these parameters and the abundance of L1 is un- distributions in the human genome: variations associated with likely to provide a clear explanation for the distribution bias age and proximity to genes. Genome Research. 2002;12(10): of L1. Indeed, many of the mechanisms discussed in this re- 1483–1495. view were inferred from the analysis of L1 distribution, that [14] Jurka J. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proceedings of the is, from the data they were trying to explain, and have not National Academy of Sciences of the United States of America. been tested rigorously. To fully understand the genomic dis- 1997;94(5):1872–1877. tribution of L1 elements, a better knowledge of the molecular [15] Cost GJ, Boeke JD. Targeting of human retrotransposon in- mechanism of insertion and the dynamics of L1 elements in tegration is directed by the specificity of the L1 endonu- natural populations will be necessary. clease for regions of unusual DNA structure. Biochemistry. 1998;37(51):18081–18093. [16] Ostertag EM, Kazazian HH Jr. Biology of mammalian L1 ACKNOWLEDGMENT retrotransposons. Annual Review of Genetics. 2001;35:501– 538. We thank Laurence Frabotta and two anonymous reviewers [17] Brooks MB, Gu W, Barnas JL, Ray J, Ray K. A Line 1 insertion for their helpful comments on the manuscript. in the Factor IX gene segregates with mild hemophilia B in dogs. Mammalian Genome. 2003;14(11):788–795. REFERENCES [18] DeBerardinis RJ, Kazazian HH Jr. Full-length L1 elements have arisen recently in the same 1-kb region of the gorilla and hu- [1] Lander ES, Linton LM, Birren B, et al. Initial sequencing and man genomes. Journal of Molecular Evolution. 1998;47(3):292– analysis of the human genome. Nature. 2001;409(6822):860– 301. 921. [19] Boissinot S, Entezam A, Young L, Munson PJ, Furano AV. The [2] Waterston RH, Lindblad-Toh K, Birney E, et al. Initial se- insertional history of an active family of L1 retrotransposons quencing and comparative analysis of the mouse genome. Na- in humans. Genome Research. 2004;14(7):1221–1231. ture. 2002;420(69 15):520–562. [20] Muotri AR, Chu VT, Marchetto MCN, Deng W, Moran JV, [3] Esnault C, Maestre J, Heidmann T. Human LINE retro- Gage FH. Somatic mosaicism in neuronal precursor cells me- transposons generate processed pseudogenes. Nature Genetics. diated by L1 retrotransposition. Nature. 2005;435(7044):903– 2000;24(4):363–367. 910. [4] Dewannieux M, Heidmann T. L1-mediated retrotransposition [21] Kazazian HH Jr, Wong C, Youssoufian H, Scott AF, Phillips of murine B1 and B2 SINEs recapitulated in cultured cells. DG, Antonarakis SE. Haemophilia A resulting from de novo Journal of Molecular Biology. 2005;349(2):241–247. insertion of L1 sequences represents a novel mechanism for [5] Dewannieux M, Esnault C, Heidmann T. LINE-mediated mutation in man. Nature. 1988;332(6160):164–166. retrotransposition of marked Alu sequences. Nature Genetics. [22] Narita N, Nishio H, Kitoh Y, et al. Insertion of a 5 truncated 2003;35(1):41–48. L1 element into the 3 end of exon 44 of the dystrophin gene [6] Korenberg JR, Rykowski MC. Human genome organization: resulted in skipping of the exon during splicing in a case of Alu, lines, and the molecular structure of metaphase chromo- Duchenne muscular dystrophy. The Journal of Clinical Investi- some bands. Cell. 1988;53(3):391–400. gation. 1993;91(5):1862–1867. [7] Boyle AL, Ballard SG, Ward DC. Differential distribution of [23] Schwahn U, Lenzner S, Dong J, et al. Positional cloning of long and short interspersed element sequences in the mouse the gene for X-linked retinitis pigmentosa 2. Nature Genetics. genome: chromosome karyotyping by fluorescence in situ hy- 1998;19(4):327–332. bridization. Proceedings of the National Academy of Sciences of [24] Meischl C, de Boer M, Ahlin˚ A, Roos D. A new exon created by the United States of America. 1990;87(19):7757–7761. intronic insertion of a rearranged LINE-1 element as the cause [8] Bailey JA, Carrel L, Chakravarti A, Eichler EE. Molecular evi- of chronic granulomatous disease. European Journal of Human dence for a relationship between LINE-1 elements and X chro- Genetics. 2000;8(9):697–703. mosome inactivation: the Lyon repeat hypothesis. Proceedings [25] Perepelitsa-Belancio V, Deininger P. RNA truncation by pre- of the National Academy of Sciences of the United States of Amer- mature polyadenylation attenuates human mobile element ac- ica. 2000;97(12):6634–6639. tivity. Nature Genetics. 2003;35(4):363–366. T. Graham and S. Boissinot 5

[26] Gilbert N, Lutz-Prigge S, Moran JV.Genomic deletions created upon LINE-1 retrotransposition. Cell. 2002;110(3):315–325. [27] Symer DE, Connelly C, Szak ST, et al. Human L1 retrotrans- position is associated with genetic instability in vivo. Cell. 2002;110(3):327–338. [28] Myers JS, Vincent BJ, Udall H, et al. A comprehensive analysis of recently integrated human Ta L1 elements. The American Journal of Human Genetics. 2002;71(2):312–326. [29] Han K, Sen SK, Wang J, et al. Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chim- panzee lineages. Nucleic Acids Research. 2005;33(13):4040– 4052. [30] Burwinkel B, Kilimann MW. Unequal homologous recom- bination between LINE-1 elements as a mutational mecha- nism in human genetic disease. Journal of Molecular Biology. 1998;277(3):513–517. [31] Segal Y, Peissel B, Renieri A, et al. LINE-1 elements at the sites of molecular rearrangements in Alport syndrome-diffuse leiomyomatosis. The American Journal of Human Genetics. 1999; 64(1):62–69. [32] Fitch DH, Bailey WJ, Tagle DA, Goodman M, Sieu L, Slightom JL. Duplication of the gamma-globin gene mediated by L1 long interspersed repetitive elements in an early ancestor of simian primates. Proceedings of the National Academy of Sci- ences of the United States of America. 1991;88(16):7396–7400. [33] Myers S, Bottolo L, Freeman C, McVean G, Donnelly P. A fine- scale map of recombination rates and hotspots across the hu- man genome. Science. 2005;310(5746):321–324. [34] Makałowski W. Genomic scrap yard: how genomes utilize all that junk. Gene. 2000;259(1-2):61–67. [35] Kazazian HH Jr. Mobile elements: drivers of genome evolu- tion. Science. 2004;303(5664):1626–1632. [36] Yang Z, Boffelli D, Boonmark N, Schwartz K, Lawn R. Apolipoprotein(a) gene enhancer resides within a LINE ele- ment. The Journal of Biological Chemistry. 1998;273(2):891– 897. [37] Speek M. Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes. Molecular and Cellular Biology. 2001;21(6):1973–1985. [38] Nigumann P, Redik K, Matlik¨ K, Speek M. Many human genes are transcribed from the antisense promoter of L1 retrotrans- poson. Genomics. 2002;79(5):628–634. [39] Nekrutenko A, Li W-H. Transposable elements are found in a large number of human protein-coding genes. Trends in Ge- netics. 2001;17(11):619–621. [40] Lyon MF. X-chromosome inactivation: a repeat hypothesis. Cytogenetics and Cell Genetics. 1998;80(1–4):133–137. [41] Chureau C, Prissette M, Bourdet A, et al. Comparative se- quence analysis of the X-inactivation center region in mouse, human, and bovine. Genome Research. 2002;12(6):894–908. [42] Han JS, Boeke JD. LINE-1 retrotransposons: modulators of quantity and quality of mammalian gene expression? BioEs- says. 2005;27(8):775–784. [43] Comeron JM. What controls the length of noncoding DNA? Current Opinion in Genetics & Development. 2001;11(6):652– 659. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 83672, Pages 1–12 DOI 10.1155/JBB/2006/83672

Review Article L1 Retrotransposons in Human Cancers

Wolfgang A. Schulz

Department of Urology, Heinrich Heine University, Mooreustrasse 5, 40225 Dusseldorf,¨ Germany

Received 31 August 2005; Accepted 16 October 2005 Retrotransposons like L1 are silenced in somatic cells by a variety of mechanisms acting at different levels. Protective mechanisms include DNA methylation and packaging into inactive chromatin to suppress transcription and prevent recombination, potentially supported by cytidine deaminase editing of RNA. Furthermore, DNA strand breaks arising during attempted retrotranspositions ought to activate cellular checkpoints, and L1 activation outside immunoprivileged sites may elicit immune responses. A number of observations indicate that L1 sequences nevertheless become reactivated in human cancer. Prominently, methylation of L1 sequences is diminished in many cancer types and full-length L1 RNAs become detectable, although strong expression is restricted to germ cell cancers. L1 elements have been found to be enriched at sites of illegitimate recombination in many cancers. In theory, lack of L1 repression in cancer might cause transcriptional deregulation, insertional mutations, DNA breaks, and an increased frequency of recombinations, contributing to genome disorganization, expression changes, and chromosomal instability. There is however little evidence that such effects occur at a gross scale in human cancers. Rather, as a rule, L1 repression is only partly alleviated. Unfortunately, many techniques commonly used to investigate genetic and epigenetic alterations in cancer cells are not well suited to detect subtle effects elicited by partial reactivation of retroelements like L1 which are present as abundant, but heterogeneous copies. Therefore, effects of L1 sequences exerted on the local chromatin structure, on the transcriptional regulation of individual genes, and on chromosome fragility need to be more closely investigated in normal and cancer cells.

Copyright © 2006 Wolfgang A. Schulz. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION will be on identifying open questions, of which there are plenty, as should become evident. In normal somatic human cells, transcription of retrotrans- Since L1 general biology is treated in detail in recent poson sequences like L1 and illegitimate recombination in- reviews [1–3] and other contributions in this issue, only a volving them are suppressed, restricting their activity to de- short introduction will follow here. L1 sequences represent veloping germ cells and placental tissues [1–3]. Suppression the major class of LTR-less retrotransposons in humans and of retroelement activity prevents not only retrotransposition, constitute about 18% of the human genome. While they but also various disturbances of transcription by retroele- are interspersed throughout the genome, including euchro- ment promoters, interference by retroelement enhancers, matic and heterochromatic regions, they are particularly fre- the activity of retroelement-encoded enzymes, and illegiti- quent in gene-poor regions that correspond to chromoso- mate recombination between homologous elements. More- mal G-bands. Full-length elements are 6 kb in size and con- over, while L1 sequences have the potential to create genomic tain an internal promoter at the 5-end that generates a ge- instability, they probably exert certain beneficial, “symbiotic” nomic transcript which also serves as an mRNA. The RNA effects. For instance, silencing of retrotransposons in somatic contains two open reading frames, ORF1 and ORF2. ORF1 cells may help to organize the genome into macro- and mi- encodes p40, an RNA-binding protein with cis preference crodomains with differential transcriptional activity. Failure for L1 RNA. ORF2 encodes the endonuclease and reverse to silence retroelements in cancer cells could therefore per- transcriptase required for retrotransposition. Only a frac- mitadverseactivitiesofretroelementsaswellasperturbany tion of L1 elements in the human genome are intact. Most beneficial effects. are truncated, usually at the 5-end, and mutated, often at The present paper summarizes current knowledge about many sites. The up to 400, 000 elements that are still dis- L1 (LINE-1) retrotransposons in human cancer. For compar- tinctly recognizable as L1 can be categorized into several ison, some observations on human endogenous retroviruses subclasses. Most and perhaps all elements still capable of (HERV) are included [4]. Throughout the text, the emphasis retrotransposition belong to a subclass named Ta. Normally, 2 Journal of Biomedicine and Biotechnology transcriptional activity of L1 is restricted to developing germ Effects on transcription cells and to cells of the placenta. In somatic cells, L1 tran- scription and retrotransposition is prevented by a variety Theoretically,awiderangeofeffects on the transcription of control mechanisms, including methylation of L1 DNA of host genes can be exerted by L1 regulatory elements and and specifically L1 promoters. In the germline, these mech- transcriptional sequences that are located close to or within anisms are relaxed, and retrotransposition does occur occa- them (Figure 1). The L1 promoter is moderately strong [8, 9] sionally. and the polyadenylation signal is relatively weak permit- ting a substantial amount of read-through [10, 11]. There- fore, active L1 promoters located in sense orientation 5 to a Potential dangers gene could override the normal transcriptional controls of a gene to deregulate expression. Active L1 promoters located Dangers resulting from L1 reactivation in cancer cells com- ff in sense orientation within the transcriptional unit could prise the direct adverse e ects of retrotransposition, en- generate alternative 5-truncated transcripts. Indeed, many hanced illegitimate recombination, and multiple ways of dis- regulatory elements of human genes are derived from L1 or turbance of transcriptional activity and gene regulation. In HERV sequences [12]. Even some protein-coding sequences the human genome, fewer than 100 L1 elements are thought ffi are derived from retroelements. A prominent example is syn- to be su ciently intact for retrotransposition [3]. However, cytin1, a crucial protein required for the formation of syn- while the danger of retrotransposition is posed only by these ff cytic cells in the placenta which has evolved from an HERV intact L1s, other adverse e ects can be exerted by a larger env protein [13]. The gene encoding syncytin1 is now conse- number of elements. In addition, reactivation may interfere ff quently named ERVWE1 for “endogenous retrovirus W en- with potential “symbiotic” e ects of L1 sequences such as velope protein1.” As in this case, regulatory sequences de- their contribution to the global and local organization of rived from L1 or HERV sequences are often more active dur- the genome and the provision of gene regulatory sequences. ing germ cell or embryonic development than in somatic Activities on the immune system can also be envisioned. cells. In cancer cells, decreased methylation and a more open These would be expected to have ambiguous consequences. chromatin structure of such sequences could therefore allow It seems therefore imprecise to consider alterations of L1 in ff the reexpression of genes or transcripts that are normally cancer solely as “reactivation,” other e ects may more appro- restricted to germ cells or the embryo, that is, oncofetal or priately be characterized as “dysregulation.” cancer-testis gene expression. Alternative transcripts may also be generated by the use Retrotransposition of polyadenylation sites of intragenic L1 sequences, especially if these are 5-truncated. As mentioned above, L1 polyadeny- The mechanisms involved in L1 retrotransposition are now lation signals are weak. It is not known which mechanism quitewellunderstood[1–3]. The endonuclease encoded by ensures that they are normally ignored in elements located L1 ORF2 induces single-strand breaks at AT-rich DNA tar- within a transcriptional unit. Consequently, it is difficult to get regions, preferably at consensus TTTT/A sites. Following estimate how altered methylation and chromatin structure in L1 ORF2 endonuclease action, the L1 RNA poly-A sequence cancer cells would affect their recognition. pairs with oligo-dT sequences in the target DNA, which serve Retroelements oriented in opposite direction inside a as primers for reverse transcription by the L1 ORF2 encoded transcriptional unit might interfere with transcription by enzyme. Reverse transcription yields a branched DNA struc- antisense effects, most prominently through formation of ture, which is presumably resolved by cellular DNA repair dsRNA. This mechanism is implicated in the generation of systems. The retrotransposition mechanism thus requires at heterochromatin in some organisms [14, 15]. In mammalian least one recombination and creates two DNA single-strand cells, dsRNA ought to induce general cellular antiviral de- breaks close to each other, which can in effect behave like fense mechanisms, for example, by activating PKR, or leads a double-strand break. Therefore, attempted or successful to the production of siRNAs and gene-specific downregula- retrotranspositions carry a high risk of eliciting chromo- tion. Interestingly, transcripts containing Alu sequences in some breaks, deletions, translocations, and recombinations sense direction appear to be edited and consequently desta- [5]. Moreover, successful retrotransposition events are likely bilized in human cells [16–18]. This mechanism provides to change the activity of genes at the insertion site. Di- an obvious means of posttranscriptional gene regulation. It verse outcomes of insertions are conceivable, including in- is possible that a similar process acts on L1 sequences, but creased or decreased transcriptional activity and the gener- it is currently unknown to what extent intact or partial L1 ation of novel, variant transcripts (Figure 1). On a genome- sequences in pre-mRNA are edited and whether such se- wide scale, the effects of retrotranspositions might be mit- quences are employed for posttranscriptional gene regula- igated by the propensity of L1 elements to insert into AT- tion, in normal or in cancer cells. A recent study [19] sug- rich, gene-poor regions of the genome, and especially into gests that L1 RNAs are not edited, at least not by the usual or close to other elements [6]. However, even retrotransposi- APOBEC3G cytidine deaminase. tions outside transcriptional units can have catastrophic ef- A further possibility has been suggested by the recent dis- fects on a cell by inducing DNA strand breaks and initiating covery of an antisense promoter near the 5-end of intact a breakage-fusion-bridge cycles [7]. L1 sequences [20]. When active, this promoter could exert Wolfgang A. Schulz 3

E1 E2 E3 E4 E1 E2 E3 E4 AAAAAAAA AAAAAAAA

L1E1 E2 L1 E3 E4 L1E1 E2 L1 E3 E4

(a) (b)

E1 E2 E3 E4 E3 E4 AAAAAAAA AAAAAAAA

L1E1 E2 L1 E3 E4 L1E1 E2 L1 E3 E4

(c) (d)

E3 E4 E1 E2 E3 E4 AAAAAAAA AAAAAAAA

L1E1 E2 L1 E3 E4 L1E1 E2 L1 E3 E4

(e) (f)

E1 E2 AAAAAAAA pA

L1E1 E2 L1 E3 E4

(g)

Figure 1: Potential effects of L1 sequences on transcriptional regulation. (a) schematic view of a human gene. One L1 element is located upstream of the gene and one within. Panels (b)–(g) show various disturbances that could be caused by partial or complete reactivation of L1 elements: (b) deregulation by upstream L1 promoter; (c) transcriptional interference by the promoter of an L1 in inverse direction to the gene; (d) generation of an alternative 5-truncated transcript by an internal L1 promoter in sense direction; (e) generation of an alternative 5-truncated transcript by the antisense promoter of an internal L1 element in inverse direction to the gene; (f) transcriptional interference by the antisense promoter of an internal L1 element oriented in sense direction; (g) generation of a truncated transcript by use of the poly-adenylation site of an internal L1 element. Note that most effects do not require intact retrotransposons. several effects on cellular genes, depending on its orienta- transposition which all belong to the Ta family [25]. They are tion. If located in sense direction, antisense transcripts could the most likely source of reverse transcriptase, endonuclease, lead to downregulation; if located in antisense direction, it and p40 protein in germ cells and the embryo as well as in might lead to overexpression of normal transcripts or the cancer cells. However, many elements of other families are emergence of novel transcripts. Accordingly, demethylation also intact, except for missense and stop mutations. They of L1 sequences in cancer cells may not only activate their could still give rise to one or the other intact protein, as canonical sense, but also their antisense promoters [21]. well as variant proteins. Alu retrotransposition uses the enzy- matic machinery provided by L1 and is therefore dependent Effects of L1-encoded proteins on expression of L1 proteins [26, 27]. Similarly, complemen- tation of transposition in trans among L1 sequences is inef- Intact L1 elements contain two open reading frames. ORF1 ficient, but not impossible [28], that is, full-length L1s with encodes a p40 RNA-binding protein supposed to act as a mutated protein-coding sequences might still be capable of chaperone and transport factor for L1 RNA. ORF2 encodes retrotransposition, if proteins are supplemented by other el- an endonuclease and a reverse transcriptase. The properties ements. The proteins provided by L1s are also most likely in- of these enzymes have meanwhile been studied quite well in volved in the formation of pseudogenes. It is unknown, how- vitro [22–24], but their impact on normal and cancer cells re- ever, whether their endogenous expression levels in cancer mains difficult to estimate. One open question is how many cells are sufficient to support retrotransposition. L1 elements are actually capable of expressing active proteins, Importantly, the potential danger of proteins encoded by especially, whether only intact elements form the source. It is L1s depends critically on their ability to exert effects beyond thought that less than 100 L1 sequences are capable of retro- aiding retrotransposition in cis or in trans. In the context of 4 Journal of Biomedicine and Biotechnology cancer, dangers posed by the endonuclease are most obvious. may act as “way stations” during X-chromosome inactiva- The endonuclease introduces single-strand breaks “(nicks)” tion [32, 33]. In a similar fashion, methylated L1 sequences into DNA with moderately stringent specificity [23]. Its ac- on other chromosomes which are packaged into hyperme- tivity is further restricted by chromatin structure [22]. The thylated and deacetylated chromatin may constitute the cores ultimate result of single-strand breaks introduced by the en- of localized facultative heterochromatic regions. A fraction donuclease in a cell depends on several factors. A first factor of centromeric heterochromatin also consists of retrotrans- is the cell cycle phase. Nicks in S-phase are most problematic, posons, mostly of L1s [34]. Intriguingly, some L1 sequences because they can be converted into double-strand breaks by are associated with nuclear matrix attachment regions [35] the replication complex. DNA repair competence and capac- and may contribute to the organization of chromatin loops. ity constitute a second factor that may differ between normal L1 clusters located between genes may furthermore con- and cancer cells. Thirdly, the presence of L1 RNA and other tribute to the segmentation of the genome into transcrip- proteins at the nicked site would be thought to influence the tional units, helping to prevent interference by regulatory el- type and efficiency of repair. ements from neighboring genes (Figure 2). Such a “bound- The potential impact of L1 reverse transcriptase and ary” function would explain why HOX clusters, which re- RNA-binding protein similarly depend on their specificity, quire long-range interactions for their proper expression pat- actually in two respects. First, to which extent are they spe- tern, are largely devoid of retroelement sequences [30]. Im- cific for L1 (and Alu) sequences? Second, are reverse tran- portantly, the organization of the genome into subregions scription and RNA-chaperoning their sole activities? Drug and loops pertains not only to transcription, but also to repli- inhibitors of reverse transcriptase and, more specifically, cation and imposes restrictions on the extent of DNA repair siRNA directed against L1 RT decrease the proliferation of and recombination. cancer cell lines [29]. Such effects are difficult to explain by Accordingly, alterations of DNA methylation and chro- the known function of the enzyme in mediating L1 and Alu matin structure at L1 sequences in cancer cells could have retrotranspositions. effects not only on transcription, but also on DNA replica- tion timing and on the extents of recombination and DNA Illegitimate recombination repair. Deregulation of gene expression could not only be caused by activation of L1 elements, but also through altered Successful and abortive retrotransposition can create chro- chromatin structure at inactive L1s allowing transcriptional mosomal instability and initiate illegitimate recombina- interference by neighboring enhancers or silencers. Not only tion by inducing DNA strand breaks and by generating in this particular situation DNA replication patterns could a branched DNA structure. However, even in the absence be disturbed, with normally late-replicating DNA shifting to- of retrotransposition, the presence of thousands of intact, wards earlier periods within S-phase. Barrier functions of re- rather long (6 kb), and relatively homologous sequences in peat DNA in the genome could be alleviated, allowing DNA the genome plus the presence of ten thousands of truncated processing during repair and Holiday junctions formed dur- and mutated sequences carries a permanent risk of illegit- ing recombination to pass through stretches of DNA that are ff imate recombination between elements located at di erent less accessible in normal cells [36]. sites. In the germline, recombination between different L1 elements contributes to human evolution, but also elicits in- herited diseases. In somatic cells, recombination ought to Effects on cell stress and immune responses be restricted strictly to homologous recombination repair of DNA double-strand breaks using homologous sequences Endogenous retroelements have been implicated in the reg- from sister chromatids or at most from the homologous ulation of cell stress responses, of the immune system, and chromosome. Any other recombination event involves dele- in the pathogenesis of several human autoimmune diseases. tions, insertions, or translocations. It is generally assumed The strongest data on regulation of human retroelements by that recombination between L1 sequences in somatic cells cell stress concern Alu sequences [37]. Likewise, the most is suppressed by dense DNA methylation and tight pack- convincing data on regulation of the immune system by aging into chromatin. Decreased methylation and relaxed retroelements and on the involvement in autoimmune dis- chromatin structure of L1 sequences in cancer cells might eases implicates HERVs [4]. There are, however, indications therefore facilitate illegitimate recombination contributing that L1 sequences too are induced during stress responses to chromosomal instability. [37], during cytotoxic chemotherapy [38],andbyUVexpo- sure of skin cells [39]. Furthermore, L1 sequences may act Disturbance of normal genome organization in a similar fashion as HERVs in at least one autoimmune disease, rheumatoid arthritis [40]. In this disease, synovial L1 sequences are thought to be involved in the organiza- fibroblasts become aberrantly activated in a fashion that re- tion of the human genome, their presence influencing short- sembles in many respects fibroblast activation in the stroma range and long-range chromatin structures. L1 sequences are of malignant tumors, with enhanced proliferation, migra- overrepresented in the late-replicating G-bands of human tion, and secretion of cytokines, chemokines, and proteases. chromosomes [30]. It is plausible that their presence is re- The fibroblast genomes at large and L1 promoters in partic- sponsible for their more heterochromatic character. L1s are ular were found to become hypomethylated. Concurrently, also overrepresented on the X-chromosome [31] where they full-length L1 RNA could be detected [41]. Overexpression Wolfgang A. Schulz 5

E1 E2 E1∗ E2∗ E1 E2 E1∗ E2∗ AAAAAAAA AAAAAAAA AAAAAAAA AAAAAAAA

E E E1 E2N L1 L1 E1∗ E2∗ E1 E2N L1 L1 E1∗ E2∗ H H

(a) (b)

Figure 2: Postulated boundary effect of intergenic L1 clusters. Being strongly methylated and tightly packed into chromatin, clustered L1 sequences might act as boundaries between genes, restricting the interaction of an enhancer (ENH) to one gene (a). In cancer cells, L1 hypomethylation could destroy this function and cause deregulation by allowing enhancer interaction with a neighboring gene (b).

of the p40 ORF1 protein in this disease has been suggested epigenetic mechanism driving cancer development and pro- to activate stress-induced protein kinases [42]. It is thought gression [63]. Alterations of methylation in cancer cells that L1 hypomethylation and expression provide an ampli- comprise “hypermethylation” which occurs focally and in a fication step in the pathogenesis of the disease by enhancing largely specific fashion, typically at CpG islands surrounding immune responses [40]. the transcriptional start regions of individual genes. Some- The function of the activation of retroelements during what paradoxically, in many, but not all cancers, increased cellular stress responses is poorly understood. Conceivably, methylation at specific sites is found alongside a decrease it forms part of a signaling system that alerts the immune in methylation levels of the overall genome. The decrease in system to the presence of infected or altered cells [43]. If that methylation appears to be relatively unspecific and is there- proposition is true, hypomethylation and activation of L1 se- fore commonly designated as “genome-wide” or “global” hy- quences in cancer cells are likely to influence the immune re- pomethylation [64, 65]. In normal somatic cells, the bulk sponse to malignant tumors. In support of this idea, some of methylcytosine is found in repetitive sequences such as HERV proteins have been found to behave as tumor antigens L1, HERVs, and Alus, but also at CpG-rich satellites such as [44, 45], but it is not known whether proteins encoded by L1 SAT2 and SAT3. The overall decrease in methylation found do so too. A similarly open question is to which extent hy- in cancer cells therefore reflects a largely parallel decrease in pomethylated repeat DNA liberated from tumor cells elicits the methylation of retrotransposon sequences [61]. As a rule, danger signals in cells regulating the immune response. In- however, L1 and HERV sequences seem to be more strongly terestingly, L1 activation is considered as a cause of increased affected than Alus [65]. plasma DNA levels in tumor patients [46]. Hypomethylation of L1 sequences has traditionally been investigated by Southern blot analysis following digestion of Observations DNA with methylation-sensitive restriction enzymes [53]. Recently, methods employing PCR following bisulfite treat- Many of the effects that can be envisioned to be exerted ment of DNA have been developed for this purpose [52, 66]. by activated L1 retrotransposons have indeed been observed These techniques are promising, since they can also be ap- in the human germline and during fetal development [1– plied to small amounts of suboptimal quality DNA. How- 3, 47, 48]. In cancer cells, mainly three phenomena point ever, because of the heterogeneity of L1 sequences, the extent towards a reactivation of retroelements. Retroelement DNA of their demethylation is difficult to estimate precisely, espe- sequences become hypomethylated, transcripts as well as cially by PCR-based methods. Southern blot analyses suggest protein products can be detected, and L1 sequences are lo- that in cancer cell lines up to 70%–80% of CpG sites in L1 se- cated at sites of breakage and recombination. For L1 retro- quences become demethylated. Decreases in L1 methylation transposons, the most convincing data are available for hy- appear to parallel those in HERVs. Accordingly, individual pomethylation. Data on L1 expression are scarce, in contrast HERV proviruses are essentially unmethylated in cancer cell to several reports on the expression of HERV gene products. lines with strong hypomethylation [53]. Nevertheless, L1 hy- L1 sequences have been found at or near deletion ends and pomethylation is anything but uniform in different cancers, translocation breakpoints, but the precise frequency and the in two respects. First, different extents of hypomethylation mechanisms involved remain to be determined. Intriguingly, are found in cancers of the same type. These differences are actual retrotransposition events are exceptional. also observed in cancer cell lines and are therefore not ex- plained by differences in the proportion of tumor cells in tis- Altered methylation sue samples. Second, L1 hypomethylation appears to develop at different stages in the development of different cancers. In a large number of human cancers, decreased methylation For instance, it is found at early stages of colon and blad- of L1 sequences has been documented (Table 1). This de- der cancers [53, 60], but only in higher-stage prostate carci- crease occurs in the context of general alterations in DNA nomas [55, 56] while primary renal carcinomas lack signif- methylation patterns that accompany carcinogenesis in many icant LINE-1 hypomethylation [52, 53]. Germ cell cancers human tissues. These are regarded as part of an important are a special case since they have generally hypomethylated 6 Journal of Biomedicine and Biotechnology

Table 1: Hypomethylation and expression of L1 in human cancers.

Change reported Cancer type Remarks References Expression Teratocarcinoma Cell lines [49, 50] Hypomethylation Various Cell lines [51] Hypomethylation Various Cell lines [46] Hypomethylation Many Considerable differences between cancer types [52] Hypomethylation, expression Bladder cancer Expression weaker than in teratocarcinomas [53, 54] Hypomethylation Renal carcinoma Cell lines only [53] Hypomethylation Prostate cancer Increases with stage and metastasis [55–57] Hypomethylation Liver carcinoma — [58] Hypomethylation, expression Liver carcinoma Hypomethylation, but not cancer-specific expression [59] Hypomethylation Various cancers Differences between cancer types [52] Hypomethylation Colon cancer Begins in preneoplastic mucosa [60] Hypomethylation Gastric cancer Correlates with overall hypomethylation [61] Hypomethylation Ovarian carcinoma — [62] genomes, presumably due to their origin from cells with and active imprinted genes [74]. De novo methylation in lower methylation levels [67, 68]. Accordingly, L1 [51, 69] the mouse embryo requires DNA methyltransferases, specif- and HERV [70] sequences are strongly hypomethylated in ically Dnmt3A and Dnmt3B as well as Dnmt1 for mainte- testicular cancers. Finally, note that very little is known on nance of the established methylation [74, 75]. In male germ the methylation of individual L1 sequences [71], and accord- cells, Dnmt3L is required for proper L1 methylation [76]. It ingly, whether their hypomethylation in cancers is uniform is not entirely clear whether methylation of L1 during de- [72]. velopment requires specific chromatin regulators directing Although genome-wide hypomethylation in human can- the methyltransferases. One candidate is SMARCA6, as its cers has been known for more than twenty years, the mech- mouse orthologue Lsh has been found to be required for anisms eliciting this alteration are still unknown. Hypo- proper methylation of L1 sequences. Inactivation of Lsh in thetical mechanisms include insufficient levels of methyl mice causes L1 hypomethylation, but only limited distur- group donors, ultimately of S-adenosylmethionine, inade- bances of the methylation of single-copy genes [77]. In com- quate expression or regulation of DNA methyltransferases, parison, inactivation of another chromatin protein ATRX reexpression of DNA demethylases, and altered expression causes hypomethylation of rDNA, but leaves L1 methylation of chromatin regulators directing DNA methyltransferases intact [78]. This suggests that the specificity of DNA methy- [64, 65]. lation may be regulated by specific “chromatin regulator” The last mechanism is particularly interesting in the proteins. present context. Retroelements constitute approximately A variety of chromatin regulator proteins have been re- 45% of the human genome [30] and contain an at least pro- ported to be aberrantly expressed or even mutated in human portionate amount of methylcytosine. Moreover, they appear cancers [65, 79–81]. However, many of these changes are rare to be preferentially recognized by the DNA methylation ma- or are specific to particular cancers. It is therefore difficult to chinery and—at least in some circumstances—appear to act envision a change in a single “master regulator” of L1 methy- as “centers of methylation” from which methylation spreads lation as the cause of the widely distributed hypomethylation into adjacent sequences [73]. Therefore, genome-wide hy- of these sequences. More likely, L1 hypomethylation could pomethylation could theoretically arise as a consequence of be associated with the general reorganization of chromatin a defect in the recognition of retroelements as methylation structure in aneuploid cancer cells that disturbs the compart- targets. mentation of the genome [65, 79, 80]. Genome-wide alter- Unfortunately, it is still not known how retroelements are ations in histone modification have recently been described distinguished for silencing in mammalian genomes. The L1 in cancer cells [82, 83]. Given the high proportion of L1 promoter is as active in somatic as in embryonic cells [9]. sequences in the human genome, these are likely to affect Therefore, L1 silencing in somatic cells cannot be simply these retrotransposons and to interact with their methyla- a consequence of transcriptional inactivity. Instead, silenc- tion. Note that the relation between DNA methylation and ing must have been actively established during fetal devel- histone modifications at L1 sequences is far from being un- opment and is faithfully maintained through cell prolifera- derstood [84]. tion and differentiation in normal somatic cells. DNA methy- lation of retroelements is established first during germ cell L1 expression in cancers development and then again during gastrulation, when the genome at large becomes de novo methylated, except for The mechanisms underlying hypomethylation changes in sequences that are actively protected, such as CpG islands human cancers are not understood, but even the description Wolfgang A. Schulz 7 of these changes is fragmentary. For instance, methylation In a similar fashion, hypomethylation of retroelement se- of HERVs has been studied in only a few cancers. Avail- quences dispersed in the genome could facilitate illegitimate able data suggest that they are affected by genome-wide hy- recombination. In favor of this idea, L1 sequences are en- pomethylation in parallel to LINE-1 sequences (Table 1). In riched at the ends of 3p14.1 and 9p21 deletions in carcino- selected cancers, endogenous retroviral sequences may be al- mas [36, 98, 99] and homozygous deletions arise preferen- most completely unmethylated. Expressed sequences derived tially in chromosomal regions with high LINE content [100]. from HERVs are found in germ cell cancers and antibod- It has also been suggested that L1 and HERV sequences are ies directed against HERV-encoded proteins are found in the involved in the formation of double-minute circular chro- blood of patients [70]. In cancers of somatic cell origin, bona mosomes in cancer cells [101, 102]. fide transcripts for envelope and auxiliary proteins have been The most straightforward hypothesis accounting for reported, especially in breast cancer [44, 85], and recently these findings is that decreased methylation and presumably in melanoma [86]. Some results suggest that HERV expres- more open chromatin structure at L1 sequences in cancer sion occurs in a wider range of cancers and even normal cells favors the illegitimate recombination between elements tissues [87, 88]. These data need further verification to ex- at different genomic locations, for example, during homolo- clude artifacts from genomic DNA and unspliced transcripts. gous recombination repair of DNA strand breaks. However, Moreover, the somewhat surprising findings that different closer analyses of the deletion ends in solid tumors indi- transcripts from different subfamilies may be expressed in a cate that this hypothesis is probably incorrect. While dele- cancer-type-specific fashion call for a closer analysis of the tion ends are indeed often located in or near L1 sequences mechanisms involved. and particularly L1 clusters, the breakpoints invariably show There are no sufficientlysystematicstudiesofL1expres- hallmarks of DNA double-strand break repair by nonhomol- sion in human cancers. The available data suggest that ex- ogous end-joining (NHEJ). Typically, one end of the deletion pression of full-length L1 sequences is by far the strongest in is located in or close to an L1 sequence, while the other end teratocarcinomas, while weaker expression is observed in a is provided by an unrelated single copy or repeat sequence wider range of carcinomas exhibiting hypomethylation [53]. [36, 99, 103]. Such structures also appeared as occasional This expression pattern therefore resembles that of HERVs. end products of repair of DNA double-strand breaks induced Since HERVs also give rise to spliced transcripts, RNA anal- by a restriction endonuclease at a specific chromosomal site yses can provide a first indication of which protein products [104]. A plausible explanation for this structure is that pro- are expressed. For L1, this question needs to be addressed us- cessing by the NHE1 protein complex damaged DNA ends ing antibodies. So far, no definite data have been published is slowed down at L1 sequences by denser chromatin, favor- on the expression of the proteins encoded by the retrotrans- ing reannealing and ligation there [36]. If this explanation is posons in human cancer. Their presence in germ cell cancers correct, retroelement hypomethylation in cancer could para- and teratocarcinoma cell lines, however, is very likely [89]. doxically diminish the tendency of breakpoints to be located at L1 sequences. It would instead tend to increase the size Involvement of L1 in chromosome of deleted and recombinated sequences, because DNA pro- breakage and recombination cessing and Holiday junctions arising during recombination repair could move further through more open chromatin. Whereas retrotransposition events take place quite regularly Presently, either hypothesis remains speculative for sev- in the germline, at an estimated rate of 1 event per 100 births eral reasons. First of all, far too few chromosomal break- [3, 4],veryfewhavebeenreportedincancercells[90, 91]. points have been investigated, especially in carcinomas. Sec- Similarly, although L1 sequences have been shown to become ondly, it has not been established for any chromosomal al- incorporated at sites of double-strand break DNA repair in teration whether hypomethylation of repeat sequences at the model experiments [92], according sequence changes have affected site preceded it. Thirdly, L1 repeats are not randomly only exceptionally been observed in human cancers [93]. In distributed in the genome. They might be associated with lo- spite of the caveats discussed below, it is therefore probably cal structures that are particularly prone to breakage, such as safe to conclude that actual retrotransposition events are rare fragile sites or the anchorage sites of chromatin loops. in human cancers and do not regularly contribute to genomic instability. Perspectives The evidence is better for indirect mechanisms by which retrotransposons could promote chromosomal instability in Consequences of L1 activity in the human germline are human cancer. L1 hypomethylation and chromosomal insta- well documented. Retrotranspositions in the germline take bility correlate well with each other in several cancer types place at a significant rate of approximately 1 event per 100 [55, 58, 94]. A similar relationship has been observed be- births [3]. In addition, a substantial number of recombina- tween the hypomethylation of tandem satellite sequences and tion events involving L1 elements have been detected, typi- alterations of the chromosomes that carry them as large jux- cally because they elicited translocations or rearrangements tacentromeric region [95–97]. In this case, hypomethylation causing inherited diseases [2, 47, 48]. Specifically, L1 retro- of the satellite sequences is thought to cause decondensation transposition and illegitimate recombination in the germline of pericentromeric chromatin and an increased propensity are causes of inherited and congenital cancers. For instance, for chromosomal breaks and rearrangements in this region. a germline deletion in the MLH1 gene carries hallmarks 8 Journal of Biomedicine and Biotechnology of recombination initiated by a failed L1 retrotransposition involved. In summary, therefore, whereas it seems unlikely event [105]. that retrotransposition is common in human cancer cells, the In contrast, in spite of considerable evidence hinting at role of L1s in recombination and chromosome breakage is a reactivation of L1 retrotransposons in a variety of human probably underestimated due to a lack of studies with appro- cancers, there is limiting evidence for major consequences of priate methodology. this process. This may be due to two very different reasons. A similar argument can be made for epigenetic effects of One is technical: even typical effects expected from L1 reac- L1 sequences in cancer. In genome-wide screens for altered tivation are difficult to detect by the techniques commonly methylation in cancer, repeat sequences are often and un- used to investigate genetic alterations in human cancers. The derstandably considered a nuisance and typically removed other is biological: reactivation may be partial and the mech- by prehybridization. Overall changes in L1 methylation are anisms ensuring silencing of L1 DNA sequences and limiting therefore well documented, but data on the behavior of indi- the effects of transcribed sequences may remain functional to vidual sequences is lacking. Bisulfite sequencing is restricted some degree. Perhaps, limited reactivation of L1 sequences to a few hundred bp per PCR and is prone to artifacts from may exert effects more through the loss of symbiotic func- template switching and target priming when applied to re- tions than through direct adverse effects on genomic stability. peat sequences. An elegant solution may be hairpin PCR. This possibility is even more difficult to ascertain. This method has revealed that in fetal fibroblasts, the pro- In general, investigations of genetic and epigenetic moters of most full-length L1 sequences are densely and sym- changes in human cancers avoid dealing with repeat se- metrically methylated, while selected elements are unmethy- quences and focus on single-copy protein-coding genes. Mu- lated [71]. The obvious question is which elements are these. tation analysis of genes is typically restricted to coding se- Accordingly, it is not clear whether the number of completely quences and employs PCR techniques to analyze individual unmethylated elements increases in cancer cells or whether exons or mRNA. Insertions or recombinations caused by L1 the decrease in methylation is distributed across all L1 se- or other retroelements would often not be detected by this quences. These questions extend of course to the issue of approach, unless they occur within exons. Therefore, it might chromatin structure at L1 elements. not be coincidental that reports describing oncogene acti- The contribution of L1s to altered gene expression in can- vation and tumor suppressor inactivation by L1 insertion cer is still more difficult to ascertain. There are many unex- date from a period when Southern blot analysis was more plained instances of altered gene expression in cancers. Per- en vogue. haps most striking are reports on frequent downregulation Similarly, recombination and deletions in cancers are well of genes (usually tumor suppressor candidates) without de- documented at the level accessible by cytogenetic techniques, tectable genetic alterations in their vicinity and altered DNA but are not well investigated at the molecular level, with the methylation in regulatory sequences. Recently, increased ex- important exception of translocations in hematological can- pression of miRNA has been introduced as a potential cause cers. In these, retroelements have indeed been found at many of such enigmatic observations [107]. In the light of poten- translocation sites, although their role in the generation of tial effects of L1 sequences, perhaps effects exerted by L1 el- the translocations is not clear. In contrast, very few studies ements in or near affected genes should also be considered. have addressed the precise structure of chromosomal break- This suggestion likewise applies the mechanisms generating points in solid tumors. A recent genome-wide study of 505 aberrant transcripts in cancer cells [21]. Again, this is a highly cancer cell lines yielded a strong association between LINE difficult issue, especially if genes that are investigated are al- content and the presence of homozygous deletions, but no ternatively spliced even in normal cells. The argument can breakpoints were characterized in detail [100]. Detailed anal- be broadened further to encompass the potential boundary yses of deletion endpoints at the FRA3B fragile site [98]and function of repeat elements. Its disturbance by altered chro- around CDKN2A at 9p21 [99] revealed a preponderance of matin structure in cancer cells may result in more or less L1 sequences at or close to the deletion endpoints. Such anal- subtle up- and downregulation. More than two decades of yses remain tedious even with the finished human genome intense work have been spent on a small number of selected sequence having become available. Therefore, we know little loci to understand the mechanisms of action of long-range on the structure of amplicons, another category of unstable regulatory elements and boundary elements at all. It is un- sequence in cancer cells, and next to nothing on the sites of derstandable that very little is known on how they are altered illegitimate recombination in cancer cells. L1 sequences have in cancer cells. Still, it may be advisable to consider such ef- been detected in double minutes, an important intermediate fects and others mentioned in this paragraph when encoun- in one amplification mechanism, and have been proposed, tering instances of altered gene regulation in cancer that can- but not proven to be involved in their formation [101]. By not be straightforwardly explained by mutations or altered a comparison of loss of heterozygosity analysis and cytoge- DNA methylation of gene regulatory sequences. netic techniques of chromosome 8p in bladder cancer cell The difficulties in determining the impact of L1 se- lines, recombination events were recently shown to be much quences in cancer cells resulting from methodological limita- more frequent and were shown to take place across much tions are compounded by biological factors. Several layers of smaller regions than hitherto assumed [106]. However, it mechanisms control L1 expression and activity. DNA methy- is not known and difficult to determine what initiated the lation inactivates the L1 promoter [9]. It is likely that this recombination events and which sequences precisely were restriction of transcriptional activity is aided by an inactive Wolfgang A. Schulz 9 chromatin structure [108], although this cannot be consid- ACKNOWLEDGMENTS ered proven [86]. A second set of L1 controls appears to act at the RNA level, perhaps exerted by cytidine deaminases, lead- I am grateful to Dr. Andrea R. Florl for critical reading of ing to RNA instability [11]. Alu-containing RNAs are subject the manuscript and to Sandy Fritzsche for help in compil- to editing [16–18], but the evidence for L1 RNA is scanty. ing the reference list. Work in our lab is supported by A third level of control is enacted at the retrotransposition the Deutsche Forschungsgemeinschaft (LI 1038/3-1) and the step. Active TP53 prevents retrotransposition [109], but this Deutsche Krebshilfe (70-3193 Schu 1). is very likely not the only barrier at this step. Last and per- haps not least, there is some evidence that retroelement ac- tivation might attract immune responses. As discussed else- REFERENCES where in detail [43], such responses are better documented [1] Yoder JA, Walsh CP, Bestor TH. Cytosine methylation and for HERVs, but they might additionally or concurrently se- the ecology of intragenomic parasites. Trends in Genetics. lect against cells with strongly activated L1 retrotransposons 1997;13(8):335–340. outside immunoprivileged organs. [2] Ostertag EM, Kazazian HH Jr. Biology of mammalian L1 There is good evidence for three of these protective retrotransposons. Annual Review of Genetics. 2001;35:501– mechanisms to be impeded in cancer cells: DNA methylation 538. is decreased, TP53 and checkpoints are often defective, and [3] Kazazian HH Jr. Mobile elements: drivers of genome evolu- immune responses to advanced cancers are muted. We know tion. Science. 2004;303(5664):1626–1632. very little about another mechanism, control of retroele- [4] Bannert N, Kurth R. Retroelements and the human genome: ments at the RNA level. In summary, therefore, the regula- new perspectives on an old relation. Proceedings of the Na- tion of L1 genomic structure, expression, and retrotranspo- tional Academy of Sciences of the United States of America. 2004;101(suppl 2):14572–14579. sition is clearly perturbed in many human cancers, but inac- [5] Symer DE, Connelly C, Szak ST, et al. Human l1 retrotrans- tivation of all tiers of control may be rare. Some cancer types position is associated with genetic instability in vivo. Cell. exhibit few changes, for example, renal cell carcinoma, where 2002;110(3):327–338. even L1 DNA methylation appears to be maintained, while [6] Boissinot S, Entezam A, Young L, Munson PJ, Furano AV.The germ cell cancers appear to represent the other end of the insertional history of an active family of L1 retrotransposons spectrum [49, 50, 52, 53]. Even in these, however, retrotrans- in humans. Genome Research. 2004;14(7):1221–1231. position events appear to be rare and the evidence for major [7] Gisselsson D. Chromosome instability in cancer: how, when, contributions of retrotransposon activity to the cancer phe- and why? Advances in Cancer Research. 2003;87:1–29. notype is limited. Presumably, at least one of the multiple [8] Swergold GD. Identification, characterization, and cell speci- safeguards against retrotransposition holds up. A likely can- ficity of a human LINE-1 promoter. Molecular and Cellular Biology. 1990;10(12):6718–6729. didate is TP53 [109],sincegermcelltumorsareamongthe ff few cancer types in which mutations of this tumor suppres- [9] Steinho C, Schulz WA. Transcriptional regulation of the hu- man LINE-1 retrotransposon L1.2B. Molecular Genetics and sor are rare [68]. Genomics. 2003;270(5):394–402. [10] Holmes SE, Dombroski BA, Krebs CM, Boehm CD, Kazazian CONCLUSIONS HH Jr. A new retrotransposable human L1 element from the Activation of L1 retrotransposons in cancer cells is expected LRE2 locus on chromosome 1q produces a chimaeric inser- to exert a variety of effects on the tumor phenotype, if it oc- tion. Nature Genetics. 1994;7(2):143–148. curs. Of course, this statement hinges on the “if,” and our [11] Han JS, Szak ST, Boeke JD. Transcriptional disruption by the L1 retrotransposon and implications for mammalian tran- present knowledge does not allow firm conclusions. Consid- scriptomes. Nature. 2004;429(6989):268–274. ering that L1 retrotransposons make up almost a fifth of our [12] van de Lagemaat LN, Landry JR, Mager DL, Medstrand P. genome, there are astonishingly large gaps in our knowledge Transposable elements in mammals promote regulatory vari- on their general biology, and consequentially in our knowl- ation and diversification of genes with specialized functions. edge on their behavior in cancer. As argued above, there is Trends in Genetics. 2003;19(10):530–536. an obvious need for more systematic investigations of DNA [13] Mi S, Lee X, Li X-P, et al. Syncytin is a captive retroviral en- methylation and chromatin structure of L1 DNA, of the ex- velope protein involved in human placental morphogenesis. pression of full-length transcripts and L1-encoded proteins Nature. 2000;403(6771):785–789. on one hand and for exemplary studies of individual ele- [14] Schramke V, Allshire R. Hairpin RNAs and retrotransposon ments and their influence on adjacent genes in cancer cells on LTRs effect RNAi and chromatin-based gene silencing. Sci- the other hand. At this stage, it is probably safe to conclude ence. 2003;301(5636):1069–1074. that L1 retrotransposons do become reactivated to various [15] Lippman Z, Gendrel A-V, Black M, et al. Role of transposable elements in heterochromatin and epigenetic control. Nature. degrees in different cancers, but that some of the many safe- 2004;430(6998):471–476. guards that prevent retrotransposition and their adverse ef- [16] Athanasiadis A, Rich A, Maas S. Widespread A-to-I RNA fectsinsomaticcellsholdupinmostcancers.Perhaps,even editing of Alu-containing mRNAs in the human transcrip- cancer cells cannot survive with fully active retrotransposons. tome. PLoS Biology. 2004;2(12):e391. It follows that more subtle effects of L1 dysregulation in can- [17] Kim DD, Kim TT, Walsh T, et al. Widespread RNA edit- cer cells, which may include adverse actions as well as loss of ing of embedded Alu elements in the human transcriptome. symbiotic functions, should be a focus of investigation. Genome Research. 2004;14(9):1719–1725. 10 Journal of Biomedicine and Biotechnology

[18] Levanon EY, Eisenberg E, Yelin R, et al. Systematic identifica- [37] Li TH, Schmid CW. Differential stress induction of individ- tion of abundant A-to-I editing sites in the human transcrip- ual Alu loci: implications for transcription and retrotranspo- tome. Nature Biotechnology. 2004;22(8):1001–1005. sition. Gene. 2001;276(1-2):135–141. [19] Turelli P, Vianin S, Trono D. The innate antiretroviral factor [38] Hagan CR, Rudin CM. Mobile genetic element activation APOBEC3G does not affect human LINE-1 retrotransposi- and genotoxic cancer therapy: potential clinical implications. tion in a cell culture assay. The Journal of Biological Chemistry. American Journal of PharmacoGenomics. 2002;2(1):25–35. 2004;279(42):43371–43373. [39] Banerjee G, Gupta N, Tiwari J, Raman G. Ultraviolet- [20] Speek M. Antisense promoter of human L1 retrotransposon induced transformation of keratinocytes: possible involve- drives transcription of adjacent cellular genes. Molecular and ment of long interspersed element-1 reverse transcrip- Cellular Biology. 2001;21(6):1973–1985. tase. Photodermatology, Photoimmunology & Photomedicine. [21] Nigumann P, Redik K, Matlik¨ K, Speek M. Many human 2005;21(1):32–39. genes are transcribed from the antisense promoter of L1 [40] Seemayer CA, Distler O, Kuchen S, et al. Rheumatoide retrotransposon. Genomics. 2002;79(5):628–634. Arthritis: Neue Entwicklungen in der Pathogeneses unter [22] Cost GJ, Golding A, Schlissel MS, Boeke JD. Target DNA besonderer Berucksichtigung¨ der synovialen Fibroblasten chromatinization modulates nicking by L1 endonuclease. [Rheumatoid arthritis: new developments in the pathogen- Nucleic Acids Research. 2001;29(2):573–577. esis with special reference to synovial fibroblasts]. Zeitschrift [23] Cost GJ, Feng Q, Jacquier A, Boeke JD. Human L1 element fur¨ Rheumatologie. 2001;60(5):309–318. target-primed reverse transcription in vitro. The EMBO Jour- [41] Neidhart M, Rethage J, Kuchen S, et al. Retrotranspos- nal. 2002;21(21):5899–5910. able L1 elements expressed in rheumatoid arthritis syn- [24] Weichenrieder O, Repanas K, Perrakis A. Crystal structure of ovial tissue: association with genomic DNA hypomethylation the targeting endonuclease of the human LINE-1 retrotrans- and influence on gene expression. Arthritis & Rheumatism. poson. Structure. 2004;12(6):975–986. 2000;43(12):2634–2647. [25] Myers JS, Vincent BJ, Udall H, et al. A comprehensive anal- [42] Kuchen S, Seemayer CA, Rethage J, et al. The L1 retroele- ysis of recently integrated human Ta L1 elements. American ment-related p40 protein induces p38delta MAP kinase. Au- Journal of Human Genetics. 2002;71(2):312–326. toimmunity. 2004;37(1):57–65. ff [26] Kajikawa M, Okada N. LINEs mobilize SINEs in the eel [43] Schulz WA, Steinho C, Florl AR. Methylation of endoge- through a shared 3’ sequence. Cell. 2002;111(3):433–444. nous human retroelements in health and disease. Cur- [27] Hagan CR, Sheffield RF, Rudin CM. Human Alu element rent Topics in Microbiology and Immunology. 2006;310, in retrotransposition induced by genotoxic stress. Nature Ge- press. netics. 2003;35(3):219–220. [44] Armbruester V, Sauter M, Krautkraemer E, et al. A novel gene from the human endogenous retrovirus K expressed in [28] Wei W, Gilbert N, Ooi SL, et al. Human L1 retrotransposi- transformed cells. Clinical Cancer Research. 2002;8(6):1800– tion: cis preference versus trans complementation. Molecular 1807. and Cellular Biology. 2001;21(4):1429–1439. [45] Schiavetti F, Thonnard J, Colau D, Boon T, Coulie PG. A [29] Sciamanna I, Landriscina M, Pittoggi C, et al. Inhibition of human endogenous retroviral sequence encoding an antigen endogenous reverse transcriptase antagonizes human tumor recognized on melanoma by cytolytic T lymphocytes. Cancer growth. Oncogene. 2005;24(24):3923–3931. Research. 2002;62(19):5510–5516. [30] Lander ES, Linton LM, Birren B, et al. Initial sequencing and [46] Alves G, Kawamura MT, Nascimento P, et al. DNA release by analysis of the human genome. Nature. 2001;409(6822):860– line-1 (L1) retrotransposon. Could it be possible? Annals of 921. ff the New York Academy of Sciences. 2000;906:129–133. [31] Ross MT, Grafham DV, Co ey AJ, et al. The DNA sequence [47] Shaffer LG, Lupski JR. Molecular mechanisms for constitu- of the human X chromosome. Nature. 2005;434(7031):325– tional chromosomal rearrangements in humans. Annual Re- 337. view of Genetics. 2000;34:297–329. [32] Bailey JA, Carrel L, Chakravarti A, Eichler EE. Molecular ev- [48] Ovchinnikov I, Rubin A, Swergold GD. Tracing the LINEs idence for a relationship between LINE-1 elements and X of human evolution. Proceedings of the National Academy of chromosome inactivation: the Lyon repeat hypothesis. Pro- Sciences of the United States of America. 2002;99(16):10522– ceedings of the National Academy of Sciences of the United 10527. States of America. 2000;97(12):6634–6639. [49] Bratthauer GL, Fanning TG. Active LINE-1 retrotransposons [33] Hansen RS. X inactivation-specific methylation of LINE- in human testicular cancer. Oncogene. 1992;7(3):507–510. 1 elements by DNMT3B: implications for the Lyon repeat [50] Skowronski J, Fanning TG, Singer MF. Unit-length line-1 hypothesis. Human Molecular Genetics. 2003;12(19):2559– transcripts in human teratocarcinoma cells. Molecular and 2567. Cellular Biology. 1988;8(4):1385–1397. [34] Laurent AM, Puechberty J, Prades C, Gimenez S, Roizes` G. [51]DanteR,Dante-PaireJ,RigalD,RoizesG.Methylationpat- Site-specific retrotransposition of L1 elements within hu- terns of long interspersed repeated DNA and alphoid repeti- man alphoid satellite sequences. Genomics. 1997;46(1):127– tive DNA from human cell lines and tumors. Anticancer Re- 132. search. 1992;12(2):559–563. [35] Khodarev NN, Bennett T, Shearing N, et al. LINE L1 retro- [52] Chalitchagorn K, Shuangshoti S, Hourpai N, et al. Dis- transposable element is targeted during the initial stages of tinctive pattern of LINE-1 methylation level in normal tis- apoptotic DNA fragmentation. Journal of Cellular Biochem- sues and the association with carcinogenesis. Oncogene. istry. 2000;79(3):486–495. 2004;23(54):8841–8846. [36]RaschkeS,BalzV,Efferth T, Schulz WA, Florl AR. Homozy- [53] Florl AR, Lower¨ R, Schmitz-Drager¨ BJ, Schulz WA. DNA gous deletions of CDKN2A caused by alternative mecha- methylation and expression of LINE-1 and HERV-K provirus nisms in various human cancer cell lines. Genes, Chromo- sequences in urothelial and renal cell carcinomas. British somes and Cancer. 2005;42(1):58–67. Journal of Cancer. 1999;80(9):1312–1321. Wolfgang A. Schulz 11

[54] Jurgens B, Schmitz-Drager BJ, Schulz WA. Hypomethylation elements. The Journal of Biological Chemistry. 2005;280(15): of L1 LINE sequences prevailing in human urothelial carci- 14413–14419. noma. Cancer Research. 1996;56(24):5698–5703. [72] Weber M, Davies JJ, Wittig D, et al. Chromosome-wide and [55] Schulz WA, Elo JP, Florl AR, et al. Genomewide DNA hy- promoter-specific analyses identify sites of differential DNA pomethylation is associated with alterations on chromosome methylation in normal and transformed human cells. Nature 8 in prostate carcinoma. Genes, Chromosomes and Cancer. Genetics. 2005;37(8):853–862. 2002;35(1):58–65. [73] Turker MS. Gene silencing in mammalian cells and the [56] Florl AR, Steinhoff C, Muller¨ M, et al. Coordinate hyper- spread of DNA methylation. Oncogene. 2002;21(35):5388– methylation at specific genes in prostate carcinoma pre- 5393. cedes LINE-1 hypomethylation. British Journal of Cancer. [74] Li E. Chromatin modification and epigenetic reprogram- 2004;91(5):985–994. ming in mammalian development. Nature Reviews. Genetics. [57] Santourlidis S, Florl AR, Ackermann R, Wirtz HC, Schulz 2002;3(9):662–673. WA. High frequency of alterations in DNA methylation in [75] Bestor TH. The DNA methyltransferases of mammals. Hu- adenocarcinoma of the prostate. The Prostate. 1999;39(3): man Molecular Genetics. 2000;9(16):2395–2402. 166–174. [76] Bourc’his D, Bestor TH. Meiotic catastrophe and retrotrans- [58] Takai D, Yagi Y, Habib N, Sugimura T, Ushijima T. Hy- poson reactivation in male germ cells lacking Dnmt3L. Na- pomethylation of LINE1 retrotransposon in human hepa- ture. 2004;431(7004):96–99. tocellular carcinomas, but not in surrounding liver cirrho- [77] Huang J, Fan T, Yan Q, et al. Lsh, an epigenetic guardian sis. Japanese Journal of Clinical Oncology. 2000;30(7):306– of repetitive elements. Nucleic Acids Research. 2004;32(17): 309. 5019–5028. [59] Lin C-H, Hsieh S-Y, Sheen I-S, et al. Genome-wide hy- [78] Gibbons RJ, McDowell TL, Raman S, et al. Mutations pomethylation in hepatocellular carcinogenesis. Cancer Re- in ATRX, encoding a SWI/SNF-like protein, cause diverse search. 2001;61(10):4238–4243. changes in the pattern of DNA methylation. Nature Genetics. [60] Suter CM, Martin DI, Ward RL. Hypomethylation of L1 2000;24(4):368–371. retrotransposons in colorectal cancer and adjacent normal [79] Ferreira R, Naguibneva I, Pritchard LL, Ait-Si-Ali S, Harel- tissue. International Journal of Colorectal Disease. 2004;19(2): Bellan A. The Rb/chromatin connection and epigenetic con- 95–101. trol: opinion. Oncogene. 2001;20(24):3128–3133. [61] Kaneda A, Tsukamoto T, Takamura-Enya T, et al. Frequent [80] Geiman TM, Robertson KD. Chromatin remodeling, histone hypomethylation in multiple promoter CpG islands is asso- modifications, and DNA methylation-how does it all fit to- ciated with global hypomethylation, but not with frequent gether? Journal of Cellular Biochemistry. 2002;87(2):117–125. promoter hypermethylation. Cancer Science. 2004;95(1):58– [81] Lund AH, van Lohuizen M. Epigenetics and cancer. Genes & 64. Development. 2004;18(19):2315–2335. [62] Menendez L, Benigno BB, McDonald JF. L1 and HERV-W [82] Fraga MF, Ballestar E, Villar-Garea A, et al. Loss of acetylation retrotransposons are hypomethylated in human ovarian car- at Lys16 and trimethylation at Lys20 of histone H4 is a com- cinomas. Molecular Cancer. 2004;3(1):12. mon hallmark of human cancer. Nature Genetics. 2005;37(4): [63] Jones PA, Baylin SB. The fundamental role of epigenetic 391–400. events in cancer. Nature Reviews. Genetics. 2002;3(6):415– [83] Seligson DB, Horvath S, Shi T, et al. Global histone modi- 428. fication patterns predict risk of prostate cancer recurrence. [64] Ehrlich M. DNA methylation in cancer: too much, but also Nature. 2005;435(7046):1262–1266. too little. Oncogene. 2002;21(35):5400–5413. [84] Martens JH, O’Sullivan RJ, Braunschweig U, et al. The profile [65] Hoffmann MJ, Schulz WA. Causes and consequences of DNA of repeat-associated histone lysine methylation states in the hypomethylation in human cancer. Biochemistry and Cell Bi- mouse epigenome. The EMBO Journal. 2005;24(4):800–812. ology. 2005;83(3):296–321. [85] Wang-Johanning F, Frost AR, Jian B, et al. Detecting the [66] Yang AS, Estecio´ MR, Doshi K, Kondo Y, Tajara EH, Issa J- expression of human endogenous retrovirus E envelope PJ. A simple method for estimating global DNA methylation transcripts in human prostate adenocarcinoma. Cancer. using bisulfite PCR of repetitive DNA elements. Nucleic Acids 2003;98(1):187–197. Research. 2004;32(3):e38. [86] Buscher¨ K, Trefzer U, Hofmann M, Sterry W, Kurth R, [67] Smiraglia DJ, Szymanska J, Kraggerud SM, Lothe RA, Pel- Denner J. Expression of human endogenous retrovirus K tomaki¨ P, Plass C. Distinct epigenetic phenotypes in semino- in melanomas and melanoma cell lines. Cancer Research. matous and nonseminomatous testicular germ cell tumors. 2005;65(10):4172–4180. Oncogene. 2002;21(24):3909–3916. [87] Sugimoto J, Matsuura N, Kinjo Y, Takasu N, Oda T, Jinno [68] Oosterhuis JW, Looijenga LH. Testicular germ-cell tumours Y. Transcriptionally active HERV-K genes: identification, iso- in a broader perspective. Nature Reviews. Cancer. 2005;5(3): lation, and chromosomal mapping. Genomics. 2001;72(2): 210–222. 137–144. [69] Alves G, Tatro A, Fanning TG. Differential methylation of [88] Yi J-M, Kim H-M, Kim H-S. Expression of the human en- human LINE-1 retrotransposons in malignant cells. Gene. dogenous retrovirus HERV-W family in various human tis- 1996;176(1-2):39–44. sues and cancer cells. Journal of General Virology. 2004;85(pt [70] Gotzinger N, Sauter M, Roemer K, Mueller-Lantzsch N. Reg- 5):1203–1210. ulation of human endogenous retrovirus-K Gag expression [89] Ergun¨ S, Buschmann C, Heukeshoven J, et al. Cell type- in teratocarcinoma cell lines and human tumours. Journal of specific expression of LINE-1 open reading frames 1 and 2 in General Virology. 1996;77(pt 12):2983–2990. fetal and adult human tissues. The Journal of Biological Chem- [71] Burden AF, Manley NC, Clark AD, Gartler SM, Laird CD, istry. 2004;279(26):27753–27763. Hansen RS. Hemimethylation and non-CpG methylation [90] Morse B, Rotherg PG, South VJ, Spandorfer JM, Astrin levels in a promoter region of human LINE-1 (L1) repeated SM. Insertional mutagenesis of the myc locus by a LINE-1 12 Journal of Biomedicine and Biotechnology

sequence in a human breast carcinoma. Nature. 1988;333 genomic regions involved in cancers. Proceedings of the Na- (6168):87–90. tional Academy of Sciences of the United States of America. [91] Miki Y, Nishisho I, Horii A, et al. Disruption of the APC gene 2004;101 (9):2999–3004. by a retrotransposal insertion of L1 sequence in a colon can- [108] Kondo Y, Issa JP. Enrichment for histone H3 lysine 9 methy- cer. Cancer Research. 1992;52(3):643–645. lation at Alu repeats in human cells. The Journal of Biological [92] Morrish TA, Gilbert N, Myers JS, et al. DNA repair medi- Chemistry. 2003;278(30):27658–27662. ated by endonuclease-independent LINE-1 retrotransposi- [109] Haoudi A, Semmes OJ, Mason JM, Cannon RE. Retrotrans- tion. Nature Genetics. 2002;31(2):159–165. position-competent human LINE-1 induces apoptosis in [93] Liu J, Nau MM, Zucman-Rossi J, Powell JI, Allegra CJ, Wright cancer cells with intact p53. Journal of Biomedicine and JJ. LINE-I element insertion at the t(11;22) translocation Biotechnology. 2004;2004(4):185–194. breakpoint of a desmoplastic small round cell tumor. Genes, Chromosomes and Cancer. 1997;18(3):232–239. [94] Florl AR, Franke KH, Niederacher D, Gerharz CD, Seifert HH, Schulz WA. DNA methylation and the mechanisms of CDKN2A inactivation in transitional cell carcinoma of the urinary bladder. Laboratory Investigation. 2000;80(10):1513– 1522. [95] Qu GZ, Grundy PE, Narayan A, Ehrlich M. Frequent hy- pomethylation in Wilms tumors of pericentromeric DNA in chromosomes 1 and 16. Cancer Genetics and Cytogenetics. 1999;109(1):34–39. [96] Wong N, Lam WC, Lai PB, Pang E, Lau WY, Johnson PJ. Hy- pomethylation of chromosome 1 heterochromatin DNA cor- relates with q-arm copy gain in human hepatocellular carci- noma. The American Journal of Pathology. 2001;159(2):465– 471. [97] Widschwendter M, Jiang G, Woods C, et al. DNA hy- pomethylation and ovarian cancer biology. Cancer Research. 2004;64(13):4472–4480. [98] Mimori K, Druck T, Inoue H, et al. Cancer-specific chro- mosome alterations in the constitutive fragile region FRA3B. Proceedings of the National Academy of Sciences of the United States of America. 1999;96(13):7456–7461. [99] Florl AR, Schulz WA. Peculiar structure and location of 9p21 homozygous deletion breakpoints in human cancer cells. Genes, Chromosomes and Cancer. 2003;37(2):141–148. [100] Cox C, Bignell G, Greenman C, et al. A survey of homozy- gous deletions in human cancer genomes. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(12):4542–4547. [101] Jones RS, Potter SS. L1 sequences in HeLa extrachromosomal circular DNA: evidence for circularization by homologous recombination. Proceedings of the National Academy of Sci- ences of the United States of America. 1985;82(7):1989–1993. [102] Huang H, Qian J, Proffit J, Wilber K, Jenkins R, Smith DI. FRA7G extends over a broad region: coincidence of human endogenous retroviral sequences (HERV-H) and small poly- dispersed circular DNAs (spcDNA) and fragile sites. Onco- gene. 1998;16(18):2311–2319. [103] Sasaki S, Kitagawa Y, Sekido Y, et al. Molecular processes of chromosome 9p21 deletions in human cancers. Oncogene. 2003;22(24):3792–3798. [104] Varga T, Aplan PD. Chromosomal aberrations induced by double strand DNA breaks. DNA Repair. 2005;4(9):1038– 1046. [105] Viel A, Petronzelli F, Della Puppa P, et al. Different molecular mechanisms underlie genomic deletions in the MLH1 Gene. Human Mutation. 2002;20(5):368–374. [106] Adams J, Williams SV, Aveyard JS, Knowles MA. Loss of het- erozygosity analysis and DNA copy number measurement on 8p in bladder cancer reveals two mechanisms of allelic loss. Cancer Research. 2005;65(1):66–75. [107] Calin GA, Sevignani C, Dumitru CD, et al. Human mi- croRNA genes are frequently located at fragile sites and Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 17142, Pages 1–6 DOI 10.1155/JBB/2006/17142

Research Article LINE-1 Hypomethylation in a Choline-Deficiency-Induced Liver Cancer in Rats: Dependence on Feeding Period

Kiyoshi Asada,1, 2 Yashige Kotake,1 Rumiko Asada,1 Deborah Saunders,1 Robert H. Broyles,1, 3 Rheal A. Towner,1 Hiroshi Fukui,2 and Robert A. Floyd1, 3

1 Free Radical Biology and Aging Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Ok 73104, USA 2 Third Department of Internal Medicine, Nara Medical University, Kashihara, Nara 634-8521, Japan 3 Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA Received 8 June 2005; Revised 28 November 2005; Accepted 4 December 2005 Chronic feeding of methyl-donor (methionine, choline, folic acid, and vitamin B12) deficient diet induces hepatocellular car- cinoma formation in rats. Previous studies have shown that promoter CpG islands in various cancer-related genes are aber- rantly methylated in this model. Moreover, the global genome in methyl-donor-deficient diet fed rats contains a lesser amount of 5-methylcytosine than control livers. It is speculated that more than 90% of all 5-methylcytosines lie within the CpG is- lands of the transposons, including the long/short interspersed nucleotide elements (LINE and SINE). It is considered that the 5-methylcytosines in LINE-1 limit the ability of retrotransposons to be activated and transcribed; therefore, the extent of hy- pomethylation of LINE-1 could be a surrogate marker for aberrant methylation in other tumor-related genes as well as genome instability. Additionally, LINE-1 methylation status has been shown to be a good indicator of genome-wide methylation. In this study, we determined cytosine methylation status in the LINE-1 repetitive sequences of rats fed a choline-deficient (CD) diet for various durations and compared these with rats fed a choline-sufficient (CS) diet. The methylation status of LINE-1 was assessed by the combined bisulfite restriction analysis (COBRA) method, where the amount of bisulfite-modified and RsaI-cleaved DNA was quantified using gel electrophoresis. Progressive hypomethylation was observed in LINE-1 of CD livers as a function of feeding time; that is, the amount of cytosine in total cytosine (methylated and unmethylated) increased from 11.1% (1 week) to 19.3% (56 weeks), whereas in the control CS livers, it increased from 9.2% to 12.9%. Hypomethylation in tumor tissues was slightly higher (6%) than the nontumorous surrounding tissue. The present result also indicates that age is a factor influencing the extent of cytosine methylation.

Copyright © 2006 Kiyoshi Asada et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION involved [4]. Oxidative stress appears to play a major role in this model [4] because there is a significant increase in oxi- When rats are chronically fed a diet devoid of a methyl- dized DNA (8-hydroxy-deoxyguanosine) levels from day 1 of donorsource (choline, methionine, folic acid,and vitamin feeding [4] and because antioxidant cofeeding inhibits cancer B12), they spontaneouslydevelop hepatocellular carcinomas formation [9]. Nevertheless, no direct connection between (HCCs) [1–5]. This is a unique carcinogenesis model in oxidative stress and carcinogenesis has been elucidated. which no known carcinogen is involved. After the initial dis- Genome-wide demethylation of 5-methylcytosine has covery [6] of this model, many questions were raised con- been regarded as a common epigenetic event in malignan- cerning its validity including the possibility of carcinogenic cies and may play a crucial role in carcinogenesis. In the contaminants in the diet. Later, it was definitively demon- rat methyl-donor deficiency models, promoter CpG islands strated that diets lacking in methionine and choline and con- in several cancer-related genes are known to be aberrantly taining no detectable level of carcinogens acted as a complete methylated, as noted by changes in DNA 5-methylcytosine carcinogen [7, 8]. In spite of extensive phenomenological content [10]. Methyl donors including choline and methion- studies, the mechanism by which dietary methyl-donor defi- ine are required for S-adenosyl methionine (SAM) biosyn- ciency causes HCC formation is not understood, but it is sug- thesis [11], and SAM is the substrate for DNA cytosine gested that various concurrent carcinogenic pathways may be methyltransferase, the enzyme responsible for maintaining 2 Journal of Biomedicine and Biotechnology

5’UTR 3’UTR ORF I ORF II

RsaI site

tttggtgagtttgggata ··· GT ACG ··· gttaggtgggtatttttgag

48 bp 115 bp

Figure 1: Genomic structure of LINE-1 in rats. Total length is 6 kb which is composed of three portions, 5’ UTR, ORF I, ORF II, and 3’UTR. Bisulfite PCR was conducted in the 5’UTR sequence. The primer sequences are shown with arrows.

DNA cytosine methylation. Therefore, it is possible to spec- ements have a 5’ untranslated region (UTR) with internal ulate that the dysregulation of Dnmt activity is a cause of promoter activity, two open reading frames (ORFs), a 3’UTR the genome-wide decrease of 5-methylcytosine and aber- that ends in an AATAAA polyadenylation signal, and a poly rant methylation of specific genes. However, the selective Atail[21, 22]. nature of the presence of aberrantly methylated genes sug- gests that other factors are also involved. More than 90% Combined bisulfite restriction analysis of all 5-methylcytosines lie within the CpG islands in the COBRA is a simple method of CpG methylation analysis transposons, including long/short interspersed nucleotide el- which utilizes the cleaving ability of the restriction enzyme ements (LINE and SINE). The presence of 5-methylcytosine RsaI specifically at bisulfite-modified CpG sites [14]. Ge- in LINE-1 is considered to limit the ability of retrotrans- nomic DNA from rat liver tissues was extracted using a posons to be activated and transcribed; therefore, loss of Quiagen’s genomic DNA extraction kit (Qiagen, Valencia, 5-methylcytosine in LINE-1 could result in an increase in CA, USA). Bisulfite modification of genomic DNA was per- retrotransposon activity, leading to propagation of aberrant formed as follows: 3 µg of DNA, digested with the restric- methylation to other genes [12, 13]. Genome-wide instabil- tion enzyme EcoRI, was incubatedwith 0.3 N NaOH in a vol- ity inevitably results from hypomethylation. LINE-1 methy- ume of 20 µL for 15 minutes, and then combined with a lation status has also been shown to be a good indicator of 120 µL portion of 3.6 M sodium bisulfite (Sigma, St Louis, genome-wide methylation [14, 15]. In humans, LINE-1 hy- MO, USA)/0.6 mM hydroquinone (Sigma) (adjusted to pH pomethylation was demonstrated in patients having various 5.0 with NaOH). The bisulfite reaction was performed by cancers [16–18]. In the present study, using the combined utilizing a thermocycler (Perkin Elmer 9600, Boston, MA, bisulfite restriction analysis (COBRA) method [14, 19], we USA) with 15 cycles of 95◦C for 30 seconds followed by evaluated the amount of cytosine/5-methylcytosine in the 50◦C for 15 minutes. The samples were desalted with a Wiz- LINE-1 repetitive sequence in rats fed a choline-deficient ard DNA Clean-Up System (Promega, Madison, WI, USA) (CD) diet for various times. and desulfonated with 5 minutes incubation in 0.3 N NaOH. Bisulfite modified DNA was PCR-amplified with custom- MATERIAL AND METHODS synthesized primers (Molecular Biology Resource Facility, Animals and diets University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA). The primer sequences which correspond to Rats were treated strictly following the animal use protocol the nucleotides in the regulatory region of LINE-1 sequence approved by the Institutional Animal Care and Use Commit- (GenBank: U87600) are as follows: L1bisF, 5’-TTT GGT GAG tee in the Oklahoma Medical Research Foundation. Weaned TTT GGG ATA-3’; L1bisR, 5’-CTC AAA AAT ACC CAC CTA male Fisher 344 rats were obtained from Charles River (Indi- AC-3’. The PCR conditions were 30 cycles of 94◦C30sec- anapolis, IN, USA) and divided into 2 groups each contain- onds, 55◦C 30 seconds, 72◦C 30 seconds. Subsequently, the ffi ◦ ing 3–5 animals and fed a choline su cient (CS) or a choline PCR product was digested with 10 unit of RsaI at 37 C, for deficient (CD) diet (Dyets Inc Bethlehem, IN, USA). Feeding 5 hours, isolated with ethanol-chloroform precipitation, and periods were 1, 4, 24, and 56 weeks. In these experiments, subjected to polyacrylamide gel electrophoresis. The gel was protein in the diet was substituted by a defined amino acid stained with ethidium bromide, and the band intensity in the diet (CSAA or CDAA diet), because the CDAA diet has been fluorogram was analyzed with an imaging workstation (Nu- shown to considerably accelerate carcinogenesis without al- cleoTech Corp, Hayward, CA, USA). The data for densito- tering cancer pathology as compared with the conventional metric analysis is presented as mean ± SE Tests for statistical CD (Lombardi’s CD diet) [20]. significance were evaluated using Student t test.

Sequence of LINE-1 RNA extraction and Semiquantitative RT-PCR The LINE-1 sequence of interest in this study is illustrated Total RNA was extracted from liver tissues of CS- or CD-diet in Figure 1.The consensus sequence revealed that LINE-1 el- fed rats with Qiagen RNeasy kit (Qiagen, Crawley, UK). After Kiyoshi Asada et al 3

M-control 1 2 3 4 5 M-control 1 2 3 4 5 − + − + − + − + − + − + − + − + − + − + − + − +

163 115 163 115 48

5.98.67.49.512.57.3 48 (%) 6.113.312.816.014.615.0 (%) (a) (b)

Figure 2: Typical COBRA for LINE-1 cytosine methylation status in the livers of rats fed CD or CS diet for 4 weeks. Numbers shown at the bottom of each lane indicate the percentage of unmethylated cytosine (115 bp band + 48 bp band) versus total cytosine (163 bp band). Panel A: CS livers (#1–#5) after 4-week feeding; Panel B: CD livers (#1–#5) after 4-week feeding. Isolated genomic DNA was treated with sodium bisulfite and PCR-amplified with LINE-1 primers and divided into two portions. The one portion was digested with RsaI (designated with +), the other was not digested (designated with −), and then both were run on the gel and stained with ethidium bromide. The band at 163 bp was identified as the LINE-1 sequence and those at 115 bp and 48 bp were identified as RsaI-digested LINE-1. The numbers on the top of each pair of lanes indicate the rat identification numbers. Lanes designated with M-control are COBRA treated samples obtained from SssI-hypermethylated rat DNA.

RNAsamplesweretreatedwithDNaseI(Invitrogen,Carls- 56 weeks. Figure 2 illustrates typical COBRA fluorograms bad, CA, USA), RT-PCR was performed with SuperScript III obtainedfromliversofratsfedeitherCDorCSdietfor4 One-Step RT-PCR System (Invitrogen, Carlsbad, CA, USA). weeks. There was a clear tendency for the CD diet to pro- Common GenBank accession number of ORF I and ORF II mote hypomethylation in LINE-1 during the entire feeding sequences in LINE-1 is DQ100473. PCR was performed un- period (1 week to 56 weeks). The amount of unmethylated der the following conditions: annealing temperatures were cytosine in LINE-1 ranged from 9.2% at 1 week to 12.9% at 60◦C for ORF I and 55 ◦C for ORF II, cycle number was 56 weeks in CS livers, while in the CD livers, it ranged from 23, and primer sequences were ORF I forward; 5-AAG AAA 11.1% at 1 week to 19.3% at 56 weeks (Figure 3). There was a CAC CTC CCG TCA CA-3,ORFIreverse;5-CCT CCT statistical significant difference between CS and CD livers at TAT GTT GGG CTT TAC C-3,ORFIIforward;5-CCC 4 and 56 weeks of feeding (Figure 3). ACT CTC TCC CTA CTT A-3,andORFIIreverse;5-TAT  AGA GGA AGG CAA CTG AT-3 . The expression of the glyc- LINE-1 hypomethylation in tumor and nontumor tissues eraldehydes phosphate dehydrogenase gene (GAPDH) was used to normalize the transcript band intensity. After 24 weeks of feeding the CD diet, rats began to have tumor nodules which were histologically identified as ade- RESULTS nomas (19), and at 56 weeks, most of these tumor nodules had developed into HCC. COBRA of LINE-1 DNA for tumor LINE-1 COBRA in artificially methylated DNA and nontumor tissues showed that there is a tendency for the DNA in tumor tissues to be more hypomethylated than non- To confirm the accuracy of COBRA, control genomic DNA tumor tissues, however, a statistically significant difference obtained from rats fed a regular diet was enzymatically was obtained only in the 56-week fed animals (Figure 4). methylated with SssI methylase (Sigma) in the presence of the substrate S-adenosylmethionine, and its LINE-1 pro- LINE-1 transcript expression moter CpG island methylation was determined with CO- BRA. The results indicate that there was 5.9–6.1% undigested The expression of LINE-1 (ORF1 and ORF2) gene transcript DNA (Figure 2, 163 kb band in the lanes marked with M- was assessed with semiquantitative RT-PCR. Densitometry- control +), suggesting that the amount of mutated DNA plus analysis indicated (in arbitrary unit) (1) for ORF I CS(8 SssI methylation resistant cytosine is within this level. weeks →16 weeks): 1.60 ± 0.42 → 1.35 ± 0.49, and CD (8 weeks →16 weeks): 1.40±0.42 → 1.90±0.28, and (2) for ORF LINE-1 COBRA in CD livers II CS(8 weeks →16 weeks): 1.45±0.21 → 1.35±0.07, and CD (8 weeks →16 weeks): 0.75±0.21 → 0.70±0.14. The compar- LINE-1methylationwasanalyzedwithCOBRAinCDand ison of the numbers of 8 weeks and 16 weeks in each group CS livers after various feeding periods, including 1, 4, 24, and indicatesthat there was no increase in LINE-1 transcript 4 Journal of Biomedicine and Biotechnology

25

a, d 20 d

c a, b 15

10 Unmethylated LINE-1 (%) 5

0 1 4 24 56 Feeding period (weeks)

CSAA CDAA

1week(n = 3) 4 weeks (n = 5) 24 weeks (n = 5) 56 weeks (n = 5) CSAA 9.2 ± 1.09.1 ± 1.014.2 ± 1.6c 12.9 ± 1.6 CDAA 11.1 ± 0.714.4 ± 0.6a,b 18.0 ± 1.1d 19.3 ± 0.6a,d %±SE

Figure 3: Percentage of unmethylated cytosine versus total (methylated plus unmethylated) cytosine in LINE-1 sequences in the livers of CSAA- and CDAA-diet fed rats. Actual data are shown in the table under the graph. Symbols: (a) significantly different from CSAA at the same feeding period; (b) significantly different from CDAA at 1 week; (c) significantly different from CSAA at 1 and 4 weeks; (d) significantly different from CDAA at 1 and 4 weeks. %±SE; standard error expression. However, these results may have marginal sta- Southern blotting, Mays-Hoopes et al estimated that there is tistical significance because of the small number of samples an 8% decrease in LINE-1’s 5-methylcyotsine content in the (N = 2foreachgroup). livers of 27-month old mice [26]. Furthermore, it is possible that the age-dependent increase of mutation in LINE-1 could DISCUSSION provide false signals of hypomethylation in this assay. In hu- mans, LINE-1 hypomethylation was detected only in HCC Chronic CD diet is hepatocarcinogenic in male rats and but not in nontumor liver cirrhosis [16]. Also, a recent report global hypomethylation has been shown to exist from early has indicated that in some cancers, such as lymphoma, renal feeding times [10, 23]. In many cancers, global hypomethy- cell carcinoma, and papillary carcinoma of the thyroid, the lation as well as hypo- or hypermethylation in specific genes LINE-1 hypomethylation level is not significantly different are widely accepted epigenetic changes [24]; however, which from normal tissues [18]. The authors suggested that human gene or DNA region responsible for aberrant methylation, cancers may be classified into two groups, a low (0–3.4%) especially in the methyl-deficient diet models, is not clear. LINE-1 hypomethylation group and a moderately high (6.8– Using a rat CD model, we applied the COBRA method to 9.5%) group [18]. We speculate that HCC in this model may analyze LINE-1 methylation. COBRA requires small amount mimic the situation in the low group. The same study also of DNA samples and was previously employed to analyze hu- showed a linear correlation of methylation levels analyzed man LINE-1 hypomethylation [18]. by COBRA with levels determined by using semiquantitative We showed that the LINE-1 promoter was hypomethy- conventional Southern blotting hybridization analysis [18]. lated in the livers of rats fed a CD diet from as early as Florl et al reported that in human urothelial cancer, 4-weeks’ feeding (14.4%)ascomparedtoCSliversofthe there were coordinate changes of LINE-1 and HERV-K DNA same feeding period (9.1%), and that hypomethylation in- methylation, suggesting that hypomethylation affects a va- creased as a function of feeding period up to 19.3% at 56 riety of retroelements to similar extents [12]. Thus, LINE- weeks (Figure 3). It is not unexpected that there was a sig- 1 hypomethylation is thought to be one of the important nificant increase in unmethylated cytosine (or a decrease in surrogate markers of global hypomethylation [14, 15]. In 5-methylcytosine) in LINE-1 of control CS livers as a func- the present study, LINE-1 hypomethylation in CD livers tion of feeding period (12.9% at 56 weeks) because it has increased in a time-dependent fashion from 11.1% at 1 been shown that aging is a major cause of genome-wide hy- week to 19.3% at 56 weeks. The cytosine (demethylated 5- pomethylation in mice and rats [25, 26]. For example, using methylcytosine) level in LINE-1 in this experiment was much Kiyoshi Asada et al 5

30 In conclusion, the degree of hypomethylation in pro- ∗ moter CpG islands in LINE-1 repetitive sequences in the liv- 25 ers of rats fed a CD diet progresses as a function of feeding period. The level of LINE-1 hypomethylation was similar to 20 that found in previous HPLC analysis for genome-wide hy- pomethylation. These results suggest that genome-wide hy- pomethylation occurs because of the choline deficiency diet 15 and that LINE-1 methylation status is a good indicator of such. Moreover, because LINE-1 hypomethylation can acti- 10 vate its retrotransposon activity, it may be the root cause of

Unmethylated LINE-1 (%) aberrant methylation in several cancer-related genes in this 5 model. This notion is yet to be proven.

0 ACKNOWLEDGMENT 24 56 Feeding period (weeks) Support from the National Cancer Institute, National Insti- tutes of Health (R01 CA82506) is gratefully acknowledged. CDAA-nontumor CDAA-tumor REFERENCES n = n = 24 weeks ( 5) 56 weeks ( 5) [1] Poirier LA. The role of methionine in carcinogenesis in vivo. Nontumors 18.0 ± 1.119.3 ± 0.6 Advances in Experimental Medicine and Biology. 1986;206:269– 282. Tumors 23.6 ± 2.425.2 ± 1.9∗ [2] Perera MI, Betschart JM, Virji MA, Katyal SL, Shinozuka H. ± ∗ significant difference (P<.05) %±SE; standard error % SE Free radical injury and liver tumor promotion. Toxicologic Pathology. 1987;15(1):51–59. Figure 4: Percentage of unmethylated cytosine in total (methylated [3] Zeisel SH, da Costa KA, Albright CD, Shin OH. Choline plus unmethylated) cytosine in LINE-1 sequences in tumor or non- and hepatocarcinogenesis in the rat. Advances in Experimen- tumor liver tissue in CD-diet fed rats. Actual data are shown in the tal Medicine and Biology. 1995;375:65–74. table under the graph. Statistically significant differences (∗)are [4] Nakae D. Endogenous liver carcinogenesis in the rat. Pathology seen between nontumor and tumor at 56 weeks. International. 1999;49(12):1028–1042. [5] Ghoshal AK, Farber E. Liver biochemical pathology of cho- line deficiency and of methyl group deficiency: a new orienta- tion and assessment. Histology and Histopathology. 1995;10(2): lower than that of the global genome as determined by using 457–462. an HpaII/MspI-based cytosine extension assay [23], which [6] Copeland DH, Salmon WD. The occurrence of neoplasms in resulted in cytosine content increasing from 46% at 9 weeks the liver, lungs and other tissues of rats as a result of prolonged up to 54% at 36 weeks. However, another study using HPLC choline deficiency. The American Journal of Pathology. 1946;22: analysis indicated that genome-wide unmethylated cytosine 1059–1076. increased from 6% at 8 weeks to 11–14% at 22 weeks [10]. [7] Mikol YB, Hoover KL, Creasia D, Poirier LA. Hepatocarcino- The cause of discrepancies in the three methods is unknown. genesis in rats fed methyl-deficient, amino acid-defined diets. LINE-1 promoter methylation is thought to play an im- Carcinogenesis. 1983;4(12):1619–1629. [8] Ghoshal AK, Farber E. The induction of liver cancer by dietary portant role in transcriptional activation of retrotransposons deficiency of choline and methionine without added carcino- [27]. Active retrotransposition can cause the movement of gens. Carcinogenesis. 1984;5(10):1367–1370. LINE-1 to anywhere within one chromosome and, as such, [9] Nakae D, Kotake Y, Kishida H, et al. Inhibition by phenyl N- could disrupt tumor suppressor genes and/or activate onco- tert-butyl nitrone of early phase carcinogenesis in the livers of genes [28, 29]. Indeed, human colon cancer has been shown rats fed a choline-deficient, L-amino acid-defined diet. Cancer to be associated with APC retrotransposon activity [28]. Research. 1998;58(20):4548–4551. In addition, aberrant LINE-1 methylation may cause spe- [10] Wilson MJ, Shivapurkar N, Poirier LA. Hypomethylation of cific gene modification as well as genomic instability [13, hepatic nuclear DNA in rats fed with a carcinogenic methyl- 30]. In rat methyl-deficiency models, CpG island aberrant deficient diet. The Biochemical Journal. 1984;218(3):987–990. methylation has been seen in several oncogenes and tumor- [11] Shivapurkar N, Poirier LA. Tissue levels of S-adenosylmethio- suppressor genes, such as alpha-fetoprotein (AFP) [31], c- nine and S-adenosylhomocysteine in rats fed methyl-deficient, amino acid-defined diets for one to five weeks. Carcinogenesis. Ha-ras [32], c-Ki-ras [32], c-fos [32], c-myc [33], Dnmt [34], 1983;4(8):1051–1057. glutathione S-transferase pi (GSTP) [35], p16 [36], and pro- [12] Florl AR, Lower¨ R, Schmitz-Drager¨ BJ, Schulz WA. DNA tein tyrosine phosphatase receptor O gene (PTPRO) [37]. Al- methylation and expression of LINE-1 and HERV-K provirus though LINE-1 hypomethylation should be followed by the sequences in urothelial and renal cell carcinomas. British Jour- increase in LINE-1 transcripts, our RT-PCR assessment did nal of Cancer. 1999;80(9):1312–1321. not show the increasing tendency. Previously, the similar dis- [13] Schulz WA, Elo JP, Florl AR, et al. Genomewide DNA hy- crepancy was reported in human liver cancer tissues [38]. pomethylation is associated with alterations on chromosome 8 6 Journal of Biomedicine and Biotechnology

in prostate carcinoma. Genes, Chromosomes and Cancer. 2002; methyl-deficient, amino acid-defined diets. Carcinogenesis. 35(1):58–65. 1992;13(10):1869–1872. [14] Yang AS, Estecio´ MRH, Doshi K, Kondo Y, Tajara EH, Issa J- [33] Tsujiuchi T, Tsutsumi M, Sasaki Y, Takahama M, Konishi Y. PJ. A simple method for estimating global DNA methylation Hypomethylation of CpG sites and c-myc gene overexpression using bisulfite PCR of repetitive DNA elements. Nucleic Acids in hepatocellular carcinomas, but not hyperplastic nodules, Research. 2004;32(3):e38. induced by a choline-deficient L-amino acid-defined diet in [15] Schulz WA. L1 retrotransposons in human cancers. to appear rats. Japanese Journal of Cancer Research. 1999;90(9):909–913. in Journal of Biomedicine & Biotechnology. [34] Lopatina NG, Vanyushin BF, Cronin GM, Poirier LA. Elevated [16] Takai D, Yagi Y, Habib N, Sugimura T, Ushijima T. Hy- expression and altered pattern of activity of DNA methyltrans- pomethylation of LINE1 retrotransposon in human hepato- ferase in liver tumors of rats fed methyl-deficient diets. Car- cellular carcinomas, but not in surrounding liver cirrhosis. cinogenesis. 1998;19(10):1777–1781. Japanese Journal of Clinical Oncology. 2000;30(7):306–309. [35] Steinmetz KL, Pogribny IP, James SJ, Pitot HC. Hypomethyla- [17] Kaneda A, Tsukamoto T, Takamura-Enya T, et al. Frequent hy- tion of the rat glutathione S-transferase pi (GSTP) promoter pomethylation in multiple promoter CpG islands is associated region isolated from methyl-deficient livers and GSTP-positive with global hypomethylation, but not with frequent promoter liver neoplasms. Carcinogenesis. 1998;19(8):1487–1494. hypermethylation. Cancer Science. 2004;95(1):58–64. [36]PogribnyIP,JamesSJ.Denovomethylationofthep16INK4A [18] Chalitchagorn K, Shuangshoti S, Hourpai N, et al. Distinctive gene in early preneoplastic liver and tumors induced by fo- pattern of LINE-1 methylation level in normal tissues and the late/methyl deficiency in rats. Cancer Letters. 2002;187(1- association with carcinogenesis. Oncogene. 2004;23(54):8841– 2):69–75. 8846. [37] Motiwala T, Ghoshal K, Das A, et al. Suppression of the pro- [19] Xiong Z, Laird PW. COBRA: a sensitive and quantitative DNA tein tyrosine phosphatase receptor type O gene (PTPRO)by methylation assay. Nucleic Acids Research. 1997;25(12):2532– methylation in hepatocellular carcinomas. Oncogene. 2003; 2534. 22(41):6319–6331. [20] Nakae D, YoshijiH, Mizumoto Y, et al. High incidence of hepa- [38] Lin C-H, Hsieh S-Y, Sheen I-S, et al. Genome-wide hy- tocellular carcinomas induced by a choline deficient L-amino pomethylation in hepatocellular carcinogenesis. Cancer Re- acid defined diet in rats. Cancer Research. 1992;52(18):5042– search. 2001;61(10):4238–4243. 5045. [21] Ostertag EM, Kazazian HH Jr. Biology of mammalian L1 retrotransposons. Annual Review of Genetics. 2001;35:501– 538. [22] Deininger PL, Batzer MA. Mammalian retroelements. Genome Research. 2002;12(10):1455–1465. [23] Pogribny IP, James SJ, Jernigan S, Pogribna M. Genomic hypomethylation is specific for preneoplastic liver in fo- late/methyl deficient rats and does not occur in non-target tis- sues. Mutation Research/Fundamental and Molecular Mecha- nisms of Mutagenesis. 2004;548(1-2):53–59. [24] Jones PA, Baylin SB. The fundamental role of epigenetic events in cancer. Nature Reviews. Genetics. 2002;3(6):415–428. [25] Dunn BK. Hypomethylation: one side of a larger picture. An- nals of the New York Academy of Sciences. 2003;983:28–42. [26] Mays-Hoopes L, Chao W, Butcher HC, Huang RC. Decreased methylation of the major mouse long interspersed repeated DNA during aging and in myeloma cells. Developmental Ge- netics. 1986;7(2):65–73. [27] Steinhoff C, Schulz WA. Transcriptional regulation of the hu- man LINE-1 retrotransposon L1.2B. Molecular Genetics and Genomics. 2003;270(5):394–402. [28] Miki Y, Nishisho I, Horii A, et al. Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Research. 1992;52(3):643–645. [29] Morse B, Rotherg PG, South VJ, Spandorfer JM, Astrin SM. Insertional mutagenesis of the myc locus by a LINE-1 sequence in a human breast carcinoma. Nature. 1988;333(6168):87–90. [30] Rockwood LD, Felix K, Janz S. Elevated presence of retrotrans- posons at sites of DNA double strand break repair in mouse models of metabolic oxidative stress and MYC-induced lym- phoma. Mutation Research/Fundamental and Molecular Mech- anisms of Mutagenesis. 2004;548(1-2):117–125. [31] Locker J, Hutt S, Lombardi B. alpha-Fetoprotein gene methy- lation and hepatocarcinogenesis in rats fed a choline-devoid diet. Carcinogenesis. 1987;8(2):241–246. [32] Zapisek WF, Cronin GM, Lyn-Cook BD, Poirier LA. The on- set of oncogene hypomethylation in the livers of rats fed Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 37285, Pages 1–8 DOI 10.1155/JBB/2006/37285

Review Article DNA Damage and L1 Retrotransposition

Evan A. Farkash1 and Eline T. Luning Prak1, 2

1 Department of Pathology and Laboratory Medicine, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA 2 405B Stellar Chance Labs, University of Pennsylvania, 422 Curie Boulevard, Philadelphia, PA 19104, USA

Received 22 August 2005; Accepted 16 October 2005 Barbara McClintock was the first to suggest that transposons are a source of genome instability and that genotoxic stress assisted in their mobilization. The generation of double-stranded DNA breaks (DSBs) is a severe form of genotoxic stress that threatens the integrity of the genome, activates cell cycle checkpoints, and, in some cases, causes cell death. Applying McClintock’s stress hypothesis to humans, are L1 retrotransposons, the most active autonomous mobile elements in the modern day human genome, mobilized by DSBs? Here, evidence that transposable elements, particularly retrotransposons, are mobilized by genotoxic stress is reviewed. In the setting of DSB formation, L1 mobility may be affected by changes in the substrate for L1 integration, the DNA repair machinery, or the L1 element itself. The review concludes with a discussion of the potential consequences of L1 mobilization in the setting of genotoxic stress.

Copyright © 2006 E. A. Farkash and E. T. Luning Prak. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

THE CELLULAR RESPONSE TO DNA MOBILIZATION OF TRANSPOSABLE DAMAGE IS COMPLEX ELEMENTS BY DNA DAMAGE While direct evidence for the activation of L1 retrotranspo- There are many chemical agents and natural processes sition by DNA damage is still sparse, there is a growing body that have the ability to damage DNA. UV light, X-rays, of data that other mobile elements can be activated by DNA chemotherapeutic drugs, cigarette smoke, and even cell divi- damage. Barbara McClintock initially observed Ac/Ds ele- sion have the potential to generate DNA lesions [1]. Depend- ment transposition in response to chromosomal transloca- ing on the source of DNA damage, the structure of the DNA tions [6, 7]. Indeed, some transposable elements, including ff break and its mechanism of repair may be di erent. Oxida- P elements in Drosophila and the synthetic Sleeping Beauty tive damage creates DNA double-strand breaks that are re- element, appear to be activated by DNA damage and repair paired by nonhomologous end joining [2]. Nucleotide base processes [8–10]. Mobilization is not limited to DNA trans- damage and dimer formation induced by UV rays during posons: various forms of DNA damage activate retrotrans- sun exposure are repaired by base excision repair [3]. Stalled position of long terminal repeat (LTR) and non-LTR retro- replication forks in dividing cells are repaired by homologous transposons including Gypsy and I factor in Drosophila and recombination [4]. Ty1inyeast[11–16]. Even closer to home for L1, transcrip- Shortly after the induction of a DSB, complex signaling tion and retrotransposition of Alu elements are increased pathways are activated [5]. These signaling cascades recruit when cells are exposed to etoposide, a topoisomerase II in- DNA repair factors to DSBs, alter transcription, and trigger hibitor that produces DSBs [17, 18]. This is relevant to L1s cell fate decisions. Significant damage may trigger cell cycle because Alu elements are thought to co-opt L1 proteins for arrest, or even apoptosis. Various cellular events occurring their mobilization, so increased Alu retrotransposition may secondary to DNA damage may affect L1 retrotransposition. reflect increased L1 mobility [19]. In a genome screen of mice Because the cellular response to genotoxic stress can vary de- exposed to gamma irradiation, new SINE and L1 insertions pending on the type of lesion and cell type, the effects on were detected, but it was unresolved if the frequency of new L1 retrotransposition could depend on the context of DNA insertions was significantly different in irradiated compared damage. to unirradiated controls [20]. 2 Journal of Biomedicine and Biotechnology

3

2 4

1 5

Figure 1: DNA damage can affect multiple stages of the L1 life cycle. (1) Transcription of the L1 element is controlled by epigenetic factors and transcription factors. (2) L1 RNA is exported to the cytoplasm, where its copy number influences retrotransposition frequency. (3) Translation of ORF1 and ORF2 proteins. (4) L1 protein and mRNA are imported into the nucleus, where ORF2 endonuclease creates a DNA double-strand break. Induced breaks may be able to serve as alternative substrates for insertion. (5) ORF2 reverse transcribes a cDNA copy of L1 at the insertion site. Host factors are thought to inhibit or assist in resolution of the insertion. The dark square represents the cell nucleus, and the lighter surrounding square represents the cytoplasm.

OVERVIEW OF THE L1 LIFECYCLE Altering L1 RNA may well alter the abundance of L1 proteins or ribonucleoproteins. Consistent with this idea, The L1 lifecycle provides ample opportunities for regulation cells treated with etoposide exhibited increased Alu RNA, byitshostcell(Figure 1). A full-length RNA encoding the increased reverse transcriptase activity and increased retro- ORF1 and ORF2 proteins is transcribed from a retrotrans- transposition [17, 18]. On the other hand, the expression of position-competent L1. L1 mRNA is exported to the cy- some proteins is not directly correlated with element RNA toplasm where its encoded ORF1 and ORF2 proteins are levels. For example, the TyA1 protein of the Ty1 LTR retro- translated. This protein-RNA complex returns to the nu- transposon in yeast did not accumulate following exposure cleus, where the endonuclease domain of ORF2 nicks the tar- to gamma irradiation, whereas mRNA copy number and get site. The reverse transcriptase domain of ORF2 creates a retrotransposition frequency did increase [15]. cDNA copy using the target site’s 5 overhang as a primer. The notion that RNA or protein levels correlate with Subsequent displacement of the mRNA by a complementary retrotransposition frequency, while direct, may be too sim- strand of cDNA and ligation of the breaks are thought to re- plistic and ignores other, more subtle forms of regulation. quire host machinery. For example, the compartmentalization of L1 proteins may Any or all steps in the retrotransposition process could be be affected by DNA damage. When tagged ORF1 and ORF2 affected by the cellular response to DNA damage. This review proteins are expressed individually or together from a virus, will focus on (1) alterations in the activity of the L1 element, they localize to the nucleolus [21]. This effect has also been primarily by regulation of L1 transcription; (2) alterations in seen in yeast retroelements, where a protein tagged with the L1 entry into the genome, with emphasis on insertion into Ty3 retrotransposon integrase domain is targeted to the nu- pre-existing DSBs, and (3) alterations in cellular factors in cleolus [22]. As Goodier and colleagues point out, L1 could response to DNA damage, in particular DNA repair machin- traffic through the nucleolus; this idea is supported by the ery and its effect on L1 retrotransposition. presence of chimeric transcripts of L1s fused to small RNA species such as U6, U3, U5, and 5S (reviewed in [21]). In the ALTERED ACTIVITY OF L1s IN THE setting of DNA damage, nucleolar protein trafficking path- SETTING OF DNA DAMAGE ways are altered (reviewed in [23]). For example, PML and The division of potential causes of L1 mobilization in the set- Mdm2 are sequestered in the nucleolus following DNA dam- ting of DNA damage into L1-intrinsic versus extrinsic factors age [24]. The sequestration of Mdm2 results in enhanced p53 ffi is admittedly an arbitrary one. It is unlikely that an “element stability [24]. Tra cking of L1 through the nucleolus there- intrinsic” property such as the level of L1 RNA or protein fore may be altered in the setting of genotoxic stress and rep- is altered without concomitant alterations in cellular factors resents a potential pathway for regulating L1 mobility. that influence L1. This section is focused on L1 RNA not be- cause RNA is necessarily the most likely point of regulation L1 RNA LEVELS AND GENOTOXIC STRESS (although it is a reasonable target, as discussed below), but because there are more abundant data pointing to a poten- Several lines of evidence suggest that L1 RNA abundance tial role for regulation of L1 RNA in the context of genotoxic is critical and rate-limiting for L1 retrotransposition. L1 stress. RNA is required for retrotransposition, not only because it E. A. Farkash and E. T. Luning Prak 3 encodes the machinery needed for L1 to retrotranspose, but transcripts while YY1’s effects on p53 might enhance the sur- because the RNA itself serves as a replication intermediate vival of cells that harbor new L1 insertions. (Figure 1). That L1 transcript abundance is rate-limiting for As is discussed elsewhere in this issue, L1 RNA levels can retrotransposition is suggested by studies in cultured cells also be influenced by epigenetic regulation. Focusing here on with tagged L1 elements showing that decreased L1 mRNA CpG methylation as a mode of transcriptional silencing of levels result in reduced retrotransposition [25]. Conversely, L1s, negative regulation of L1 retrotransposition by this form increasedL1RNAlevelshavebeenobservedforhighlyac- of “methylation defense” predicts that L1s are methylated tive L1 elements [26]. Furthermore, the correlation between and that demethylation derepresses L1s. Consistent with RNA levels and retrotransposition frequency is not unique to methylation defense, the L1 5UTR has been shown to un- L1 retrotransposons: Ty1 elements in yeast appear to retro- dergo methylation and methylation has a negative effect on transpose in direct proportion to the amount of Ty1 mRNA L1 promoter activity [42] and retrotransposition using a cul- [15, 27, 28]. tured cell assay [43]. This effect may be mediated by methyl- Given that RNA is important for L1 mobility, does DNA CpG-binding protein 2 (MeCP2), which inhibits retrotrans- damage influence L1 transcript abundance? To our knowl- position in the cultured cell assay [44]. Oxidative damage has edge, there are no published data that compare L1 RNA lev- been shown to decrease the affinity of MeCP2 for damaged els in irradiated and unirradiated cells. However, there is ev- methylated DNA [45]. DNA damage near an L1 element may idence that RNA levels of other retrotransposons are influ- therefore release it from negative regulation. enced by DNA damage. For example, gamma radiation has DNA damage may also play a role in regulating global been shown to increase Ty1 RNA in yeast [15]andIAPRNA methylation of genomic L1s. Gamma radiation has been in murine myeloid cells [29]. Furthermore, murine and hu- shown to induce hypomethylation in cell lines [46] and in man cell lines expressing the Bcl-2 survival gene exhibit an mouse livers, and spleens [47]. One potential mechanism for increase in endogenous Alu mRNA levels following exposure hypomethylation in the setting of irradiation is an alteration to gamma radiation, UV, etoposide, and cisplatin [17]. in the folate pool. Gamma radiation has been shown to re- Since the induction of DNA damage has an extensive ef- duce the activity of the enzyme methylenetetrahydrofolate fect on the transcriptional profile of a cell [30], it is plau- reductase in the livers of mice [48]. A polymorphism asso- sible that L1 RNA levels are differentially regulated follow- ciated with reduced activity of this enzyme has been linked ing gamma radiation. One way to regulate L1 expression fol- to hypomethylation and gastric cancer susceptibility in hu- lowing DNA damage is to alter transcription factor levels mans [49]. Another possibility is that irradiation influences or binding activity. The 5UTR of the L1 contains an inter- the expression of DNA methyltransferases. Hypomethyla- nal promoter element [31–33] with putative binding sites for tion in transformed cell lines has been associated with de- SRY family members [34], YY1 [35], and RUNX3 [25]. DNA creased expression of the DNA methyltransferases DNMT1, damage could modulate L1 activity by acting through factors DNMT3a, and DNMT3b [50] and mobilization of retro- that bind these sites. transposons has been linked to methyltransferase deficiency. Binding of the SRY family member, SOX11, to the L1 For example, methylation of the LTR retrotransposon IAP is 5UTR was shown to increase L1 retrotransposition, pro- diminished and transcription is activated in Dnmt1 deficient moter activity, and RNA copy number [34]. More recently, mouse embryos [51]. Mouse knockouts of Dnmt3L demethy- binding of SOX2 has been shown to inhibit L1 promoter ac- late genomic L1 insertions and exhibit greatly increased lev- tivity in rat hippocampal neuronal stem cells [36]. SOX2 and elsofL1mRNAintheirgermcells[52]. If by either or both SOX11 possess high-mobility group domains, which have mechanisms widespread demethylation occurs, L1s could be been shown to bind to cisplatin-DNA adducts [37]. If SRY globally activated following DNA damage. family members are differentially recruited to the sites of DNA damage, then this could alter the profile of transcrip- ALTEREDL1ENTRYINTOTHEGENOME tion factors at the L1 5 UTR. DURING DNA DAMAGE Another L1 transcription factor that may be affected by DNA damage is the ubiquitous YinYang1 (YY1) factor. Insertion of an L1 copy into the genome necessitates the YY1 is thought to facilitate the production of full-length L1 creation and repair of broken DNA. Based on the ele- mRNAs [38]. In response to exposure to methyl-N-nitro- gant work from Tom Eickbush’s group on the non-LTR N-nitrosoguanidine, YY1 was polyADP-ribosylated in HeLa retrotransposon R2Bm and recent findings using an in cells, decreasing its ability to bind its consensus target se- vitro L1 system, the L1 endonuclease is believed to nick quences [39]. YY1 has also been shown to be a negative regu- DNA in a staggered fashion creating overhanging single- lator of p53 activation under conditions of genomic stress in stranded DNA [53–55]. After L1 integration, the DNA primary and cancer cell lines [40]. This is interesting given ends are sealed and filled in, forming the target site du- that L1 activity is itself thought to be a genomic stressor plications that flank a typical L1 insertion (steps 4 and 5, that induces apoptosis using a p53-dependent mechanism Figure 1). On the other hand, what happens to L1 integra- [41]. Under conditions of DNA damage, YY1 could therefore tionifacellissubjectedtoDNAdamage(Figure 2)? The have opposing effects on the retrotransposition frequency: presence of broken DNA may allow L1 to integrate into decreased YY1 binding could result in fewer full-length L1 preformed breaks in an endonuclease-independent fashion 4 Journal of Biomedicine and Biotechnology

AAAAAAAA

A

A A AAAAAAA T T T T T T T A T A A A A A

(a)

A ** A A A A * A A A

(b)

AAAAAAAA

A

A AAAAAAAA

T T T T T T T A T A A A A A

(c)

Figure 2: Potential ways in which DNA damage could influence L1 retrotransposition. (a) Endonuclease-dependent insertion under normal conditions. The L1 endonuclease () makes staggered nicks at the target site, creating 3overhangs. Filling in generates 7–20 target

site duplications ( ) flanking the insertion. (b) Endonuclease-independent insertion at the site of a double-strand break. The preexisting double-strand break shown here lacks staggered nicks or overhangs. L1 entry into this site would therefore also lack target site duplications. Genomic deletions may occur due to processing by cellular DNA repair processes (∗). (c) Endonuclease-dependent insertion potentiated

by DNA damage. DNA damage may upregulate cellular cofactors of reverse transcription and integration (Ð). Insertion via pathway c is endonuclease-dependent, but occurs at an increased efficiency.

(Figure 2(b)). Alternatively or in addition, enzymes used by deletions at the site of integration), while fully functional L1s the cell to repair damaged DNA may aid (or inhibit) L1 retro- generated fewer of these “atypical” insertions [61–64]. transposition (Figure 2(c)). Retrotransposons can use artificially induced DNA CELLULAR COFACTORS AND INHIBITORS OF breaks as substrates for insertion. Yeast with deficiencies in RETROTRANSPOSITION homologous recombination machinery occasionally capture Ty1 cDNA during repair of breaks introduced at the MAT Cellular proteins involved in the response to DNA damage, locus [56, 57]. Using a plasmid-based assay in which DNA particularly those of the nonhomologous end joining cascade breaks repaired by captured cDNA are selectively recovered, (NHEJ), may act as cofactors or inhibitors of retrotransposi- Yu and Gabriel found that 21 out of 37 captured sequences tion. Transcription of NHEJ factors including Ku70 and its were derived from Ty1 elements [58]. Furthermore, in mouse partner Ku80 are up-regulated following exposure to gamma cells both LTR retrotransposons and SINE elements were able radiation [65]. Furthermore, many of these repair factors to repair a break induced by the restriction enzyme I-SceI colocalize at the sites of double-strand breaks [66]andhave [59]. altered bioavailability following DNA damage [67]. There- Collectively, these studies indicate that retrotransposons fore it seems reasonable to propose that modulation and al- can integrate into broken DNA. But does this happen fre- tered subcellular distribution of DNA repair enzymes in the quently? The previously described experiments used genetic setting of genotoxic stress could influence L1 retrotransposi- screenstolookforwhatmayhavebeenrareevents.Un- tion. der the conditions of the cell culture L1 retrotransposition The contribution of DNA repair factors to the mobi- assay, mutation of the L1 endonuclease active site reduced lization of DNA transposons has been investigated by sev- the retrotransposition frequency to ∼ 1% of wild-type lev- eral groups. In Drosophila, the P element transposase pos- els [60]. This result suggests that L1 usually uses its own sesses putative phosphorylation sites for the ataxia telangec- endonuclease to gain entry into the genome. However, in tasia mutation protein (ATM), a master control kinase of the the setting of DNA repair enzyme deficiency (DNA-PKcs or DNA damage response [68]. Mutation of specific ATM sites XRCC4 deficiency in particular) L1s lacking endonuclease increased or in some cases decreased excision of these ele- exhibited greatly increased rates of retrotransposition [61]. ments. The DNA repair protein Ku70 and the Bloom heli- L1s lacking endonuclease generated genomic insertions in re- case, both downstream of ATM [69, 70], have been shown pair deficient cells with atypical structures (including large to be important for repair of P element excision sites [71]. E. A. Farkash and E. T. Luning Prak 5

Ku70 is also important for repair of Sleeping Beauty excision molecules such as WRN and Artemis [86, 87]. Evidence from in mammalian cells [9].Inasurveyofmultiplerepairfac- Caenorhabditis elegans (mut7) and mammals (Artemis), in- tors, deficiencies in the NHEJ factors Ku80, DNA-PKcs, and dicates that deficiencies of these enzymes can mobilize trans- XRCC4 and the homologous recombination factors Rad51C posable elements [88], (E.A. Farkash and E.T. Luning Park, and XRCC3 decreased Sleeping Beauty mobility in mam- unpublished data). Enhanced mobility, coupled with oppor- malian cells [10]. Reconstitution of the knockout reversed tunity may cause mobile elements to assist with telomere the phenotype, and even increased transposition above wild- maintenance under conditions of genotoxic stress. type levels for DNA-PKcs [10]. On the other hand, with the exception of HeT-A and DNA repair factors also influence the mobility of retro- TART in Drosophila, preferential insertion into chromosome transposons. A mutagenesis screen for inhibitors of Ty1 ends does not necessarily translate into a beneficial function retrotransposition revealed genes that help maintain ge- for the element. Insertion into telomeres could be less dis- nomic integrity including telomerase, a yeast homologue of ruptive than inserting elsewhere, giving elements with this Bloom, and components of the NBS complex [72]. Rad3 insertion site preference a proliferative advantage. If, as is and Ssl2, helicases involved in nucleotide excision repair, widely presumed, L1 integration is random, then increasing appear to inhibit Ty1 retrotransposition post-translationally its mobility will most likely have neutral or negative conse- [73]. Potential cofactors for Ty1 retrotransposition are the quences for the host cell. Even simply upregulating the L1 Ku repair factors. Ku70 protein coprecipitates with Ty1 endonuclease in the absence of successful integration could cDNA, cofractionates with Ty1 retrotransposition interme- be toxic to the cell by promoting the formation of additional diates, and deficiency in both Ku70 and Ku80 dramatically DSBs, fostering chromosomal rearrangements, and translo- decreases retrotransposition [74]. cations. The consequences of L1 integration into preformed There is also evidence linking NHEJ machinery to the DNA breaks in the setting of genotoxic stress could be severe regulation of L1s. Ku70/80 binding sites have been iden- in that such insertions may be more likely to be accompa- tified in murine L1s: L1s make up 19% of the mouse nied by large deletions [61]. In this regard it is worth noting genome, but account for 26% of the Ku70/80 binding sites that pathogenic insertions in chimpanzees and humans have [75].CelllinesdeficientinDNA-PKcspermitlowerrates been associated with large deletions [89]. A meta-analysis of endonuclease-dependent L1 retrotransposition than their of human pathogenic insertions found 6 out of 48 (12.5%) wild-type parentals, while XRCC4 mutants permit higher were associated with large deletions, compared to 5 out of rates of L1 retrotransposition [61]. Repair enzyme deficiency 145 (3.4%) polymorphic genomic insertions [90]and6out could affect L1 retrotransposition via multiple pathways. In- of 100 insertions characterized in a cell-culture-based retro- creased persistence of unrepaired double-strand breaks could transposition assay [64]. Severe DNA damage can result in serve as substrates for insertion and increase endonuclease- cell cycle arrest and apoptosis [91]. Both cell cycle arrest and independent insertion (Figure 2(b)). On the other hand, a apoptosis have been seen to accompany retrotransposition dearth of DNA repair enzymes might hinder the resolution in severely stressed cells [28, 41]. Retrotransposition in a cell of L1 insertions. The loss or altered availability of inhibitors with damaged DNA could be its final undoing. The poten- could, conversely, promote retrotransposition. tial lethality of genotoxic stress may help to account for the paucity of endonuclease-independent insertions among L1s present in the human genome. POTENTIAL CONSEQUENCES OF L1 ACTIVATION DURING GENOTOXIC STRESS ACKNOWLEDGMENT Increased retrotransposition in the setting of genetic dam- age could have a beneficial effect on the cell: L1 insertion This study was supported by NIH grants T32H10791 and into the site of a DSB could form a bridge between chromo- CA108812. some fragments, sealing an otherwise irreparable break [76]. Consistent with this idea, activity of the L1-like NL1Tc ele- REFERENCES ment resulted in decreased unrepaired DNA breaks and en- hanced survival of Trypanosoma cruzi exposed to daunoru- [1] Hoeijmakers JHJ. Genome maintenance mechanisms for pre- bicin [77]. Moreover, retrotransposons may have been co- venting cancer. Nature. 2001;411(6835):366–374. opted over the course of evolution to play a role in special- [2] Jackson SP. Sensing and repairing DNA double-strand breaks. ized DNA repair functions. An example of this is the prefer- Carcinogenesis. 2002;23(5):687–696. ential insertion into and maintenance of telomere ends. Mo- [3] Lankinen MH, Vilpo LM, Vilpo JA. UV- and γ-irradiation- bile elements with an insertion site preference for telomere or induced DNA single-strand breaks and their repair in hu- subtelomeric regions have been identified in Saccharomyces man blood granulocytes and lymphocytes. Mutation Re- search/Fundamental and Molecular Mechanisms of Mutagene- cerevisiae, Chlorella vulgaris, Bombyx mori, Allium cepa,and sis. 1996;352(1-2):31–38. Giardia lamblia [78–82]. In Drosophila, the non-LTR retro- [4] McGlynn P, Lloyd RG. Recombinational repair and restart of transposons HeT-A and TART not only preferentially in- damaged replication forks. Nature Reviews. Molecular Cell Bi- sert at chromosome ends, but play a direct role in telomere ology. 2002;3(11):859–870. maintenance [83–85]. In other animals including humans, [5] Ward I, Chen J. Early events in the DNA damage response. telomere maintenance relies on telomerase and DNA repair Current Topics in Developmental Biology. 2004;63:1–35. 6 Journal of Biomedicine and Biotechnology

[6] McClintock B. The origin and behavior of mutable loci in [24] Bernardi R, Scaglioni PP, Bergmann S, Horn HF, Vousden maize. Proceedings of the National Academy of Sciences of the KH, Pandolfi PP. PML regulates p53 stability by sequestering United States of America. 1950;36(6):344–355. Mdm2 to the nucleolus. Nature Cell Biology. 2004;6(7):665– [7] McClintock B. The significance of responses of the genome to 672. challenge. Science. 1984;226(4676):792–801. [25] Yang N, Zhang L, Zhang Y, Kazazian HH Jr. An important role [8] Handler AM, Gomez SP. P element excision in Drosophila is for RUNX3 in human L1 transcription and retrotransposition. stimulated by gamma-irradiation in transient embryonic as- Nucleic Acids Research. 2003;31(16):4929–4940. says. Genetical Research. 1997;70(1):75–78. [26] Han JS, Boeke JD. A highly active synthetic mammalian retro- [9] Yant SR, Kay MA. Nonhomologous-end-joining factors regu- transposon. Nature. 2004;429(6989):314–318. late DNA repair fidelity during sleeping beauty element trans- [27]CurcioMJ,HedgeAM,BoekeJD,GarfinkelDJ.TyRNAlevels position in mammalian cells. Molecular and Cellular Biology. determine the spectrum of retrotransposition events that ac- 2003;23(23):8505–8518. tivate gene expression in Saccharomyces cerevisiae. Molecular [10] Izsvak´ Z, Stuwe¨ EE, Fiedler D, Katzer A, Jeggo PA, Ivics Z. and General Genetics: MGG. 1990;220(2):213–221. Healing the wounds inflicted by sleeping beauty transposition [28] Staleva LS, Venkov P. Activation of Ty transposition by muta- by double-strand break repair in mammalian somatic cells. gens. Mutation Research/Fundamental and Molecular Mecha- Molecular Cell. 2004;13(2):279–290. nisms of Mutagenesis. 2001;474(1-2):93–103. [11] Georgiev PG, Korochkina SE, Georgieva SG, Gerasimova [29] Ishihara H, Tanaka I, Furuse M, Tsuneoka K. Increased expres- TI. Mitomycin C induces genomic rearrangements involving sion of intracisternal A-particle RNA in regenerated myeloid transposable elements in Drosophila melanogaster. Molecular cells after X irradiation in C3H/He inbred mice. Radiation Re- and General Genetics: MGG. 1990;220(2):229–233. search. 2000;153(4):392–397. [12] Bradshaw VA, McEntee K. DNA damage activates transcrip- [30] Jen K-Y, Cheung VG. Transcriptional response of lym- tion and transposition of yeast Ty retrotransposons. Molecular phoblastoid cells to ionizing radiation. Genome Research. and General Genetics: MGG. 1989;218(3):465–474. 2003;13(9):2092–2100. [13] Morawetz C, Hagen U. Effectofirradiationandmuta- [31] Swergold GD. Identification, characterization, and cell speci- genic chemicals on the generation of ADH2- and ADH4- ficity of a human LINE-1 promoter. Molecular and Cellular Bi- constitutive mutants in yeast: the inducibility of Ty trans- ology. 1990;10(12):6718–6729. position by UV and ethyl methanesulfonate. Mutation Re- [32] Minakami R, Kurose K, Etoh K, Furuhata Y, Hattori M, Sakaki search/Fundamental and Molecular Mechanisms of Mutagene- Y. Identification of an internal cis-element essential for the hu- sis. 1990;229(1):69–77. man L1 transcription and a nuclear factor(s) binding to the [14] Morawetz C. Effect of irradiation and mutagenic chemicals on element. Nucleic Acids Research. 1992;20(12):3139–3145. the generation of ADH2-constitutive mutants in yeast. Signif- [33] Mathias SL, Scott AF. Promoter binding proteins of an active icance for the inducibility of Ty transposition. Mutation Re- human L1 retrotransposon. Biochemical and Biophysical Re- search/Fundamental and Molecular Mechanisms of Mutagene- search Communications. 1993;191(2):625–632. sis. 1987;177(1):53–60. [34] Tchenio´ T, Casella J-F, Heidmann T. Members of the SRY fam- [15] Sacerdot C, Mercier G, Todeschini A-L, Dutreix M, Springer ily regulate the human LINE retrotransposons. Nucleic Acids M, Lesage P. Impact of ionizing radiation on the life cy- Research. 2000;28(2):411–415. cle of Saccharomyces cerevisiae Ty1 retrotransposon. Yeast. [35] Becker KG, Swergold GD, Ozato K, Thayer RE. Binding of the 2005;22(6):441–455. ubiquitous nuclear transcription factor YY1 to a cis regulatory [16] Bregliano JC, Laurencon A, Degroote F. Evidence for an in- sequence in the human LINE-1 transposable element. Human ducible repair-recombination system in the female germ line Molecular Genetics. 1993;2(10):1697–1702. of Drosophila melanogaster. I. Induction by inhibitors of nu- [36] Muotri AR, Chu VT, Marchetto MCN, Deng W, Moran JV, cleotide synthesis and by gamma rays. Genetics. 1995;141(2): Gage FH. Somatic mosaicism in neuronal precursor cells me- 571–578. diated by L1 retrotransposition. Nature. 2005;435(7044):903– [17] Rudin CM, Thompson CB. Transcriptional activation of short 910. interspersed elements by DNA-damaging agents. Genes, Chro- [37] Trimmer EE, Zamble DB, Lippard SJ, Essigmann JM. Human mosomes and Cancer. 2001;30(1):64–71. testis-determining factor SRY binds to the major DNA adduct [18] Hagan CR, Sheffield RF, Rudin CM. Human Alu element of cisplatin and a putative target sequence with comparable retrotransposition induced by genotoxic stress. Nature Genet- affinities. Biochemistry. 1998;37(1):352–362. ics. 2003;35(3):219–220. [19] Dewannieux M, Esnault C, Heidmann T. LINE-mediated [38] Athanikar JN, Badge RM, Moran JV. A YY1-binding site is retrotransposition of marked Alu sequences. Nature Genetics. required for accurate human LINE-1 transcription initiation. 2003;35(1):41–48. Nucleic Acids Research. 2004;32(13):3846–3855. [20] Asakawa J, Kuick R, Kodaira M, et al. A genome scanning ap- [39] Oei SL, Shi Y. Poly(ADP-ribosyl)ation of transcription factor proach to assess the genetic effects of radiation in mice and Yin Yang 1 under conditions of DNA damage. Biochemical and humans. Radiation Research. 2004;161(4):380–390. Biophysical Research Communications. 2001;285(1):27–31. [21] Goodier JL, Ostertag EM, Engleka KA, Seleme MC, Kazazian [40] Gronroos¨ E, Terentiev AA, Punga T, Ericsson J. YY1 inhibits HH Jr. A potential role for the nucleolus in L1 retrotransposi- the activation of the p53 tumor suppressor in response to tion. Human Molecular Genetics. 2004;13(10):1041–1048. genotoxic stress. Proceedings of the National Academy of Sci- [22] Lin SS, Nymark-McMahon MH, Yieh L, Sandmeyer SB. Inte- ences of the United States of America. 2004;101(33):12165– grase mediates nuclear localization of Ty3. Molecular and Cel- 12170. lular Biology. 2001;21(22):7826–7838. [41] Haoudi A, Semmes OJ, Mason JM, Cannon RE. Retrotrans- [23] Zimber A, Nguyen Q-D, Gespach C. Nuclear bodies and com- position-competent human LINE-1 induces apoptosis in can- partments: functional roles and cellular signalling in health cer cells with intact p53. Journal of Biomedicine and Biotech- and disease. Cellular Signalling. 2004;16(10):1085–1104. nology. 2004;2004(4):185–194. E. A. Farkash and E. T. Luning Prak 7

[42] Hata K, Sakaki Y. Identification of critical CpG sites for [59] Lin Y, Waldman AS. Capture of DNA sequences at double- repression of L1 transcription by DNA methylation. Gene. strand breaks in mammalian chromosomes. Genetics. 2001; 1997;189(2):227–234. 158(4):1665–1674. [43] Woodcock DM, Lawler CB, Linsenmeyer ME, Doherty JP, [60] Feng Q, Moran JV, Kazazian HH Jr, Boeke JD. Human L1 Warren WD. Asymmetric methylation in the hypermethylated retrotransposon encodes a conserved endonuclease required CpGpromoterregionofthehumanL1retrotransposon.The for retrotransposition. Cell. 1996;87(5):905–916. Journal of Biological Chemistry. 1997;272(12):7810–7816. [61] Morrish TA, Gilbert N, Myers JS, et al. DNA repair mediated [44] Yu F, Zingler N, Schumann G, Stratling¨ WH. Methyl-CpG- by endonuclease-independent LINE-1 retrotransposition. Na- binding protein 2 represses LINE-1 expression and retrotrans- ture Genetics. 2002;31(2):159–165. position but not Alu transcription. Nucleic Acids Research. [62] Gilbert N, Lutz-Prigge S, Moran JV.Genomic deletions created 2001;29(21):4493–4501. upon LINE-1 retrotransposition. Cell. 2002;110(3):315–325. [45] Valinluck V, Tsai H-H, Rogstad DK, Burdzy A, Bird A, Sow- [63] Symer DE, Connelly C, Szak ST, et al. Human l1 retrotrans- ers LC. Oxidative damage to methyl-CpG sequences inhibits position is associated with genetic instability in vivo. Cell. the binding of the methyl-CpG binding domain (MBD) of 2002;110(3):327–338. methyl-CpG binding protein 2 (MeCP2). Nucleic Acids Re- [64] Gilbert N, Lutz S, Morrish TA, Moran JV. Multiple fates of search. 2004;32(14):4100–4108. L1 retrotransposition intermediates in cultured human cells. [46] Kalinich JF, Catravas GN, Snyder SL. The effect of gamma ra- Molecular and Cellular Biology. 2005;25(17):7780–7795. diation on DNA methylation. Radiation Research. 1989;117 [65] Brodsky MH, Weinert BT, Tsang G, et al. Drosophila (2):185–197. melanogaster MNK/Chk2 and p53 regulate multiple DNA re- [47] Pogribny I, Raiche J, Slovack M, Kovalchuk O. Dose- pair and apoptotic pathways following DNA damage. Molecu- dependence, sex- and tissue-specificity, and persistence of lar and Cellular Biology. 2004;24(3):1219–1231. radiation-induced genomic DNA methylation changes. Bio- [66] Rapp A, Greulich KO. After double-strand break induction by chemical and Biophysical Research Communications. 2004;320 UV-A, homologous recombination and nonhomologous end (4):1253–1261. joining cooperate at the same DSB if both systems are avail- [48] Batra V, Kesavan V, Mishra KP. Modulation of enzymes able. Journal of Cell Science. 2004;117(pt 21):4935–4945. involved in folate dependent one-carbon metabolism by γ [67] Drouet J, Delteil C, Lefranc¸ois J, Concannon P, Salles B, Cal- -radiation stress in mice. Journal of Radiation Research. sou P.DNA-dependent protein kinase and XRCC4-DNA ligase 2004;45(4):527–533. IV mobilization in the cell in response to DNA double strand [49] Graziano F, Kawakami K, Ruzzo A, et al. Methylenetetrahy- breaks. The Journal of Biological Chemistry. 2005;280(8):7060– drofolate reductase 677C/T gene polymorphism, gastric can- 7069. cer susceptibility and genomic DNA hypomethylation in an [68] Beall EL, Mahoney MB, Rio DC. Identification and analysis at-risk Italian population. International Journal of Cancer. of a hyperactive mutant form of Drosophila P-element trans- 2006;118(3):628–632. posase. Genetics. 2002;162(1):217–227. [50] Raiche J, Rodriguez-Juarez R, Pogribny I, Kovalchuk O. Sex- [69] Brown KD, Lataxes TA, Shangary S, et al. Ionizing radiation and tissue-specific expression of maintenance and de novo exposure results in up-regulation of Ku70 via a p53/ataxia- DNA methyltransferases upon low dose X-irradiation in telangiectasia-mutated protein-dependent mechanism. The mice. Biochemical and Biophysical Research Communications. Journal of Biological Chemistry. 2000;275(9):6651–6656. 2004;325(1):39–47. [70] Ababou M, Dutertre S, Lecluse´ Y, Onclercq R, Chatton B, [51] Walsh CP, Chaillet JR, Bestor TH. Transcription of IAP en- Amor-Gueret´ M. ATM-dependent phosphorylation and accu- dogenous retroviruses is constrained by cytosine methylation. mulation of endogenous BLM protein in response to ionizing Nature Genetics. 1998;20(2):116–117. radiation. Oncogene. 2000;19(52):5955–5963. [52] Bourc’his D, Bestor TH. Meiotic catastrophe and retrotrans- [71] Min B, Weinert BT, Rio DC. Interplay between Drosophila poson reactivation in male germ cells lacking Dnmt3L. Nature. Bloom’s syndrome helicase and Ku autoantigen during non- 2004;431(7004):96–99. homologous end joining repair of P element-induced DNA [53] Luan DD, Korman MH, Jakubczak JL, Eickbush TH. Reverse breaks. Proceedings of the National Academy of Sciences of the transcription of R2Bm RNA is primed by a nick at the chro- United States of America. 2004;101(24):8906–8911. mosomal target site: a mechanism for non-LTR retrotranspo- [72] Scholes DT, Banerjee M, Bowen B, Curcio MJ. Multi- sition. Cell. 1993;72(4):595–605. ple regulators of Ty1 transposition in Saccharomyces cere- [54] Cost GJ, Feng Q, Jacquier A, Boeke JD. Human L1 element visiae have conserved roles in genome maintenance. Genetics. target-primedreversetranscriptioninvitro.The EMBO Jour- 2001;159(4):1449–1465. nal. 2002;21(21):5899–5910. [73] Lee BS, Lichtenstein CP, Faiola B, et al. Posttranslational in- [55] Christensen SM, Eickbush TH. R2 target-primed reverse tran- hibition of Ty1 retrotransposition by nucleotide excision re- scription: ordered cleavage and polymerization steps by pro- pair/transcription factor TFIIH subunits Ssl2p and Rad3p. Ge- tein subunits asymmetrically bound to the target DNA. Molec- netics. 1998;148(4):1743–1761. ular and Cellular Biology. 2005;25(15):6617–6628. [74] Downs JA, Jackson SP. Involvement of DNA end-binding pro- [56] Moore JK, Haber JE. Capture of retrotransposon DNA at tein Ku in Ty element retrotransposition. Molecular and Cellu- the sites of chromosomal double-strand breaks. Nature. lar Biology. 1999;19(9):6260–6268. 1996;383(6601):644–646. [75] Katz DJ, Beer MA, Levorse JM, Tilghman SM. Functional [57] Teng S-C, Kim B, Gabriel A. Retrotransposon reverse- characterization of a novel Ku70/80 pause site at the H19/Igf2 transcriptase-mediated repair of chromosomal breaks. Nature. imprinting control region. Molecular and Cellular Biology. 1996;383(6601):641–644. 2005;25(10):3855–3863. [58] Yu X, Gabriel A. Patching broken chromosomes with extranu- [76] Eickbush TH. Repair by retrotransposition. Nature Genetics. clear cellular DNA. Molecular Cell. 1999;4(5):873–881. 2002;31(2):126–127. 8 Journal of Biomedicine and Biotechnology

[77] Olivares M, Lopez´ MC, Garc´ıa-Perez´ JL, Briones P, Pulgar M, Thomas MC. The endonuclease NL1Tc encoded by the LINE L1Tc from Trypanosoma cruzi protects parasites from daunorubicin DNA damage. Biochimica et Biophysica Acta (BBA)/Gene Structure and Expression. 2003;1626(1–3):25–32. [78] Zhu Y, Zou S, Wright DA, Voytas DF. Tagging chromatin with retrotransposons: target specificity of the Saccharomyces Ty5 retrotransposon changes with the chromosomal localization of Sir3p and Sir4p. Genes & Development. 1999;13(20):2738– 2749. [79] Noutoshi Y, Arai R, Fujie M, Yamada T. Structure of the ChlorellaZeppretrotransposon:nestedZeppclustersinthe genome. Molecular and General Genetics: MGG. 1998;259(3): 256–263. [80] Anzai T, Takahashi H, Fujiwara H. Sequence-specific recog- nition and cleavage of telomeric repeat (TTAG)n by endonu- clease of non-long terminal repeat retrotransposon TRAS1. Molecular and Cellular Biology. 2001;21(1):100–108. [81] Pich U, Schubert I. Terminal heterochromatin and alternative telomeric sequences in Allium cepa. Chromosome Research. 1998;6(4):315–321. [82] Arkhipova IR, Morrison HG. Three retrotransposon families inthegenomeofGiardia lamblia:twotelomeric,onedead. Proceedings of the National Academy of Sciences of the United States of America. 2001;98(25):14497–14502. [83] Biessmann H, Mason JM, Ferry K, et al. Addition of telomere- associated HeT DNA sequences “heals” broken chromosome ends in Drosophila. Cell. 1990;61(4):663–673. [84] Levis RW, Ganesan R, Houtchens K, Tolar LA, Sheen F-M. Transposons in place of telomeric repeats at a Drosophila telomere. Cell. 1993;75(6):1083–1093. [85] Pardue M-L, DeBaryshe PG. Retrotransposons provide an evolutionarily robust non-telomerase mechanism to maintain telomeres. Annual Review of Genetics. 2003;37:485–511. [86] Lee JW, Harrigan J, Opresko PL, Bohr VA. Pathways and func- tions of the Werner syndrome protein. Mechanisms of Ageing and Development. 2005;126(1):79–86. [87] Rooney S, Alt FW, Lombard D, et al. Defective DNA repair and increased genomic instability in Artemis-deficient murine cells. The Journal of Experimental Medicine. 2003;197(5):553– 565. [88] Ketting RF, Haverkamp TH, van Luenen HG, Plasterk RH. Mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell. 1999;99(2):133–141. [89] Han K, Sen SK, Wang J, et al. Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chim- panzee lineages. Nucleic Acids Research. 2005;33(13):4040– 4052. [90] Chen J-M, Stenson PD, Cooper DN, Ferec´ C. A systematic analysis of LINE-1 endonuclease-dependent retrotransposi- tional events causing human genetic disease. Human Genetics. 2005;117(5):411–427. [91] Barzilai A, Yamamoto K-I. DNA damage responses to oxida- tive stress. DNA Repair. 2004;3(8-9):1109–1115. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 29049, Pages 1–8 DOI 10.1155/JBB/2006/29049

Review Article Do Small RNAs Interfere With LINE-1?

Harris S. Soifer

Beckman Research Institute of the City of Hope, Fox North 2002, 1450 East Duarte Road, Duarte, CA 91010-3011, USA

Received 2 August 2005; Revised 7 October 2005; Accepted 12 October 2005 Long interspersed elements (LINE-1 or L1) are the most active transposable elements in the human genome. Due to their high copy number and ability to sponsor retrotransposition of nonautonomous RNA sequences, unchecked L1 activity can negatively impact the genome by a number of means. Substantial evidence in lower eukaryotes demonstrates that the RNA interference (RNAi) machinery plays a major role in containing transposon activity. Despite extensive analysis in other eukaryotes, no experimental evidence has been presented that L1-derived siRNAs exist, or that the RNAi plays a significant role in restricting L1 activity in the human genome. This review will present evidence showing a direct role for RNAi in suppressing the movement of transposable elements in other eukaryotes, as well as speculate on the role RNAi might play in protecting the human genome from LINE-1 activity.

Copyright © 2006 Harris S. Soifer. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

IMPORTANCE OF LIMITING L1 RETROTRANSPOSITION (40/82) of the FL-L1s were shown to be retrotransposition IN THE HUMAN GENOME competent, with a majority of the retrotransposition activ- ity contributed by six “hot” L1s [2]. Although the poten- The majority of the human genome is comprised of DNA tial for active L1s to greatly increase their copy number is from repetitive sequences and mobile genetic elements. limited by the propensity for truncations to occur at the Retrotransposons, mobile DNA that moves via an RNA inter- 5 end during integration, two highly active RC-L1s (L1RP mediate, are the most abundant transposable elements and and L1b- Thal) have been characterized that are the result of comprise approximately 40% of human genomic sequence. disease-causing, full-length de novo integration events [4]. Of these retrotransposons, the non-long terminal repeat Subsequent comparison with other RC-L1s showed that both (non-LTR) long interspersed elements (LINE-1 or L1) retain L1RP and L1b- Thal exhibit high activity in cell culture and be- a degree of autonomy, as some full-length (FL) L1s encode long to a group of “hot” L1s responsible for most of the retro- functional proteins necessary for retrotransposition [1]. Al- transposition that occurs in our genome today [2]. In addi- though over 99% of L1 sequences are inactive, either through tion, characterization of cloned retrotransposition events us- deleterious mutations, 5-end truncations, or internal re- ing tagged-RC-L1 constructs in cultured cancer cells indicate arrangements, bioinformatic and empirical analysis predict that ∼10% of L1 insertions are accompanied by large chro- that 100 FL-L1s have the capacity for autonomous move- mosomal rearrangements, suggesting that active L1s could ment, and thus are termed retrotransposition-competent L1s also lead to genomic instability [5, 6]. Furthermore, an in- (RC-L1s) [2]. The consensus RC-L1 is 6 kb and contains a creasing number of reports using advanced molecular tech- 5 untranslated region (5 UTR) with an internal promoter, niques illustrate that L1s continue to negatively impact the two nonoverlapping open reading frames (ORF1 and ORF2), fitness of the genome, either through de novo retrotranspo- and a 3 UTR with its own polyadenylation signal. ORF1 sition resulting in insertional mutagnesis, or as the result of encodes a 40 kd (p40) RNA binding protein that forms ri- unequal recombination between dispersed L1s and gene se- bonucleoprotein particles with L1 RNA [3]. ORF2 encodes a quences [7–10]. 150 kd protein with an N-terminal endonuclease (EN) and a Undoubtedly, both positive and negative factors con- C-terminal reverse transcriptase (RT) domain [1]. tinue to regulate L1 activity. For example, experiments us- Despite their small number, the 100 or so remaining RC- ing tagged retrotransposition-incompetent constructs (ie, L1s continue to threaten the human genome. Recently, 82 Alu, pseudogene, and mutant L1s) demonstrate that nonau- FL-L1s with intact ORFs were cloned and their activity tested tonomous RNAs are mobilized in trans by the L1 machin- using a cell culture retrotransposition assay. Almost one-half ery at a much lower frequency compared to the RC-L1 that 2 Journal of Biomedicine and Biotechnology encoded them [11, 12]. This characteristic, known as cis- rich bioinformatics resources spawned from the genome se- preference, limits the ability of nonautonomous retrotrans- quencing efforts over the last decade permit the identification posons to form functional RNPs, thereby preventing the ac- of human orthologs of essential RNAi components. cumulation of dead-end intermediates. In fact, cis-preference Human cells encode one Dicer (DCR) protein, an en- helps ensure the survival of the small number of RC-L1s that zyme with two RNase III domains that forms an intramolec- would otherwise compete with nonautonomous retrotrans- ular dimer to cleave dsRNA in a processive manner pro- posons for limited host factors. The idea that RC-L1 might ducing 21–25 nucleotide siRNAs [22]. The early embryonic be under purifying selection, as well as various ways that L1s lethality observed in mice with the Dicer null genotype can negatively impact the genome, argues in favor of multiple (Dcr-1 −/−) confirms an essential role for Dicer in mam- mechanisms to regulate L1 activity. Considerable experimen- malian development. Unfortunately, the establishment of tal evidence exists that RNA interference (RNAi) represses mouse embryonic fibroblast lines for further study has been the activity of many different transposable elements in other hampered by the early death (E7.5) of Dicer null embryos eukaryotes, leading to speculation that RNAi might act in a [23]. To provide a more favorable system to study the role similar manner against human L1s. of Dicer in controlling mammalian retroelements, Dicer- RNAi is a conserved eukaryotic mechanism in which deficient mouse embryonic stem (ES) cells were developed. double-stranded RNA (dsRNA) recognizes homologous Increased transcription of murine L1 elements was observed mRNA transcripts and causes sequence-specific inhibition of in the absence of Dicer, but not wild-type ES cells, providing gene expression through a number of mechanisms (Figure 1) the first direct evidence that RNAi controls the expression [13]. RNAi is initiated by cleavage of endogenous long of murine L1 retrotransposons [24]. The observed increase dsRNA or short-hairpin RNA (shRNA or pre-miRNA) pre- in L1 expression was measured by quantitative RT-PCR us- cursors by the RNase III enzyme Dicer into 21–25 nucleotide ing primers homologous to the murine L1 5 UTR, presum- small interfering RNA (siRNA) or microRNA (miRNA) ably allowing quantification of transcripts originating from effector molecules [14, 15]. The siRNAs, which are per- the ∼ 3000 RC-L1s that inhabit the C57/BL6 genome. In fectly complementary to their target, recognize their cog- addition, transcripts from intracisternal A particles (IAPs), nate mRNA and become associated with a large multi- an active murine LTR-retrotransposon, were also elevated in protein complex referred to as the RNA-induced silencing the absence of Dicer. This report supports earlier work in complex (RISC) that destroys target mRNAs by endonucle- which IAP and murine endogenous retrovirus-L transcripts olytic cleavage at regions homologous to the siRNA [16, 17] were up-regulated following injection of anti-Dicer dsRNA (Figure 1(a)). miRNAs, on the other hand, are imperfectly into 2- and 8-cell stage mouse embryos [25]. As Dicer ac- matched with their target sequences and associate with ho- tivity is necessary for limiting transcription of both non- mologous mRNAs in a ribonucleoprotein complex resulting LTR as well as LTR containing retrotransposons, one is not in sequence-specific reduction of gene expression through reaching to propose Dicer-mediated cleavage of endogenous translation inhibition [13, 18](Figure 1(b)). In addition to retrotransposon-derived dsRNA into siRNA functions in hu- gene silencing at the posttranscriptional level (ie, siRNA- man cells. mediated degradation or miRNA-mediated translation inhi- The siRNA produced by Dicer is handed off to the RNA- bition), siRNAs targeting promoter regions in genomic DNA induced silencing complex (RISC). While the exact com- can bring about DNA and histone methylation, resulting in ponents of Homo sapiens RISC remain to be completely promoter shutdown in a process termed transcriptional gene characterized, siRNA-mediated knockdown of in HeLa cells, silencing (TGS) (Figure 1(c))[13]. as well as gene targeting experiments in mice, demon- strate that the RISC-component AGO2 is essential for tar- RNAi SUPPRESSES TRANSPOSABLE ELEMENTS IN getmRNAcleavage(Figure 1(a))[26, 27]. Selective inacti- MANY OF EUKARYOTES vation of AGO2 orthologs in lower eukaryotes demonstrates that RISC-associated Ago proteins are required for silencing The genetic link between RNAi and control of mobile genetic both DNA transposons and retrotransposons. For example, elements was initially established following EMS mutagene- loss of the AGO2 ortholog qde-2 in Neurospora crassa leads sis screens of Caenorhabditis elegans.SeveralCelegansmu- to increased expression of the LINE-like retrotransposon, tants deficient in RNAi also show increased activity of DNA Tad. Moreover, deletion of both Dicer genes causes an in- transposons, specifically Tc1, Tc3, and Tc5, as demonstrated crease in Tad activity, linking the initiation step in RNAi by Southern blot analysis for Tc-directed insertions (Table 1) to non-LTR retrotransposon silencing [28]. An interesting [19, 20]. Further screens in Celegansdemonstrated that aspect of the analysis of Tad retrotransposition is that the while not all genes necessary to RNAi are also required for Neurospora genome, which is devoid of active transposons transposon silencing, there is substantial cross-talk between through the action of efficient homology-dependent gene si- the two regulatory pathways [21]. Additional evidence sup- lencing mechanisms such as repeat-induced point mutations porting a role for RNAi in silencing both transposons and (RIP), requires an intact RNAi response to respond to the in- retrotransposons has been demonstrated through genetic troduction by transformation of an active Tad element. Thus, analysis in a number of other eukaryotes. One problem has perhaps one role of RNAi in higher eukaryotes is to permit a been translating the results obtained in these model eukary- rapid and potent response to the sudden activation of retro- otes to the more complex human genome. Fortunately, the transposons. Harris S. Soifer 3

shRNA Pre-miRNA (eg, Centromere or dsRNA) dsRNA

Dicer Dicer Dicer

P P P P P

Homology search GEMIN3 GEMIN4 AGO2 engine? HP1? RNA HDAC? helicase? DNMT? AGO2 AGO2

RNA-induced RNA-induced miRNA-associated transcriptional silencing ribonucleoprotein silencing complex complex complex (RISC) (miRNP) (RITS)

mRNA Translation DNA/histone degradation inhibition methylation

(a) (b) (c)

Figure 1: RNAi-based gene silencing pathways in H sapiens. (a) Dicer cleaves long double-stranded RNA (dsRNA) or short-hairpin RNA (shRNA) into functional siRNA with characteristic 3 overhangs. siRNAs are incorporated into RISC, recognize the target mRNA through an unknown subunit(s), and cleavage is performed by AGO2. (b) Precursor microRNAs (pre-miRNA), which themselves are a cleavage product of a primary microRNA transcript, are further processed by Dicer into functional miRNAs that associate with AGO2 into a miRNA ribonucleoprotein (miRNP). miRNPs recognize their target mRNAs resulting in translation inhibition by an undefined mechanism. (c) Transcriptional gene silencing is initiated by Dicer-mediated cleavage of long dsRNA (eg, centromere dsRNA) into siRNA that associate with the RITS complex. Putative components of H sapiens RITS are depicted: AGO-argonaute; DNMT-DNA methyltransferase; HDAC-histone deacetylase; HP1-heterochromatin protein 1.

In addition to LINE-like Tad retrotransposons, increased DOES RNAi CONTROL LINE-1 ACTIVITY IN transcript levels of the Ingi and SLACS retroposon elements HUMAN CELLS? are observed in cells lacking Ago1, the AGO2 ortholog of Trypanosoma brucei RISC [29]. Several other spontaneous So far, there is no direct evidence that the RNAi pathway in or induced AGO mutants, such as the Arabidopsis Ago4 and human cells protects the genome from the activity of L1s. Drosophila piwi mutants, also show elevated levels of retro- Direct genetic evidence has been hard to come by in hu- transposons [30–33]. Thus, genetic evidence from a variety man cells because of the difficulty in inhibiting RNAi gene of organisms links both the initiation step (Dicer) and RISC- function. For other model eukaryotes such as Celegansand mediated effector step (AGO) of RNAi to the control mobile Schizosaccharomyces pombe, the high rate of homologous re- genetic elements. Moreover, the fact that different classes of combination (HR) and ability to perform large-scale genetic transposable elements (DNA transposons, LTR and non-LTR screens, permits the study of mutant phenotypes through retrotransposons, and endogenous retrovirus sequences) are insertion and/or inactivation of specific genes [35]. More- up-regulated in the absence of the RNAi machinery supports over, the recent application of RNAi technology to selectively the generalization that RNAi is part of the eukaryotic innate inhibit gene function in mammalian cells both in culture immune system to protect the genome from the mutational and in vivo had made it less necessary to rigorously pursue load of parasitic sequences [34](Table 1). methods that enhance the efficiency of HR in mammalian 4 Journal of Biomedicine and Biotechnology

Table 1: Eukaryotic RNAi orthologs involved in silencing transposable elements. H sapiens orthologs, if present in the Homologene database, are indicated. N.D. implies not determined.

RNAi genes Transposable Human siRNAs Organism implicated in element Reference(s) ortholog detected? silencing TEs silenced mut-7 — rde-2 AGO2 Tc1, Tc3, Tc5 DNA Terminal inverted Caenorhabditis elegans [19–21, 48] mut-16 — transposons repeat (TIR) of Tc1 mut-14 — Gypsy ERV 5 UTR Drosophila melanogaster piwi PIWI Copia retrotransposon N.D. [32, 33] Mdg1 retrotransposon N.D.

Ingi retroposon ORF 1 Trypanosma bruceii Ago1 AGO2 [29, 30] SLACS retrotransposon ORF 1 and 3 UTR

qde-2 AGO2 Neurosporra crassa Tad retrotransposon ORF 1 and ORF 2 [28] dcl1/dcl2 DICER1

LINE-1 N.D. Mus musculus Dicer-1 DICER-1 [24, 25] Intracisternal A particle N.D.

Arabidopsis thaliana Ago4 — AtSN1 retroelement AtSN1∗ [31]

∗AtSN1 siRNA was determined by Northern blot with a full-length 159 nucleotides sense AtSN1 RNA probe. cells. Although several genetic screens in mammalian cells protection and Northern blot analyses in chicken DT40 and have been conducted using shRNA libraries, one can appre- murine ES cells, respectively [24, 44]. Restriction of L1 siR- ciate this RNAi-mediated approach would be problematic for NAs specific cell types, such as primordial germ tissue and/or studying the role that RNAi plays in controlling human L1s gametes, would explain why earlier characterization of en- [36, 37]. It is possible to achieve transient inhibition of the dogenous siRNAs in human cervical carcinoma cells failed to RNAi pathway by transfecting human cells with large quan- detect L1 siRNAs. Since L1s that retrotranspose in gametes tities (> 50 nM) of siRNA targeting one of the RNAi compo- insure passage of their genetic information to the next gen- nents (eg, DICER or AGO2) [15, 38]. However, functional eration without impacting host fitness through somatic mu- inhibition of the RNAi pathway is directly proportional to tagenesis, the cell might combat this threat by producing L1 transfection efficiency and varies between cell lines (unpub- siRNA at a specific time during gametogenesis. Despite the lished observations). In addition, some virus products are in- advantage for L1s to restrict their expression in germ cells, hibitors of RNAi, either by successfully competing with en- immunohistochemical analysis detected L1 ORF translation dogenous dsRNA for Dicer, as is the case for the adenovirus products (ORF1p and ORF2p) in adult and fetal testicu- VA1 noncoding RNA, or by sequestering siRNA in an inac- lar tissue, as well as Sertoli, Leydig, and vascular endothe- tive complex [39–41]. Although one group reported efficient lial cells [45, 46]. Furthermore, a single case of insertional down-regulation of Dicer in HeLa cells using a trans-cleaving mutagenesis by L1 in somatic tissue has been reported [47]. hammerhead ribozyme, only transient knockdown of Dicer Consequently, the threat posed by RC-L1s and functional expression was achieved and they did not demonstrate func- ORF proteins is not limited to the germline, and L1 siR- tional inhibition of the RNAi pathway [42]. NAs might also be present in somatic cells. Moreover, as the In the absence of data showing increased L1 activity in amount of FL-L1 RNA in cultured somatic cells is relatively cells with an impaired RNAi pathway, the detection and low compared to L1 expression from established germ cell cloning of L1-derived siRNAs would support a role for RNAi tumors, somatic cells seem a fitting place for posttranscrip- in controlling L1s. Efforts to clone the small RNA fraction tional degradation of FL-L1 RNA by siRNA to occur. from HeLa cells failed to find microRNAs (miRNAs) pro- In lower eukaryotes where classical genetics has estab- duced from LINE-1, suggesting that if endogenous L1 miR- lished a direct link between RNAi and the control of mobile NAs are produced, they are present at low levels or in spe- genetic elements, siRNAs have been detected for both trans- cific cell types [43]. This initial cloning effortreliedonhigh posons and retrotransposons. For example, siRNAs derived throughput sequencing after annealing linker molecules to from the LINE-like Tad retrotransposon were detected by the small RNA fraction purified from HeLa cells and might Northern blotting of total RNA from Neurospora crassa qde-2 overlook miRNAs from repetitive elements. Indeed, endoge- mutants, but not wild-type progeny [28](Table 1). qde-2 is nous siRNAs homologous to centromere repeats were not the AGO2 ortholog of N crassa RISC, and qde-2 mutants cloned using this approach, despite being detected by RNase are viable, but defective in RNAi. Tad-specific siRNAs were Harris S. Soifer 5

5 UTR p40 EN RT Cys 3 UTR

5 UTR P1 L1 ORF P2

5 3 5 3 Dicer 5 Dicer Dicer 5 3 3

PPP PPP

Target 5 UTR sequences Target RC-L1 mRNA and/or L1 ORF for siRNA-mediated transcripts with coding potential DNA/histone methylation for siRNA-mediated degradation

(a) (b) (c)

Figure 2: Proposed ways RNAi can control L1 activity. The consensus RC-L1 is depicted above. (a) L1 dsRNA produced from the 5 UTR sense and antisense promoters is processed by Dicer and can target transcripts originating from RC-L1s for degradation. Alternatively, this siRNA can also initiate histone and DNA methylation resulting in silencing of the L1s promoter. (b) Regions of L1 mRNA that form stable hairpins through intramolecular base pairing could be Dicer substrates. The resulting siRNA is capable of a number of responses. (c) L1 dsRNA produced by read-through transcription from opposing cellular promoters is converted into siRNA that can target RC-L1 or ORF transcripts for degradation.

detected with probes homologous to the Tad ORF1 or ORF2, processing of L1 dsRNA might occur [49]. The expression indicating that siRNAs were produced along the length of the profile of L1 sequences is particularly complicated, not only element. In Celegans, RNase protection analysis successfully because the ∼ 3000 FL-L1s that reside in the human genome detected Tc1 dsRNA produced by read-through transcrip- contain an internal Pol II promoter that could remain tran- tion of endogenous promoters, as well as Tc1 siRNA in the scriptionally active, but strong cellular promoters nearby germ line of wild-type and RNAi-deficient worms (Table 1). presumably inactive L1s could result in the expression and In contrast to Tad siRNAs, endogenous siRNAs from the C translation of unwanted L1 ORF products [50](Figure 2(c)). elegans DNA transposons were not derived from the trans- Therefore, the production of L1 dsRNA and its conversion posase ORF, but were detected with probes complementary by Dicer into L1 siRNA might simply be a consequence of to the inverted repeats [48]. The fact that Celegansmuta- the large number (> 500 000 copies/diploid genome) of L1 tor strains also show increased mobility of other DNA trans- sequences and their proximity to transcriptionally active, posons such as Tc3 and Tc5, suggests that the Celegans endogenous promoters (Figure 2(c)). The activity of adja- RNAi is not specific to one element and RNAi might be a cent promoters also establishes the possibility that L1 dsRNA general defense mechanism against transposon activity. En- or siRNA could form through simple diffusion of comple- dogenous siRNAs homologous to retrotransposons have also mentary L1 transcripts expressed from distant loci. In addi- been detected by Northern blot in Arabidopsis thaliana and tion to the activity of cellular promoters, regions of the L1 Drosophila melanogaster (Table 1)[31, 32]. mRNA that form stable hairpin structures greater than 21 One requirement for the production of L1 siRNA would nucleotides might also be subject to Dicer processing into be transcription of antisense L1 RNA that could hybridize siRNA (Figure 2(b)). To date, no L1 hairpin structures have with L1 sense RNA to form dsRNA followed by Dicer- been defined biochemically, although recombinant human mediated processing into siRNAs. An early study of L1 ex- Dicer efficiently converts in vitro transcribed L1 dsRNA into pression demonstrated that large quantities of both sense functional siRNA [51]. and antisense L1 RNA of variable size greater than 1 kb Instead of relying on adjacent promoters for transcrip- are present in total RNA of a human teratocarcinoma cell tion, the production of sense/antisense L1 dsRNA might take line, but not in the cytoplasmic RNA fractions where Dicer advantage of a unique feature of the L1 5 UTR; the existence 6 Journal of Biomedicine and Biotechnology of an internal promoter that transcribes L1 sense RNA and an control L1 activity is not due to a lack of effort, as several antisense promoter (ASP) within nucleotides +400 to +600 groups are pursuing experiments to assess the interaction (with respect to the 5-end of the L1) of the 5 UTR that betweenRNAiandhumanL1s.Thedifficulty in studying transcribes minus-strand L1 sequence in the opposite direc- the activity of endogenous human L1s in cells with an im- tion (Figure 2(a)) [52, 53]. In cell lines where the 5 UTR paired RNAi pathway has slowed progress in showing a role sense promoter shows transcriptional activity, the L1 ASP is for RNAi in suppressing L1s. As current Dicer- and Ago2- also active, albeit at lower levels [52, 54]. The resulting mi- null mice show early embryonic lethality, the use of con- nus strand L1 RNA could anneal with plus strand L1 RNA ditional gene targeting through Cre-mediated excision of originating from the same L1 5 UTR region, or anneal with floxed-RNAi alleles will permit further assessment for the another 5 UTR sense RNA by diffusion. Dicer could then role of RNAi in L1 retrotransposition [23, 27]. Conditional convert the dsRNA derived from the L1’s 5 UTR into siRNA. gene targeting and deletion of Dicer in the T cells causes loss It is important to recognize that 5 UTR siRNA can act on of microRNA processing linked to impaired T cell differen- transcripts arising from the L1s sense promoter as well as the tiation [58]. These Dicer-deficient T cells are viable, but lack L1s ASP (Figure 2(a)). As the mechanism for choosing which Dicer activity, thus providing a distinct Dicer-null popula- strand of the siRNA (sense strand targeting antisense mes- tion for which retrotransposon activity can be assessed. It is sage or antisense strand targeting sense message) is incorpo- just a matter of time before proper experiments, combined rated into RISC along with the target is not well understood, with dogged determination, provide direct evidence that hu- it is possible that siRNA produced from this unique region of man L1s are, to some degree, constrained by the RNAi path- the L1s 5 UTR could generate two different RNAi responses way. [55, 56]. First, L1 retrotransposition could be kept in check by the antisense siRNA strand recognizing transcripts orig- ACKNOWLEDGMENTS inating from RC-L1s. Additionally, the sense siRNA strand could target transcripts from the L1s ASP, thereby regulat- I would like to thank Lars Aagard, Kevin Morris, and John ing the expression of certain endogenous genes through the Rossi for critical review of this manuscript and helpful com- action of a single pool of L1 5 UTR siRNAs [53]. ments. The author is supported by the Beckman Fellowship As of yet, short duplex RNAs derived from L1s await from the Arnold and Mabel Beckman Foundation. characterization, possibly owing to low-level expression in specific cell types. Solution hybridization using radiolabelled RNAprobesfromconservedregionsoftheL1s5 UTR offers REFERENCES a sensitive method to detect endogenous 5 UTR siRNAs. For [1] Ostertag EM, Kazazian HH Jr. Biology of mammalian L1 the detection of L1 siRNA, it will be necessary to distinguish retrotransposons. Annual Review of Genetics. 2001;35:501– short, single-stranded L1 RNA that might hybridize to the 538. riboprobe and be mistakenly detected as L1 siRNA, from the [2] Brouha B, Schustak J, Badge RM, et al. Hot L1s account for the real L1 siRNA duplexes, which being double-stranded are re- bulk of retrotransposition in the human population. Proceed- sistant to RNase A activity in the presence of high salt [48]. ings of the National Academy of Sciences of the United States of A further issue complicating the detection of L1 siRNAs by America. 2003;100(9):5280–5285. ribonuclease digestion is the fact that single nucleotide mis- [3] Kulpa DA, Moran JV. Ribonucleoprotein particle formation ffi matches between endogenous L1 siRNAs and the riboprobe is necessary but not su cient for LINE-1 retrotransposition. might cause cleavage and detection of protected fragments Human Molecular Genetics. 2005;14(21):3237–3248. [4] Kimberland ML, Divoky V, Prchal J, Schwahn U, Berger W, that are smaller than the predicted 21–25 nucleotides size for Kazazian HH Jr. Full-length human L1 insertions retain the siRNA. Careful design of 5 UTR riboprobes should limit po- capacity for high frequency retrotransposition in cultured tential problems caused by single nucleotide mismatches. For cells. Human Molecular Genetics. 1999;8(8):1557–1560. example, one could restrict detection of siRNAs to a specific [5] Gilbert N, Lutz-Prigge S, Moran JV.Genomic deletions created L1 subfamily, such as Ta-1d, which harbors a deletion at po- upon LINE-1 retrotransposition. Cell. 2002;110(3):315–325. sition 72 of the 5 UTR and distinguishes this youngest L1 [6] Symer DE, Connelly C, Szak ST, et al. Human l1 retrotranspo- subset from the slightly more divergent Ta-1nd [57]. sition is associated with genetic instability in vivo. Cell. 2002; 110(3):327–338. CONCLUSION [7] Burwinkel B, Kilimann MW. Unequal homologous recombi- nation between LINE-1 elements as a mutational mechanism There is ample experimental evidence, through genetic ma- in human genetic disease. Journal of Molecular Biology. 1998; nipulation and biochemical analysis, that RNA interference 277(3):513–517. controls the activity of transposable elements in a variety [8] Kumatori A, Faizunnessa NN, Suzuki S, Moriuchi T, Kurozumi H, Nakamura M. Nonhomologous recombination of eukaryotes such as A thaliana, S pombe, C elegans, and between the cytochrome b558 heavy chain gene (CYBB)and Mmusculus[19–21, 24, 25, 28–33]. In addition, since the LINE-1 causes an X-linked chronic granulomatous disease. ffi RNAi response can e ciently limit retrotransposition of an Genomics. 1998;53(2):123–128. RC-L1 when introduced into transformed human cells, there [9] Suminaga R, Takeshima Y, Yasuda K, Shiga N, Nakamura H, are no barriers per se to siRNA-mediated degradation of Matsuo M. Non-homologous recombination between Alu and L1s. The inability to uncover direct evidence that RNAi may LINE-1 repeats caused a 430-kb deletion in the dystrophin Harris S. Soifer 7

gene: a novel source of genomic instability. Journal of Human [29] Shi H, Djikeng A, Tschudi C, Ullu E. Argonaute protein in Genetics. 2000;45(6):331–336. the early divergent eukaryote Trypanosoma brucei: control of [10] Kazazian HH Jr, Moran JV. The impact of L1 retrotransposons small interfering RNA accumulation and retroposon tran- on the human genome. Nature Genetics. 1998;19(1):19–24. script abundance. Molecular and Cellular Biology. 2004;24(1): [11] Esnault C, Maestre J, Heidmann T. Human LINE retrotrans- 420–427. posons generate processed pseudogenes. Nature Genetics. [30] Djikeng A, Shi H, Tschudi C, Ullu E. RNA interference in Try- 2000;24(4):363–367. panosoma brucei: cloning of small interfering RNAs provides [12] Wei W, Gilbert N, Ooi SL, et al. Human L1 retrotransposi- evidence for retroposon-derived 24-26-nucleotide RNAs. tion: cis preference versus trans complementation. Molecular RNA. 2001;7(11):1522–1530. and Cellular Biology. 2001;21(4):1429–1439. [31] Zilberman D, Cao X, Jacobsen SE. ARGONAUTE4 control [13] Meister G, Tuschl T. Mechanisms of gene silencing by double- of locus-specific siRNA accumulation and DNA and histone stranded RNA. Nature. 2004;431(7006):343–349. methylation. Science. 2003;299(5607):716–719. [14] Bernstein E, Caudy AA, Hammond SM, Hannon GJ. Role for [32] Sarot E, Payen-Groscheneˆ G, Bucheton A, Pelisson´ A. Evi- a bidentate ribonuclease in the initiation step of RNA interfer- dence for a piwi-dependent RNA silencing of the gypsy en- ence. Nature. 2001;409(6818):363–366. dogenous retrovirus by the Drosophila melanogaster flamenco [15] Lee Y, Jeon K, Lee J-T, Kim S, Kim VN. MicroRNA maturation: gene. Genetics. 2004;166(3):1313–1321. stepwise processing and subcellular localization. The EMBO [33] Kalmykova AI, Klenov MS, Gvozdev VA. Argonaute pro- Journal. 2002;21(17):4663–4670. tein PIWI controls mobilization of retrotransposons in the [16] Hammond SM, Boettcher S, Caudy AA, Kobayashi R, Hannon Drosophila male germline. Nucleic Acids Research. 2005;33(6): GJ. Argonaute2, a link between genetic and biochemical anal- 2052–2059. yses of RNAi. Science. 2001;293(5532):1146–1150. [34] Vastenhouw NL, Plasterk RHA. RNAi protects the Caenorhab- [17] Schwarz DS, Tomari Y, Zamore PD. The RNA-induced silenc- ditis elegans germline against transposition. Trends in Genetics. 2+ ing complex is a Mg -dependent endonuclease. Current Biol- 2004;20(7):314–319. ogy. 2004;14(9):787–791. [35] Hudson DF, Morrison C, Ruchaud S, Earnshaw WC. Reverse [18] Mourelatos Z, Dostie J, Paushkin S, et al. miRNPs: a novel genetics of essential genes in tissue-culture cells: ‘dead cells class of ribonucleoproteins containing numerous microRNAs. talking’. Trends in Cell Biology. 2002;12(6):281–287. Genes & Development. 2002;16(6):720–728. [36] Paddison PJ, Silva JM, Conklin DS, et al. A resource for large- [19] Tabara H, Sarkissian M, Kelly WG, et al. The rde-1 gene, scale RNA-interference-based screens in mammals. Nature. RNA interference, and transposon silencing in C elegans. Cell. 2004;428(6981):427–431. 1999;99(2):123–132. [37] Berns K, Hijmans EM, Mullenders J, et al. A large-scale RNAi [20] Ketting RF, Haverkamp THA, van Luenen HGAM, Plasterk screen in human cells identifies new components of the p53 RHA. Mut-7 of C elegans, required for transposon silencing pathway. Nature. 2004;428(6981):431–437. and RNA interference, is a homolog of Werner syndrome he- licase and RNaseD. Cell. 1999;99(2):133–141. [38] Hutvagner´ G, McLachlan J, Pasquinelli AE, Balint´ E, Tuschl [21] Vastenhouw NL, Fischer SEJ, Robert VJP,et al. A genome-wide T, Zamore PD. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal screen identifies 27 genes involved in transposon silencing in C elegans. Current Biology. 2003;13(15):1311–1316. RNA. Science. 2001;293(5531):834–838. [22] Zhang H, Kolb FA, Jaskiewicz L, Westhof E, Filipowicz W. Sin- [39] Ye K, Malinina L, Patel DJ. Recognition of small interfering gle processing center models for human Dicer and bacterial RNA by a viral suppressor of RNA silencing. Nature. 2003;426 RNase III. Cell. 2004;118(1):57–68. (6968):874–878. [23] Bernstein E, Kim SY, Carmell MA, et al. Dicer is essential for [40] Lu S, Cullen BR. Adenovirus VA1 noncoding RNA can inhibit mouse development. Nature Genetics. 2003;35(3):215–217. small interfering RNA and MicroRNA biogenesis. Journal of [24] Kanellopoulou C, Muljo SA, Kung AL, et al. Dicer-deficient Virology. 2004;78(23):12868–12876. mouse embryonic stem cells are defective in differentiation [41] Bucher E, Hemmes H, de Haan P, Goldbach R, Prins M. The and centromeric silencing. Genes & Development. 2005;19(4): influenza A virus NS1 protein binds small interfering RNAs 489–501. and suppresses RNA silencing in plants. Journal of General Vi- [25] Svoboda P, Stein P, Anger M, Bernstein E, Hannon GJ, Schultz rology. 2004;85(pt 4):983–991. RM. RNAi and expression of retrotransposons MuERV-L and [42] Kawasaki H, Taira K. Short hairpin type of dsRNAs that are IAP in preimplantation mouse embryos. Developmental Biol- controlled by tRNAVal promoter significantly induce RNAi- ogy. 2004;269(1):276–285. mediated gene silencing in the cytoplasm of human cells. Nu- [26] Meister G, Landthaler M, Patkaniowska A, Dorsett Y, Teng G, cleic Acids Research. 2003;31(2):700–707. Tuschl T. Human Argonaute2 mediates RNA cleavage targeted [43] Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identi- by miRNAs and siRNAs. Molecular Cell. 2004;15(2):185–197. fication of novel genes coding for small expressed RNAs. Sci- [27] Liu J, Carmell MA, Rivas FV, et al. Argonaute2 is the catalytic ence. 2001;294(5543):853–858. engine of mammalian RNAi. Science. 2004;305(5689):1437– [44] Fukagawa T, Nogami M, Yoshikawa M, et al. Dicer is essential 1441. for formation of the heterochromatin structure in vertebrate [28] Nolan T, Braccini L, Azzalin G, De Toni A, Macino G, Cogoni cells. Nature Cell Biology. 2004;6(8):784–791. C. The post-transcriptional gene silencing machinery func- [45] Branciforte D, Martin SL. Developmental and cell type speci- tions independently of DNA methylation to repress a LINE1- ficity of LINE-1 expression in mouse testis: implications like retrotransposon in Neurospora crassa. Nucleic Acids Re- for transposition. Molecular and Cellular Biology. 1994;14(4): search. 2005;33(5):1564–1573. 2584–2592. 8 Journal of Biomedicine and Biotechnology

[46] Ergun¨ S, Buschmann C, Heukeshoven J, et al. Cell type- specific expression of LINE-1 open reading frames 1 and 2 in fetal and adult human tissues. The Journal of Biological Chem- istry. 2004;279(26):27753–27763. [47] Miki Y, Nishisho I, Horii A, et al. Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Research. 1992;52(3):643–645. [48] Sijen T, Plasterk RHA. Transposon silencing in the Caenorhab- ditis elegans germ line by natural RNAi. Nature. 2003;426 (6964):310–314. [49] Skowronski J, Singer MF. Expression of a cytoplasmic LINE-1 transcript is regulated in a human teratocarcinoma cell line. Proceedings of the National Academy of Sciences of the United States of America. 1985;82(18):6050–6054. [50] Swergold GD. Identification, characterization, and cell speci- ficity of a human LINE-1 promoter. Molecular and Cellular Bi- ology. 1990;10(12):6718–6729. [51]SoiferHS,ZaragozaA,PeyvanM,BehlkeMA,RossiJJ.Apo- tential role for RNA interference in controlling the activity of the human LINE-1 retrotransposon. Nucleic Acids Research. 2005;33(3):846–856. [52] Speek M. Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes. Molecular and Cellular Biology. 2001;21(6):1973–1985. [53] Nigumann P, Redik K, Matlik¨ K, Speek M. Many human genes are transcribed from the antisense promoter of L1 retrotrans- poson. Genomics. 2002;79(5):628–634. [54] Yang N, Zhang L, Zhang Y, Kazazian HH Jr. An important role for RUNX3 in human L1 transcription and retrotransposition. Nucleic Acids Research. 2003;31(16):4929–4940. [55] Schwarz DS, Hutvagner´ G, Du T, Xu Z, Aronin N, Zamore PD. Asymmetry in the assembly of the RNAi enzyme complex. Cell. 2003;115(2):199–208. [56] Khvorova A, Reynolds A, Jayasena SD. Functional siRNAs and miRNAs exhibit strand bias. Cell. 2003;115(2):209–216. [57] Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotrans- poson evolution and amplification in recent human history. Molecular Biology and Evolution. 2000;17(6):915–928. [58] Muljo SA, Ansel KM, Kanellopoulou C, Livingston DM, Rao A, Rajewsky K. Aberrant T cell differentiation in the absence of Dicer. The Journal of Experimental Medicine. 2005;202(2):261– 269. Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2006, Article ID 32713, Pages 1–8 DOI 10.1155/JBB/2006/32713

Review Article The Potential Regulation of L1 Mobility by RNA Interference

Shane R. Horman,1 Petr Svoboda,2 and Eline T. Luning Prak1

1 Department of Pathology and Laboratory Medicine, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6055, USA 2 Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, 4058 Basel, Switzerland

Received 6 August 2005; Revised 12 December 2005; Accepted 20 December 2005 The hypothesis that RNA interference constrains L1 mobility seems inherently reasonable: L1 mobility can be dangerous and L1 RNA, the presumed target of RNAi, serves as a critical retrotransposition intermediate. Despite its plausibility, proof for this hypothesis has been difficult to obtain. Studies attempting to link the L1 retrotransposition frequency to alterations in RNAi activity have been hampered by the long times required to measure retrotransposition frequency, the pleiotropic and toxic effects of altering RNAi over similar time periods, and the possibility that other cellular machinery may contribute to the regulation of L1s. Another problem is that the commonly used L1 reporter cassette may serve as a substrate for RNAi. Here we review the L1-RNAi hypothesis and describe a genetic assay with a modified reporter cassette that detects approximately 4 times more L1 insertions than the conventional retrotransposition assay.

Copyright © 2006 Shane R. Horman et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

RNAi SILENCING OF TRANSPOSABLE ELEMENTS a series of molecules that are implicated in the RNAi pathway, including the Argonaute protein PIWI [9, 10]. RNAi is an evolutionarily conserved process of sequence- By analogy, perhaps a sequence-dependent process of specific posttranscriptional gene silencing (reviewed in [1]). mobile element silencing, such as RNAi, is used to regulate Double-stranded RNA (dsRNA) is cleaved by the ribonu- L1 mobility. As with the above-mentioned examples, the reg- clease DICER into small interfering RNA species (siRNAs). ulation of L1 mobility may be particularly relevant in the SiRNA molecules, in turn, target complementary RNA se- germline and in embryos. Mobility in the germline or in quences for destruction (reviewed in [2]). RNAi is postu- embryos could result in inheritance of the new insertion. lated to play a role in the silencing of transposable elements These sites are also where L1s are believed to be most active and viruses that produce dsRNA [3, 4]. One line of evidence [11–14]. Other mechanisms for recognizing and responding linking RNAi to repressed transposition comes from the ne- to dsRNA, such as RNase L and PKR-mediated responses, matode, Celegans[5, 6]. Tc1 elements, a class of DNA trans- can cause apoptosis. While apoptosis seems like a reason- posons, mobilize in somatic cells, but are silenced in the germ able strategy for dealing with a wayward somatic cell, in the line of Celegans.AnumberofmutantCelegansstrains that germline or early embryo, apoptosis could be detrimental to have lost this silencing have also lost the ability to execute the fitness of the organism [14, 15]. Here we explore the the- RNAi (though there were also RNAi mutants that lacked this sis that the mobility of human L1s is regulated by RNAi. transposon mobilization phenotype) [5]. The identification of specific genes, which when mutated show activation of L1 RETROTRANSPOSITION: HAZARDS germline transposition, indicates that an active transposon- AND CONSTRAINTS silencing process exists in the germline [5, 6]. Another line of evidence linking RNAi (or a mechanism similar to RNAi) The human genome contains roughly half a million long in- to the regulation of transposable elements involves the I- terspersed elements (L1s) that collectively account for 17% factor in Drosophila. Mobilization of the I-factor (an L1-like ofitsmass[16]. Most new L1 insertions are “dead on arrival” non-LTR retrotransposon) is regulated at least in part by due to 5 truncation and nearly all but perhaps 60–100 L1 se- a homology-dependent silencing mechanism in the female quences in the human genome are inactive due to truncation, germline [7, 8]. This silencing mechanism has been linked to inversion, or mutation [17]. 2 Journal of Biomedicine and Biotechnology

As discussed elsewhere in this issue, retrotransposition How (or even if) L1 RNA is recognized by cellular ma- can be hazardous because L1s can insert into genes, alter chinery is unknown. If RNAi limits human L1 retrotrans- gene expression, shuffle exons, transduce 3 flanking sequen- position, the most obvious possibility is that RNAi post- ces, mobilize Alu elements, and their replicative mobilization transcriptionally targets L1 mRNA. The presumed trigger adds significant DNA mass to the genome [18–24]. L1 inser- for RNAi is double-stranded RNA (dsRNA), although other tions and recombination events involving genomic L1 and forms of sequence-specific recognition or unusual RNA sec- Alu insertions have been reported in a number of genetic dis- ondary structure are possible. DsRNA has been documented orders (reviewed in [25]). Although it is possible that some to be the target of RNAi-induced transposon silencing in functions of L1 are beneficial to mammals (a most interest- other species, most notably the Tc1 DNA transposon in C ing recent demonstration involves the potential role of L1s elegans [5, 6]. Read-through transcription of dispersed Tc1 as diversity generators in the CNS, [26]), most germline L1 copies can form dsRNA as a result of “snap-back” of their insertions are likely to be neutral or negatively selected. Neg- terminal inverted repeats (TIRs), which are complementary ative selection of L1s is suggested by the higher frequencies in sequence. Human L1 retrotransposons are not flanked of full-length human L1 insertions on the sex chromosomes by complementary TIRs, however there is considerable nu- than the autosomes (the former not being as able as the latter cleotide sequence similarity between active L1s [35]. This to undergo purifying selection) and by the dominance and high level of sequence similarity amongst active human L1 limited periods of activity of single L1 subfamilies in some elements might allow only a few L1 dsRNA molecules to si- primate lineages [27, 28]. lence many genomic L1s. Sense and antisense L1 transcripts L1 mobility in mammals appears to be actively con- have been documented in human teratocarcinoma cells [36]. strained. An indirect line of evidence for this constraint is There are two reports suggesting the presence of long L1 that different cell types exhibit different rates of retrotrans- dsRNA [37, 38], although thus far an unequivocal demon- position, ranging from 30% or higher in some transformed stration of Dicer-derived L1 siRNAs or miRNAs from mam- cell lines to fewer than one per million cells. In the mouse, malian cells has remained elusive [39–41]. the rate of germline retrotransposition events using an L1- There are several ways in which L1 dsRNA could be EGFP transgene is approximately one event in 100 offspring formed (see Figure 1). First, antisense L1 RNA could arise as [11, 13]. Analysis of L1 transcription, protein production a read-through transcript from a heterologous promoter ele- and retrotransposition, reveals different levels of L1 activity ment (Figure 1(a)). If sense and antisense transcripts orig- in different cell types, with highest levels of activity noted in inating from different loci could form dsRNA, even inac- germ cells, embryonal cells, and recently neuronal cells [11– tive copies of L1 could contribute to the loss of L1 mobil- 13, 26, 29, 30]. The factor(s) that assist L1 mobilization in ity (Figure 1(b)). As the genomic burden of L1 copies in- some cell lines, but not others, are not known. creases, the level of L1 repression might also increase. On the other hand, highly efficient silencing of all L1 copies L1 RNA IS A LOGICAL TARGET FOR in trans could be problematic since L1s may influence the LIMITING L1 MOBILITY human transcriptome significantly (reviewed in [20]). An alternative is to selectively target L1 dsRNA that arises in RNA is a logical target for cellular machinery to protect cis. In this connection, the L1 5UTR has antisense pro- against unwanted L1 proliferation. L1 RNA is required and moter activity at positions 400–600, providing an additional may be rate-limiting for retrotransposition. In cell-culture- source of antisense L1 RNA (Figure 1(c))[42, 43]. RNAi tar- based assays with tagged human L1 elements, it has been geting the L1 5UTR would be expected to selectively re- shown that a decrease in L1 mRNA leads to a decrease in L1 strain full-length (and therefore more likely to be active) el- retrotransposition frequencies, lending support to the idea ements. Finally, L1 dsRNA can originate from the transcrip- that L1 activity can be limited by regulating L1 transcript tion of inverted repeats (Figure 1(d)). A high copy number abundance [31, 32]. L1 RNA is critical for retrotransposition of L1 sequences in mammalian genomes increases the like- because it encodes the necessary ORF1 and ORF2 proteins, lihood of generating inverted repeats either by genome re- which act preferentially upon the RNA that encoded them arrangements or by insertions of L1 elements into or near [33, 34]. This effect, termed cis preference, may allow active themselves [44]. We conducted a simple search for long in- L1s a greater proliferative advantage than retroelements that verted repeats (>200 bp) in the human genome, and a pre- mobilize in trans because trans-mobilization can result in the liminary analysis of several chromosomes indicates that the expansion of mutated rather than active elements. RNAi may whole genome contains tens of perfect inverted repeats of be able to counter this potential advantage of cis preference L1 sequences. Transcription of such inverted repeats results by using nonfunctional L1 RNAs to inhibit functional L1s. in an efficient dsRNA formation because this dsRNA fold- On the other hand, the high copy number of L1 insertions ing is a first-order reaction. Although the focus here is on in mammals may have been selected for L1s that are inef- dsRNA, there may be other forms of L1 RNA that are rec- ficiently regulated by RNAi. If RNAi silences L1s, it does not ognized. L1 RNA species are heterogeneous due to vari- do so with perfect efficiency since L1 transcripts are detected, able 5 truncation, premature polyadenylation, and inversion and some L1s can still mobilize in the human genome. [45, 46]. Shane R. Horman et al 3

AAAAA

LINE-1 A A A AA A + AAAAA A A AA A A A LINE-1 A A LINE-1 A A A AAAA A A AAAAA A

(a) (b)

LINE-1 1-ENIL LINE-1 1-ENIL

LINE-1 1-ENIL A AAAAA A A A A LINE-1 LINE-1 1-ENIL A AA AAAAA A A

(c) (d)

Figure 1: Generation of L1 dsRNA. L1 dsRNA could arise from different transcripts (shown in Figures 1(a), 1(b),and1(c))orfromthe same transcript (shown in Figure 1(d)). (a) An L1 transcript originates from the internal 5UTR promoter, producing a sense-strand RNA. Neighboring the same element, an antisense-oriented heterologous promoter produces a read-through transcript that includes antisense L1 RNA. (b) An L1 is transcribed producing sense RNA and another L1 insertion, elsewhere in the genome, is transcribed off of a heterologous promoter yielding an antisense RNA. (c) An L1 is transcribed off of its 5UTR producing a sense transcript, while antisense promoter activity of the 5UTR produces an antisense transcript. (d) If an L1 inserts near another L1 sequence in the genome, it may be possible to create a hairpin. The figure shows two full-length L1 sequences facing each other, although it should be noted that hairpins could also form between truncated L1 copies that face each other, as long as there is a transcript that extends between the copies. DNA strands are shown with solid lines and RNA with dashed lines. In the scenario depicted, transcripts off of either DNA strand extending through the two L1 sequences will give rise to self-complementary regions: the forward facing L1 and the reverse complementary sequence of L1 on the same RNA strand can base pair, forming hairpins (dashed vertical lines) with stretches of dsRNA.

HOW MIGHT RNAi SILENCE L1s? been proposed as a genomic defense against transposable el- ements and may function in an RNAi-dependent or inde- The most obvious possibility is that RNAi limits L1 retro- pendent manner to limit L1 transcription [53, 54]. Methyla- transposition by decreasing the amount of L1 RNA. In this tion of the L1 5UTR has been demonstrated in different cell case, disruption of RNAi should increase L1 RNA levels and types [55, 56]. Treatment of 3T3 cells with 5-azacytidine, a result in an increased retrotransposition frequency. The rela- pyrimidine analog that inhibits DNA methyltransferase, in- tionship between RNAi and repression of the LTR retrotrans- creases L1 transcript abundance [57]. In mice, inactivation posons MuERV-L and intracisternal A-particle (IAP) was of methylases can result in mobilization of retrotransposons recently investigated in early mouse embryos [47]. Knock- including IAP elements and L1s [58, 59]. On the other hand, ing down DICER (with siRNA or dsRNA) resulted in a methylation was not observed in response to stable dsRNA 50% increase in the abundance of MuERV-L and IAP tran- expression in murine oocytes [47]. Consistent with the latter scripts [47]. Recently, conditional dicer knock-out ES cells observation, a recent analysis in human cancer cells suggests were shown to exhibit slightly increased levels of IAP and that RNAi-mediated transcriptional silencing can arise inde- L1 transcripts compared to dicer wild-type cells [48]. In fur- pendently of methylation [60]. ther support of this theory, L1 retrotransposons can form Another nonmutually exclusive possibility is that RNAi dsRNA that is cleaved into siRNAs by DICER in cultured participates in altering chromatin accessibility. Heterochro- cells [49, 50]. This analysis reveals that L1s can serve as tar- matic silencing and histone methylation have been tied to gets for RNAi, but does not address whether they do so in the RNAi pathway in Spombe[61, 62]. Moreover, in the fil- nature. amentous fungus, Neurospora crassa, repression of the L1- Another possibility is that one or more components of like retrotransposon Tad is dependent upon the Argonaute the RNAi machinery acts by silencing L1 insertions in chro- protein QDE2 and DICER [63]. DNA and histone methy- matin via methylation of L1 DNA [47]. Methylation has lation have also been implicated in transposon control in 4 Journal of Biomedicine and Biotechnology

Arabidopsis [64]. In certain yeast and plant species, hete- permissive region of the genome, see Figure 2(a)). Because rochromatin formation may be directed by siRNAs in an the L1 retrotransposition construct contains sense and anti- Argonaute complex with similarities to the RNA-induced si- sense promoters, it may induce RNAi regardless of whether lencing complex, suggesting that the processes of PTGS and the L1 element induces RNAi naturally. transcriptional gene silencing are intertwined [63]. Although To circumvent the potential problem of having bidirec- plant L1-like elements differ from mammalian L1 elements, tional transcription in the L1-EGFP construct, we created a a similar means of mammalian L1-associated chromatin si- series of EGFP-tagged L1 elements that lacked antisense pro- lencing may be at work. moter activity (Figure 2(b)). Using a genetic assay to moni- It is possible that RNAi acts upon L1s using all of these tor retrotransposition, our preliminary data reveal a 2-fold pathways: degradation of L1 RNA (which limits the produc- increase in retrotransposition when constructs lacking the tion of new insertions), modification of L1 DNA sequences antisense promoter in the EGFP marker were used, com- and chromatin silencing (which should limit the activity of pared to the conventional L1-EGFP construct (Figure 2(b)). new or existing functional L1s). The containment of L1s in To control for length effects (better detection of retrotrans- regions of silenced chromatin provides protection by sup- position events due to a shorter marker cassette), we created pressing their transcription, mobility, and recombinational a construct with a “stuffer fragment” in place of the EGFP activity [65–67]. promoter. This L1-stuffer construct also exhibited increased retrotransposition compared to the standard L1-EGFP con- ESTABLISHING A FUNCTIONAL LINK BETWEEN struct, indicating that the basis for the increased retrotrans- RNAi AND L1 RETROTRANSPOSITION position frequency was due to the absence of promoter ac- tivity rather than being due to differences in marker length. Currently, the only direct evidence linking RNAi to the re- Because the retrotransposition frequency in this assay ap- pression of L1 elements in mammals is a slightly increased proaches 90%, we may be underestimating the true retro- level of L1 transcripts in dicer deficient mouse ES cells [48]. transposition frequency (there may be more than one inser- Current efforts to explore L1 regulation by RNAi in mam- tion per clone). The basis for the increased retrotransposition mals are focused on three areas: (i) demonstration of siRNAs activity of constructs lacking the antisense promoter driving derived from native L1 elements; (ii) determining whether the EGFP marker cassette is unresolved. Perhaps these con- L1 dsRNA is assembled in cis (from the same L1) or if sense structs will be helpful in future studies that attempt to link and antisense transcripts originating from two different el- RNAi to the regulation of L1 retrotransposition. ements (assembly in trans) can also trigger RNAi; and (iii) perturbing components of the RNAi pathway and seeing if ALTERNATIVE SILENCING PATHWAYS MEDIATED there are corresponding alterations in the L1 retrotranspo- BY DsRNA sition frequency. Exploration of the first area is under active investigation and is discussed in detail elsewhere in this issue. DsRNA can induce several different pathways in mammals. Concerning the second area, it has been assumed that dsRNA One of them is RNA editing, a process in which adenosines formation in trans is minimal because xenogeneic L1s (eg, a are converted to inosine in nuclear dsRNA by the enzyme human element in a mouse cell) do not appear to be more adenosine deaminase (ADAR). Editing of dsRNA can occur active than syngeneic L1s (a human element in a human cell) in a site-selective or promiscuous fashion. The latter results [32, 68]. However, the permissiveness for L1 retrotransposi- in the generation of a series of variably mutated RNA species. tion in these different cell types is not controlled for. It is in- DsRNA longer than 50 bp in which > 20% A-to-I editing has triguing that L1 elements that have genetically modified RNA occurred is referred to as hyperedited [69]. Based largely on sequences, but identical protein coding sequences, can be far work with polyoma virus, hyperedited RNA may be retained more active for retrotransposition [32, 46]. While there are and/or sequestered in the nucleus [69]. other potential reasons for this (such as decreased premature L1RNAcanserveasasubstrateforRNAediting[70, 71]. polyadenylation and alterations in RNA structure), it will be However, the effects of RNA editing on L1 activity are un- interesting to see if some of the enhanced activity of synthetic known. Since RNA editing affects dsRNA without target- L1sisduetodifferent levels of RNAi. The third area of in- ing homologous copies of single-stranded RNA, editing may vestigation attempts to establish a mechanistic link between have a smaller impact on L1 retrotransposition than RNAi. RNAi and L1 retrotransposition. If L1 RNA editing is similar to Alu editing, most RNA du- Analyzing L1 retrotransposition in cells with altered plexes would be formed intramolecularly due to base pairing RNAi activity is challenging. The first obstacle is to moni- between two oppositely oriented Alus residing in the same tor the mobilization of an active L1 in a sea of L1 sequences RNA molecule [72]. Such duplexes would be expected to in the genome. To get around this “needle-in-a-haystack” have imperfect base pairing between neighboring oppositely problem, L1 elements were tagged with antisense marker cas- oriented L1 elements and could promote editing rather than settes interrupted in the sense direction by an intron [52]. RNAi. RNA editing may further help L1 to evade RNAi be- These tagged elements could then be monitored for retro- cause hyperedited L1 dsRNA would be probably processed transposition by scoring for expression of the marker (which less efficiently into siRNAs and such siRNAs would not base could only occur after a cycle of transcription, processing, re- pair as well with their targets. This idea is consistent with the verse transcription, and integration into a transcriptionally observation that RNAi is antagonized by hyperediting [73] Shane R. Horman et al 5

L1-EGFP

CMV 5UTR ORF1 ORF2 Ap PF Intron GE VMC 3UTR pA

5 truncated insertion

ORF2 Ap PF GE VMC 3UTR pA

(a)

L1-EGFP n = CMV L1 PF GE VMC 66

L1-EGFP-delP n = CMV L1 PF GE 46

L1-EGFP-stuffer n = CMV L1 PF GE 72

L1-EGFP-RIC

CMV L1 PF GE VMC n = 48

Construct 1153045607590 Percentage of transfected clones with spliced EGFP DNA (b)

Figure 2: (a) The standard L1 reporter construct contains opposing promoters. The standard L1 reporter construct used in our laboratory (L1-EGFP) consists of the CMV promoter, the human L1RP element, and the antisense EGFP gene cloned into the L1 3 UTR followed by theSV40latepoly-AsequenceinpCEP4(constructdescribedinmoredetailin[51]). When L1-EGFP retrotransposes, a full-length L1 RNA is transcribed, the intron interrupting EGFP is spliced out, and the processed RNA is reverse transcribed, and a cDNA copy is inserted into the genome. If the insertion is of sufficient length and enters the genome in a transcriptionally permissive region, retrotransposition can be detected phenotypically by screening for EGFP expression. Retrotransposition can also be assayed genetically by performing PCR with primers that flank the EGFP intron. Because the EGFP marker is driven off of an antisense-oriented promoter relative to the L1, the potential exists for creating dsRNA. L1 and EGFP transcripts are given by dashed horizontal lines, promoters are denoted with black arrows, and blue arrows indicate intron-flanking primers used to distinguish new insertions from the parental L1. (b) Loss of an antisense promoter increases L1 retrotransposition in a cultured cell assay. 143B osteosarcoma cells were transfected with one of the following constructs as shown in Figure 2(a): L1-EGFP (the same wild-type L1 retrotransposition construct shown in Figure 2(a)), L1-EGFP-DelP (identical to L1-EGFP except that the CMV promoter driving EGFP was deleted), L1-EGFP-Stuffer (identical to L1-EGFP except that the CMV promoter driving EGFP was replaced with a piece of DNA lacking promoter activity or polyadenylation signals), or L1-EGFP-RIC (retrotransposition incompetent due to two missense mutations (marked with a red X over the L1 coding sequence) derived from the JM111 L1 mutant [52]). Boxes indicate coding sequences except for the yellow box in the L1-EGFP-Stuffer construct that denotes the stuffer sequence. Arrows denote the promoters and the black line separating the EGFP cassette denotes the intron. Cells were selected in hygromycin for two weeks and individual clones were picked and expanded. PCR using primers that flank the intron/exon splice site in EGFP (as described in [11]) was used to monitor individual clones of antibiotic-resistant cells for L1 retrotransposition (loss of the intron in EGFP). The percentages of clones that had the spliced EGFP are shown in Figure 2(b). The number of clones surveyed for each genotype is given to the right of each of the bars. None of the retrotransposition incompetent L1 transfectants had a spliced EGFP product. and that the phenotype of ADAR mutants can be rescued by immune response that likely arose to combat viruses, which mutations in RNAi [74]. frequently produce dsRNA. Activation of PKR by dsRNA In addition to siRNA and RNA editing, longer L1 dsRNA results in its autophosphorylation and subsequent phos- molecules can induce additional cellular responses [75]. phorylation of the eukaryotic initiation factor 2α (eIF2α), Longer dsRNA molecules can be recognized by the dsRNA- causing general inhibition of cellular protein synthesis [76]. dependent protein kinase PKR, which, when activated, Another pathway of dsRNA regulation involves RNaseL, results in interferon-mediated activation of the Jak-Stat a potent riboendonuclease. RNaseL can be indirectly trig- pathway and cellular upregulation of interferon-regulated gered by dsRNA through an increase in 2–5 oligoadeny- genes [76]. This mechanism of cytokine defense is an innate lates. 2–5 oligoadenylates are produced from ATP by 2–5 6 Journal of Biomedicine and Biotechnology oligoadenylate synthetases, which are activated by dsRNA [3] Plasterk RHA. RNA silencing: the genome’s immune system. [77]. In addition to the nucleolytic properties of RNaseL, the Science. 2002;296(5571):1263–1265. enzyme also upregulates type I interferon genes by sequester- [4] Bagasra O, Prilliman KR. RNA interference: the molecular im- ing NFκB transcription factors [77]. mune system. Journal of Molecular Histology. 2004;35(6):545– Certain cell types, for example those of myeloid origin, 553. constitutively express receptors that recognize dsRNA [78]. [5] Sijen T, Plasterk RHA. Transposon silencing in the Caenor- habditis elegans germ line by natural RNAi. Nature. 2003; The toll-like receptor (TLR)3 recognizes and binds to dsRNA 426(6964):310–314. [79]. TLR3 is expressed on the cell surface as well as in in- [6] Vastenhouw NL, Plasterk RHA. RNAi protects the Caenorhab- tracellular vesicles [80]. Thus, dsRNA can be recognized by ditis elegans germline against transposition. Trends in Genet- TLR3 internally, as an intermediate in viral replication, or ics. 2004;20(7):314–319. externally, as dsRNA leaks from dying cells [78]. Recogni- [7] Jensen S, Gassama M-P, Heidmann T. Taming of transposable tion of dsRNA by TLR3 initiates the binding of NFκBand elements by homology-dependent gene silencing. Nature Ge- IRF-3 transcription factors to the promoters of type I inter- netics. 1999;21(2):209–212. feron genes leading to their upregulation, which can eventu- [8] Jensen S, Gassama M-P, Heidmann T. Cosuppression of I ally cause cell death via apoptosis [80]. transposon activity in Drosophila by I-containing sense and Mammalian oocytes, embryos, and embryonic stem cells antisense transgenes. Genetics. 1999;153(4):1767–1774. [9] Aravin AA, Naumova NM, Tulin AV, Vagin VV, Rozovsky do not induce a dsRNA-mediated interferon response, but YM, Gvozdev VA. Double-stranded RNA-mediated silencing utilize the RNAi pathway to respond to long dsRNA [14, 15, of genomic tandem repeats and transposable elements in the 81]. In contrast, somatic cells might be more likely to use D. melanogaster germline. Current Biology. 2001;11(13):1017– an interferon pathway when confronted with long dsRNA 1027. species [82]. The rationale for using different dsRNA recog- [10] Kalmykova AI, Klenov MS, Gvozdev VA. Argonaute pro- nition pathways in progenitor cells versus somatic cells is tein PIWI controls mobilization of retrotransposons in the that embryos may not be able to afford the luxury of shut- Drosophila male germline. Nucleic Acids Research. 2005;33(6): ting down individual cells if trouble arises. In contrast, adult 2052–2059. mammalian cells can apoptose with little to no effect on the [11] Prak ETL, Dodson AW, Farkash EA, Kazazian HH Jr. Tracking organism as a whole. It may be that the secondary products an embryonic L1 retrotransposition event. Proceedings of the of L1 dsRNA dicer-mediated endonucleolytic cleavage acti- National Academy of Sciences of the United States of America. ff 2003;100(4):1832–1837. vate the PKR-interferon pathway in adult di erentiated cells, [12] Branciforte D, Martin SL. Developmental and cell type which induces cell death. specificity of LINE-1 expression in mouse testis: implica- tions for transposition. Molecular and Cellular Biology. 1994; CONCLUSION 14(4):2584–2592. [13] Ostertag EM, DeBerardinis RJ, Goodier JL, et al. A mouse L1 retrotransposons have shaped the mammalian genome model of human L1 retrotransposition. Nature Genetics. 2002; and contribute significantly to its mass, yet their mobility 32(4):655–660. appears to be actively constrained. Along with other cellu- [14] Svoboda P. Long dsRNA and silent genes strike back: RNAi lar defense mechanisms, RNAi may participate in cell-type- in mouse oocytes and early embryos. Cytogenetic and Genome specific, multifaceted defense against L1 mobility that in- Research. 2004;105(2–4):422–434. [15] Stein P, Zeng F, Pan H, Schultz RM. Absence of non- cludes RNA destruction, DNA methylation, and heterochro- specific effects of RNA interference triggered by long double- matin formation. The development of new genetic models stranded RNA in mouse oocytes. Developmental Biology. of RNAi deficiency in mammals, coupled with a genetic as- 2005;286(2):464–471. say for monitoring L1 retrotransposition events, may help to [16] Lander ES, Linton LM, Birren B, et al. Initial sequencing and advance our understanding of how L1 mobility is regulated. analysis of the human genome. Nature. 2001;409(6822):860– 921. ACKNOWLEDGMENTS [17] Brouha B, Schustak J, Badge RM, et al. Hot L1s account for the bulk of retrotransposition in the human population. Proceed- We thank Richard Schultz, Greg Hannon, and members of ings of the National Academy of Sciences of the United States of the Luning Prak Lab for helpful discussions and we gratefully America. 2003;100(9):5280–5285. acknowledge Sarah Fox and Janet Sallit for skilled technical [18] Luning Prak ET, Kazazian HH Jr. Mobile elements and the hu- man genome. Nature Reviews. Genetics. 2000;1(2):134–144. assistance. This work was supported by NIH R01 CA108812 [19] Han JS, Szak ST, Boeke JD. Transcriptional disruption by the to Eline T Luning Prak and training grant support to Shane L1 retrotransposon and implications for mammalian tran- R Horman (T32 CA 09140). scriptomes. Nature. 2004;429(6989):268–274. [20] Han JS, Boeke JD. LINE-1 retrotransposons: modulators of REFERENCES quantity and quality of mammalian gene expression? BioEs- says. 2005;27(8):775–784. [1] Tomari Y, Zamore PD. Perspective: machines for RNAi. Genes [21] Moran JV, DeBerardinis RJ, Kazazian HH Jr. Exon shuffling by & Development. 2005;19(5):517–529. L1 retrotransposition. Science. 1999;283(5407):1530–1534. [2] Carmell MA, Hannon GJ. RNase III enzymes and the initia- [22] Pickeral OK, Makaowski W, Boguski MS, Boeke JD. Frequent tion of gene silencing. Nature Structural & Molecular Biology. human genomic DNA transduction driven by LINE-1 retro- 2004;11(3):214–218. transposition. Genome Research. 2000;10(4):411–415. Shane R. Horman et al 7

[23] Goodier JL, Ostertag EM, Kazazian HH Jr. Transduction of 3- [43] Nigumann P, Redik K, Matlik K, Speek M. Many human genes flanking sequences is common in L1 retrotransposition. Hu- are transcribed from the antisense promoter of L1 retrotrans- man Molecular Genetics. 2000;9(4):653–657. poson. Genomics. 2002;79(5):628–634. [24] Dewannieux M, Esnault C, Heidmann T. LINE-mediated [44] Szak ST, Pickeral OK, Makalowski W, Boguski MS, Landsman retrotransposition of marked Alu sequences. Nature Genetics. D, Boeke JD. Molecular archeology of L1 insertions in the hu- 2003;35(1):41–48. man genome. Genome Biology. 2002;3(10):research0052. [25] Kazazian HH Jr. Mobile elements and disease. Current Opinion [45] Ostertag EM, Kazazian HH Jr. Twin priming: a proposed in Genetics & Development. 1998;8(3):343–350. mechanismforthecreationofinversionsinL1retrotranspo- [26]MuotriAR,ChuVT,MarchettoMC,DengW,MoranJV,Gage sition. Genome Research. 2001;11(12):2059–2065. FH. Somatic mosaicism in neuronal precursor cells mediated [46] Perepelitsa-Belancio V, Deininger P. RNA truncation by pre- by L1 retrotransposition. Nature. 2005;435(7044):903–910. mature polyadenylation attenuates human mobile element ac- [27] Boissinot S, Entezam A, Furano AV. Selection against delete- tivity. Nature Genetics. 2003;35(4):363–366. rious LINE-1-containing loci in the human lineage. Molecular [47] Svoboda P, Stein P, Anger M, Bernstein E, Hannon GJ, Schultz Biology and Evolution. 2001;18(6):926–935. RM. RNAi and expression of retrotransposons MuERV-L and [28] Boissinot S, Roos C, Furano AV. Different rates of LINE-1 (L1) IAP in preimplantation mouse embryos. Developmental Biol- retrotransposon amplification and evolution in New World ogy. 2004;269(1):276–285. monkeys. Journal of Molecular Evolution. 2004;58(1):122–130. [48] Kanellopoulou C, Muljo SA, Kung AL, et al. Dicer-deficient ff [29] Goto T, Jones GM, Lolatgis N, Pera MF, Trounson AO, mouse embryonic stem cells are defective in di erenti- Monk M. Identification and characterisation of known and ation and centromeric silencing. Genes & Development. novel transcripts expressed during the final stages of human 2005;19(4):489–501. oocyte maturation. Molecular Reproduction and Development. [49] Soifer HS, Zaragoza A, Peyvan M, Behlke MA, Rossi JJ. A po- 2002;62(1):13–28. tential role for RNA interference in controlling the activity of [30] Ergun S, Buschmann C, Heukeshoven J, et al. Cell type- the human LINE-1 retrotransposon. Nucleic Acids Research. specific expression of LINE-1 open reading frames 1 and 2 in 2005;33(3):846–856. fetal and adult human tissues. The Journal of Biological Chem- [50] Yang N, Zhang L, Kazazian HH Jr. L1 retrotransposon- istry. 2004;279(26):27753–27763. mediated stable gene silencing. Nucleic Acids Research. 2005;33(6):e57. [31] Yang N, Zhang L, Zhang Y, Kazazian HH Jr. An important role for RUNX3 in human L1 transcription and retrotransposition. [51] Ostertag EM, Prak ET, DeBerardinis RJ, Moran JV, Kazazian Nucleic Acids Research. 2003;31(16):4929–4940. HH Jr. Determination of L1 retrotransposition kinetics in cul- tured cells. Nucleic Acids Research. 2000;28(6):1418–1423. [32] Han JS, Boeke JD. A highly active synthetic mammalian retro- [52] Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, transposon. Nature. 2004;429(6989):314–318. Kazazian HH Jr. High frequency retrotransposition in cul- [33] Wei W, Gilbert N, Ooi SL, et al. Human L1 retrotransposition: tured mammalian cells. Cell. 1996;87(5):917–927. cis preference versus trans complementation. Molecular and [53] Bestor TH. The host defence function of genomic methylation Cellular Biology. 2001;21(4):1429–1439. patterns. Novartis Foundation Symposium. 1998;214:187–195. [34] Esnault C, Maestre J, Heidmann T. Human LINE retro- discussion 195–199, 228–232. transposons generate processed pseudogenes. Nature Genetics. [54] Bestor TH. Cytosine methylation mediates sexual conflict. 2000;24(4):363–367. Trends in Genetics. 2003;19(4):185–190. [35] Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotrans- [55] Woodcock DM, Lawler CB, Linsenmeyer ME, Doherty JP, poson evolution and amplification in recent human history. Warren WD. Asymmetric methylation in the hypermethylated Molecular Biology and Evolution. 2000;17(6):915–928. CpG promoter region of the human L1 retrotransposon. The [36] Skowronski J, Fanning TG, Singer MF. Unit-length line-1 tran- Journal of Biological Chemistry. 1997;272(12):7810–7816. scripts in human teratocarcinoma cells. Molecular and Cellular [56] Burden AF, Manley NC, Clark AD, Gartler SM, Laird Biology. 1988;8(4):1385–1397. CD, Hansen RS. Hemimethylation and non-CpG methyla- [37] Kramerov DA, Bukrinsky MI, Ryskov AP. DNA sequences tion levels in a promoter region of human LINE-1 (L1) homologous to long double-stranded RNA. Transcription repeated elements. The Journal of Biological Chemistry. of intracisternal A-particle genes and major long repeat 2005;280(15):14413–14419. of the mouse genome. Biochimica et Biophysica Acta. [57] Tchenio T, Segal-Bendirdjian E, Heidmann T. Generation of 1985;826(1):20–29. processed pseudogenes in murine cells. The EMBO Journal. [38] Martens JH, O’Sullivan RJ, Braunschweig U, et al. The pro- 1993;12(4):1487–1497. file of repeat-associated histone lysine methylation states in the [58] Walsh CP, Chaillet JR, Bestor TH. Transcription of IAP en- mouse epigenome. The EMBO Journal. 2005;24(4):800–812. dogenous retroviruses is constrained by cytosine methylation. [39] Houbaviy HB, Murray MF, Sharp PA. Embryonic stem cell- Nature Genetics. 1998;20(2):116–117. specific MicroRNAs. Developmental Cell. 2003;5(2):351–358. [59] Bourc’his D, Bestor TH. Meiotic catastrophe and retrotrans- [40] Suh MR, Lee Y, Kim JY, et al. Human embryonic stem cells poson reactivation in male germ cells lacking Dnmt3L. Nature. express a unique set of microRNAs. Developmental Biology. 2004;431(7004):96–99. 2004;270(2):488–498. [60] Ting AH, Schuebel KE, Herman JG, Baylin SB. Short double- [41] Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identi- stranded RNA induces transcriptional gene silencing in hu- fication of novel genes coding for small expressed RNAs. Sci- man cancer cells in the absence of DNA methylation. Nature ence. 2001;294(5543):853–858. Genetics. 2005;37(8):906–910. [42] Speek M. Antisense promoter of human L1 retrotransposon [61] Verdel A, Jia S, Gerber S, et al. RNAi-mediated target- drives transcription of adjacent cellular genes. Molecular and ing of heterochromatin by the RITS complex. Science. Cellular Biology. 2001;21(6):1973–1985. 2004;303(5658):672–676. 8 Journal of Biomedicine and Biotechnology

[62] Cam HP, Sugiyama T, Chen ES, Chen X, FitzGerald PC, [81] Paddison PJ, Caudy AA, Hannon GJ. Stable suppression of Grewal SI. Comprehensive analysis of heterochromatin- and gene expression by RNAi in mammalian cells. Proceedings of RNAi-mediated epigenetic control of the fission yeast genome. the National Academy of Sciences of the United States of Amer- Nature Genetics. 2005;37(8):809–819. ica. 2002;99(3):1443–1448. [63] Nolan T, Braccini L, Azzalin G, De Toni A, Macino G, Cogoni [82] Saunders LR, Barber GN. The dsRNA binding protein fam- C. The post-transcriptional gene silencing machinery func- ily: critical roles, diverse cellular functions. The FASEB Journal. tions independently of DNA methylation to repress a LINE1- 2003;17(9):961–983. like retrotransposon in Neurospora crassa. Nucleic Acids Re- search. 2005;33(5):1564–1573. [64] Lippman Z, Gendrel AV, Black M, et al. Role of transposable elements in heterochromatin and epigenetic control. Nature. 2004;430(6998):471–476. [65] Birchler JA, Bhadra MP,Bhadra U. Making noise about silence: repression of repeated genes in animals. Current Opinion in Genetics & Development. 2000;10(2):211–216. [66] Bender J. DNA methylation and epigenetics. Annual Review of Plant Biology. 2004;55:41–68. [67] Lippman Z, May B, Yordan C, Singer T, Martienssen R. Distinct mechanisms determine transposon inheritance and methylation via small interfering RNA and histone modifica- tion. PLoS Biology. 2003;1(3):E67. [68] Naas TP, DeBerardinis RJ, Moran JV, et al. An actively retro- transposing, novel subfamily of mouse L1 elements. The EMBO Journal. 1998;17(2):590–597. [69] DeCerbo J, Carmichael GG. Retention and repression: fates of hyperedited RNAs in the nucleus. Current Opinion in Cell Bi- ology. 2005;17(3):302–308. [70] Blow M, Futreal PA, Wooster R, Stratton MR. A survey of RNA editing in human brain. Genome Research. 2004;14(12):2379– 2387. [71] Levanon EY, Eisenberg E, Yelin R, et al. Systematic identifica- tion of abundant A-to-I editing sites in the human transcrip- tome. Nature Biotechnology. 2004;22(8):1001–1005. [72] Athanasiadis A, Rich A, Maas S. Widespread A-to-I RNA edit- ing of Alu-containing mRNAs in the human transcriptome. PLoS Biology. 2004;2(12):e391. [73] Scadden AD, Smith CW. RNAi is antagonized by A → Ihyper- editing. EMBO Reports. 2001;2(12):1107–1111. [74] Tonkin LA, Bass BL. Mutations in RNAi rescue aberrant chemotaxis of ADAR mutants. Science. 2003;302(5651):1725. [75] Billy E, Brondani V, Zhang H, Muller¨ U, Filipowicz W. Specific interference with gene expression induced by long, double- stranded RNA in mouse embryonal teratocarcinoma cell lines. Proceedings of the National Academy of Sciences of the United States of America. 2001;98(25):14428–14433. [76] Sledz CA, Holko M, de Veer MJ, Silverman RH, Williams BR. Activation of the interferon system by short-interfering RNAs. Nature Cell Biology. 2003;5(9):834–839. [77] Castelli JC, Hassel BA, Maran A, et al. The role of 2–5 oligoadenylate-activated ribonuclease L in apoptosis. Cell Death and Differentiation. 1998;5(4):313–320. [78] Guillot L, Le Goffic R, Bloch S, et al. Involvement of toll-like receptor 3 in the immune response of lung epithelial cells to double-stranded RNA and influenza A virus. The Journal of Biological Chemistry. 2005;280(7):5571–5580. [79] Matsumoto M, Kikkawa S, Kohase M, Miyake K, Seya T. Estab- lishment of a monoclonal antibody against human Toll-like receptor 3 that blocks double-stranded RNA-mediated sig- naling. Biochemical and Biophysical Research Communications. 2002;293(5):1364–1369. [80] Matsumoto M, Funami K, Oshiumi H, Seya T. Toll-like recep- tor 3: a link between toll-like receptor, interferon and viruses. Microbiology and Immunology. 2004;48(3):147–154.