US 2012O244131A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2012/0244131 A1 Delacote et al. (43) Pub. Date: Sep. 27, 2012

(54) METHOD FORMODULATING THE CI2N IS/II (2006.01) EFFICIENCY OF DOUBLE-STRAND A6II 35/00 (2006.01) BREAK-INDUCED MUTAGENESIS C40B 30/06 (2006.01) CI2N 5/10 (2006.01) (75) Inventors: Fabien Delacote, Paris (FR), (52) U.S. Cl...... 424/93.21; 506/10; 435/6.13:435/.440; Philippe Duchateau, Livry Gargan 435/320.1; 435/.414; 435/417; 435/411; 435/412: (FR), Christophe Perez-Michaut, 435/419:435/254.3:435/254.5; 435/254.6; Paris (FR) 435/254.23; 435/254.2:435/254.11: 435/325: 435/348; 435/366; 435/353; 435/354; 435/351; (73) Assignee: CELLECTISSA, Romainville 435/350; 435/349; 536/24.5; 536/23.1: 514/44A; Cedex (FR) 514744 R (21) Appl. No.: 13/367,098 (57) ABSTRACT (22) Filed: Feb. 6, 2012 A method for modulating double-strand break-induced mutagenesis at a genomic locus of interest in a cell, thereby Related U.S. Application Data giving new tools for genome engineering, including thera (60) Provisional application No. 61/439,739, filed on Feb. peutic applications and cell line engineering. A method for 4, 2011. modulating double-strand break-induced mutagenesis, con s cerns the identification of effectors that modulate double O O Strand break-induced mutagenesis by use of interfering Publication Classification agents; these agents are capable of modulating double-strand (51) Int. Cl. break-induced mutagenesis through their respective director A6 IK3I/7088 (2006.01) indirect actions on said effectors. Methods of using these GOIN 2L/76 (2006.01) effectors, interfering agents and derivatives, respectively, by CI2N 15/63 (2006.01) introducing them into a cell in order to modulate and more CI2N 15/79 (2006.01) particularly to increase double-strand break-induced A6IP35/00 (2006.01) mutagenesis. Specific derivatives of identified effectors and CI2N I/19 (2006.01) interfering agents, vectors encoding them, compositions and CI2N L/15 (2006.01) kits comprising Such derivatives for modulating or increasing C7H 2L/02 (2006.01) double-strand break-induced mutagenesis.

38;& 8::cis bikiig. 8:8 3:338c:38

$38.8, 8:8s 3:38:888:g

:38: 88s $38888

Patent Application Publication Sep. 27, 2012 Sheet 1 of 12 US 2012/0244131 A1

Figure

if ercis binding and protection

xix. 8 is processing

if &rcis ligation

K. ix.88 eterocities xx-xes 88.8 yatex xxcrax 8:8xx {x}rgix Patent Application Publication Sep. 27, 2012 Sheet 2 of 12 US 2012/0244131 A1

igure

8& ::$8883: : -8:::::::::::::::::: 8xxx;8:38 : {yeire serine stretc: -8 Act . x 8.8 xxxix.88 Kia exiors . x: i: 88888 - 88.

xx 3' xxxixty &

... :8: ::::: :::::: ::::::::::::::::::

888; 8.x: x:

8x8:8 pie ori

88.88: 888:88: 88: 88x8: Patent Application Publication Sep. 27, 2012 Sheet 3 of 12 US 2012/0244131 A1

rigue 3

Patent Application Publication Sep. 27, 2012 Sheet 4 of 12 US 2012/0244131 A1

88: Patent Application Publication Sep. 27, 2012 Sheet 5 of 12 US 2012/0244131 A1

Rix * : {xx

$38: 88: 3:38:

Patent Application Publication Sep. 27, 2012 Sheet 6 of 12 US 2012/0244131 A1

Figure 8

38 i.888: 8 -&xi ix3.88883: 8xxx:ix: 8:8 {{yxix. $xists 8:8

2 8: 888x388 : 8: 88.tiers . As EGFPorf-Arg)

x - 8888

-.

scay vics serpa

x R&s: 3 homeiogy

* skix x x: x: intros it surr i. ssy Keiope

ii. 8: 8:8 88.8.x: 8888: 8xxx xx 888 Patent Application Publication Sep. 27, 2012 Sheet 7 of 12 US 2012/0244131 A1

::::::::::.888:8 3888: 83.388 *::::::::::::::::::

:

:::::::::::::::::: *::::::::::: :: Patent Application Publication Sep. 27, 2012 Sheet 8 of 12 US 2012/0244131 A1

Figure 8 ::::::::::: 88: 8x promoter 8 x: xixe: /- 7 x:

88: 883: ::::::::

w8 xx

w8 x :

88:8; •. Kak sv early promoter

Figure 9 cx expire ::::::::::::::

c promoter

: xxxx

k:

xii. xii: Patent Application Publication Sep. 27, 2012 Sheet 9 of 12 US 2012/0244131 A1

08: 8888

Sv4.0 polyA 8x8 xxx;

Figure 1

Patent Application Publication Sep. 27, 2012 Sheet 10 of 12 US 2012/0244131 A1

::::::::::: 888 Patent Application Publication Sep. 27, 2012 Sheet 11 of 12 US 2012/0244131 A1

8388x

cow are prise

8. 8:

88: ::::::888

:

::::::::::::

K 88 rex sex 8x8 as Kg8. 8:8: is 88: 8v8 eary 8x Figure is a8: 88:

a prox 88:8; 8x 8x888

88: 888y 8888 Patent Application Publication Sep. 27, 2012 Sheet 12 of 12 US 2012/0244131 A1

8 8 -

:

8.

& &: 38 38 38 38 &

*igure 8 :::::::::::::: 888

$888

8 :

88::::::::8x88 US 2012/0244131 A1 Sep. 27, 2012

METHOD FOR MODULATING THE Terada, Urawa et al. 2002; Endo, Osakabe et al. 2006; Endo, EFFICIENCY OF DOUBLE-STRAND Osakabe et al. 2007). Typically, GT events occur in a fairly BREAK-INDUCED MUTAGENESIS small proportion of treated mammalian while GT efficiency is extremely low in higher plant cells and range between 0.01 CROSS-REFERENCE TO RELATED 0.1% of the total number of random integration events APPLICATION (Terada, Johzuka-Hisatomietal. 2007). The low GT frequen 0001. This application claims priority under 35 U.S.C. cies reported in various organisms are thought to result from S119(e) to U.S. Provisional Application No. U.S. 61/439,739, competition between HR and non homologous end joining filed Feb. 4, 2011, which is hereby incorporated by reference (NHEJ) for repair of dsDNA breaks (DSBs). As a conse in its entirety. quence, the ends of a donor molecule are likely to be joined by NHEJ rather than participating in HR, thus reducing GT FIELD OF THE INVENTION frequency. There is extensive data indicating that DSBs repair by NHEJ is error-prone. Often, DSBs are repaired by end 0002 The present invention relates to a method for modu joining processes that generate insertions and/or deletions lating double-strand break-induced mutagenesis at a genomic (Britt 1999). Thus, these NHEJ-based strategies might be locus of interestina cell, thereby giving new tools for genome more effective than HR-based strategies for targeted engineering, including therapeutic applications and cell line mutagenesis into cells. Indeed, expression of I-Sce I, a rare engineering. More specifically, the method of the present cutting restriction , has been shown to introduce invention for modulating double-strand break-induced mutations at I-Sce I cleavage sites in Arabidopsis and tobacco mutagenesis (DSB-induced mutagenesis), concerns the iden (Kirik, Salomon et al. 2000). Nevertheless, the use of restric tification of effectors that modulate said DSB-induced tion is limited to rarely occurring natural recogni mutagenesis by uses of interfering agents; these agents are tion sites or to artificial target sites. To overcome this prob capable of modulating DSB-induced mutagenesis through lem, meganucleases with engineered specificity towards a their respective direct or indirect actions on said effectors. chosen sequence have been developed. Meganucleases show The present invention also concerns the uses of these effec high specificity to their DNA target, these proteins being able tors, interfering agents and derivatives, respectively, by intro to cleave a unique chromosomal sequence and therefore do ducing them into a cell in order to modulate and more par not affect global genome integrity. Natural meganucleases are ticularly to increase DSB-induced mutagenesis. The present essentially represented by homing endonucleases, a wide invention also relates to specific derivatives of identified spread class of proteins found in , bacteria and effectors and interfering agents, vectors encoding them, com archae (Chevalier and Stoddard 2001). Early studies of the positions and kits comprising such derivatives in order to I-Sce I and HO homing endonucleases have illustrated how modulate and more particularly to increase DSB-induced the cleavage activity of these proteins can be used to initiate mutagenesis. HR events in living cells and have demonstrated the recombi nogenie properties of chromosomal DSBs (Dujon, Colleaux BACKGROUND OF THE INVENTION et al. 1986; Haber 1995). Since then, meganuclease-induced 0003 Mutagenesis is induced by physical and chemical HR has been successfully used for genome engineering pur means provoking DNA damages when incorrectly repaired poses in bacteria (Posfai, Kolisnychenko et al. 1.999), mam leading to mutations. Several chemicals are known to cause malian cells (Sargent, Brenneman et al. 1997: Cohen-Tan DNA lesions and are routinely used. Radiomimetic agents noudji, Robine et al. 1998; Donoho, Jasin et al. 1998), mice work through free radical attack on the Sugar moieties of (Cbuble, Smith et al. 2006) and plants (Puchta, Dujon et al. DNA (Povirk 1996). A second group of drugs inducing DNA 1996; Siebert and Puchta 2002). Meganucleases have damage includes inhibitors of topoisomerase I (Topol) and II emerged as scaffolds of choice for deriving genome engineer (Topol I) (Teieher 2008) (Burden and N. 1998). Other classes ing tools cutting a desired target sequence (Paques and of chemicals bind covalently to the DNA and form bulky Duchateau 2007). adducts that are repaired by the nucleotide excision repair 0005 Combinatorial assembly processes allowing to (NER) system (Nouspikel 2009). Chemicals inducing DNA engineer meganucleases with modified specificities has been damage have a diverse range of applications and are widely described by Arnould et al. (Arnould, Chames et al. 2006; used. However, although certain agents are more commonly Smith, Grizot et al. 2006; Arnould, Perez et al. 2007: Grizot, applied in studying a particular repair pathway (e.g. cross Smith et al. 2009). Briefly, these processes rely on the iden linking agents are favored for NER studies), most drugs tifications of locally engineered variants with a substrate simultaneously provoke a variety of lesions (Nagy and Sou specificity that differs from the substrate specificity of the toglou 2009). The physical means to generate mutagenesis is wild-type meganuclease by only a few nucleotides. Another through the exposure of cells to ionizing radiation of one of type of specific endonucleases is based on Zinc finger three classes X-rays, gamma rays, or neutrons (Green and nuclease. ZFNs are chimeric proteins composed of a Syn Roderick 1966). However, using these classical, strategies. thetic zinc finger-based DNA binding domain and a DNA the overall yield of induced mutations is quite low, and the cleavage domain. By modification of the zinc finger DNA DNA damage leading to mutagenesis cannot be targeted to binding domain, ZFNs can be specifically designed to cleave precise genomic DNA sequence. virtually any long stretch of dsDNA sequence (Kim, Chaetal. 0004. The most widely used in vivo site-directed mutagen 1996; Cathomen and Joung 2008). An NHEJ-based targeted esis strategy is targeting (GT) via homologous recom mutagenesis strategy was developed recently in several bination (HR). Efficient GT procedures have been available organisms by using synthetic ZFNs to generate DSBs at spe for more than 20 years in yeast (Rothstein 1991) and mouse cific genomic sites (Lloyd, Plaisier et al. 2005; Beumer, (Capecchi 1989). Successful GT has also been achieved in Trautman et al. 2008; Doyon, McCammon et al. 2008; Meng, Arabidopsis and rice plants (Hanin, Volrath et al. 2001; Noyes et al. 2008). Subsequent repair of the DSBs by NHEJ US 2012/0244131 A1 Sep. 27, 2012 frequently produces deletions and/or insertions at the joining or synthetic compounds) of the interactions of XRCC4 and site. For examples, in Zebrafish embryos, the injection of DNA ligase IV, and XRCC4 and DNA-PK to effect cellular mRNA coding for engineered ZFN led to animals carrying DNA repair activity. It also relates to screens for individuals the desired heritable mutations (Doyon, McCammon et al. predisposed to conditions in which XRCC4 and/or DNA 2008). In plant, same NHEJ-based targeted-mutagenesis has ligase IV are deficient, Sarkaria et al. (Sarkaria, Tibbetts et al. also been successful (Lloyd, Plaisier et al. 2005). Although 1998) describes the inhibition of phosphoinositide 3- these powerful tools are available, there is still a need to related (such as DNA-dependent , ATR further improved double-strand break-induced mutagenesis. and ATM) by the radiosensitizing agent, wortmannin. 0006. As mentioned above, two mechanisms for the repair 0009. In an attempt to define in molecular detail the of DSBs have been described, involving either homologous mechanism of NHEJ, an in vitro system for end-joining was recombination or non-homologous end-joining (NHEJ). recently developed (Baumarm and West 1998). The reactions NHEJ consists of at least two genetically and biochemically exhibited an apparent requirement for DNA-PKS, Ku70/80, distinct process (Feldmann, Schmiemann et al. 2000). The XRCC4 and DNA ligase IV, consistent with the in vivo major and best characterized "classic' end-joining pathway requirements. Preliminary fractionation and complementa (C-NHEJ) involves rejoining of what remains of the two DNA tion assays, however, revealed that these factors were not ends through direct, relegation (Critchlow and Jackson 1998). Sufficient for efficient end-joining, and that other components A scheme for this pathway is shown in FIG.1. NHEJ can be of the reaction remained to be identified. divided in three major steps: detection and protection of DNA 0010 RNA interference is an endogenous gene silencing ends, DNA end-processing and finally DNA ligation, Detec pathway that responds to dsRNAS by silencing homologous tion and protection of DNA ends are mediated by DNA-PK (Meister and Tuschl 2004). First described in Cae which is composed of Ku70 and Ku80 proteins that form an norhabditis elegans by Fire et al. the RNAi pathway functions heterodimer () binding DNA ends and recruiting DNA-PK in a broad range of eukaryotic organisms (Hannon 2002). catalytic subunit (DNA-PKcs). This interaction DNA-PKcs Silencing in these initial experiments was triggered by intro Ku-DSB stimulates DNA-PKcs kinase activity, maintains the duction of long dsRNA. The enzyme Dicer cleaves these long broken ends in close proximity and prevents from extended dsRNAs into short-interfering RNAs (siRNAs) of approxi degradation. Ku also recruits other components of C-NHEJ mately 21-23 nucleotides. One of the two siRNA strands is repair process. Candidates for DNA end processing are Arte then incorporated into an RNA-induced silencing complex mis DNA mu () and lamda (W), polynucleotide (RISC). RISC compares these “guide RNAs” to RNAs in the kinase (PNK) and Werner's syndrome (WRN) (for cell and efficiently cleaves target RNAS containing sequences review (Mahaney, Meek et al., 2009)). The ligation process is that are perfectly, or nearly perfectly complementary to the mediated by DNA ligase IV and its cofactors XRCC4 and guide RNA. XLF/Cernmunos. Finally, other proteins or complex modulat 0011 For many years it was unclear whether the RNAi ing NHEJ activity have been described such as BRCA1, pathway was functional in cultured mammalian cells and in Rad50-Mre 11-Nbs (Williams, Williams et al. 2007: Shrivas whole mammals. However, Elbashir S. M. et al., 2001 (El tav, De Haro et al. 2008) complex, CtIP or FANCD2 (Bau, bashir, Harborth et al. 2001), triggered RNAi in cultured Man et al. 2006: Pace, Mosedale et al. 2010)). NHEJ is mammalian cells by transfecting them with 21 nucleotide thought to be effective at all times in the cell cycle ((Essers, synthetic RNA duplexes that mimicked endogenous siRNAs. van Steeg et al. 2000): (Takata, Sasaki et al. 1998)). NHEJ McCaffrey et al. (McCaffrey, Meuse et al., 2002), also dem also plays an important role in DSB repair during V(D)J onstrated that siRNAs and shRNAs could efficiently silence genes in adult mice. recombination (Blunt, Finnie et al. 1995) (Taccioli, Rathbun 0012 Introduction of chemically synthetized siRNAs can et al. 1993). effectively mediate post-transcriptional gene silencing in 0007. The second mechanism, referred as microhomology mammaliancells without inducing interferon responses. Syn mediated end joining (MMEJ) or alternative NHEJ thetic siRNAS, targeted against a variety of genes have been (A-NHEJ) or back up NHEJ (B-NHEJ) is associated with Successfully used in mammalian cells to prevent expression significant 5'-3' resection of the end and uses microhomolo of target mRNA (Harborth, Elbashir et al. 2001). These dis gies to anneal DNA allowing repair. Little is known about the coveries of RNAi and siRNA-mediated gene silencing has led components of this machinery. DNA ligase3 with XRCC1 to a spectrum of opportunities for functional genomics, target proteins are candidate for the ligase activity (Audebert, Salles validation, and the development of siRNA-based therapeu et al. 2004; Wang, Rosidietal. 2005). PARP seems also to be tics, making it a potentially powerful tool for therapeutics and an important factor of this mechanism (Audebert, Salles et al. in vivo studies. 2004) (Wang, Wu et al. 2006). 0013 The authors of the present invention have developed 0008. Theoretically, both classical and alternative NHEJ a new approach to increase the efficiency of DSB-induced could lead to mutagenesis, although A-NHEJ mechanism mutagenesis. This new approach relates through, the identi would represent the main pathway to favour when one wants fication of new effectors that modulate said DSB-induced to increase DSB-induced mutagenesis. Several methods have mutagenesis by uses of interfering agents in an in vivo assay. been described in order to modulate NHE.J. For example, US These agents being capable of modulate DSB-induced 2004/029130A1 concerns a method of stimulating NHEJ of mutagenesis through their respective directorindirect actions DNA the method comprising performing NHEJ of DNA in on respective effectors, introduction of these interfering the presence of inositol hexakisphosphate (IP6) or other agents and/or derivatives into a cell, respectively, will lead to stimulatory inositol phosphate. The invention also provides a cell wherein said DSB-induced mutagenesis is modulated. screening assays for compounds which may modulate NHEJ and DNA-PK and related protein kinases and which may be BRIEF SUMMARY OF THE INVENTION therapeutically useful. WO98/30902 relates to modulation of 0014. The present invention relates to a method for modu the NHEJ system via regulation (using protein and/or natural lating DSB-induced mutagenesis at a genomic locus of inter US 2012/0244131 A1 Sep. 27, 2012

estina cell, thereby giving new tools for genome engineering, at RAG1 locus of 293H cells (8 siRNAs identified with an including therapeutic applications and cell line engineering. extrachromosomal assay targeting CAMK2G, SMG1, 00.15 More specifically, in a first aspect, the present inven PRKCE, CSNK1D, AK2, AKT2, MAPK12 and ElF2AK2 tion concerns a method for identifying effectors that modu genes and two siRNAs targeting PRKDC gene). late DSB-induced mutagenesis, thereby allowing the increase 0027 FIG. 5: DeepSequencing experiment for monitoring or decrease of DSB-induced mutagenesis in a cell. As of NHEJ repair events induced by SC-RAG meganuclease at described elsewhere, this method allows screening of inter endogenous RAG1 locus of 293H cells in the presence or not fering agents libraries covering an unlimited number of mol of siRNAs targeting WRN, MAPK3, FANCD2, ATR, ecules. As a non-limiting example, the method of the present BRCA1 and XRCC6 genes. invention allows screening for interfering RNAs, which in 0028 FIG. 6: EGFP plasmid construction maps to monitor turn allow identifying the genes which they silence, through a frequency of NHEJ repair events induced by SC GS their capacities to stimulate or to inhibit DSB-induced (pCLS6810, SEQID NO: 5) or I-Sce I (pCLS6663, SEQID mutagenesis, based on at least one reporter system. NO: 6) meganucleases. The vectors can be targeted at RAG1 0016. In a second aspect, the present invention concerns a endogenous locus to obtain an established cell line method for modulating DSB-induced mutagenesis in a cell by 0029 FIG. 7: Extrachromosomal transfection assay in using interfering agents. 293H cell line to validate induction of NHEJ repair events of 0017. In a third aspect, the present invention concerns the EGFP reporter gene of the pCLS6810 (SEQID NO: 5) specific interfering agents, their derivatives such as poly plasmid with the expression vector pCLS2690 (SEQID NO: nucleotide derivatives or other molecules as non-limiting 3) for the SC GS meganuclease in comparison with a control examples. vector pCLS002 (SEQID NO: 41). 0018. In a fourth aspect, the present invention further 0030 FIG. 8: Vector map of pCLS2690. encompasses cells in which. DSB-induced mutagenesis is 0031 FIG.9: Vector map of pCLS2222. modulated. It refers, as non-limiting example, to an isolated 0032 FIG.10: Vector map of pCLS0099. cell, obtained and/or obtainable by the method according to 0033 FIG. 11: Vector map of pCLS0002. the present invention. 0034 FIG. 12: Normalized Luciferase activity of the 0019. In a fifth aspect, the present invention also relates to High-throughput screening of the sRNA library. Hits stimu compositions and kits comprising the interfering agents, lating or inhibiting the SC GS-induced Non Homologous polynucleotides derivatives, vectors and cells according to End Joining repair activity are indicated by plain or hatched the present invention. squares respectively. 0020. In a sixth aspect, the present invention concerns the 0035 FIG. 13: Vector map of pCLS1853 uses of specific interfering agents, their derivatives Such as 0036 FIG. 14: Vector map of pCLS8054 polynucleotide derivatives or other molecules as non-limiting 0037 FIG. 15: Graph correlation between the percentage examples, for modulating DSB-induced mutagenesis. of GFP+ cells induced by the meganucleases SC GS and 0021. The above objects highlight certain aspects of the Trex2 SC GS and the frequency of NHEJ mutagenesis ana invention. Additional objects, aspects and embodiments of lyzed by deep sequencing. Striated triangle: negative control the invention are found in the following detailed description of transfection with pCLS0002 (SEQ ID NO: 41). Striated of the invention. circle: cotransfection of SC GS (SEQID NO: 4) with siRNA control AS. Dark circles: cotransfections of SC GS with BRIEF DESCRIPTION OF THE FIGURES siRNAs CAP1 (SEQID NO: 367), TALDO1 (SEQ ID NO: 0022. In addition to the preceding features, the invention 111) and DUSP1 (SEQID NO: 106). Striated square: cotrans further comprises other features which will emerge from the fection of Trex2/SC GS (SEQ ID NO: 1049) with siRNA description which follows, as well as to the appended draw control AS. Dark squares: cotransfections of SC GS with ings. A more complete appreciation of the invention and siRNAs TALDO1 (SEQID NO: 111), DUSP1 (SEQID NO: many of the attendant advantages thereof will be readily 106) and PTPN22 (SEQID NO: 283). obtained as the same becomes better understood by reference 0038 FIG. 16 Vector map of pCLS9573 to the following Figures in conjunction with the detailed description below. DETAILED DESCRIPTION OF THE INVENTION 0023 FIG. 1: Scheme of the “classic' end-joining path 0039. Unless specifically defined herein below, all techni way (C-NHEJ). cal and Scientific terms used herein have the same meaning as 0024 FIG. 2: Plasmid construction maps to quantify commonly understood by a skilled artisan in the fields of gene NHEJ repair events by SC GS (pCLS6883: SEQID NO: 1) therapy, biochemistry, genetics, and molecular biology. or I-Scel (pCLS6884SEQID NO: 2); these constructions can 0040 All methods and materials similar or equivalent to be targeted to RAG1 locus. those described herein can be used in the practice or testing of 0.025 FIG. 3: Z-score values of an extrachromosomal the present invention, with suitable methods and materials assay Screening of siRNA targeting 696 genes coding for being described herein. All publications, patent applications, kinases. patents, and other references mentioned herein are incorpo 0026 FIG. 4: Validation, of a stable cellular model to rated by reference in their entirety. In case of conflict, the quantify NHEJ repair events induced by SC GS via a present specification, including definitions, will control. Fur luciferase reporter gene, after integration of pCLS6883 (SEQ ther, the materials, methods, and examples are illustrative ID NO: 1) at RAG1 locus of 293H cells; Panel 4A: Examples only and are not intended to be limiting, unless otherwise of four siRNAs increasing NHEJ repair events induced by specified. SC GS at RAG1 locus of 293H cells after targeting WRN, 0041. The practice of the present invention will employ, MAPK3, FANCD2 and LIG4 genes. Panel 4B: Examples of unless otherwise indicated, conventional techniques of cell, 10 siRNAs increasing NHEJ repair events induced by SC GS biology, cell culture, molecular biology, transgenic biology, US 2012/0244131 A1 Sep. 27, 2012

microbiology, recombinant DNA, and immunology, which stimulate or inhibit said double-strand break-induced are within the skill of the art. Such techniques are explained mutagenesis. Effectors whose interfering agents increase or fully in the literature. See, for example, Current Protocols in decrease the expression of reporter gene detected and thus Molecular Biology (Frederick M. AUSUBEL, 2000, Wiley double-strand break-induced mutagenesis can also be classi and son Inc, Library of Congress, USA); Molecular Cloning: fied as effectors stimulating or inhibiting double-strand A Laboratory Manual. Third Edition, (Sambrook et al., 2001, break-induced mutagenesis. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory 0046. In the second screening step of this aspect of the Press); Oligonucleotide Synthesis (M. J. Gait ed., 1984); invention, a similar system as in the first screening step is Mullis et al. U.S. Pat. No. 4,683, 195; Nucleic Acid Hybrid used, except for the reporter gene employed. In this second ization (B. D. Harries & S. J. Higgins eds. 1984): Transcrip step, the reporter gene is preferably selected to allow a quali tion And Translation (B. D. Hames & S.J. Higgins eds. 1984); tative and/or quantitative measurement of the modulation Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., seen during the first screening step. 1987): Immobilized Cells And Enzymes (IRL Press, 1986); 0047. The invention therefore relates to a method for iden B. Perbal, A Practical Guide To Molecular Cloning (1984): tifying effectors that modulate double-strand break-induced the series, Methods In ENZYMOLOGY (J, Abeison and M. mutagenesis in a cell comprising the steps of Simon, eds.-in-chief, Academic Press, Inc., New York), spe 0048 (a) providing a cell expressing a reporter gene cifically, Vols. 154 and 155 (Wu et al. eds.) and Vol. 185, rendered inactive by a frameshift in its coding sequence, “ Technology' (D. Goeddel, ed.); Gene due to the introduction in said sequence of a DSB-cre Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. ating agent target site; Calos eds., 1987, Cold Spring Harbor Laboratory): Immu 0049 (b) providing an interfering agent; nochemical Methods In Cell And Molecular Biology (Mayer 0050 (c) contacting said cell with: and Walker, eds. Academic Press, London, 1987); Handbook 0051) i. an interfering agent; OfExperimental Immunology, Volumes I-IV (D. M. Weir and 0.052 ii. a delivery vector comprising a double-strand C. C. Blackwell, eds., 1986); and Manipulating the Mouse break creating agent, wherein said double-strand Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring break creating agent provokes a mutagenic double Harbor, N.Y., 1986). strand break that can be repaired by NHEJ leading to 0042. In a first aspect, the present invention concerns a a functional restoration of said reporter gene; method for identifying effectors that modulate double-strand 0.053 (d) detecting expression of the functional reporter break-induced mutagenesis, thereby allowing the increase or gene in the cell obtained at the end of step (c); decrease of double-strand break-induced mutagenesis in a 0.054 (e) repeating steps (c) and (d) at least one time for cell. As described elsewhere, this method allows screening of each interfering agent; interfering agents libraries covering an unlimited number of 0.055 (f) identifying effectors whose interfering agent molecules. As a non-limiting example, the method of the increases or decreases the expression of the reporter present invention allows screening for interfering RNAS, gene detected at step (d) as compared to a negative which in turn allow identifying the genes which they silence, control; and through their capacities to stimulate or to inhibit double 0056 (g) for the effectors identified at step (f), repeating Strand break-induced mutagenesis, based on at least one steps (a), (c), (d) and (f) with a cell line expressing a reporter system. different inactive reporter gene than the inactive reporter 0043. This first aspect of the method of the invention is gene previously used; based on two Successive screening steps. whereby the effectors identified at the end of step (f) are 0044) The first screening step is a highly sensitive high effectors that modulate double-stranded break-induced throughput assay measuring double-strand break-induced mutagenesis in a cell. mutagenesis based on a compatible reporter gene, for 0057. In a preferred embodiment, the present invention example the luciferase gene. This method allows, in a few concerns a method for identifying effector genes that modu runs, to screen several thousands of interfering agents for lates endonuclease-induced mutagenesis, thereby allowing their capacities to modulate double-strand break-induced the increase or decrease of double-strand break-induced mutagenesis (compared to negative, neutral or positive inter mutagenesis in a cell. As elsewhere described, this method fering agents taken as controls) by measuring the restoration allows screening of an interfering agents library, wherein in a of a functional reporter gene originally rendered inactive by a non limitative example, this library is an interfering RNA frameshift introduced via a double-strand break creating library covering an unlimited number of genes. The method agent target site. It is easily understandable that the target of the present invention allows screening for interfering sequence for double-strand break-induced mutagenesis can RNAs, which in turn allow identifying the genes which they be as a non limiting example, any double-strand break-in silence, through their capacities to stimulate or to inhibit duced mutagenesis site. For this identification step, said inter endonuclease-induced mutagenesis, based on at least one fering agents are co-transfected with a delivery vector con reporter system. taining said reporter gene rendered inactive by a frameshift 0058. In this preferred embodiment, the method of the mutation inserted via a double-strand break-induced target invention is based on two Successive screening steps. site and a delivery vector containing a double-strand break 0059. The first screening step is a highly sensitive high creating agent, said double-strand break creating agent pro throughput assay measuring endonuclease-induced mutagen Vokes a mutagenic double-strand break that can be repaired esis based on a compatible reporter gene, for example the by NHEJ leading to the restoration of said reporter gene and luciferase gene. This method allows, in a few runs, to Screen to the increase in said reporter signal. several thousands of interfering RNAs for their capacities to 0045 Interfering agents that modulate double-strand modulate the reparation of an endonuclease-induced break-induced mutagenesis can be divided in candidates that mutagenesis Substrate coupled to said reporter system, com US 2012/0244131 A1 Sep. 27, 2012 pared to negative, neutral or positive interfering RNAS taken line expressing a different inactive reporter gene than the as controls. Said endonuclease-induced mutagenesis Sub inactive reporter gene previously used; strate is rendered inactive by a frameshift in its coding whereby the genes identified at the end of step (i) and/or (g) sequence due to the introduction in said sequence of an endo are genes that modulate endonuclease-induced mutagenesis nuclease-specific target site, like an I-Sce or an engineered in a cell. meganuclease target site. It is easily understandable that the 0072 The eukaryotic cell line used at step (a) can be endonuclease-specific target site can be any endonuclease constructed by stably transfecting a cell line with a vector specific target site. For this identification step, said interfering (hereafter referred to as the first vector) comprising an inac RNAs are co-transfected with a delivery vector containing tive reporter gene, i.e. a reporter gene rendered inactive by a said reporter gene rendered inactive by a frameshift mutation frameshift mutation in its coding sequence, said frameshift due to the insertion of a double-strand break-induced target mutation being due to the introduction in said sequence of a site and a delivery vector containing an endonuclease expres target sequence for an endonuclease. In other terms, such sion cassette; said endonuclease provokes a mutagenic inactive reporter gene is not capable of emitting any relevant double-strand break, that can be repaired by NHEJ leading to detectable signal upon transfection into a cell. On the vector, the functional restoration of said reporter gene and to the the inactive reporter gene is placed under the control of increase in said reporter gene-associated signal. expression signals allowing its expression. Thus, upon stable 0060 Interfering RNAs that modulate endonuclease-in transfection of the cell line with the first vector, the cell line duced mutagenesis can be divided in candidates that stimulate expresses the inactive reporter gene which is integrated in its or inhibit said endonuclease-induced mutagenesis. Genes genome. from which these interfering RNAs are derived can also be 0073. This first vector can for example consist of, or be classified as genes stimulating or inhibiting endonuclease derived from, the pCLS6883 vector of SEQID NO: 1, or of induced mutagenesis. Therefore, genes related to interfering the pCLS6884 vector of SEQID NO: 2. RNAS that stimulate endonuclease-induced mutagenesis can (0074 The interfering RNA library used in the frame of this be classified as genes whose products inhibit double-strand method is preferably representative of an entire eukaryotic break-induced mutagenesis. Conversely, genes related to transcriptome. In addition, it preferably comprises at least interfering RNAs that, inhibit endonuclease-induced two different interfering RNAs for each gene of the eukary mutagenesis can be classified as genes whose products are otic transcriptome. Most preferably, it is constituted by necessary or stimulate double-strand break-induced iRNAS capable of targeting human genes, although it may mutagenesis. also be constituted by iRNAs capable of targeting genes form 0061. In the second screening step of this aspect of the common animal models such as mice, rats or monkeys. In a invention, a similar system as in the first screening step is preferred embodiment, the interfering RNA library used in used, except for the reporter gene used. In this second step, the the frame of the present invention, can be restricted to a part reporter gene is preferably selected to allow a qualitative of an eukaryotic transcriptome. Said restricted interfering and/or quantitative measurement of the modulation seen dur RNA library can be focused and representative of certain ing the first screening step, such as the gene encoding the classes of genes, such as genes encoding for protein kinases Green Fluorescent Protein (GFP) as non-limiting example. as a non-limiting example. 0062. The invention therefore relates to a method for iden 0075. At step (c), in addition to being transfected with the tifying genes that modulate endonuclease-induced mutagen iRNA, the eukaryotic cell is transfected with a second vector. esis in a cell comprising the steps of: 0076. The second, vector comprises an endonuclease 0063 (a) providing a cell expressing a reporter gene expression cassette (i.e. an endonuclease under the control of rendered inactive by a frameshift in its coding sequence, expression signals allowing its expression upon transfection due to the introduction in said sequence of a target into the cell). Therefore, a functional copy of the reporter sequence for an endonuclease; gene (and thus a detectable signal) can only be obtained upon 0064 (b) providing an interfering RNA comprised in an endonuclease-induced mutagenesis in the transfected interfering RNA library; eukaryotic cell. 0065 (c) transiently co-transfecting said cell with: 0077. The second vector can for example consist of, or be 0066 i. said interfering RNA; derived from, the pCLS2690 vector of SEQ ID NO: 3. The 0067 ii. a delivery vector comprising an endonu second vector can also for example encode for I-Sce mega clease expression cassette wherein said endonuclease nuclease (SEQID NO: 40). provokes a mutagenic double-strand break that can be 0078. The endonuclease present in the second vector can repaired by NHEJ leading to a functional restoration for example correspond to a a homing endonuclease such as of said reporter gene; I-SceI, I-CreI, I-Ceul, I-MsoI, and I-DmoI. It may be a wild 0068 (d) detecting the signal emitted by the reporter type or a variant endonuclease. In a preferred embodiment, gene in the co-transfected cell obtained at the end of step the endonuclease is an engineered meganuclease Such as, in a (c); non-limiting example, an engineered SC GS meganuclease 0069 (e) repeating step (c) and (d) at least, one time for (SEQID NO: 4). each interfering RNA of said interfering RNA library; 007.9 The first and second vectors may further comprise 0070 (f) identifying genes whose silencing through selection markers such as genes conferring resistance to an RNA interference increases or decreases the signal antibiotic in order to select cells co-transfected with both detected at step (d) as compared to a negative control; VectOrS. and 0080. In a preferred embodiment, the reporter gene used at 0071 (g) optionally, for the genes identified at step (f), step (c) is a high throughput Screening-compatible reporter providing an interfering RNA capable of silencing said gene Such as e.g. the gene encoding luciferase (including gene, and repeating steps (a), (c), (d) and (f) with a cell variants of this gene Such as firefly or renilla luciferase genes) US 2012/0244131 A1 Sep. 27, 2012 or other reporter genes that allow measuring a defined param puromycin N-acetyl gene pac, the blasticidin S eter in a large number of samples (relying on the use of deaminase resistant gene bSr and the bleomycin resistant, multiwell plates, typically with 96, 384 or 1536 wells) as genesh ble, as non-limiting examples. quickly as possible. Other reporter genes include in a non limitative way, the beta-galactosidase and the phosphatase I0087. In this second screening, the reporter gene is pref alkaline genes, which are well-known in the art. erably a gene allowing an accurate detection of the signal and 0081. In step (d), the signal emitted by the reporter gene in a precise qualitative and/or quantitative measurement of the the co-transfected cell is detected using assays well-known in endonuclease-induced mutagenesis modulation, Such as e.g. the art. the genes encoding the Green Fluorescent Protein (GPP), the 0082 Step (e) comprises repeating steps (c) and (d) at least Red Fluorescent Protein (RFP), the Yellow Fluorescent Pro one time for each interfering RNA of the interfering RNA tein (YFP) and the Cyano Fluorescent Protein (CFP), respec library. For example, if the iRNA library comprises two dif tively. The reporter gene of the second screening can also be ferent interfering RNAs for each gene of the eukaryotic tran any protein antigen that can be detected using a specific Scriptome, each gene of the transcriptome will be tested antibody conjugated to a fluorescence-emitting probe or twice. tagged by Such a fluorescent probe usable in Fluorescent 0083. At step (f), genes whose silencing through RNA Activated Cell Sorting (FACS). For example cell surface interference increases or decreases, preferably significantly expressing molecule like CD4 can be used as an expression increases or decreases, the signal detected at step (d) as com pared to a negative control are identified. In particular, the reporter molecule detectable with a specific anti-CD4 anti signal detected at step (d) is compared with the signal body conjugated to a fluorescent protein. FACS technology detected in the same conditions with at least one interfering and derivated applications to measure expression of reporter RNA taken as a negative control. The interfering RNA taken genes are well known in the art. as a negative control corresponds to a iRNA known not to I0088 As shown in Examples 1 to 4, the above method hybridize and thus not to be involved in endonuclease-in according to the invention was successfully applied to iden duced mutagenesis such as e.g. the “All Star (AS) iRNA tify several genes that modulate endonuclease-induced (Qiagen #1027280). For example, if a two-fold increase of the mutagenesis in a cell. signal detected upon transfection with an iRNA targeting a given gene, compared to the signal detected with a negative I0089. In a second aspect, the present invention concerns a control, said given gene is identified as a gene that modulates method for modulating double-strand break-induced endonuclease-induced mutagenesis in said cell. mutagenesis in a cell by using interfering agents. The infor 0084. In a preferred embodiment, the method of the mation obtained when carrying out the above method for present invention further comprises Supplementary steps of identifying effectors that modulate double-strand break-in selection. In other terms, the interfering RNAs identified at duced mutagenesis in a cell can be used to increase or step (f) are further selected through another Succession of decrease mutagenesis in cells. Depending on the envisioned steps (a), (c), (d) and (t), wherein inactive reporter gene is application, interfering agents that increase or interfering different from the one previously used. agents that decrease double-strand break-induced mutagen 0085. In a most preferred embodiment, steps (a) to (f) the esis in a cell can be used. above methods are first carried out using a cell line expressing an inactive luciferase reporter gene. This cell line can for 0090 indeed, interfering agents that modulate double example correspond to a cell line obtained through stable Strand break-induced mutagenesis through their respective transfection of a cell line with pCLS6883 vector of SEQ ID effectors can be used directly. For a given interfering agent, it NO: 1, or of the pCLS6884 vector of SEQ ID NO: 2 or is easily understood that other interfering agents derived from plasmids derived from those. This cell line is then co-trans said given interfering agent (equivalent interfering RNAS) fected with iRNAs and pCLS2690 vector of SEQID NO: 3, can be synthetized and used with the same objectives and Once genes whose silencing through. RNA interference results. increases or decreases the signal detected at step (d) as com pared to a negative control are identified, steps (a), (c), (d) and 0091 Interfering agents or derivatives can be used to (f) may then be repeated with iRNAs silencing these genes. modulate double-strand break-induced mutagenesis in a cell The cell line used at the second selection round may for by introducing them with at least, one delivery vector con example express an inactive GFP reporter gene (due to a taining at least one double-strand break creating agent frameshift mutation after insertion of an endonuclease target expression cassette. It is easily understood that these interfer site), and may e.g. be obtained through stable transfection of ing agents or derivatives can be introduced by all methods a cell line with the pCLS inactive GFP-encoding vector known in the art, as part or not of a vector, unique or not, under (pCLS6810 of SEQID NO: 5 or pCLS6663 of SEQID NO: the control of an inducible promoter or not. Therefore, the 6. The pCLS2690 vector of SEQ ID NO: 3 and the pCLS effects of these interfering agents or derivatives in the cell can inactive GFP-encoding vector of SEQID NO: 5 can then be be permanent or transitory. used for co-transfection with iERNAs. This second screening allows confirming that the genes identified at Step (f) are 0092. Therefore, the second aspect of the invention per genes that modulate endonuclease-induced mutagenesis in a tains to a method for modulating double-strand break-in cell. duced mutagenesis in a cell, comprising the steps of I0086. In the second screening, the reporter gene used can 0.093 (a) identifying an effector that is capable of be a gene that when active, confers resistance to an antibiotic modulating double-strand break-induced mutagenesis Such as the neomycin phosphotransferase resistant gene mptl, in a cell by a method according to the first aspect of the the hygromycin phosphotransferase resistant gene hph, the invention; and US 2012/0244131 A1 Sep. 27, 2012

0094 (b) introducing into a cell: 0107. In the above methods, “at least one interfering 0.095 i. at least one interfering agent capable of agent’ means that only one interfering agent but also more modulating said effector, than one interfering agent, can be used. In a preferred embodi 0096 ii. at least one delivery vector comprising at ment, 2 interfering RNAs can be used at the same time in the least one double-strand break creating agent; above methods; in a most preferred embodiment, 3, 4, 5, 6, 7, thereby obtaining a cell in which double-strand break-in 8, 9 or 10 interfering RNAs can be used at the same time; in duced mutagenesis is modulated. another most preferred embodiment, more than 10 interfering 0097. Therefore, in the second aspect of the invention is RNAs can be used. When several interfering RNAs are used comprised a method for increasing double-strand break-in in the above methods, they can be mixed or not, i.e. intro duced mutagenesis in a cell, comprising the steps of duced into the cell at the same moment or not. In another 0.098 (a) identifying a gene that is capable of stimulat embodiment, more than one interfering agent means 2 differ ing double-strand break-induced mutagenesis in a cell ent interfering agents as described in the “Definitions' para by a method according to the first aspect of the invention graph below; as non-limiting example, one interfering RNA or providing a gene selected from the group of genes targeting one gene can be used at the same time than one listed in table I or II; and cDNA overexpressing another gene. As another non-limiting 0099 (b) Introducing into a eukaryotic cell: example of using different kinds of interfering agents (as 0100 i. at least one interfering agent, wherein said described in the “Definitions' paragraph below), at least one interfering agent is a polynucleotide silencing or interfering RNA can be used at the same time than at least one encoding said gene, wherein said polynucleotide is an Small compound. interfering RNA capable of silencing said gene if the 0.108 in the above methods, the endonuclease encoded by signal detected at Step (d) of the method according to the vector comprising at least one endonuclease expression claim 1 is increased as compared to the negative con cassette may either be the same endonuclease as the one used trol, and is a cDNA transcribed from said gene if the in the method for identifying genes that modulate endonu signal detected at Step (d) of the method according to clease-induced homologous recombination, or a different claim 1 is decreased as compared to the negative endonuclease. This endonuclease can correspond to any of control; the endonucleases described in the “Definitions' paragraph 0101 ii. at least one delivery vector comprising at below. It may for example be a homing endonuclease such as least one double-strand break creating agent; I-SceI, I-CreI, I-Ceul, I-MsoI, and I-DmoI. It may be a wild thereby obtaining a eukaryotic cell in which double-strand type or a variant endonuclease. In a preferred embodiment, break-induced mutagenesis is increased. the endonuclease is a variant I-CreI endonuclease. 0102. In another embodiment, is a method for increasing 0109. By increasing double-strand break-induced double-strand break-induced mutagenesis in a cell compris mutagenesis is understood the increase of its efficiency, ie any ing the steps of introducing into said cell: statistically significant increase of double-strand break-in 0103 i.at least one interfering agent, wherein said inter duced mutagenesis in a cell when compared to an appropriate fering agent is a polynucleotide silencing at least one control, including for example, at least 5%, 10%. 20%, 30%, gene selected from the group of genes listed in tables I, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500% II, IV and VII: or greater increase in the efficiency of a double-strand break 0104 ii. at least one delivery vector comprising at least induced mutagenesis event for a polynucleotide of interest one double-strand break creating agent; (i.e. a transgene). 0105 thereby obtaining a eukaryotic cell in which 0110. In a preferred embodiment according to the inven double-strand break-induced mutagenesis is increased. tion, the gene that modulates endonuclease-induced 0106 More preferably, the interfering RNA used accord mutagenesis is a gene that decreases endonuclease-induced ing to the present invention for increasing double-strand mutagenesis efficiency. In Such a case, an interfering RNA break-induced mutagenesis in a cell targets a sequence capable of silencing said gene, introduced into said consid selected from the group consisting of SEQ ID NO: 13-35, ered cell, is able to increase endonuclease-induced mutagen SEQ ID NO: 37-39, SEQ ID NO: 44-76 and SEQ ID NO: esis. The interfering RNA may for example be a siRNA, a 80-555. More preferably, the interfering RNA targets used miRNA or a shRNA. according to the present invention for increasing double Strand break-induced mutagenesis in a cell targets a sequence 0111. In an extrachromosomal assay transiently express selected from the group consisting of SEQ ID NO: 106, 15, ing the vectors of the above method of the invention in a 16, 20, 33, 45,80, 83, 85, 89, 96, 97,98, 102, 103, 104, 108, eukaryotic cell, the inventors have found that the genes listed 109, 110, 111, 113, 114, 115, 118, 121, 122, 126, 127, 128, in table I herebelow are capable of decreasing endonuclease 135, 137, 138, 139, 140, 141, 143, 146, 149, 151, 153, 162, induced mutagenesis, particularly GS engineered meganu 163, 167, 168, 174, 175, 177, 178, 180, 181, 184, 185, 186, clease-induced mutagenesis (see Example 2). Indeed, siR 187, 188, 189, 193, 195, 196, 198, 201, 203, 204, 215, 221, NAS respectively targeting those genes (sequences listed in 222, 223, 225, 226, 227, 228, 229, 232, 233,235, 236, 237, Table I) are able to stimulate GS engineered meganuclease 238, 239, 243, 244, 247, 249, 250, 251, 252,254, 256, 257, induced mutagenesis. Therefore, a gene that is capable of 258, 265,267,268, 269,271, 277,278, 282,283, 285, 299, modulating endonuclease-induced mutagenesis in a eukary 308,309, 315, 328, 331, 335,338,340, 353, 367, 368, 385, otic cell can be selected from the group of genes listed in 399, 416. Table I below. US 2012/0244131 A1 Sep. 27, 2012

TABLE I siRNA hits stimulating GS SC-induced luciferase signal. SEO Mean Gene Gene siRNA D Z. Stimulation targeted ID target sequence NO: Score Sto factor

CSNK1D 1453 CCGGTCTAGGATCGAAATGTT 13 3.14 O.75 3. 40

AK2 2O4 CGGCAGAACCCGAGTATCCTA 14 5.51 O2O 5. O8

AKT2 2O8CAAGCGTGGTGAATACATCAA 15 3.. 65 0.23 2.94

CAMK2G 81.8GAGGAAGAGATCTATACCCTA 16 5 O1. O. 23 3. 66

GK2 2712TACGTTAGAAGAGCACTGTAA. 17 3.33 1. 49 2.75

PFKFB4 521OCAGAAAGTGTCTGGACTTGTA 18 3.92 114 2.18

MAPK12 63 OOCTGGACGTATTCACTCCTGAT 19 384 OO6 3.22

PRKCE 5581 CCCGACCATGGTAGTGTTCAA 20 4 OO O. 43 2.91

EIF2AK2 561OCGGAAAGACTTACGTTATTAA 21 4.5 O O. 22 3.15

WEE1 7.465 CAGGGTAGATTAC CTCGGATA. 22 32 O O. O.8 5. O1

CDK5R1 8851 CCGGAAGGLCACGCTGTTTGA, 23 4. O1 O.26 6. O3

LIG4 3981 CACCGTTTATTTGGACTCGTA. 24. 4.11 O. 41 6.15

AKAP1 8165AGCGCTGAACTTGATTGGGAA 25 4.97 O.32 7.24

MAP3K6 9 O64TCAGAGGAGCTGAGTAATGAA. 26 5.99 O.22 5. 41

DYRK3 84.44 TCGACAGTACGTGGCCCTAAA 27 3.54. O.22 3 : 61

RPS6KA4 8986 CGCCACCTTCATGGCATTCAA 28 3.56. Of3 3 : 61

STK17A 92.63 CACACTCGTGATGTAGTTCAT 29 3.26 O. 43 2. Of

GNE 10O2 OCCCGATCATGTTTGGCATTAA 3 O 331 O.25 2.2O

ERN2 10595 CTGGTTCGGCGGGAAGTTCAA 31, 3.47 147 2.30

HUNK 3O811 CACGGGCAAAGTGCCCTGTAA 32 363 13 O 1.97

SMG1 23 O49 CACCATGGTATTACAGGTTCA 33 3.22 O. 46 2. Os

WNK4 65266CAGCTTGTTGGGCGTTTCCAA 3 4 5 - 58 Of O 4.15

MAGI2 98.63 CAGGCCCAACTTGGGATATCA 35 3. Of O. 63 2. Of

0112 More preferably, the interfering RNA targets used in 243, 244, 247, 249, 250, 251, 252, 254, 256, 257, 258, 265, the frame of the method according to the present invention 267, 268,269,271, 277,278, 282,283, 285, 299, 308,309, target a sequence selected from the group consisting of SEQ 315, 328,331, 335,338, 340,353, 367, 368,385,399, 416. ID NO: 13-35, SEQ ID NO:37-39, SEQID NO: 44-76 and 0113. As shown in example 3, the above method according SEQID NO: 80-1041. More preferably, the interfering RNA to the invention was successfully applied to stimulate endo targets used in the frame of the method according to the nuclease-induced mutagenesis in a cellular model stably present invention target a sequence selected from the group expressing at an endogenous locus (RAG1) the construction consisting of SEQID NO: 13-35, SEQID NO:37-39, SEQID that allows to measure GS engineered meganuclease-induced NO: 44-76 and SEQ ID NO: 80-555. More preferably, the mutagenesis. Indeed, siRNAS targeting genes involved in interfering RNA targets used in the frame of the method NHEJ (LIG4; SEQID NO: 24) or in NHEJ and other DNA according to the present invention target a sequence selected repair pathway (WRN; SEQ ID NO: 37) or in DNA repair from the group consisting of SEQID NO: 106, 15, 16, 20,33, (FANCD2, SEQ ID NO: 39) or in DNA repair regulation 45, 80, 83, 85, 89, 96, 97,98, 102, 103, 104, 108, 109, 110, (MAPK3, SEQID NO: 38) were able to increase GS engi 111, 113, 114, 115, 118, 121, 122, 126, 127, 128, 135, 137, neered meganuclease-induced luciferase signal. Moreover, 8 138, 139, 140, 141, 143, 146, 149, 151, 153, 162, 163, 167, siRNAs identified with the extrachromosomal assay of 168, 174, 175, 177, 178, 180, 181, 184, 185, 186, 187, 188, example 2, targeting CAMK2G (SEQ ID NO: 16). SMG1 189, 193, 195, 196, 198, 201, 203, 204, 215, 221, 222, 223, (SEQID NO:33), PRKCE (SEQIDNO:20), CSNK1D (SEQ 225, 226, 227, 228, 229, 232, 233,235, 236, 237,238, 239, ID NO: 13), AK2 (SEQID NO: 14), AKT2 (SEQID NO:15), US 2012/0244131 A1 Sep. 27, 2012

MAPK12 (SEQID NO: 19) and EIF2AK2 (SEQID NO: 21) mutagenesis (i.e. the presence of which increases double genes and also two siRNAs targeting PRKDC gene Strand break-induced mutagenesis in a cell). In Such a case, a (PRKDC 5, SEQID NO: 75 and PRKDC 8, SEQID NO: cDNA leading to increased expression of said gene is intro 76) involved in DNA repair regulation were able to increase duced into said cell. GS engineered meganuclease-induced luciferase signal. As I0120 cDNA usually refers to a double-stranded DNA that shown in example 4, the above method according to the is derived from mRNA which can be obtained from prokary invention was successfully applied to stimulate endonu otes or eukaryotes by reverse transcription. cDNA is a more clease-induced mutagenesis at an endogenous locus (RAG1). convenient way to work with the coding sequence than SiRNAs targeting XRCC6 (SEQID NO:44), BRCA1 (SEQ mRNA because RNA is very easily degraded by omnipresent IDNO:45), FANCD2 (SEQID NO:39), WRN (SEQID NO: RNases. Methods and advantages to work with cDNA are 37) and MAPK3 (SEQID NO:38) were able to enhance the well known in the art (1989, Molecular cloning: a laboratory percentage of mutagenic NHEJ repair as measured by Deep manual. 2" edition and further ones, Cold Spring Harbor Sequencing analysis at the endogenous RAG1 locus (see Laboratory Press, Cold Spring Harbor, N.Y.). Particularly in Table II below) the context of the present invention the availability of a cDNA clone allows the corresponding protein to be expressed in a TABL E II variety of contexts. The cDNA can be inserted into a variety of expression vectors for different purposes. Perhaps the most siRNA stimulating endonuclease-induced obvious use of Such an approach in the present invention is to mutagenesis at RAG1 locus. drive the expression of a defined protein involved, in a protein SEQ NHEJ transduction cascade to levels that allow higher frequency of Gene Gene siRNA ID Stimulation endonuclease-induced mutagenesis and So, mutagenesis targeted ID target sequence NO: factor events. As well-known in the art, one can express not only the XRCC6 2547 ACCGAGGGCGATGAAGAAGCA 44 1.6 wild type protein but also mutant proteins, said particular mutations having consequences in structure-function rela BRCA1 672 ACCATACAGCTTCATAAATAA 45 2.1 tionships within a protein itself (improved catalytic activity) FANCD2 2177 AAGCAGCTCTCTAGCACCGTA 39 2.5 or for association with another endogenous protein. I0121. As used herein, the term “cDNA encompasses both WRN 7486 CGGATTGTATACGTAACTCCA 37 2. 4 full-length cDNAs naturally transcribed from the gene and biologically active fragments thereof, such as e.g. cDNAS MAPK3 595 CCCGTCTAATATATAAATATA 38 1.9 encoding the mature protein encoded by the gene or biologi cally active fragments thereof. The biologically active frag 0114. As also shown in example 3, the screen of a siRNA ments thereof can for example code for maturation products collection from Qiagen led to the identification of 481 siRNA of the protein encoded by the gene. hits that stimulate SC-GS-induced mutagenesis as listed in I0122. In a third aspect, the present invention concerns table IV (SEQIDNO: 80-555) and to the identification of 486 specific interfering agents, their derivatives such as poly siRNA hits that inhibit SC-GS-induced mutagenesis as listed nucleotides derivatives or other molecules as non-limiting in tableV (SEQID NO: 556-1041). Interfering RNA capable examples. In this aspect, the present invention concerns spe of silencing a given gene can easily be obtained by the skilled cific interfering agents for modulating double-(strand break in the art. Such iERNAs may for example be purchased from a induced mutagenesis in a cell, wherein said interfering agents provider. Alternatively, commercially available tools allow modulate effectors representative of an entire eukaryotic tran designing iRNAS targeting a given gene. Scriptome. In a preferred embodiment, said interfering agents 0115 Useful interfering RNAs can be designed with a modulate effectors which are part of a restricted library rep number of Software program, e.g., the OligoEngine siRNA resentative of certain classes of effectors. In a most preferred design tool available at the oligoengine.com worldwide web embodiment, said interfering agents modulate effectors from site. Database RNAi Codex (available at the codex.cshl.edu the group listed in Table I and Table II. In a preferred embodi website) publishes available RNAi resources, and provides ment of this third aspect, the present invention concerns spe the most complete access to this growing resource. cific polynucleotide derivatives identified for effector genes, 0116. The iRNAs used in the frame of the present inven which increase endonuclease-induced mutagenesis. tion can for example be a shRNA. shRNAs can be produced I0123. In a preferred embodiment of this aspect of the using a wide variety of well-known RNAi techniques. ShR invention, these polynucleotide derivatives are interfering NAs that are synthetically produced as well as miRNA that RNAs, more preferably siRNAs or shRNAs. are found, in nature can for example be redesigned to function 0.124. As indicated in the definitions hereabove, the siR as synthetic silencing shRNAs. DNA vectors that express NAS according to the invention are double-stranded RNAs, perfect complementary shRNAS are commonly used togen each RNA of the duplex comprising for example between 17 erate functional siRNAs. and 29 nucleotides, e.g. 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 0117 iRNAs can be produced by chemical synthesis (e.g. 27, 28 or 29 nucleotides. in the case of siRNAs) or can be produced by recombinant 0.125. Such siRNAs can be formed from two RNA mol technologies through an expression vector (e.g. in the case of ecules that hybridize together or can alternatively be gener shRNAs). ated from a single RNA molecule that includes a self-hybrid 0118. The iRNAs according to the invention may option izing portion, referred to as shRNAs. The duplex portion of a ally be chemically modified. siRNA can include one or more impaired and/or mismatched 0119. In another preferred embodiment according to the nucleotides in one or both strand of the duplex (bulges) or can invention, the gene that modulates endonuclease-induced contain one or more noncomplementary nucleotides pairs. mutagenesis is a gene that increases endonuclease-induced Duplex of a siRNA is composed of a sense strand and of an US 2012/0244131 A1 Sep. 27, 2012 antisense strand. Given a target transcript, only one strand of and SEQ ID NO: 80-555, again more preferably from the the siRNA duplex is supposed to hybridize with one strand of group consisting of SEQID NO: 106, 15, 16, 20, 33, 45,80, said target transcript, in certain embodiments, one strand 83, 85, 89, 96, 97,98, 102, 103,104,108, 109, 110, 111, 113, (either sense, either antisense) is perfectly complementary 114, 115, 118, 121, 122, 126, 127, 128, 135, 137, 138, 139, with a region of the target transcript, either on the entire 140, 141, 143, 146, 149, 151, 153, 162, 163, 167, 168, 174, length of the considered siRNA strand (comprised between 175, 177, 178, 180, 181, 184, 185, 186, 187, 188, 189, 193, 17 and 29 nucleotides, including 17, 18, 19, 20, 21, 22, 23, 24, 195, 196, 198, 201, 203, 204, 215, 221, 222, 223, 225, 226, 25, 26, 27, 28, and 29 nucleotides), either on only a part of the 227, 228, 229, 232, 233,235, 236, 237,238,239, 243, 244, considered siRNA strand, 17 to 29 or 19 to 29 nucleotides 247, 249, 250, 251, 252,254, 256, 257, 258, 265, 267,268, matching for example, or 18, 19, 20, 21, 22, 23, 24, 25, 26, 27. 269,271, 277,278, 282,283, 285, 299, 308,309, 315,328, 28 from 29 nucleotides. In one embodiment it is intended that 331, 335, 338, 340, 353, 367, 368, 385, 399, 416 with or the considered strand of the siRNA duplex (either sense, without mismatch. Preferably, there is no mismatch, meaning either antisense) hybridizes the target transcript without a that one strand of this iRNA (either sense, either antisense) single mismatch over that length. In another embodiment, one comprises or consists of the RNA sequence corresponding to or more mismatches between the considered strand of the a DNA sequence selected from the group consisting of SEQ siRNA duplex (either sense, either antisense) can exist. ID NO: 13-35, SEQ ID NO:37-39, SEQID NO: 44-76 and 0126 Therefore, an aspect of the invention is drawn to an SEQID NO: 80-1041. interfering RNA for increasing endonuclease-induced I0128. In the iRNAs according to the invention, the sense mutagenesis in a cell, wherein said interfering RNA com RNA nucleic acid may for example have a length comprised prises a sense RNA nucleic acid and an antisense RNA between 19 and 29. nucleic acid, and wherein said interfering RNA down-regu lates the expression (most preferably silences the expression) I0129. In the frame of the present invention, the interfering of gene transcripts part of library representative of an entire RNA according to the invention may further comprising a transcriptome. In a preferred embodiment, the interfering hairpin sequence, wherein the sense RNA nucleic acid and the RNA library used in the frame of the present invention can be antisense RNA nucleic acid are covalently linked by the hair representative of only a part of an eukaryotic transcriptome. pin sequence to produce a shRNA molecule. Said restricted interfering RNA library can be representative 0.130. In a preferred embodiment according to the inven of certain classes of transcripts, such as those encoding for tion, the interfering RNA according to the invention as kinases as a non-limiting example. In a preferred embodi defined hereabove down-regulates the expression (most pref ment, said interfering RNA library can be obtained from a erably silences the expression) of the genes listed in Table I, provider, as a non limiting-example, said interfering RNA Table II, Table IV. Table V and Table VII. Indeed, as respec library can be a library purchased from Qiagen and covering tively shown in examples 2 and 4, introducing Such an iRNA 19121 genes with two different siRNAs per gene. In a pre selected from the group consisting of SEQ ID NO: 13-35, ferred embodiment of this aspect of the invention, interfering SEQ ID NO: 37-39, SEQ ID NO: 44-46 and SEQ ID NO: RNA targets a gene selected from this library. In a most 75-76 in a cell leads to approximately a 2 to 7 fold increase of preferred embodiment of this aspect of the invention, inter the endonuclease-induced mutagenesis signal of an extrach fering RNA targets a gene selected from the group of genes romosomal reporter assay in this cell and to a 1.6 to 2.5 listed in Table I, II, IV and Table VII. More preferably, the increase of the endonuclease-induced mutagenesis events at interfering RNA according to the invention targets a sequence an endogenous locus of this cell. Other results and fold selected from the group consisting of SEQ ID NO: 13-35, increase are shown in example 3 and 5 for iRNA listed in SEQ ID NO: 37-39, SEQ ID NO: 44-76 and SEQ ID NO: Tables IV, V and VII. 80-1041. More preferably, the interfering RNA targets used I0131. In a preferred embodiment, these iRNA down-regu in the frame of the method according to the present invention lating the expression of their respective targeted genes com target a sequence selected from the group consisting of SEQ prise a sense RNA nucleic acid consisting of a sequence at ID NO: 13-35, SEQ ID NO:37-39, SEQID NO: 44-76 and least 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% iden SEQ ID NO: 80-555. More preferably, the interfering RNA tical to a fragment of at least 17 consecutive nucleotides of the targets used in the frame of the method according to the respective mRNA sequences of the genes listed in Tables I, II, present invention target a sequence selected from the group IV, V and VII. These fragments of at least 17 consecutive consisting of SEQID NO: 106, 15, 16, 20,33, 45,80, 83, 85, nucleotides of the respective mRNA sequences of the genes 89, 96, 97,98, 102, 103, 104, 108, 109, 110, 111, 113, 114, listed in Tables I and II may for example include 17, 18, 19. 115, 118, 121, 122, 126, 127, 128, 135, 137, 138, 139, 140, 20, 21, 22, 23, 24, 25, 26, 27, 28, and 29 consecutive nucle 141, 143, 146, 149, 151, 153, 162, 163, 167, 168, 174, 175, otides of the respective mRNA sequences of the genes listed 177, 178, 180, 181, 184, 185, 186, 187, 188, 189, 193, 195, in Tables I, II, IV, V and VII. 196, 198, 201, 203, 204, 215, 221, 222, 223, 225, 226, 227, (0132. The antisense RNA nucleic acid of such an iRNA 228, 229, 232, 233, 235, 236, 237,238, 239, 243, 244, 247, above from the mRNA sequence of a given gene listed in 249, 250, 251, 252, 254, 256, 257, 258, 265,267, 268, 269, Tables I, II, IV and V may as a non-limiting example consist 271, 277,278, 282, 283, 285,299, 308, 309, 315, 328,331, of a sequence at least 70%, 75%, 80%, 85%, 90%. 95%, 99% 335, 338,340, 353,367, 368,385, 399, 416. or 100% identical to a fragment complementary to at least 17 0127. In other terms, one strand of this iRNA (either sense, consecutive nucleotides of the considered mRNA sequence. either antisense) comprises a sequence hybridizing to a This fragment of at least 17 consecutive nucleotides comple sequence selected from the group consisting of SEQID NO: mentary of the respective mRNA sequences of the genes 13-35, SEQID NO:37-39, SEQID NO: 44-76 and SEQID listed in Tables I, II, IV, V and VII may for example include NO: 80-1041, more preferably from the group consisting of 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, and 29 consecu SEQID NO: 13-35, SEQID NO:37-39, SEQID NO. 44-76 tive nucleotides complementary of this sequence. US 2012/0244131 A1 Sep. 27, 2012

0133. The iRNAs down-regulating the expression of a erably, there is no mismatch, meaning that one strand of this given gene listed in Tables I and II may correspond to a iRNA (either sense, either antisense) comprises or consists of different sequence targeting the same given genes listed in the RNA sequence corresponding to a DNA sequence Table III below. selected from the group consisting of SEQID NO:38 and/or SEQID NO: 74. In a preferred embodiment, two interfering TABLE III RNAs can be used at the same time in the methods of the Other siRNAs target sequences for present invention; in a preferred embodiment, these iRNAs targeted genes of Tables I and II are siRNAs; in a most preferred embodiment, combinations of two siRNAs used at the same time in the methods of the SEQ present invention encompass siRNAs targeting CAMK2G Gene Gene siRNA ID (SEQID NO:16), SMG1 (SEQIDNO:33), PRKCE (SEQID targeted ID target sequence NO: NO:20), FANCD2 (SEQID NO:39) and LIG4 (SEQID NO: CSNK1D 1453 CTCCCTGACGATTCCACTGTA 47 24) genes. In another most preferred embodiment, the com binations of two siRNAs that are used in the present invention AK2 2O4. CTGCAAGCCTACCACACTCAA 48 are selected from the group consisting of CAMK2G+SMG1, AKT2 2O8 ACGGGCTAAAGTGACCATGAA 49 CAMK2G+PRKCE, CAMK2G+FANCD2, CAMK2G+ LIG4, SMG1+PRKCE, SMG1+FANCD2, SMG1+LIG4, CAMK2G 818 CCGATGAGAAACCTCGTGTTA SO PRKCE+FANCD2, PRKCE+LIG4, FANCD2+LIG4. In another preferred embodiment, several combinations of two GK2 2712 CTCGGGTGTGCCATAATAATA 51 siRNAs, i.e. 2, 3, 4, 5, 6, 7, 8, 9 or 10 combinations of two PFKFB4 5210 ACGGAGAGCGACCATCTTTAA 52 interfering RNAs can be used at the same time; in another most preferred embodiment, more than 10 combinations of MAPK12 63 OO TGGAAGCGTGTTACTTACAAA 53 two interfering RNAs can be used. When several combina PRKCE 5581 CACGGAAACACCCGTACCTTA 54 tions of two interfering RNAs are used in the above methods, they can be mixed or not, i.e. introduced into the cell at the EIF2AK2 561O TACATAGGCCTTATCAATAGA 55 same moment or not. In another embodiment, one combina tion of two interfering RNA can be used at the same time than WEE1 7.465 ACAATTACGAATAGAATTGAA 56 one cDNA overexpressing another gene. In another embodi CDK5R1 885.1 TGAGCTGGTTTGACTCATTAA f ment, one combination of two interfering RNA can be used at the same time than at least one Small compound. LIG4 3981 ATCTGGTAAGCTCGCATCTAA 58 0.135 The invention further pertains to viral vector for AKAP1 81.65 CACGCAGAGATGACAGTACAA 59 producing the interfering RNA according to the invention, wherein said viral vector comprises a polynucleotide MAP3K6 9064 CACCATCCAAATGCTGTTGAA 6 O sequence encoding the sense RNA nucleic acid of said inter DYRK3 84.44 AGCCAATAAGCTTAAAGCTAA 61 fering RNA and a polynucleotide sequence encoding the anti sense RNA nucleic acid of said interfering RNA. RPS6KA4 8986 CAGGCTGTGCCTTTGACTTTA 62 0.136. In such vectors, the polynucleotide sequence encod ing the sense RNA nucleic acid may under the control of a first STK17A 92.63 TCCATTGTAACCGAAGAGTTA 63 promoter, and the polynucleotide sequence encoding the anti GNE 1 OO2O ATGGAAATACATATCGAATGA 64 sense RNA nucleic acid may be under the control of a second promoter. These promoters may for example be selected from ERN2 10595 AAGGATGAAACTGGCTTCTAT 65 the group consisting of an inducible promoter, a tissue spe HUNK 3O811 TCGGACCAAGATCAAACCAAA. 66 cific promoter and a RNA polymerase III promoter. 0.137 Alternatively, when the sense and the antisense SMG1 23O49 ATCGATGTTGCCAGACTACTA 67 nucleic acids are covalently linked by a hairpin sequence to produce a shRNA molecule, they are under the control of a WNK4 65266 CAGGAGGAGCCAGCACCATTA. 68 single promoter. MAGI2 98.63 ATGGACCGATGGGAGAATCAA 69 0.138 Another aspect of the invention is drawn to an iso lated DNA polynucleotide coding for the interfering RNA XRCC6 2547 TTTGTACTATATACTGTTAAA 70 according to the invention, wherein said DNA polynucleotide BRCA1 672 AACCTATCGGAAGAAGGCAAG 71. comprises a polynucleotide sequence encoding the sense RNA nucleic acid of said interfering RNA and a polynucle FANCD2 2177 CAGAGTTTGCTTCACTCTCTA 72 otide sequence encoding the antisense RNA nucleic acid of WRN 7486 TCCGCTGTAGCAATTGGAGTA 73 said interfering RNA. In such a DNA polynucleotide, the sense and the antisense nucleic acids may be covalently MAPK3 5595 TGGACCGGATGTTAACCTTTA 74 linked by a hairpin sequence to produce a shRNA molecule upon transcription. 0134. The iRNA clown-regulating the expression of the 0.139 Still another aspect of the invention relates to a MAPK3 gene (Gene ID 5595) may for example target a plasmidic vector comprising the DNA polynucleotide sequence consisting of SEQID NO:38 and/or SEQID NO: according to the invention. 74. In other terms, one strand of this iRNA (either sense, 0140. Such a plasmidic vector preferably comprises a pro either antisense) comprises a sequence hybridizing to a moter, wherein the polynucleotide sequence encoding the sequence selected from the group consisting of SEQID NO: sense RNA nucleic acid is under control of said promoter. 38 and/or SEQID NO: 74, with or without mismatch. Pref. Said promoter may for example be selected from the group US 2012/0244131 A1 Sep. 27, 2012 consisting of an inducible promoter, a tissue specific pro buffer allowing storage of the iRNAs, DNA polynucleotides, moter and a RNA polymerase III promoter. vectors and cells according to the invention, or a pharmaceu 0141. In a fourth main aspect of the present invention, is tically acceptable carrier. encompassed cells in which double-strand break-induced 0152. In another aspect of the invention, the present inven mutagenesis is modulated. It refers, as non-limiting example, tion concerns a kit for modulating double-strand break-in to an isolated cell, obtained and/or obtainable by the method duced mutagenesis in a cell, wherein said composition com according to the present invention. Cells in which double prises at least an interfering agent that modulate an effector Strand break-induced mutagenesis is increased are useful for from a group of effectors representative of an entire eukary genome engineering, including therapeutic applications and otic transcriptome. In a preferred embodiment, said interfer cell line engineering. ing agent modulate an effector which are part of a restricted 0142. The invention therefore relates to an isolated cell library representative of certain classes of effectors. In a most obtained and/or obtainable by the methods according to the preferred embodiment, said interfering agent modulate an invention as defined in the above paragraphs. As shown in effector from the group listed in Table I, II, IV, V and VII. example3, a cellular model has been established which stably 0153. In a preferred embodiment of this aspect of the expresses at an endogenous locus (RAG1) the construction invention, the invention also pertains to a kit for increasing that allows to measure GS engineered meganuclease-induced mutagenesis and/or endonuclease-induced mutagenesis in a mutagenesis. Moreover, in this cell line, different siRNAs cell, wherein said kit comprises at least one interfering RNA, were shown to increase GS engineered meganuclease-in viral vector, isolated DNA polynucleotide or plasmidic vector duced mutagenesis via a reporter signal. According to this as defined in the above paragraphs, and/oran isolated eukary fourth aspect of the invention, a cell in which endonuclease otic cell as defined in the above paragraphs. induced mutagenesis is increased can be directly or indirectly 0154 The kit may further comprise instructions for use in be derived from this cellular model. increasing mutagenesis efficiency and/or for use in increasing 0143. The invention further relates to a cell, wherein said endonuclease-induced mutagenesis. cell is stably transformed with at least one interfering RNA, 0.155. In a sixth main aspect, the present invention con viral vector, isolated DNA polynucleotide or plasmidic vector cerns the uses of specific interfering agents for modulating as described in the previous paragraphs. double-strand break-induced mutagenesis in a cell, wherein 0144. The eukaryotic cell can be any type of cell such as said interfering agent modulates an effector from a group of e.g. a CHO cell (for example a CHO-K1 or a CHO-S cell), a effectors representative of an entire eukaryotic transcriptome. HEK293 cell, a Caco2 cell, an U2-OS cell, a NIH 3T3 cell, a In a preferred embodiment, said interfering agent modulates NSO cell, a SP2 cell, and a DG44 cell. an effector which is part of a restricted library representative 0145. In a preferred embodiment, the cell is a cell suitable of certain classes of effectors. In a most preferred embodi for production of recombinant proteins. ment, said interfering agent modulates an effector from the 0146 Said cell is preferably an immortalized and/or a group listed in Tables I, II, IV, V and VII. In a preferred transformed cell, although primary cells are contemplated by embodiment of this sixth aspect, the present invention con the present invention, in particular in the frame of gene cerns the uses of specific polynucleotide derivatives identi therapy. fied for effector genes, which increase double-strand break 0147 In a fifth main aspect, the present invention also induced mutagenesis efficiency. relates to compositions and kits comprising the interfering 0156 Indeed, the polynucleotides derivatives according to agents, polynucleotides derivatives, vectors and cells accord the invention, which include the iRNAs, DNA polynucle ing to the present invention. otides, cDNAs and vectors described hereabove, can be used 0148. The invention further pertains to compositions and to increase mutagenesis in a cell and/or to increase double kits comprising the iRNAs, DNA polynucleotides, cDNAs, Strand break-induced mutagenesis in a cell. vectors and cells according to the invention described here 0157. Therefore, an aspect, of the invention is directed to above. an in vitro or ex vivo use of at least one interfering agent, Such 0149. In this aspect of the invention, the present invention as but non-limited to interfering RNA, DNA polynucleotide, concerns a composition for modulating double-strand break viral vector or plasmidic vector as defined in the above para induced mutagenesis in a cell, wherein said composition graphs for increasing mutagenesis in a cell and/or endonu comprises at least an interfering agent that modulate an effec clease-induced mutagenesis in a cell, tissue or organ. tor from a group of effectors representative of an entire 0158 Modulating double-strand break-induced mutagen eukaryotic transcriptome. In a preferred embodiment, said esis is also useful in animal models, for which it is often interfering agent modulates an effector which is part of a desired to construct knock-in or knock-out animals, as a non restricted library representative of certain classes of effectors. limiting example. In a most preferred embodiment, said interfering agent modu 0159. Therefore, the invention relates to the use of specific lates an effector from the group listed in Table I, Table II, interfering agents for modulating double-strand break-in Table IV, Table V and Table VII. duced mutagenesis in a non-human model, wherein said 0150. In a preferred embodiment of this aspect of the interfering agent modulates an effector from a group of effec invention, the invention pertains to a composition for increas tors representative of an entire eukaryotic transcriptome. In a ing mutagenesis and/or endonuclease-induced mutagenesis preferred embodiment, said interfering agent modulates an in a cell comprising at least one interfering RNA, viral vector, effector which is part of a restricted library representative of isolated DNA polynucleotide or plasmidic vector as defined certain classes of effectors. In a most preferred embodiment, in the above paragraphs, and/or an isolated cell as defined in said interfering agent modulate an effector from the group the above paragraphs. listed in Tables I, II, IV, V and VII. The invention also relates 0151. The composition preferably further comprises a car to the use of an interfering RNA according to the invention for rier. The carrier can for example be a buffer, such as e.g. a increasing mutagenesis efficiency and/or endonuclease-in US 2012/0244131 A1 Sep. 27, 2012

duced mutagenesis in a non-human animal model. The ani to the invention is used to increase the endonuclease-induced mal models thus obtained are also part of the invention. mutagenesis in the treatment of a genetic disorder. 0160. It is further desirable to modulate double-strand 0.167 As another non-limiting example, an interfering break-induced mutagenesis or endonuclease-induced agent according to the invention can be used to modulate the endonuclease-induced mutagenesis in the treatment of a mutagenesis in the frame of treatments of subjects by therapy. genetic disorder where an absent or faulty gene is targeted by 0161 Therefore, the invention further pertains to an inter at least one DSB-creating agent and replaced by a working fering agent according to the invention for use as a medica gene via gene targeting for example, in this case, an interfer ment. ing agent according to the invention is used to decrease the 0162 A preferred embodiment of the invention is drawn to endonuclease-induced mutagenesis in the treatment of a an interfering agent or an interfering RNA according to the genetic disorder. invention for use as an adjuvant in the treatment of a genetic 0168 An interfering agent according to the present inven disease by gene therapy. For purposes of therapy, an interfer tion may also be used in cancer therapy. A way to improve ing agent or an interfering RNA according to the invention cancer cells killing can be to increase their mutagenesis rate can be administered with a DSB-creating agent with a phar using an interfering agent according to the invention either in maceutically acceptable excipient in a therapeutically effec association with radiotherapy, as a non-limiting example, tive amount. Such a combination is said to be administered in either by increasing endonuclease-induced mutagenesis a “therapeutically effective amount' if the amount adminis according to the invention. As known in the art, radiotherapy tered, is physiologically significant. An agent is physiologi is also called radiation therapy. This approach allows the cally significant if its presence results in a detectable change treatment of cancers and other diseases with ionizing radia in the physiology of the recipient. In the present context, an tion that injures or destroy cancer cells in the area being agent, is physiologically significant if its presence results in a treated by damaging their genetic material. The approach decrease in the severity of one or more symptoms of the according to the present invention allows to improve Such targeted disease and in a genome correction of the lesion or radio therapeutic treatments by increasing the mutagenesis abnormality. (See Current Protocols in Human Genetics: rate in the cells of the treated area, either by adding in the Chapter 12 “Vectors For Gene Therapy & Chapter 13 treated cells an interfering agent according to the invention “Delivery Systems for Gene Therapy'). In other words, the and/or targeting a gene with a specific endonuclease, thereby term adjuvant refers to a compound administered in addition obtaining cancer cells with increased rate of mutagenesis and to the active principle aiming at treating the patient, said increased rate of mortality. In a parallel approach, an inter adjuvant increasing the efficiency of the treatment. In a pre fering agent according to the present invention may also be ferred embodiment, said interfering agent according to the used to improve cancer treatment by chemiotherapy. invention can be administered at the same time than a DSB creating agent. In another preferred embodiment, said inter DEFINITIONS fering agent according to the invention can be administered before a DSB-creating agent in another preferred embodi (0169. The terms “effector” and “effectors' refer to any ment, said, interfering agent, according to the invention can cellular target, from nucleic or protein origin that can be be administered after a DSB-creating agent. targeted to directly or indirectly modulate double-strand break-induced mutagenesis; it encompasses any molecule 0163 Gene therapy is a technique for the treatment of that binds to nucleic acid to modulate gene transcription or genetic disorders in man whereby the absent or faulty gene is protein translation, any molecule that bind to another protein replaced by a working gene, so that the body can make the to alter or modify at least one property of that protein, such as correct enzyme or protein and consequently eliminate the its activity, or any gene or gene products that could play a role root cause of the disease. directly or indirectly in the process of double-strand break 0164. In the present case, the interfering agent Such as but induced mutagenesis. non-limited to an interfering RNA modulates the endonu 0170 The term “interfering agent” or “interfering agents’ clease-induced mutagenesis to increase the efficiency of the refer to any molecule and compound, likely to interact with treatment by gene therapy. effectors. It encompasses Small chemicals, Small molecules, 0.165 Examples of genetic disorders that can be treated by or Small compounds, composite chemicals or molecules, gene therapy include but are not limited to the Lesch-Nyhan from Synthetic or natural origin, encompassing amino acids syndrome, retinoblastoma, thalassaemia, the sickle cell dis or nucleic acid derivatives, synthons, Active Pharmaceutical ease, adenosine deaminase-deficiency, severe combined Ingredients, any chemical of industrial interest, used in the immune deficiency (SCID), Huntington's disease, adrenoleu manufacturing of drugs, industrial chemicals or agricultural kodystrophy, the Angelman syndrome, the Canavan disease, products. These interfering agents are part or not of molecular the Celiac disease, the Charcot-Marie-Tooth disease, color libraries dedicated to particular screening, commercially blindness, Cystic fibrosis, the Down syndrome, Duchenne available or not. These interfering agents encompass poly muscular dystrophy, Haemophilia, the Klinefelter's syn nucleotides derivatives as a non limiting example. drome, Neurofibromatosis, Phenylketonuria, the Prader 0171 The term “endonuclease' refers to any wild-type or Willi syndrome, the Sickle-cell disease, the Tay-Sachs dis variant enzyme capable of catalyzing the hydrolysis (cleav ease and the Turner syndrome. age) of bonds between nucleic acids within of a DNA or RNA 0166 AS non-limiting example, an interfering agent molecule, preferably a DNA molecule. Endonucleases do not according to the invention can be used to modulate the endo cleave the DNA or RNA molecule irrespective of its nuclease-induced mutagenesis in the treatment of a genetic sequence, but recognize and cleave the DNA or RNA mol disorder where a dominant nonfunctional allele is targeted by ecule at specific polynucleotide sequences, further referred to at least one DSB-creating agent to knock Such dominant non as “target sequences' or “target sites’. Endonucleases can be functional allele; in this case, an interfering agent according classified as rare-cutting endonucleases when having typi US 2012/0244131 A1 Sep. 27, 2012 cally a polynucleotide recognition site of about 12-45 base by two DNA-binding domains with a perfect two-fold sym pairs (bp) in length, more preferably of 14-45 bp. Rare-cut metry for homodimers such as I-CreI (Chevalier, Monnat et ting endonucleases significantly increase HR by inducing al. 2001), I-MsoI (Chevalier, Turmel et al. 2003) and I-CreI DNA double-strand breaks (DSBs) at a defined locus (Rouet, (Spiegel, Chevalier et al., 2006) and with a pseudo symmetry Srnih et al. 1994; Rouet, Smih et al. 1994; Choulika, Perrinet for monomers such as I-Scel (Moure, Gimble et al. 2003), al. 1995; Pingoud and Silva 2007). Rare-cutting endonu I-DmoI (Silva, Dalgaard et al. 1999) or I-Anil (Bolduc, Spie cleases can for example be a homing endonuclease (Paques gel et al. 2003). Both monomers and both domains (for mono and Duchateau 2007), a chimeric Zinc-Finger nuclease meric proteins) contribute to the catalytic core, organized (ZFN) resulting from the fusion of engineered zinc-finger around divalent cations. Just above the catalytic core, the two domains with the catalytic domain of a restriction enzyme LAGLIDADG peptides also play an essential role in the such as FokI (Porteus and Carroll 2005) or a chemical endo dimerization interface. DNA binding depends on two typical nuclease (Eisenschmidt, Lanio et al. 2005; Arimondo, Tho saddle-shaped C. B.BC.f3 BC. folds, sitting on the DNA major mas et al. 2006: Simon, Cannata et al. 2008). In chemical groove. Other domains can be found, for example in inteins endonucleases, a chemical or peptidic cleaver is conjugated such as PI-PfuI (Ichiyanagi, Ishino et al. 2000) and PI-SceI either to a polymer of nucleic acids or to another DNA rec (Moure, Gimble et al. 2002), whose protein splicing domain ognizing a specific target sequence, thereby targeting the is also involved in DNA binding. cleavage activity to a specific sequence. Chemical endonu 0.175. The making of functional chimeric meganucleases, cleases also encompass synthetic nucleases like conjugates of by fusing the N-terminal I-DmoI domain with an I-CreI orthophenanthroline, a DNA cleaving molecule, and triplex monomer (Chevalier, Kortemme et al. 2002; Epinat, Arnould forming oligonucleotides (TFOs), known to bind specific et al. 2003); International PCT Application WO 03/078619 DNA sequences (Kalish and Glazer 2005). Such chemical (Cellectis) and WO 2004/031346 (Fred Hutchinson Cancer endonucleases are comprised in the term "endonuclease' Research Center, Stoddard etal)) have demonstrated the plas according to the present invention. Rare-cutting endonu ticity of LAGLIDADG proteins. cleases can also be for example TALENs, a new class of chimeric nucleases using a FokI catalytic domain and a DNA 0176) Different groups have also used a semi-rational binding domain derived from Transcription Activator Like approach to locally alter the specificity of the I-CreI (Selig Effector (TALE), a family of proteins used in the infection man, Stephens et al. 1997: Sussman, Chadsey et al. 2004); process by plant pathogens of the Xanthomonas genus (Boch, International PCT Applications WO 2006/097784, WO 2006/ Scholze et al. 2009; Moscou and Bogdanove 2009; Christian, 097853, WO 2007/060495 and WO 2007/049156 (Cellectis); Cermak et al., 2010; Li, Huang et al. 2010). (Arnould, Chames et al. 2006: Rosen, Morrison et al. 2006: 0172 Rare-cutting endonuclease can be a homing endo Smith, Grizot et al. 2006), I-Sce (Doyon, Pattanayak et al. nuclease, also known under the name of meganuclease. Such 2006), PI-Scel (Gimble, Moureet al. 2003) and I-MsoI (Ash homing endonucleases are well-known to the art (Stoddard worth, Havranek et al. 2006). 2005). Homing endonucleases recognize a DNA target 0177. In addition, hundreds of I-CreI derivatives with sequence and generate a single- or double-strandbreak. Hom locally altered specificity were engineered by combining the ing endonucleases are highly specific, recognizing DNA tar semi-rational approach and High Throughput Screening: get sites ranging from 12 to 45 base pairs (bp) in length, (0178 Residues Q44, R68 and R70 or Q44, R68, D75 usually ranging from 14 to 40 bp in length. The homing and I77 of I-CreI were mutagenized and a collection of endonuclease according to the invention may for example variants with altered specificity at positions +3 to 5 of the correspond to a LAGLIDADG endonuclease, to a HNHendo DNA target (5NNN DNA target) were identified by nuclease, or to a GIY-YIG endonuclease. screening (international PCT Applications WO 2006/ 0173. In the wild, meganucleases are essentially repre 097784 and WO 2006/097853 (Cellectis); (Arnould, sented by homing endonucleases. Homing Endonucleases Chames et al. 2006; Smith, Grizot et al. 2006). (HES) are a widespread family of natural meganucleases 0179 Residues K28, N30 and Q38 or N30, Y33 and including hundreds of proteins families (Chevalier and Stod Q38 or K28, Y33, Q38 and S40 of I-CreI were dard 2001). These proteins are encoded by mobile genetic mutagenized and a collection of variants with altered elements which propagate by a process called "homing’: the specificity at positions +8 to 10 of the DNA target endonuclease cleaves a cognate allele from which the mobile (1ONNN DNA target) were identified by screening (Ar element is absent, thereby stimulating a homologous recom nould, Chames et al. 2006; Smith, Grizot et al., 2006); bination event that duplicates the mobile DNA into the recipi International PCT Applications WO 2007/060495 and ent locus. Given their exceptional cleavage properties in WO 2007/049156 (Cellectis)). terms of efficacy and specificity, they could represent ideal 0180. Two different variants were combined and scaffolds to derive novel, highly specific endonucleases. HES assembled in a functional heterodimeric endonuclease able to belong to four major families. The LAGLIDADG family, cleave a chimeric target resulting from the fusion of two named after a conserved peptidic motif involved in the cata different halves of each variant DNA target sequence ((Ar lytic center, is the most widespread and the best characterized nould, Chames et al. 2006; Smith, Grizot et al. 2006); Inter group. Seven structures are now available. Whereas most national PCT Applications WO 2006/097854 and WO 2007/ proteins from this family are monomeric and display two 034262). LAGLIDADG motifs, a few have only one motif, and thus 0181 Furthermore, residues 28 to 40 and 44 to 77 of I-CreI dimerize to cleave palindromic or pseudo-palindromic target were shown to form two partially separable functional sub Sequences. domains, able to bind distinct parts of a homing endonuclease 0.174 Although the LAGLIDADG peptide is the only con target half-site (Smith, Grizot et al. 2006); International PCT served region among members of the family, these proteins Applications WO 2007/049095 and WO 2007/057781 (Cel share a very similar architecture. The catalytic core is flanked lectis). US 2012/0244131 A1 Sep. 27, 2012

0182. The combination of mutations from the two subdo acid sequence of a wild-type, naturally-occurring, endonu mains of I-CreI within, the same monomerallowed the design clease with a different amino acid. Said substitutions) can for of novel chimeric molecules (homodimers) able to cleave a example be introduced by site-directed mutagenesis and/or palindromic combined DNA target sequence comprising the by random mutagenesis. In the frame of the present invention, nucleotides at positions +3 to 5 and +8 to 10 which are bound Such variant endonucleases remain functional, i.e. they retain by each subdomain (Smith, Grizot et al. 2006); International the capacity of recognizing and specifically cleaving a target PCT Applications WO 2007/049095 and WO 2007/057781 sequence to initiate gene targeting process. (Cellectis). 0190. The variant endonuclease according to the invention 0183 The method for producing meganuclease variants cleaves a target sequence that is different from the target and the assays based on cleavage-induced recombination in sequence of the corresponding wild-type endonuclease. mammal or yeast cells, which are used for Screening variants Methods for obtaining such variant endonucleases with novel with altered specificity are described in the International PCT specificities are well-known in the art. Application WO 2004/067736; (Epinat, Arnould et al. 2003: 0191 Endonucleases variants may be homodimers (mega Chames, Epinat et al. 2005; Arnould, Chames et al. 2006). nuclease comprising two identical monomers) or het These assays result in a functional Lac Z reporter gene which erodimers (meganuclease comprising two non-identical can be monitored by standard methods. monomers). It is understood that the scope of the present 0184 The combination of the two former steps allows a invention also encompasses endonuclease variants per se, larger combinatorial approach, involving four different Sub including heterodimers (WO2006097854), obligate het domains. The different subdomains can be modified sepa erodimers (WO2008093249) and single chain meganu rately and combined to obtain an entirely redesigned mega cleases (WO0307861.9 and WO2009095793) as nonlimiting nuclease variant (heterodimer or single-chain molecule) with examples, able to cleave one target of interest in a polynucle chosen specificity. In a first step, couples of novel meganu otide sequence or in a genome. The invention also encom cleases are combined in new molecules (“half-meganu passes hybrid variant perse composed of two monomers from cleases) cleaving palindromic targets derived from the target different origins (WO03078619). one wants to cleave. Then, the combination of such “half 0.192 Endonucleases with novel specificities can be used meganucleases’ can result in a heterodimeric species cleav in the method according to the present invention for gene ing the target of interest. The assembly of four sets of muta targeting and thereby integrating a transgene of interest into a tions into heterodimeric endonucleases cleaving a model genome at a predetermined location. target sequence or a sequence from different genes has been 0193 Endonucleases according to the invention or rare described in the following Cellectis International patent cutting endonucleases according to the invention can be men applications: XPC gene (WO2007/093918), RAG gene tioned or defined as one double-strand break creating agent (WO2008/010093), HPRT gene (WO200S/0593.82), beta-2 amongst other double-strand break creating agents well microglobulin gene (WO2008/102274), Rosa26 gene known in the art. Double-strand break creating agent means (WO2008/152523), Human hemoglobin beta gene any agent or chemical or molecule able to create DNA (or (WO2009/13622) and Human interleukin-2 receptor gamma double-stranded nucleic acids) double-strand breaks (DSBs). chain gene (WO2009019614). As previously mentioned, endonucleases can be considered 0185. These variants can be used to cleave genuine chro as double-strand break creating agent targeting specific DNA mosomal sequences and have paved the way for novel per sequences, in other terms, a double-strand break creating spectives in several fields, including gene therapy. agent targeting a double-strand break creating agent target 0186 Examples of such endonuclease include I-Sce I, site. Under “double-strand break creating agent' is also I-Chu I, I-Cre I, I-Csm I, PI-Sce I, PI-Tli I, PI-M tu I, I-Ceu I, encompassed variants orderivatives of endonucleases such as I-Sce II, I-Sce III, HO, PI-Civ I, PI-Ctr I, PI-Aae I, PI-Bsu I, engineered variants or engineered derivatives of meganu PI-Dha I, PI-Dra I, PI-Mav I, PI-Mch I, PI-Mfu I, PI-Mfl I, cleases, zinc-finger nucleases or TALENs; these variants or PI-Mga I, PI-Mgo I, PI-Min I, PI-Mka I, PI-Mle I, PI-Mima I, derivatives can be chimeric rare-cutting endonucleases, i.e. PI-Msh I, PI-Msm I, PI-Mth I, PI-Mtu I, PI-Mxe I, PI-Npu I, fusion proteins comprising additional protein catalytic PI-Pju I, PI-Rma I, PI-Spb I, PI-Ssp I, PI-Fac I, PI-Mja I, domains, displaying one or several enzymatic activities PI-Pho I, PI-Tag I, PI-Thy I, PI-Tko I, PI-Tsp I, I-MsoI. amongst nuclease, endonuclease or exonuclease, or a fusion 0187. A homing endonuclease can be a LAGLIDADG protein with a polymerase activity, a kinase activity, a phos endonuclease such as I-Sce, I-Cre, I-Ceul, I-MsoI, and phatase activity, a methylase activity, a topoisomerase activ I-DmoI. ity, an activity, a activity, a ligase activ 0188 Said LAGLIDADG endonuclease can be I-Sce I, a ity, a helicase activity, or a recombinase activity, as non member of the family that contains two LAGLIDADG motifs limiting examples or fusion proteins with other proteins and functions as a monomer, its molecular mass being implicated in DNA processing. In a more precise non-limit approximately twice the mass of other family members like ing example, said "double-strand break creating agent' I-CreI which contains only one LAGLIDADG motif and according to the present invention can be a fusion protein functions as homodimers. between a single-chain meganuclease obtained according to 0189 Endonucleases mentioned in the present application previously published methods (Grizot et al. 2009) and an encompass both wild-type (naturally-occurring) and variant exonuclease Trex2 as shown in example 4. endonucleases. Endonucleases according to the invention can 0194 Other agents or chemicals or molecules are double be a “variant' endonuclease, i.e. an endonuclease that does Strand break creating agents whom DNA sequence targets are not naturally exist in nature and that is obtained by genetic non-specific or non-predictable such as, in a nonlimiting list, engineering or by random mutagenesis, i.e. an engineered alkylating agents (Methyl Methane Sulfonate or dimethane endonuclease. This variant endonuclease can for example be Sulfonates family and analogs). Zeocyn, enzyme inhibitors obtained by substitution of at least one residue in the amino Such as toposiomerase inhibitors (types I and II Such as non US 2012/0244131 A1 Sep. 27, 2012 limiting examples quinolones, fluoroquinolones, ciprofloxa and 29 nucleotides, including 17, 18, 19, 20, 21, 22, 23, 24, cin, irinotecan, lamellarin D, doxorubicin, etoposide) and 25, 26, 27.28, and 29 nucleotides. siRNAs can additionally be ionizing radiations (X-rays. Ultraviolet, gamma-rays). chemically modified. (0195 The term “reporter gene’’, as used herein, refers to a (0199 “MicroRNAs” or “miRNAs” are endogenously nucleic acid sequence whose product can be easily assayed, encoded RNAs that are about 22-nucleotide-long, that post for example, colorimetrically as an enzymatic reaction prod transcriptionally regulate target genes and are generally uct, such as the lac7 gene which encodes for f3-galactosidase. expressed in a highly tissue-specific or developmental-stage Examples of widely-used reporter molecules include specific fashion. At least more than 200 distinct miRNAs have enzymes such, as f-galactosidase, f-glucoronidase, B-glu been identified in plants and animals. These small regulatory cosidase; luminescent molecules such as green fluorescent RNAs are believed to serve important biological functions by protein and firefly luciferase; and auxotrophic markers such two predominant modes of action: (1) by repressing the trans as His3p and Ura3p. (See, e.g., Chapter 9 in Ausubel, F. M., et lation of target mRNAs, and (2) through RNA interference, al. Current Protocols in Molecular Biology, John Wiley & that means cleavage and degradation of mRNAs. In this latter Sons, Inc. (1998)). The expressions “inactive reporter gene” case, miRNAs function analogously to siRNAs. miRNAs are or “reporter gene rendered inactive” refers to a reporter gene first transcribed as part as a long, largely single-stranded wherein one part of said reporter gene has been replaced for primary transcript (pri-miRNA) (Lee, Jeon et al. 2002). This the purpose of the present invention, in this inactive state, said pri-miRNA transcript is generally and possibly invariably, inactive reporter gene is not capable of emitting any relevant synthetized by RNA polymerase II and therefore is polyade detectable signal upon, transfection in a cell. In the present nylated and may be spliced. It contains an about 80-nucle invention, reporter genes such as Luciferase and Green Fluo otides long hairpin structure that encodes the mature about rescent Protein genes have been rendered inactive by, respec 22-nucleotides miRNA part of one arm of the stem. In animal tively, the introduction of a frameshift mutation in their cells, this primary transcript is cleaved by a nuclear RNaseIII respective coding sequence. Said frameshift mutation can be type enzyme called Drosha (Lee et al., 2003, Nature 425:415 due, as a non-limiting example, to the introduction in said 419) to liberate a hairpin mRNA precursor, or pre-miRNA of coding sequence of a target, sequence for an endonuclease. about-65 nucleotides long. This pre-miRNA is then exported Upon cellular co-transfection of said inactive reporter gene to the cytoplasm by exportin-5 and the GTP-bound form, of and endonuclease, said endonuclease provokes a double the Ran cofactor (Yi, Qin et al. 2003). Once in the cytoplasm, strand break in its target that is repaired by NHEJ, leading to the pre-miRNA is further processed by Dicer, another a functional restoration of said reporter gene. The expressions RNaseIII enzyme to produce a duplex of about-22 nucle “functional restoration” of a reporter gene or “functional otides base pairs long that is structurally identical to a siRNA reporter gene' refer to the recovering of a reporter gene duplex (Hutvagner, McLachlan et al. 2001). The binding of capable of emitting a relevant detectable signal upon trans protein components of the RISC, or RISC cofactors, to the fection in a cell. "RNA interference” refers to a sequence duplex results in incorporation of the mature, single-stranded specific post transcriptional gene silencing mechanism trig miRNA into a RISC or RISC-like protein complex, while the gered by dsRNA, during which process the target RNA is other strand of the duplex is degraded (Barteletal, 2004, Cell degraded. RNA degradation occurs in a sequence-specific 116: 281-297). manner rather than by a sequence-independent dsRNA 0200 Thus, one can design and express artificial miRNAs response, like PKR response. based on the features of existing miRNA genes. The miR-30 (0196) The terms “interfering RNA” and “iRNA” refer to (microRNA30) architecture can be used to express miRNAs double stranded RNAs capable of triggering RNA interfer (or siRNAs) from RNA polymerase II promoter-based ence of a gene. The gene thus silenced is defined as the gene expression plasmids (Zeng, Cai et al. 2005). In some targeted by theiRNA. Interfering RNAs include, e.g., siRNAs instances the precursor miRNA molecules may include more and shRNAs; an interfering RNA is also an interfering agent than one stem-loop structure. The multiple stem-loop struc as described above. tures may be linked to one another through a linker, such as, (0197) “iRNA-expressing construct” and “iRNA con for example, a nucleic acid, linker, a miRNA flanking Struct are generic terms which include small interfering sequence, other molecules, or some combination thereof. RNAs (siRNAs), shRNAs and other RNA species, and which 0201 A "short hairpin RNA (shRNA)” refers to a segment can be cleaved in vivo to form siRNAs. As mentioned before, of RNA that is complementary to a portion of a target gene it has been shown that the enzyme Dicer cleaves long dsRNAs (complementary to one or more transcripts of a target gene), into short-interfering RNAs (siRNAs) of approximately and has a stem-loop (hairpin) structure, and which can be 21-23 nucleotides. One of the two siRNA strands is then used to silence gene expression. incorporated into an RNA-induced silencing complex 0202. A “stem-loop structure' refers to a nucleic acid hav (RISC). RISC compares these “guide RNAs” to RNAs in the ing a secondary structure that includes a region of nucleotides cell and efficiently cleaves target RNAs containing sequences which are blown or predicted to form a double strand (stem that are perfectly, or nearly perfectly complementary to the portion) that is linked on one side by a region of predomi guide RNA. 'iRNA construct also includes nucleic acid nantly single-stranded nucleotides (loop portion). The terms preparation designed to achieve an RNA interference effect, "hairpin' is also used herein to refer to stem-loop structures. Such as expression vectors able of giving rise to transcripts (0203. By "double-strand break-induced target sequence” which form dsRNAs or hairpin RNA in cells, and or tran or "double-strand break creating agent target site', or “DSB scripts which can produce siRNAs in vivo. creating agent target site' is intended a sequence that is rec (0198 A“short interfering RNA” or “siRNA” comprises a ognized by any double strand break creating agent. RNA duplex (double-stranded region) and can further com 0204 The expression “polynucleotide derivatives' refers prises one or two single-stranded overhangs, 3' or 5' over to polynucleotide sequences that can be deduced and con hangs. Each molecule of the duplex can comprise between 17 structed from the respective sequence or a part of the respec US 2012/0244131 A1 Sep. 27, 2012

tive sequence of identified-effector genes according to the sane meganuclease which interacts with the other half of the present invention. These derivatives can refer to mRNAs, DNA target to form a functional meganuclease able to cleave siRNAs, dsRNAs, miRNAs, cDNAs. These derivatives can said DNA target. be used directly or as part of a delivery vector or vector/ 0213. By “meganuclease variant' or “variant it is plasmid/construct, by introducing them in an eukaryotic cell intended a meganuclease variant or a "DSB-creating agent' to increase gene targeting efficiency and/or endonuclease variant, a rare-cutting endonuclease” variant, a "chimeric induced homologous recombination. rare-cutting endonuclease' variant obtained, by replacement of at least one residue in the amino acid sequence of the parent 0205. “Transfection” means “introduction' into alive cell, meganuclease or parent “DSB-creating agent, parent rare either in vitro or in vivo, of certain nucleic acid construct, cutting endonuclease', parent "chimeric rare-cutting endo preferably into a desired cellular location of a cell, said nuclease” with a different amino acid. nucleic acid construct being functional once in the transfected 0214. By “peptide linker it is intended to mean a peptide cell. Such presence of the introduced nucleic acid may be sequence of at least 10 and preferably at least 17 amino acids stable or transient. Successful transfection will have an which links the C-terminal amino acid residue of the first intended effect or a combination of effects on the transfected monomer to the N-terminal residue of the second monomer cell. Such as silencing and/or enhancing a gene target and/or and which allows the two variant monomers to adopt the triggering target physiological event, like enhancing the fre correct conformation for activity and which does not alter the quency of mutagenesis following an endonuclease-induced specificity of either of the monomers for their targets. DSB as a non-limiting example. 0215. By “subdomain it is intended the region of a 0206 “Modulate' or “modulation' is used to qualify the LAGLIDADG homing endonuclease core domain which up- or down-regulation of a pathway like NHEJ consecutive interacts with a distinct part of a homing endonuclease DNA to an endonuclease-induced DSB in particular conditions or target half-site. not, compared to a control condition, the level of this modu 0216 By “selection or selecting it is intended to mean the lation being measured by an appropriate method. More isolation of one or more meganuclease variants based upon an broadly, it can refer to any phenomenon “modulation' is observed specified phenotype, for instance altered cleavage associated with, like the expression level of a gene, a poly activity. This selection can be of the variant in a peptide form nucleotide or derivative thereof (DNA, cDNA, plasmids, upon which the observation is made or alternatively the selec RNA, mRNA, interfering RNA), polypeptides, etc. tion can be of a nucleotide coding for selected meganuclease 0207 “Amino acid residues” in a polypeptide sequence variant. are designated herein according to the one-letter code, in 0217. By “screening it is intended to mean the sequential which, for example, Q means Gln or Glutamine residue, R or simultaneous selection of one or more meganuclease vari means Arg or Arginine residue and D means Asp or Aspartic ant(s) which exhibits a specified phenotype such as altered acid residue. cleavage activity. 0208 “Amino acid substitution” means the replacement of 0218. By “derived from it is intended to mean a “DSB one amino acid residue with another, for instance the replace creating agent variant, a rare-cutting endonuclease” variant, ment of an Arginine residue with a Glutamine residue in a a "chimeric rare-cutting endonuclease' variant or a meganu peptide sequence is an amino acid Substitution. clease variant which is created from a parent “DSB-creating agent, rare-cutting endonuclease', 'chimeric rare-cutting 0209 “Altered/enhanced/increased/improved cleavage endonuclease' or meganuclease and hence the peptide activity”, refers to an increase in the detected level of mega sequence of the resulting “DSB-creating agent variant, rare nuclease cleavage activity, see below, against a target DNA cutting endonuclease” variant, "chimeric rare-cutting endo sequence by a second meganuclease in comparison to the nuclease' variant or meganuclease variant is related to (pri activity of a first meganuclease against the target DNA mary sequence level) but derived from (mutations) the sequence. Normally the second meganuclease is a variant of sequence peptide sequence of the parent meganuclease. By the first and comprise one or more Substituted amino acid “I-CreI' is intended the wild-type I-CreI having the sequence residues in comparison to the first meganuclease. of pdb accession code 1g.9y, corresponding to the sequence 0210 “Nucleotides are designated as follows: one-letter SEQID NO: 7 in the sequence listing. code is used for designating the base of a nucleoside: a is 0219. By “I-Crel variant with novel specificity” is adenine, t is thymine, c is cytosine, and g is guanine. For the intended a variant having a pattern of cleaved targets different degenerated nucleotides, r represents g or a (purine nucle from that of the parent meganuclease. The terms "novel speci otides), k represents gort, S represents g or c, w represents a ficity”, “modified specificity”, “novel cleavage specificity'. or t, m represents a or c, y represents t or c (pyrimidine “novel substrate specificity' which are equivalent and used nucleotides), d represents g, a or t, V represents g, a or c, b indifferently, refer to the specificity of the variant towards the represents g. t or c, h represents a, torc, and n represents g, a. nucleotides of the DNA target sequence. In the present patent t Or C. applicationall the I-CreI variants described comprise an addi 0211. By “meganuclease', is intended an endonuclease tional Alanine after the first Methionine of the wild type having a double-stranded DNA target sequence of 12 to 45bp. I-CreI sequence (SEQ ID NO: 7). These variants also com more preferably 22 to 24bp when said meganuclease is an prise two additional Alanine residues and an Aspartic Acid I-CreI variant. Said meganuclease is either a dimeric enzyme, residue after the final Proline of the wild type I-CreI wherein each domain is on a monomer or a monomeric sequence. These additional residues do not affect the proper enzyme comprising the two domains on a single polypeptide. ties of the enzyme and to avoid confusion these additional 0212. By “meganuclease domain is intended the region residues do not affect the numeration of the residues in I-Cre which interacts with one half of the DNA target of a mega or a variant referred in the present patent application, as these nuclease and is able to associate with the other domain of the references exclusively refer to residues of the wildtype I-CreI US 2012/0244131 A1 Sep. 27, 2012

enzyme (SEQ ID NO: 7) as present in the variant, so for sequence of a given endonuclease or a variant of Such endo instance residue 2 of I-Crei is in fact residue 3 of a variant nuclease. These terms refer to a distinct DNA location, pref which comprises an additional Alanine after the first erably a genomic location, at which a double stranded break Methionine. (cleavage) is to be induced by the meganuclease. The DNA 0220. By “I-CreI site' is intended a 22 to 24 bp double target is defined by the 5' to 3' sequence of one strand of the stranded DNA sequence which is cleaved by I-CreI. I-CreI double-stranded polynucleotide, as indicate above for C1221. sites include the wild-type non-palindromic I-CreI homing Cleavage of the DNA target occurs at the nucleotides at posi site and the derived palindromic sequences such as the tions +2 and -2, respectively for the sense and the antisense sequence 5'-t. 12c.11a. loaga.sa. 7c.gg.st.4c.3g2t. 1a, c.2g.sa. strand. Unless otherwise indicated, the position at which 4c.sgst 7tstotlogia, 12 (SEQ ID NO: 8), also called cleavage of the DNA target by an I-Cre I meganuclease vari C1221. ant occurs, corresponds to the cleavage site on the sense 0221 By “domain or “core domain is intended the strand of the DNA target. By “an I-SceI target site' is meant “LAGLIDADG homing endonuclease core domain which is a target sequence for the endonuclease I-Scel; by “an engi the characteristic Offic. B BC. fold of the horning endonu neered meganuclease target site' is meant a target sequence cleases of the LAGLIDADG family, corresponding to a for a variant endonuclease that has been engineered as previ sequence of about one hundred amino acid residues. Said ously mentioned and as described in WO2006097854, domain comprises four beta-strands (BBBB) folded in an WO2008093249, WO03078619, WO2009095793, anti-parallel beta-sheet which interacts with one half of the WOO3O78619 and WO 2004/067736. DNA target. This domain is able to associate with another 0225. By “DNA target half-site”, “half cleavage site' or LAGLIDADG homing endonuclease core domain which half-site' is intended the portion of the DNA target which is interacts with the other half of the DNA target to form a bound by each LAGLIDADG homing endonuclease core functional endonuclease able to cleave said DNA target. For domain. example, in the case of the dimeric homing endonuclease 0226. By "chimeric DNA target” or “hybrid DNA target” I-CreI (163 amino acids), the LAGLIDADG homing endo is intended the fusion of different halves of two parent mega nuclease core domain corresponds to the residues 6 to 94. nuclease target sequences. In addition at least one half of said 0222 By “beta-hairpin' is intended two consecutive beta target may comprise the combination of nucleotides which strands of the antiparallel beta-sheet of a LAGLIDADG hom are bound by at least two separate Subdomains (combined ing endonuclease core domain (BB or BB) which are con DNA target). nected by a loop or a turn, 0227 By “parent meganuclease it is intended to mean a 0223. By “single-chain meganuclease”, “single-chain chi wild type meganuclease or a variant of Such a wild type meric meganuclease', 'single-chain meganuclease deriva meganuclease with identical properties or alternatively a tive', 'single-chain chimeric meganuclease derivative' or meganuclease with some altered characteristic in comparison “single-chain derivative' is intended a meganuclease com to a wild type version of the same meganuclease. prising two LAGLIDADG homing endonuclease domains or 0228 By “delivery vector” or “delivery vectors” is core domains linked by a peptidic spacer as described in intended any delivery vector which can be used in the present WO03078619 and WO2009095793. The single-chain mega invention to put into cell contact or deliver inside cells or nuclease is able to cleave a chimeric DNA target sequence Subcellular compartments agents/chemicals and molecules comprising one different half of each parent meganuclease (proteins or nucleic acids) needed in the present invention. It target sequence. —By “SC-GS meganuclease' or “engi includes, but is not limited to liposomal delivery vectors, viral neered SC-GS meganuclease' is meant an engineered single delivery vectors, drug delivery vectors, chemical carriers, chain meganuclease as described in WO03078619 and polymeric carriers, lipoplexes, polyplexes, dendrimers, WO2009095793 capable of cleaving a target sequence microbubbles (ultrasound, contrast agents), nanoparticles, according to SEQ ID NO: 9, and having a polypeptidic emulsions or other appropriate transfer vectors. These deliv sequence corresponding as a non-limiting example to SEQID ery vectors allow delivery of molecules, chemicals, macro NO: 4. —By “SC-RAG meganuclease' or “meganuclease molecules (genes, proteins), or other vectors such as plas SC-RAG” or “SC-RAG” is meant an engineered single chain mids, peptides developed by Diatos. In these cases, delivery meganuclease as described in WO03078619 and vectors are molecule carriers. By “delivery vector or “deliv WO2009095793 capable of cleaving a target sequence ery vectors’ is also intended delivery methods to perform according to SEQ ID NO: 10, and having a polypeptidic transfection. sequence corresponding as a non-limiting example to SEQID 0229. The terms “vector or “vectors' refer to a nucleic NO: 11. acid, molecule capable of transporting another nucleic acid to 0224. By “DNA target”, “DNA target sequence', “target which it has been linked. A “vector in the present invention sequence', “target-site', “target”, “site”, “site of interest'. includes, but is not limited to, a viral vector, a plasmid, a RNA “recognition site', 'polynucleotide recognition site'. “recog vector or a linear or circular DNA or RNA molecule which nition sequence”, “homing recognition site'. “homing site'. may consists of a chromosomal, non chromosomal, semi "cleavage site', 'endonuclease-specific target site' is synthetic or synthetic nucleic acids. Preferred vectors are intended a 20 to 24bp double-stranded palindromic, partially those capable of autonomous replication (episomal vector) palindromic (pseudo-palindromic) or non-palindromic poly and/or expression of nucleic acids to which they are linked nucleotide sequence that is recognized and cleaved by a (expression vectors). Large numbers of Suitable vectors are LAGLIDADG homing endonuclease such as I-Cre, or a known to those of skill in the art and commercially available. variant, or a single-chain chimeric meganuclease derived 0230 Viral vectors include retrovirus, adenovirus, par from I-CreI. Said DNA target sequence is qualified of “cleav vovirus (e.g. adenoassociated viruses), coronavirus, negative able' by an endonuclease, when recognized within a genomic Strand RNA viruses such as orthomyxovirus (e.g., influenza sequence and known to correspond to the DNA target virus), rhabdovirus (e.g., rabies and vesicular stomatitis US 2012/0244131 A1 Sep. 27, 2012 virus), paramyxovirus (e.g. measles and Sendai), positive More particularly, the vector comprises a replication origin, a Strand RNA viruses such as picornavirus and alphavirus, and promoter operatively linked to said encoding polynucleotide, double-stranded DNA viruses including adenovirus, herpes a ribosome binding site, a RNA-splicing site (when genomic virus (e.g. Herpes Simplex virus types 1 and 2, Epstein-Barr DNA is used), a polyadenylation site and a transcription virus, cytomegalovirus), and poxvirus (e.g., vaccinia, fowl termination site. It also can comprise an enhancer or silencer pox and canarypox). Other viruses include Norwalk virus, elements. Selection of the promoter will depend upon the cell togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, in which the polypeptide is expressed. Suitable promoters and hepatitis virus, for example. Examples of retroviruses include tissue specific and/or inducible promoters. Examples include: avian leukosis-sarcoma, mammalian C-type, B-type of inducible promoters are: eukaryotic metallothionine pro viruses, D type viruses, HTLV-BLV group, lentivirus, spuma moter which is induced by increased levels of heavy metals, virus (Coffin, J. M., Retroviridae: The viruses and their rep prokaryotic lac7 promoter which is induced in response to lication. In Fundamental Virology. Third Edition, B. N. isopropyl-B-D-thiogalacto-pyranoside (IPTG) and eukary Fields, et al., Eds. Lippincott-Raven Publishers, Philadel otic heat shock promoter which is induced by increased tem phia, 1996). perature. Examples of tissue specific promoters are skeletal 0231. By “lentiviral vector” is meant HIV-Based lentiviral muscle , prostate-specific antigen (PSA), vectors that are very promising for gene delivery because of C-antitrypsin protease, human Surfactant (SP) A and B pro their relatively large packaging capacity, reduced immunoge teins, B-casein and acidic whey protein genes. nicity and their ability to stably transduce with high efficiency 0234 Inducible promoters may be induced by pathogens a large range of different cell types. Lentiviral vectors are usually generated following transient transfection of three or stress, more preferably by stress like cold, heat, UV light, (packaging, envelope and transfer) or more plasmids into or high ionic concentrations (reviewed in (Potenza, Alemanet producer cells. Like HIV, lentiviral vectors enter the target al, 2004)). Inducible promoter may be induced by chemicals cell through the interaction of viral Surface glycoproteins (reviewed in (Moore, Samalova et al. 2006); (Padidam 2003); with receptors on the cell surface. On entry, the viral RNA (Wang, Zhou et al. 2003); (Zuo and Chua 2000). undergoes reverse transcription, which is mediated by the 0235 Delivery vectors and vectors can be associated or viral complex. The product of reverse combined with any cellular permeabilization techniques such transcription is a double-stranded linear viral DNA, which is as Sonoporation or electroporation or derivatives of these the substrate for viral integration in the DNA of infected cells. techniques. 0232 By “integrative lentiviral vectors (or LV), is meant 0236. By “cell” or “cells” is intended any prokaryotic or Such vectors as nonlimiting example, that are able to integrate eukaryotic living cells, cell lines derived from these organ the genome of a target cell. At the opposite by “non integrative isms for in vitro cultures, primary cells from animal or plant lentiviral vectors (or NILV) is meant efficient gene delivery origin. vectors that do not integrate the genome of a target cell 0237 By “primary cell” or “primary cells” are intended through the action of the virus integrase. cells taken directly from living tissue (i.e. biopsy material) 0233. One type of preferred vector is an episome, i.e., a and established for growth in vitro, that have undergone very nucleic acid capable of extra-chromosomal replication. Pre few population doublings and are therefore more representa ferred vectors are those capable of autonomous replication tive of the main functional components and characteristics of and/or expression of nucleic acids to which they are linked. tissues from which they are derived from, in comparison to Vectors capable of directing the expression of genes to which continuous tumorigenic or artificially immortalized cell lines. they are operatively linked are referred to herein as “expres These cells thus represent a more valuable model to the in sion vectors. A vector according to the present invention vivo state they refer to. comprises, but is not limited to, a YAC (yeast artificial chro 0238. In the frame of the present invention, “eukaryotic mosome), a BAG (bacterial artificial), a baculovirus vector, a cells' refer to a fungal, plant or animal cell or a cell line phage, a phagemid, a cosmid, a viral vector, a plasmid, a RNA derived from the organisms listed below and established for in vector or a linear or circular DNA or RNA molecule which vitro culture. More preferably, the fungus is of the genus may consist of chromosomal, non chromosomal, semi-syn Aspergillus, Penicillium, Acremonium, Trichoderma, thetic or synthetic DNA. In general, expression vectors of Chrysoporium, Mortierella, Kluyveromyces or Pichia; More utility in recombinant DNA techniques are often in the form preferably, the fungus is of the species Aspergillus niger, of “plasmids” which refer generally to circular double Aspergillus nidulans, Aspergillus Oryzae, Aspergillus terreus, stranded DNA loops which, in their vector form are not bound Penicillium chrysogenium, Penicillium citrinum, Acremonium to the . Large numbers of Suitable vectors are Chrysogenium, Trichoderma reesei, Mortierella alpine, Chry known to those of skill in the art. Vectors can comprise select sosporium lucknowense, Kiuyverornyces lactis, Pichia pas able markers, for example: neomycin phosphotransferase, toris or Pichia ciferrii. histidinoldehydrogenase, dihydrofolate reductase, hygromy 0239 More preferably the plant is of the genus Arabidos cin phosphotransferase, herpes simplex virus thymidine pis, Nicotiana, Solanum, Iactuca, Brassica, Oryza, Aspara kinase, adenosine deaminase, glutamine synthetase, and gus, Pisum, Medicago, Zea, Hordeum, Secale, Triticum, Cap hypoxanthine-guanine phosphoribosyl transferase for sicum, Cucumis, Cucurbita, Citrullis, Citrus, Sorghum; More eukaryotic cell culture: TRP1 for S. cerevisiae; tetracycline preferably, the plant is of the species Arabidospis thaliana, rifampicin or amplicillin resistance in E. coli. Preferably said Nicotiana tabaccum, Solanum lycopersicum, Solanum vectors are expression vectors, wherein a sequence encoding tuberosum, Solanum melongena, Solanum esculentum, Lac a polypeptide of interest is placed under control of appropri tuca saliva, Brassica napus, Brassica oleracea, Brassica ate transcriptional and translational control elements to per rapa, Oryza glaberrima, Oryza saliva, Asparagus officinalis, mit production or synthesis of said polypeptide. Therefore, Pisum sativum, Medieago Saliva, zea mays, Hordeum vul said polynucleotide is comprised in an expression cassette. gare, Secale cereal, Triticum aestivum, Triticum durum, Cap US 2012/0244131 A1 Sep. 27, 2012 20 sicum sativus, Cucurbita pepo, Citrullus lanatus, Cucumis a reporter gene, leading to a inactive reporter gene that can melo, Citrus aurantifolia, Citrus maxima, Citrus medica, Cit only be restored after an NHEJ event. In the frame of the rus reticulata. present invention, the expression “double-strand break-in 0240 More preferably the animal cell is of the genus duced mutagenesis” (DSB-induced mutagenesis) refers to a Homo, Rattus, Mus, Sus, Bos, Danio, Canis, Fells, Equus, mutagenesis event consecutive to an NHEJ event following Salmo, Oncorhynchus, Gallus, Meleagris, Drosophila, Cae an endonuclease-induced DSB, leading to insertion/deletion norhabditis; more preferably, the animal cell is of the species at the cleavage site of an endonuclease. Homo sapiens, Rattus norvegicus, Mus musculus, Sus scrofa, 0244. By “gene' is meant the basic unit of heredity, con Bos taurus, Danio rerio, Canis lupus, Felis catus, Equus sisting of a segment of DNA arranged in a linear manner cabalus, Salmo salar; Oncorhynchus mykiss, Gallus gallus, along a chromosome, which codes for a specific protein or Meleagris gallopavo, Drosophila melanogaster, Caenorhab segment of protein. A gene typically includes a promoter, a 5' ditis elegans. untranslated region, one or more coding sequences (exons), 0241 By “homologous is intended a sequence with optionally introns, a 3' untranslated region. The gene may enough identity to another one to lead to homologous recom further comprise a terminator, enhancers and/or silencers. bination between sequences, more particularly having at least 0245. As used herein, the term “transgene' refers to a 95% identity, preferably 97% identity and more preferably sequence encoding a polypeptide. Preferably, the polypeptide 99%. encoded by the transgene is either not expressed, or expressed 0242 “Identity” refers to sequence identity between two but not biologically active, in the cell, tissue or individual in nucleic acid molecules or polypeptides. Identity can be deter which the transgene is inserted. Most preferably, the trans mined by comparing a position in each sequence which may gene encodes a therapeutic polypeptide useful for the treat be aligned for purposes of comparison. When a position in the ment of an individual. compared sequence is occupied by the same base, then the 0246 The term, “gene of interest” or “GOI refers to any molecules are identical at that position. A degree of similarity nucleotide sequence encoding a known orputative gene prod or identity between nucleic acid or amino acid sequences is a uct function of the number of identical or matching nucleotides at 0247. As used herein, the term “locus is the specific positions shared by the nucleic acid sequences. By a poly physical location of a DNA sequence (e.g. of a gene) on a nucleotide having a sequence at least, for example, 95% chromosome. The term, “locus' usually refers to the specific “identical to a query sequence of the present invention, it is physical location of an endonuclease's target sequence on a intended that the sequence of the polynucleotide is identical chromosome. Such a locus, which comprises a target to the query sequence except that the sequence may include sequence that is recognized and cleaved by an endonuclease up to five nucleotide alterations per each 100 nucleotides of according to the invention, is referred to as "locus according the query sequence. In otherwords, to obtain a polynucleotide to the invention'. Also, the expression "genomic locus of having a sequence at least 95% identical to a query sequence, interest' is used to qualify a nucleic acid sequence in a up to 5% (5 of 100) of the nucleotides of the sequence may be genome that can be a putative target for a double-strand break inserted, deleted, or substituted with another nucleotide. Vari according to the invention. By "endogenous genomic locus of ous alignment algorithms and/or programs may be used to interest' is intended a native nucleic acid sequence in a calculate the identity between two sequences, including genome, i.e., a sequence or allelic variations of this sequence FASTA, or BLAST which are available as a part of the GCG that is naturally present at this genomic locus. It is understood sequence analysis package (University of Wisconsin, Madi that the considered genomic locus of interest of die present son, Wis.), and can be used with, e.g., default setting. The invention can be between two overlapping genes the consid program, which uses the Needleman-Wunsch ered endonuclease's target sequences are located in two dif global alignment algorithm (Needleman and Wunseh 1970) ferent genes. “Genomic locus of interest” in the present appli to find the optimum alignment (including gaps) of two cation, encompasses nuclear genetic material but also a sequences when considering their entire length, may for portion of genetic material that can exist independently to the example be used. The needle program is for example avail main body of genetic material. Such as plasmids, episomes, able on the ebi.ac.uk worldwide web site. The percentage of virus, transposons or in organelles Such as mitochondria or identity in accordance with the invention is preferably calcu chloroplasts as non-limiting examples, at which a double lated using the EMBOSS::needle (global) program with a stranded break (cleavage) can be induced by the DSB-creat “Gap Open parameter equal to 10.0, a "Gap Extend param ing agent, i.e endonuclease, rare-cutting endonuclease and/or eter equal to 0.5, and a Blosumó2 matrix. chimeric rare-cutting endonuclease of the invention. 0243. By “mutation' is intended the substitution, deletion, 0248. By “RAG1 locus is intended the RAG1 gene posi insertion of one or more nucleotides/amino acids in a poly tion in a mammalian genome. For example, the human RAG1 nucleotide (cDNA, gene) or a polypeptide sequence. Said gene is available in the NCBI database, under the accession mutation can affect the coding sequence of a gene or its number NC 000011.8 (GeneID:5896) and its locus is posi regulatory sequence. It may also affect the structure of the tioned from position 36546139 to 36557877. genomic sequence or the structure/stability of the encoded 0249. The above written description of the invention pro mRNA. By “frameshift mutation is intended a genetic muta vides a manner and process of making and using it such that tion caused by insertions or deletions in a DNA sequence of a any person skilled in this art is enabled to make and use the number of nucleotides that is not evenly divisible by three. same, this enablement being provided in particular for the Due to the triplet nature of the genetic code, Suchinsertions or Subject matter of the appended claims, which make up a part deletions can change the reading frame of the considered of the original description. gene, resulting in a completely different translation of this 0250) As used above, the phrases “selected from the group gene. For the purpose of the present invention, Such frame consisting of “chosen from.” and the like include mixtures shift mutations have been inserted in the coding sequence of of the specified materials. US 2012/0244131 A1 Sep. 27, 2012

0251 Where a numerical limit or range is stated herein, I-Sce recognition site iv) a glycine serine stretch V) the same the endpoints are included. Also, all values and Subranges 2 codons for alanine as ini) and finally vi) luciferase reporter within a numerical limit or range are specifically included as gene lacking its ATG start codon. Luciferase reporter gene is if explicitly written out. inactive due to a frame-shift introduced by GS or I-Scel 0252. The above description is presented to enable a per recognition site. Thus induction of a DNA double strand son skilled in the art to make and use the invention, and is break (DSB) by SC GS (SEQID NO: 4) encoded by vector provided in the context of a particular application and its pCLS2690 (SEQID NO:3) or I-SecI meganuclease (SEQID requirements. Various modifications to the preferred embodi NO: 40) followed by a mutagenic DSB repair by NHEJ can ments will be readily apparent to those skilled in the art, and lead to the restoration of Luciferase gene in frame with ATG the generic principles defined herein may be applied to other start codon. embodiments and applications without departing from the 0261 These sequences were placed in a plasmid used to spirit and scope of the invention. Thus, this invention is not target the final construct at RAG1 locus using hisRAG1 Inte intended to be limited to the embodiments shown, but is to be gration Matrix CMV Neo from cQPS(R) Custom Human Full accorded the widest scope consistent with the principles and Kit DD (Cellectis Bioresearch). features disclosed herein. 0262 The plasmid pCLS6883 (SEQID NO: 1) was tested 0253 Having generally described this invention, a further in an extrachromosomal assay by co transfection with either understanding can be obtained by reference to certain specific pCLS2690 (SEQID NO: 3) encoding SC GS meganuclease examples, which are provided herein for purposes of illustra or pCLS0099 (SEQID NO: 12) encoding GFP. Luciferase tion only, and are not intended to be limiting unless otherwise signal was analysed 72 hours post transfection. Co transfec specified. tion with pCLS6883 (SEQID NO: 1) and pCLS0099 (SEQID NO: 12) led to a 1,800 R.L.U. signal whereas co transfection EXAMPLES with pCLS6883 (SEQID NO: 1) and pCLS2690 (SEQID NO: Example 1 3) led to a 20,000 R.L.U signal. This result demonstrates that the presence of meganuclease SC GS induced luciferase by a Constructions Monitoring Meganuclease-Induced factor of 11 and that pCLS6883 (SEQID NO: 1) construe is Mutagenesis thus measuring mutagenic NHEJ DSB-repair induced by 0254 Plasmids measuring meganuclease-induced SC GS. mutagenesis at their target site were constructed. They are based on activation of a reporter gene after cleavage of a target Example 2 by a meganuclease followed by a mutagenic repair of this Screening with Extrachromosomal Assay of siRNAs double strand break (DSB). The present invention uses Targeting Genes Coding for Kinases Luciferase reporter gene and I-Sce and GS meganucleases but other reporter gene Such as GFP and other meganucleases 0263. The construct measuring SC GS induced mutagen can be used. esis at GS locus with Luciferase reporter gene (pCLS6883, 0255 a) Materials and Methods SEQID NO: 1) was used to screen in an extrachromosomal assay siRNAS targeting genes coding for kinases. This screen Cell Culture identified 23 siRNAs that led to stimulation of Luciferase 0256 Cell line 293H was cultured at 37°C. with 5% CO, signal induced by meganuclease SC GS. in Dulbecco's modified Eagle's medium (DMEM) Glutamax 0264 a) Materials and Methods supplemented with 10% fetal calf serum, 2 mM L-glutamme, 100 UI/ml penicilline, 100 g/ml streptomycine, 0.25 ug/m Extrachromosomal Screening amphotericine B (Fongizone). 0265 Twenty thousand of 293H cells per well were seeded in white 96 well plates one day before transfection. Per well, Transient Transfection in 96 Well Plate Format cells were transfected with 200 ng of total DNA with 100 ng 0257 Twenty thousand cells per well were seeded in white pCLS6883 (SEQID NO: 1) and either 100 ng of pCLS2690 96 well plates one day before transfection. Per well, cells (SEQID NO:3) or 100ng of pCLS0099 (SEQID NO: 12) and were transfected with 200 ng of total DNA with 100 ng with siRNA at 33 nM final concentration using 1.35 ul of pCLS6883 (SEQID NO: 1) and either 100 ng of pCLS2690 Polyfect transfection reagent (QIAGEN). The siRNAs kinase (SEQ ID NO:3) or 100 ng of pCLS0099 (SEQ ID NO: 12) set (QIAGEN) is targeting 696 genes with 2 siRNAs per gene. using 1.35 ul of Polyfect transfection reagent (QIAGEN). Ninety-six hours post transfection 50 ul per well of ONEGlo Ninety-six hours post transfection 50 ul per well of ONEGlo (Promega) were added, cells were incubated in dark for 3 (Promega) were added, cells were incubated in dark for 3 minutes before luciferase activity analysis (1 second?well) minutes before luciferase activity analysis (1 second?well) using PHER AStar luminometer (BMG Labtech). using PHER AStar luminometer (BMG Labtech). 0266 b) Results 0258 b) Results: 0267 Each plate from siRNA kinase set was co-trans 0259. The plasmids pCLS6883 (SEQ ID NO: 1) and fected in duplicate in presence of SC GS (pCLS2690, pCLS6884 (SEQ ID NO:2) were constructed to quantify SEQID NO: 1), induced condition) or in simplicate in pres NHEJ repair events induced by SC GS or I-SceI respectively. ence of GFP (pCLS0099, SEQID NO: 12, not induced con These constructions can be targeted at RAG1 site locus and dition). Induction by SC GS was monitored by comparison are presented in FIG. 2. of the Luciferase signal obtained with pCLS2690 (SEQID 0260 The sequence used to measure meganuclease-in NO: 1) over the one obtained with pCLS0099 (SEQID NO: duced mutagenesis is made of an ATG start codon followed 12). The induction varied from 2 to 11 depending on the by i) 2 codons for alanine ii) the tag HA sequence iii) GS or transfection. US 2012/0244131 A1 Sep. 27, 2012 22

0268 To normalize the different transfections, Z score mutagenesis can be used to screen an siRNA collection cov was calculated for each plate with, the following equation ering 19,121 genes to identify new genes regulating Z=(x-1)/O were x is the R.L.U. value of a given siRNA, L, is mutagenic DSB repair. the median R.L.U. value of the plate and O its standard devia (0271 a) Materials and Methods tion. A siRNA was considered as a stimulating hit when its Z score value was higher than 3 (FIG. 3). Cell Culture 0269. This screen led to the identification of 23 positive hits that stimulate luciferase signal by a factor ranging from 2 (0272. Cell line 293H was cultured at 37°C. with 5% CO, to 7 (cf. Table I below). in Dulbecco's modified Eagle's medium (DMEM) Glutamax

TABLE I siRNA hits stimulatind GS SC-induced luciferase sidinal SEO Mean Gene Gene siRNA D Z. Stimulation targeted ID target sequence NO: Score Sto factor

CSNK1D 1453 CCGGTCTAGGATCGAAATGTT 13 3.14 O.75 3 - 4 O

AK2 2O4 CGGCAGAACCCGAGTATCCTA 14 5.51 O2O 5. O8

AKT2 2O8CAAGCGTGGTGAATACATCAA 15 3.. 65 0.23 2.94

CAMK2G 81.8GAGGAAGAGATCTATACCCTA 16 5 O1. O. 23 3 66

GK2 2712TACGTTAGAAGAGCACTGTAA. 17 3.33 1. 49 2.75

PFKFB4 521OCAGAAAGTGTCTGGACTTGTA 18 3.92 114 2.18

MAPK12 63 OOCTGGACGTATTCACTCCTGAT 19 384 OO6 3.22

PRKCE 5581 CCCGACCATGGTAGTGTTCAA 20 4 OO O. 43 2.91.

EIF2AK2 561OCGGAAAGACTTACGTTATTAA 21 4.5 O O. 22 3.15

WEE1 7.465 CAGGGTAGATTAC CTCGGATA. 22 32 O O. O.8 5. O1

CDK5R1 8851 CCGGAAGGCCACGCTGTTTGA, 23 4. O1 O.26 6. O3

LIG4 3981 CACCGTTTATTTGGACTCGTA. 24. 4.11 O. 41 6.15

AKAP1 8165AGCGCTGAACTTGATTGGGAA 25 4.97 O.32 F24

MAP3K6 9 O64TCAGAGGAGCTGAGTAATGAA. 26 5.99 O.22 s: 41

DYRK3 84.44 TCGACAGTACGTGGCCCTAAA 27 3.54. O.22 3 : 61

RPS6KA4 8986 CGCCACCTTCATGGCATTCAA 28 3.56. Of3 3 : 61

STK17A 92.63 CACACTCGTGATGTAGTTCAT 29 3.26 O. 43 2. Of

GNE 10O2 OCCCGATCATGTTTGGCATTAA 3 O 331 O.25 2.2O

ERN2 10595 CTGGTTCGGCGGGAAGTTCAA 31, 3.47 147 2.30

HUNK 3O811 CACGGGCAAAGTGCCCTGTAA 32 363 13 O 1.97

SMG1 23 O49 CACCATGGTATTACAGGTTCA 33 3.22 O. 46 2. O5

WNK4 65266CAGCTTGTTGGGCGTTTCCAA 3 4 5 - 58 Of O 4.15

MAGI2 98.63 CAGGCCCAACTTGGGATATCA 35 3. Of O. 63 2. Of

Example 3 supplemented with 10% fetal calf serum, 2 mM L-glutamine, 100 UI/ml penicilline, 100 g/ml streptomycine, 0.25 ug/ml Establishment of Cellular Model Measuring Mega amphotericine B (Fongizone). The clones measuring mega nuclease-induced mutagenesis were maintained with 200 nuclease-Induced Mutagenesis ug/ml of G418 (Invitrogen). 0270 Stable cell lines measuring meganuclease-induced Stable Transfection to Generate Cell Line Measuring I-Mega mutagenesis at targeted locus were established. The different nuclease-Induced Mutagenic NHEJ Repair constructions were introduced at RAG1 locus in a single copy (0273. One million of 293H cells were seeded one day prior using c(GPS kit. The cell line measuring SC GS-induced to transfection, 3 Jug of SC RAG encoding vector US 2012/0244131 A1 Sep. 27, 2012

(pCLS2222, SEQID NO:36) and 2 g of plasmid measuring two siRNAs targeting PRKDC gene (siRNA target sequence SC GS-induced mutagenic NHEJ repair (pCLS6883, PRKDC 5, CTCGTGTATTACAGAAGGAAA=SEQ ID SEQID NO: 1) were co-transfected on cells using 25 ul of NO: 75 and siRNA target sequence PRKDC 8, lipofectamine (Invitrogen) according to the manufacturer's GACCCTGTTGACAGTACTTTA=SEQ ID NO: 76) instructions. Three days following transfection, 2000 cells involved in DNA repair regulation increased SC GS-induced were seeded in 10 cm petri. One week after seeding 400 ug/ml luciferase signal from 6,000 up to 14,000 R.L.U. (see FIG. of G418 (Invitrogen) were added on cells. Neomycin resistant 4B). This result demonstrates that inhibition of these genes clones were transferred in 96 well plate using ClonePix (Ge stimulate SC GS-induced mutagenic NHEJ repair signal. netix) and cultured in presence of 400 ug/ml of G418 (Invit rogen) and 50 uM of Gancyclovir (Sigma). Genomic DNA of HTS Screening Measuring SC GS-Induced Non Homolo Neomycin and Gancyclovir resistant clones were extracted in gous End Joining Repair Activity: order to perform a PGR specific of RAG1 targeted integration 0277 Screening of a siRNA collection covering 19,121 (cGPS(R) Custom Human Full Kit DD, Cellectis Rioresearch). genes (Qiagen with two siRNAS targeting each gene) using Transient Transfection in 96 Well Plate Format for siRNA this cell line will lead, to identification of other siRNAs that Screening could, modulate SC GS-induced mutagenic NHEJ repair. 0274. Twenty thousand cells per well were seeded in white For that purpose, a high-throughput screening was set up to 96 well plates one day before transfection. Per well, cells cotransfect each siRNA of the collection with pCLS2690 were transfected with 200 ng of DNA (SC GS encoding (SEQID NO:3) in duplicate. This screen led to identification vector pCLS2690, SEQID NO: 3 or GFP encoding vector of 481 and 486 hits stimulating and inhibiting the luciferase pCLS0099, SEQID NO: 12) and with or without 33 nM final signal respectively. concentration of siRNA using 1.35ul of Polyfect transfection reagent (QIAGEN). Seventy two to ninety six hours post a) Materials and Methods transfection 50 ul per well of ONEGlo (Promega) were added, cells were incubated in dark for 3 minutes before 0278 siRNA Dilation luciferase activity analysis (1 second?well) using PHERAStar (0279. The siRNA collection from QIAGEN was received luminometer (BMG Labtech). siRNAs targeting the gene in 96 well plate format in solution at 10M concentration. On WEN (SEQID NO:37), MAPK3 (SEQIDNO:38), FANCD2 each plate columns 1 and 12 were empty allowing controls (SEQID NO: 39), PRKDC (PRKDC 5, addition. During dilution process of siRNA, siRNA AS CTCGTGTATTACAGAAGGAAA=SEQ ID NO: 75 and (Qiagen #1027280), a negative control, siRNA RAD51 (SEQ PRKDC 8, GACCCTGTTGACAGTACTTTA=SEQ ID ID NO: 77) and siRNA LIG4 (SEQID NO: 78), two siRNAs NO: 76) and LIG4 (SEQID NO: 24) were used to extinct targeting proteins involved in recombination process and genes involved in DNA repair or regulation in order to ana siRNALuc2 (SEQID NO: 79) targeting the expression of the lyze their potential for mutagenic NHEJ stimulation. More reporter gene used were added at 333 nM final concentration. over, 8 siRNAs identified with an extrachromosomal assay 0280 Fourteen thousand cells per well were seeded in and targeting CAMK2G (SEQID NO: 16), SMG1 (SEQ ID white 96 well plates one day before transfection. Per well NO:33), PRKCE (SEQID NO:20), CSNK1D (SEQID NO: cells were co-transfected with 200 ng of DNA poLS2690 13), AK2 (SEQ ID NO: 14), AKT2 (SEQ ID NO: 15), (SEQIDNO:3) and with 33 nM final concentration of siRNA MAPK12 (SEQID NO: 19) and E1F2AK2 (SEQID NO: 21) using 1.35ul of Polyfect transfection reagent (QIAGEN) per genes were used. All experiments carried out in 96-well plates well. Seventy two hours post transfection 50 ul per well of (cell seeding, cell transfection, incubation and luciferase ONEGlo (Promega) were added, cells were incubated in dark detection) were performed with a Velocity 11 robot (Velocity, for 3 minutes before analysis of luciferase activity (1 second/ Palo Alto, Calif.). well) using PHERAStar luminometer (BMG Labtech). 0275 b) Results: 0276 A cell line measuring mutagenic NHEJ repair b) Results: induced by SC GS was created. This cell line contains a 0281. Seventeen runs were performed to screen the entire single copy of the reporter system integrated at RAG1 locus collection. For each run the mean luciferase intensity of the all and was validated by comparison of Luciferase signal run and of siRNA Luc2 (SEQID NO: 79) and their standard obtained after transfection with GFP encoding vector deviations were calculated. A siRNA hit stimulating pCLS0099, SEQID NO: 12 to SC GS encoding vector luciferase signal was defined for each run when its luciferase pCLS2690, SEQID NO:3 (see FIG.4A). Indeed, transfection intensity was above the run mean intensity plus 2.5 times the with GFP (SEQID NO: 12) encoding vector gave similar 60 run standard deviation. A siRNA hit inhibiting luciferase R.L.U. luciferase signal than untreated cells whereas trans signal was defined as follows: its luciferase signal is less than fection with SC GS encoding vector (SEQID NO:3) with no the mean luciferase activity obtained with siRNALuc2 (SEQ sRNA or with siRNA control AS induced a 600 R.L.U. ID NO: 79) plus 0.5 times its standard deviation. On each run luciferase signal. Moreover siRNAS targeting genes involved SC GS-induced mutagenic NHEJ repair was checked by in classical NHEJ (LIG4) or in classical NHEJ and other comparison of induced luciferase signal between co-transfec DNA repair pathway (WRN and FANCD2) or in DNA repair tion of pCLS2690 (SEQ ID No. 3) with either the siRNA regulation (MAPK3) increased SC GS-induced luciferase control AS (Qiagen #1027280) or with the siRNA screened. signal from 725 up to 1,200 R.L.U. (see FIG. 4A). Moreover, Effect of siRNA was also verified by analyzing the decrease 8 siRNAs identified with an extrachromosomal assay, target of luciferase signal with co-transfection of pCLS2690 (SEQ ing CAMK2G (SEQID NO: 16), SMG1 (SEQID NO: 33), ID No. 3) with siRNA Luc2 (SEQID NO: 79) PRKCE (SEQID NO:20), CSNK1 D (SEQID NO: 13), AK2 0282. To compare the screen form run to run, normaliza (SEQID NO: 14), AKT2 (SEQID NO:15), MARK. 12 (SEQ tion was applied on each run to get the run mean luciferase ID NO: 19) and E1F2AK2 (SEQ ID NO: 21) genes and also signal equal to 100 R.L.U. FIG. 12 represents data of all runs US 2012/0244131 A1 Sep. 27, 2012

after normalization and shows the hits stimulating (with at least a normalized luciferase activity superior to 183) or hits TABLE IV- continued inhibiting (with at least a normalized luciferase activity infe siRNA hits stimulating GS SC-induced rior to 37.5) SC GS-induced mutagenic NHEJ repair luciferase signal with at least a fold luciferase signal. increase of increase of 1.83

0283. As indicated in Table IV below, this screen led to the SEQ identification of 481 siRNAs hits that stimulate SC GS-in Gene Gene siRNA ID duced mutagenic NHEJ repair luciferase signal with at least a Targeted ID target sequence NO stimulation factor of 1.83. TAF6 6878 CTGGGAGTGTCCAGAAGTACA

TABLE IW EIF4A3 9775 CCGCATCTTGGTGAAACGTGA siRNA hits stimulating GS SC-induced CXorf59 2864 64 CTGTGAGTTCCTGTACACCTA luciferase signal with at least a fold increase of increase of 1.83 TALDO1 6888 CCGGGCCGAGTATCCACAGAA

SEQ C16orf59 CGGGATGAACCTGCAGTCTGA Gene Gene siRNA ID Targeted ID target sequence NO H2AFY 9555 CAAGTTTGTGATCCACTGTAA

LCMT2 9836 CAGGCGCGGTACAGAACACCA 80 SAMD 5 389-432 CTGCTCATAGGAGTTCAGTAA

SNORD115-10 1 OOO33447 GAGAACCTTATATTGTCTGAA 81 PROK2 TCGCTCTGGAGTAGAAACCAA

WNK4 652 66 CAGCTTGTTGGGCGTTTCCAA 82 BCL9L. 283149 ACCCACAATTGTAATGTAGCA

HMX2 3.167 CGGGCGCGTACTGTACTGTAA 83 WDR5 11091 AAGCAGCACCGCAGACTGTAA

TAL 1. 6886. TCCGTCAACGTTGTACTGTAT 84 ADAMTSL5 3393 66 ATGCCTAACCAGGCACTGTAA

WAW3 10451 CACGACTTTCTCGAACAC CTA 85 BASP1 104 O9 TTCCAAGATCCGCGTCTGAAA

APOA1BP 12824. O ATGACGATTGATGAACTGTAT 86 ALX1 8092 TAGAGCTATGGACAACTGTAA

SNORD115-14 10 OO33 451 GAGAAACTTATATTGTCTGAA. 87 KLHL34 25724 O CTCGGCAGTCGTGGAAACCAA 21

REM1 28954 CGCTGTGGTGTTCGACTGTAA 88 CYFIP2 26.999 CGCCCACGTCATGGAGGTGTA 22

MTHFD2L 44.1024 CAGCGGTATATTAGTTCAGTT 89 MID1 4281 TAGAACGTGATGAGTCATCAT 23

OR8H2 39 O151 CTCAACTGTCGTCACACCTAA 90 SGCD 6444 TAAATCTATAGAAACACCTAA 24

UBAC2 337867 CACGCTGGACATCCAGAGACA 91 RING1 CAGGGTCAGATCAGACCACAA 25

HGFAC 3O83 AAGGACTGCGGCACAGAGAAA. 92 CAW3 859 TTGCGTTCACTTGTACTGTAA 26

GOLT1A 127845. CACTAGCTCGATGGTCTGAAA 93 KIAAO28O 232O1 CAGCATTCCCTCTGCTATCTA 27

FAM24A 11867O CTGGATGGTTGAACTGTAGCA 94. ISCU 234.79 CTCCAGCATGTGGTGACGTAA 28

NEFH 4744 CTCGCTGGACACGCTGAGCAA 95 ARWCF 421 AAGACTATTGGTAAACACCTA 29

SPRN 503542 CAGGAACATTCCCAAGCAGGA 96 BBC3 27113 CAGCCTGTAAGATACTGTATA 3 O

INTS12 571 17 CAGGACCTAGTGGAAGTACTA 97 ZMYND11 1. Of 71 CCGGATGAAGTCGGACCACAA 31

PAF1 54.623 CTCCACTGAGTTCAACCGTTA 98 PCDHGB5 5 6101 CGGGCAAATCTTTAGTCTGAA 32

ALDH8A1 64577 CTGGATAAAGCAGGTGTTCCA 99 INTS12 sf117 AACCTGCTACTTCGTCAGCTA 33

ELF2 1998 AAGCATCAGTTCACAGCAGTA 1 OO CTGGATGGCGTTATGATTTCA 34

TSSC1 726O CAGCTGCGGAGACGACTGTAA 101 NUP35 1294 O1 CAGGACTTGGATCAACACCTT 35

TMEM13 O 22.2865. CCCGCTGGTGCTTACTGGCAA 102 OR2M7 391-196 AAGGGCAAGTCTGGAGATTGA 36

TCCGTCAACAGTAGTTCCTTA 1 O3 SLC6A14 11254 ACCAATAGTAACT CACTGTAA 37

SNORD114-17 7675.95 ATGAATGATATGTGTCTGAAA 104 CHP2 63928 CAGGGCGACAATAAACTGTAT 38

FGL2 10875. CAGGATCGAGGAGGTGTTCAA 105 TRIM61 391712 TAGGGTATGTATATGTTCCTA 39

DUSP1 1843 CACGAACAGTGCGCTGAGCTA 106 AFG3L1 172 CGGCTGGAAGTCGTGAACAAA. 4 O

MYCL1 461 O AAGGCCTTGGAATACTTGCAA 107 2969 TAGGTGGTCGTGTGATGGTAA 41 US 2012/0244131 A1 Sep. 27, 2012 25

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

TMPRSS11A 339967 ATCCACATCAATGGACTGTTA 42 CPLX2 10814 CAGATAGGTAGCAGAGACCAA

KRTAP13-3 337960 CAGGACT CACATGCTCTGCAA 43 CAMK2G 818 GAGGAAGAGATCTATACCCTA 16

ZNF1 Of 51427 TACCTCGGACCAGCTCTGTAA 44 KCTD15 CAGGATAAGCCGCCTCTTCAA 76

NCOR2 961.2 AACGAGATTGCTGGAAACCAA 45 CSH1 1442 ACGGGCTGCTCTACTGCTTCA 77

SMG1 CACCATGGTATTACAGGTTCA 33 EDIL3 1 OO85 CCCAAGTTTGTCGAAGACATT 78

NKX2-3 159296 CAGGTACAAGTGCAAGAGACA 46 PAX3 SO77 AACGCCTGACGTGGAGAAGAA 79

NFKBIA 4792 CAGCCAGAAATTGCTGAGGCA 47 KCNC3 3748 CAGCGGCAAGATCGTGATCAA

FAM69B 138311 CCGGCGGGAGCTGGTACTGTT 48 LRRC8C 84230 TACCTTATACTGGCTGTTCTA 81

CARD9 6417O CAGCGACAACACCGACACTGA 49 LOC9 O586 CTGGGCATGGGTATGCTGTAA 82

TMEM105. 284.1.86 AACGAGGTATGGAACTGTTCA SO LONRF2 164832 CCGACGGATATTAGTCATCAT 83

LARP6 5.5323 ATGGTGTCTTGTAGGACCAAA 51 DDX3X 1654 AACGAGAGAGTTGGCAGTACA 84

WSIG8 3.91123 CCGGCGTATAGGCGTGATCAT 52 GNG4 2786 CCGAAGTCAACTTGACTGTAA 85

TXNRD1 7296 CCGACTCAGAGTAGTAGCTCA 53 SAP3 OL 79685 CAAGAGCGTAAGGCAC CTATA 86

USP36 576 O2 CAAGAGCGTCTCGGACACCTA 54 KCTD15 AACCTTGGAGATTCACGGCAA 87

MYEOW2 15 O678 GTCAGCGAAGACAGCACAATA 55 LOC1651.86 1651.86 CAACGTCTCTATAGAGACCAA 88

NES 10763 CGCGCCGTCGAGGCAGAGAAA 56 FAM59A 64762 AAGGGCAGATTTAGCACCCGA 89

PRDX3 10935 AAGGCGTTCCAGTATGTAGAA st ARF5 381 TTCGCGGATCTTCGGGAAGAA 9 O

USP39 10713 CAGGCTCTATCTAATGTTCCT 58 ESSPL 345O 62 CAGCCTACACTTTGACCACAA 91

KRT16 38.68 TACGAGCAGATGGCAGAGAAA 59 SPON1 104.1.8 ATCGCACGGAAGGGTGAACAA 92

MAGA1 31.59 CACCACAACTCCAGGAAGGAA 60 BCL11B 64919 CAGAGGTGGGTTAAACTGTAA 93

SLCO3A1 28232 CAGCATCGCCATCGCGCTCAA 61 ZNF826 6647 O1 TCAGATGGTCCTCACACCTAA 94

SFN 2810 CCGGGAGAAGGTGGAGACTGA 62 C2 lorf 62 5 6245 CAACCTGATGTGCAACTGTAA 95

DPP4 ATCGGGAAGTGGCGTGTTCAA 63 OGFOD1 55239 TCGGACGCTGTTACGGAAGAA 96

SPTBN2 6712 CAGCGTCAACATCCTGCTCAA 64 LYST 1130 CACATCATTGTCAACACCTAA 97

PRR3 79. Of AAGGTCAACCCTTGGTTCTTA 65 LIPJ 142910 AGGGTTGTTGTATACTTGCAA 98

SARM1 23098 CTGGTGGTTAAGGGTAGCAAA. 66 GH2 2689 CAGCTGGCATATGACACCTAT 99

GRIN2C 29 OS CCCAGCTTTCACTATCGGCAA 67 PHF21B 112885 CACCGTGGTCAGCGTCAAGAA 2OO

TMEM179 388O21 CGGGCCGGCCATGGCGCTCAA 68 ZFYWE28 sff32 CAAGCCTGAAACAGACGACAA

TWIST1 7291 CACCTCTGCATTCTGATAGAA 69 TMEM183A 927 O3 CTGTACCTATAACACCAGTAA

CCCGTCTGAATTCCTCAGGAA 70 HIST1H2AE ATCCCGAGTCCCAGAAACCAA

C10orf SO 118611 CAGATCCGTCCTGTCGCTCAA 71. XKRX 4O2415 CACCCATAATGTAGTAGACTA 2O4.

1535.79 AAGGACATTATTAGTTTGACA 72 CTR9 9646 AAGGGTAGTGGCAGTGAACAA 2O5

CES2 8824 CTGCATGATGTTAGTTACCAA 73 POLDIP2 CACGTGAGGTTTGATCAGTAA

AOF1 221 656 ATCGATGCGGTATGAAACCAA 74 10827 CGCCCACGAATGGATCAGGAA US 2012/0244131 A1 Sep. 27, 2012 26

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

FGFR3 2261 ACCCTACGTTACCGTGCTCAA FYCO1 79443 AAGCCACGTCATATAACT CAA 242

DIRAS2 54769 CCCGACGGTGGAAGACACCTA 209 ABHD9 79852 CAGCTCAGTGCTACTGTGAAT 243

TM4SF5 90.32 CGCCCTCCTGCTGGTACCTAA CGB 1082 ACCAAGGATGGAGATGTTCCA 244

CENPH 64946 ATGGATAACATGAAACACCTA HES5 388585 CACGCAGATGAAGCTGCTGTA 245

NAGK sff CCCGGTCTTGTTCCAGGGCAA RAB10 10890 AAGGGACAAACTAGTAGGTTT 246

LGI2 552O3 TACGACGAGAGTTGGACCAAA. CCCGTTAGTGCTACACTCATT 247

TERT 7015 CCAGAACGTTCCGCAGAGAAA BZW2 2.8969 CAGGAGCGTCTTTCTCAGGAA 248

PEAR1 CTGCACGCTGCTCATGTGAAA FAM1 OOB 283991 CACGTTCTTCCAAGAAACCAA 249

FBLIM1 54751 CTCCACAATTGTTATAACCAA MGC23985 38.9336 ACCAGACAAGCCAGACGACAA 25 O

PDEFA 5150 CAGATAGGTGCTCTGATACTA CSAG2 7284 61 CCAGCCGAACGAGGAACT CAA 251

ETV4 2118 ACCGGAGTCATTGGGAAGGAA CTCCTTTATCTTCCAAACCAA 252

SNORD9 692O53 CTGTGATGAGTTGCCATGCTA GGA1 26088 CCCGCCATGTGACGACACCAA 253

TMEM2s 84.866 CAGGCGATGAGTCTAGTAGCA 22O C10orf53 2829 66 CTGGAATGTGGTGGAACTCAT 254

SETD5 55.209 AACGCGCTTGAACAACACCTA 221 TBL3 106. Of CTGCGTCACGTGGAACACCAA 255

SART1 9092 CAGCATCGAGGAGACTAACAA 222 EEF1A1 1915 CAGAATAGGAACAAGGTTCTA 256

IL17RB 5554. O CCGCTTGTTGAAGGCCACCAA 223 OTOP3 34.7741 TTGCCAGTACTTCACCCTCTA 257

NEU1 4758 CAGGTCTAGTGAGCTGTAGAA 224 BANP 54971 CAGCGACATCCAGGTTCAGTA 258

ST6GALNAC1 558 O8 CCCACGACGCAGAGAAACCAA 225 FOXP2 93986 AAGGCGACATTCAGACAAATA 259

KCNIP2 CAGCTGCAGGAGCAAACCAAA. 226 BAHD1 22893 CAAGAATTACCCACTTCGTAA 26 O

PPP3CA 553 O TCGGCCTGTATGGGACTGTAA 227 ZNF416 55659 GAGGCCTTTGCCAGAGTTAAA 261

PRPF4 9128 TCCGGTCGTGAAGAAACCACA 228 WPS4A 2.7183. CTCAAAGACCGAGTGACATAA 262

SYNGAP1 88.31 CAGAGCAGTGGTACCCTGTAA 229 FAM38B 63.895 CACC TAGTGATTCTAACT CAA 263

PTBP1 sf2. GCGCGTGAAGATCCTGTTCAA 23 O LRRC24 44.1381 CCACGAGATGTTCGTCATCAA 264

IGFL4 44 4882 CCAGACAGTTGTGAGGTTCAA 231 PRKCE 5581 CCCGACCATGGTAGTGTTCAA

HES1 328O CACGACACCGGATAAACCAAA. 232 SECISBP2 79 0.48 TCCCAGTATCTTTATAACCAA 265

UNO83O 389084 CACAGACGATGTTCCACAGGA 233 C6orf1 221491 CAGATGTATAGTATTCAGTAT 266

CTCGGGAAACGTGGACGACAA 234 SEMA3G 5692O CCCTGCCCTATTGAAACT CAA 267

PLG 534 O AAGTGCGGTGGGAGTACTGTA 235 C16orf3 CTGGGACAACGCAGTGTTCAA 268

IK 35.50 CAGGCGCTTCAAGGAAACCAA 236 ANKRD12 23253 CCGGAGCGGATTAAACCACCA. 269

UBE2D4 51619 CCGAATGACAGTCCTTACCAA 237 TEX28 1527 CAGCGAAGAGAGAATGGCCTA 27 O

NEK3 4752 CAGAGATATCAAGTCCAAGAA 238 MAPK8IP3 23162 CAGCCGCAACATGGAAGTACA 271

ATF7 IP2 8 OO 63 TAGGACGACTGAAATAACCAA 239 FAM9A 171482 AAAGCTCAGTTGGAAGCTCAA 272

SRRM2 23524 CGCCACCTAAACAGAAATCTA 24 O ACTR8 93.973 TACTACCAACTTAGTCATCAA 273

SRRM2 23524 CTCGATCATCTCCGGAGCTAA 241 236.42 CAGCGTTACAGTAATGTTCCA 274 US 2012/0244131 A1 Sep. 27, 2012 27

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

SNORD114-10 767588 ATGATGAATACATGTCTGAAA 27s RBPJL 11317 CTCAAAGGTCTCCCTCTTCAA 309

LYPD1 116372 CACGGTGAACGTTCAAGACAT 276 DPY19L2P1 554.236 GTCCATTGTCTAAGTGTTCTA

SSBP1 6742 AGCCTAAAGATTAGACTGTAA 277 C21orf2 AAGGGCCGTTTCTCCACAGAA

TRIM32 22954 CAGCACTCCAGGAATGTTCAA 278 C11orf76 148753 TACGGTGATCCTCCTCTGCAT

POLR3H 171568 AACAAACGGCACAGACACCAA 279 TBC1D5 97.79 AGGAAGGTTGTTGGCCAACAA

SNORA71B 26776 TGCCTTTGCCCTGGTCATTGA 28O TMEM2O3 941. Of AACAGGTGTCAGATACTCATA

KRT6C 286887 CAAGTCAACGTCTCTGTAGTA 281 PTPRA 5786 CCGGAGAATGGCAGACGACAA

CYP2A7 1549 CCCAAGCTAGGTGGCATTCAT 282 CSNK1A1L, 122O11 CTGCTTACCTGTGAAGACATA

PTPN22 26.191 TGGGATGTACGTTGTTACCAA 283 GSTM1L 29.45 ATCCTTGACCTGAACTGTATA

P2RX5 SO26 CTGATAAAGAAGGGTTACCAA 284 LOXL2 4017 CCGGAGTTGCCTGCTCAGAAA

FANCE 21.78 TCGAATCTGGATGATGCTAAA 285 ACSL5 517 O3 CAAGGGTACAAACGTGTTCAA

PLS3 5358 AACGGATTCATTTGTGACTAT 286 MT4 845 60 ATGCACAACCTGCAACTGTAA

HNRNPA1 31.78 CAGGGTGATGCCAGGTTCTAT 287 DNTTP1 116092 CCGGCATGGTATGGAAACCAA 321

CD72 971. ATCACCTACGAGAATGTTCAA 288 OR2A14 135941 CACCTGGCCATTGTTGACATA 322

PRC1 TCGAGTGGAGCTGGTTCAGTA 289 ASB2 51676 CAAGTACGGTGCTGACATCAA 323

5 OSO9 CCCGGGCATCCAGGTCTGAAA 290 SREBF2 6721 CCGCAGTGTCCTGTCATTCGA 324

ZNF649 65251 AACGCTATGAACACGGAAGAA 291 ISG2 OL1 64782 CACGGGCACTCATCAGTAGAA 3.25

HCP5 10866 TAGGAGGGAGTCAGTACTGTT 292 SPTBN2 6712 CTCCGCGGATCTAGTCATCAA 326

PXDNL 1379 O2 TACCGACTGAATGCCACCTTA 293 LYNX1 CACCAGGATGAAGGTCAGTAA 327

CDYL2 124359 AATGATCATGTTGGAGAGCAA 294 LTBR 4 O55 TACATCTACAATGGACCAGTA 328

C1 forf28 283987 CCCGTGGAAGCCACCGATGAT 295 ZNF295 49854 CAGGTTGAAGTCCATAATCAG 329

10848 AAGGAGTAAAGTCTAGCAGGA 296 SOX11 6664 CTCCGACCTGGTGTTCACATA 33 O

LINCR CTGGGCCGTGATGGACGTGTA 297 PANK2 8 OO25 CTGTGTGTGAACTTACTGTAA 331

PRTN3 5657 CAACTACGACGCGGAGAACAA 298 RBM47 545 O2 CACGGTGGCTCCAAACGTTCA 332

SLC25A19 60386 CTCCCTGTGATCAGTTACCAA 299 CCDC13 1522O6 CCCAACCGGGAGCGAGAAGAA 333

C14 of 45 TTCCGTCTTCCAAGTTACCAA 3 OO RTF1 231.68 ACCGCTCATCACGAACATCAT 334

PPHLN1 51535 CAAGAGATACTTCACCCTCAA ZRANB2 CACGATCTTCATCACGCTCAT 335

ENDOGL1 9941 AAGAAGCTAGAAGAACT CAAA. FAM83E 5.4854 CTCGGCGTCTGTCAAGCAGAA 336

989 CAGAATCTCATTACTGCTTCA RCP 27297 GAGGAATTTCCTCGAGAACAA 337

CHRNB4 1143 CAGCAAGTCATGCGTGACCAA NMBR 4829 CCCGCGGACAGTAAACTTGCA 338

SPACA3 124912 AAGCTCTACGGTCGTTGTGAA 3. OS CPNE2 221184 CAGGACAGAAACCGCGATCAA 339

DCI 1632 CAGGTACTGCATAGGACT CAA XRCC6 2547 GAGGATCATGCTGTTCACCAA 44

221545 CTCATTTGTCGCCATCGTCTA 3. Of PRDM14 63978 ACCGGCCTCACAAGTGTTCTA 34 O

CLIC6 541 O2 CCGAATCTAATTCCGCAGGAA CCCGTGACGGACTTCAGAGAA 341 US 2012/0244131 A1 Sep. 27, 2012 28

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

NRARP 44.1478 TTCGCTGTTGCTGGTGTTCTA 342 ADAMTS7 11173 CTGCATCAACGGCATCTGTAA 375

HNRNPC 31.83 CTCCCGTGTATTCATTGGGAA 343 BAG4 953 O ACCCAAGTACATATCCTGTAA 376

GAS2L2 24 6176 CTCCGGAACCATGTGATGGTA 344 GRID2 2.895 AAGCAATGGATCGGAGAACAA 377

KIFC2 9 O990 AGGGCGGCTGCCAGAACT CAA 345 GDF15 9518 CTGGGAAGATTCGAACACCGA 378

SFRS7 64.32 CCCGACGTCCCTTTGATCCAA 346 RASAL2 94.62 CTCGTGGGCTGCCTAAACTAA 379

FERMIT2 10979 AAGCTAGATGACCAGTCTGAA 347 UGT2B28 5.4490 CACCCAGGTAATGGTTAGAAA

FRMD4A 55.691 CTGGATTCTGTTCAACTGTAA 348 FLJ2O254 54867 CCCGATTCCGTGAATCAGCTA 381

SNORD13 AGCGTGATGATTGGGTGTTCA 349 DPF2 5977 CCGGAGTAGCCCAGAGCAATT 382

TTLL9 1643.95 TGCGTCAACGATCGGAAGAAA 350 CNGB1 1258 CAGAAGTTACTCCGGAAGAAA 383

ZNF691 51058 TTGCTGCTACCTTGACCT CAA 351 CAPN13 92.291 ACGAAGGATGGTCCCAAATAA 384

TRAPPC6A 79 090 CTGTGTGTGTGGAATCTGAAA 352 CAP1 104.87 AAGCCTGGCCCTTATGTGAAA 385

EIF4A3 9775 AAAGAGCAGATTTACGATGTA 353 PDIA6 1013 O ACGGGATTAGAGGATTTCCTA 386

PRR16 51334 AACCTGCAGATTTCACCTATT 3.54 TMEM63A 972 CAGGGACTCTCTTGACGCTGA 387

RPIA 22934 AACCACGTGAGGAATAACCAA 355 SMARCC1 65.99 CAGCGGATTTCAACCAAGAAT 388

FAM26E 25 4228 CTGCCGATCTAAAGTTAGCTA 356 TBKBP1 CACTGCTTACGGAGACATCAA 389

C11orf75 56935 CCGCGGGCAGGAATAACT CAA 357 LGALS2 3957 CACCATTGTCTGCAACTCATT 390

WBSCR19 AAGGACTTCAACAGTCAGCTT 358 CACNG1 786 CACCGTCTGGATCGAGTACTA 391

CTSA 5476 CCGGCCCTGGTTAGTGAAGTA 359 ZBTB7C GCCACTGGATCTGGTCATCAA 392

NEK1 O 152110 CAGAAGGTATCTACTCTGAAA 360 PREB 101.13 CCGGGCTCCGTTCCCGTTGTA 393

WIPF3 64 4150 CTCCGGATGAATATAAACCAT 361 CBX8 573.32 AAGGAAAGTAACACGGACCAA 394

TRAFD1 10906 CCAGGTCTCTCAGTGACATAA 362 RBM8A 993.9 ACACGACAAATTCGCAGAATA 395

TH1L 5.1497 CCGGTTGAACTTATCCGCGTT 363 MDM1 5 6890 ATGAGGGTGTAACAAACCATA 396

ZNF658 261.49 CTCAGCCCATATAGTACATCA 364 TPD52L3 89882 CAGGCCAGGTCGTCAACT CAA 397

STXBP2 6813 CACGGACAAGGCGAACATCAA 365 C3orf59 151963 AAGGGCAAGTAACGTGTTCAT 398

TRIM f1 1314 O5 CAGGATCGTGGTGGCTGACAA 366 LRP5 CTGGACGGACT CAGAGACCAA 399

CAP1 104.87 CAACACGACATTGCAAATCAA 367 DLD 1738 CAGCCGATTGATGCTGATGTA 4 OO

BRCA1 672 CTGCAGATAGTTCTACCAGTA 45 TCF2O 6942 CAGGAGTTGCACGTAGAGAAA

WLDLR 7436 CAAGATCGTAGGATAGTACTA 3.68 KIAAO644 98.65 CGGCGGCAACTTCATAACCAA

CREBZF 58487 CAGGAGGAGAGTCGCTACCTA 369 ARFRP1 1 O139 CGGCGTCATCTACGTCATTGA

FAM12 OC 54954 CTGCGTGAGGCTAGCACTCAT STAU1. CTCGGATGCAGTCCACCTATA

C4BPA 722 AACTCAGACGCTTACCTGTAA 3.71 GRIP2 80852 CAGGAGTGATCTGCTGAACAT 405

TBL3 106. Of CTGCCATGATAAGGACATCAA 372 UHRF1BP1 54887 CCGCGTGAGGCTTGACCACTA

OLFML2A 1696.11 TCCAGTCATATTTAGAACAAA 373 CPNE2 221184 CACGATCGTCTCCAGCAAGAA 4. Of

54.885 GAGAAGGGTACT CACAGCTTA. 374 ICA1 3382 CAGGATCGATATGCTCAAGAT 408 US 2012/0244131 A1 Sep. 27, 2012 29

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

CDH2O 28.316 TACTACGAAGTGATTATCCAA CNGA3 1261 CCCGTCCAGCAACCTGTACTA 443

BPIL2 25 424 O CCGGAGTCTACTTTACCGGTA MTMR6 91. Of CCCGGATAGCAAGCAAACCAA 444

ADRBK1 156 CGGCTGGAGGCTCGCAAGAAA GRAMD1A 576.55 CACGATCTCCATCCAGCTGAA 445

MOG 434 O CAGAGTGATAGGACCAAGACA MRLC2 103.910 GAGGGTGTAAATTGTATTGAA 446

HOXD3 3232 CTCGCCATAAATCAGCCGCAA LOC285,908 28S908 CCCGACGGCCTTGACAGACCA 447

STRA6 6 422O CTGGAAGATACTGGGACTGTT FAM18B 51030 TCAGTGGACCTTGAGCTAATA 4 48

SNORA4 O 67,7822 CCCAGAACTCATTGTTCAGTA WDFY2 115825 AGCCACCTTCCATGACAGTAA 449

GKS 25 63.56 TACCATCTTGTACGAGCAATA GAS1 2619 ATGGATTTATGAAGACACTCA 450

LZTR1 8216 CAAGATCAAATACCCACGGAA C3orf35 339883 ATGGCCCACGTGAAATCTGAA 451

TRIM42 287015 CAGCGCCATCGCCAAGTTCAA DEFB106A. 2459 O9 TAAAGGGACATGCAAGAACAA 452

C1orf38 94.73 AAGTTGTAAGTGACTAACCAA IDS 3.423 CCCGAGGTCCCTGATGGCCTA 453

ANKRD2OA3 4.41425 ATCCCTCACTGAATTCAGTAA CCDC19 25790 AAGGCTCGCTATCGGACCAAA 454

ATG9B CAGCCGCGGCCTGGCGCTCAA 421 TAGATTCCAGTTGATGAAGAA 45.5

PI4KA 5297 AAGCGGCTGCGTGAAGACATA 422 SNORD116-10 1OOO33422 CAGTACCATCATCCTCATCTA 456

SDHB 639 O. CTGGTGGAACGGAGACAAATA 423 KIF22 3835 CAGGACATCTATGCAGGTTCA 457

RAB43 33.9122 CCGAGCGTGGGTCCCAGTCTA 424 SNF1LK2 23235 CAGGATTACATCCGTTTATTA 458

ZNF443 10224 AAGCATTATCTCATCGCTCAA 425 GTF2A1 297 TCCATTGGTCTTACAAGTTGA 459

IL17C 271.89 CCGCGAGACAGCTGCGCTCAA 426 PMF1 11243 CTGCGGCGCCATGTGCAGAAA 460

GPR149 34.4758 CACCGTGAGCGTAGCGCAGAA 427 MCSR 41.61 CGGCATTGTCTTCATCCTGTA 461

OR56A3 39 OO83 AACTCCGTTATTGTGGAAGAA 428 CELSR1 CGCCAACAGTGTGATTACCTA 462

TECTB 6975 CAGGGCAACCTTCCAATTCAA 429 LRRC32 2615 CGCCGGCAGAAGTTTAACCAA 463

BLK 64 O CTGGTAAGCGACTGTCATCAA 43 O TGM6 343641 CAGCATCGCTGGCAAGTTCAA 464

ZNF718 2554 O3 CCGCAACT CAATCTGTTCTAA 431 EPHAf 2O45 CAGGCTGCGAAGGAAGTACTA. 465

SEC11B 1577 OS GTGGGAGAAATCGCTGTTCTA 432 LIMA1 51474 CAGGTTAAGAGTGAGGTTCAA 466

FGD2 221.472 CAGGGTCATCTTCTCCAACAT 433 PHILDA1 22822 AGGAGCGATGATGTACTGTAA 467

TIFA 926.10 CTGGGTGTGCCCAATTGATCA 434 CELSR1 CGGGATCCTGGATGTGATCAA 468

PRB3 5544 AAGAAGGTGGTCATAGCTCTA 435 ETNK1 555 OO TCGATCGAGATGAGGAAGTAA 469

TREML.1 CAGGCGTACGTTTCTCACAGA 436 TSPAN14 81619 CGGGACGATATCGATCTGCAA 47 O

PHF10 55274 ATGGCAGTGTATGGAATGTAA 437 ANGEL2 908 O6 CTGACGCAATTGGCAATGCTA 471

PTDSS2 81.490 CTGGTGGATGTGCATGATCAT 4.38 44.1024 TACGTCTGATATGGTTAAAGA 472

MGC129 66 84792 CACCACTGTACTTGGCGTTAA 439 B3GALT4 8705 ATCCTGCGGTGTCGAGCAATA 473

FAM3D 131177 TACGACGATCCAGGGACCAAA. 4 4 O WKORC1 79001 GAGGGAAGGTTCTGAGCAATA 474

799 iss CCGGCGCATCGTCCACCTATA 4 41 DCD 1171.59 CTGGTCTGTGCCTATGATCCA

NOX5 794 OO TTGCCCTATTTGACTCCGATA 442 ETHE1 23.474 CACGATTACCATGGGTTCACA 476 US 2012/0244131 A1 Sep. 27, 2012 30

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

HOOK1 51.361 CAGGGTTACTTCTGTTGACTA 477 OR 138883 CTGCGTTGTTTGTGTGTTCTA

NDST1 334 O CCCAGCGATGTCTGCTATCTA 478 NPLOC4 556.66 CAGTCGAAATAAGGACACCTA

COX4 I2 847 O1 CTGCACAGAACTCAACGCTGA 479 CDON SO937 AAGCATGTTATTACAGCAGAA

C22orf16 4 OO916 CCCAGATAGCTGGGATTGGAA FAM4 6C 54.855 CAGACTGATCGCCACCAAGAA

ACTA2 59 TACGAGTTGCCTGATGGGCAA 481 OR271 284383 ACCACAGTCCACAGCAGGATA

TNRC18 84629 CTCGGTCATCCGCTCGCTCAA 482 598 CTGCTTGGGATAAAGATGCAA

HCFC1 3054 ACCGTTCACTATTGTAGAGTA 483 93.72 AACTATAGTTGGGATGATCAA

PIAS4 51,588 CACCGAATTAGTCCCACAGAA 484 ST AACCCTCAGCTTATGTAGCTA

NPY1R 4886 ACGACATCAGCTGATAATCAA 485 FABP 6 2172 CACCATCGGAGGCGTGACCTA

DEFB125 245,938 CTCAGACAGCTCTTACTCATA 486 USF2 7392 CCGGGAGTTGCGCCAGACCAA

C1orf128 57 095 TACGGGCAATGTCAAGCTCAA 487 TSPYL1 729 TAGAACCGGTTGCAAGTTCAA 521

TAGLN2 84. Of CAGCTGAGCGCTATGGCATTA 488 99 O6 CCAGCGTCCCTTTGTTGTGAA 522

TRAPPC6A 79 090 GACCTACGTCCTGCAAGACAA 489 RFXANK 86.25 CTCAGTCTTTGCGGACAAGAA 523

SFTPA2B 6436 CTCCACGACTTCAGACATCAA 490 PLEKHG2 64 857 CAGGTTCAGCCAGACCCTCAA 524

RGS11 8786 CCCAAGGTTCCTGAAGTCTGA 491 283948 CGCAATGTAGTTAGGTGCTCA 525

CYFIP2 26.999 CACGCATCGGCTGCTCTGTAA 492 MEGF11 84.465 AAGAATCCGTGTGCAGTTCTA 526

MAP6 4135 TACCACCAAGCCAGACGACAA 493 MLXIP 22877 CAGGACGATGACATGCTGTAT 27

LIG4 3981 ATCTGGTAAGCTCGCATCTAA 494 TTLL12 231.70 CACGGTGAGCTGCCCAGTACA 528

TBXA2R 6915 CCCGCAGATGAGGTCTCTGAA 495 ACOT4 122970 CAAACAGTCTCTGAACGGTTA 529

OR1L14 254973 CACTGTAGTGGTCCTGTTCTA 496 26.276 CAGCGTTGGATCAACACTGTA 53 O

UQCRC2 7385 TACATCCAGTCTGACGACAAA 497 FNS 1455.81 CTCGGTTAGATGTGACATCAA 531

DLL1 285.14 CACGCAGATCAAGAACACCAA 498 CL DN17 26.285 TAGTAAGACCTCCACCAGTTA 532

TAS2R45 259291 CACCGAGTGGGTGAAGAGACA 499 63.23 ACGCATCAATCTGGTGTTCAT 533

SETBP1 CAGCGTTGCTCTGAAGGCAAA 5 OO F521 25925 CAGCGCTTAAATCCAAGACTA 534

RND3 390 AACGTTAAGCGGAACAAATCA. 5 O1. TMEM167 153339 TTCAGAGTCTATTGACTGTAA 535

C12orf 62 84987 CAGCGCCAGGCCGCAGAAGAA CO 51226 CGGTGTGATTCTGGAGAGTGA 536

CPA5 93.979 CCGCTTATGGCGGAAGAACAA 503 CNOT2 4848 CAGCAGCGTTTCATAGGAAGA s37

KCNK15 60598 CCGGTGGAAGTCCATCTGACA 504 NU P21 OL 91181 CTGGCTGTCCGGCGTCATCAA 538

GLT8D1 5583 O TAGCTGGTACAGATAATTCAA 5 OS KCTD17 79734 CCCGGGCCTGAGAAGGAAGAA 539

GPRS 8 CAGATGGTTTATCGTGTTCAA CACGCTAAGGAGGGTGCTGGA 54 O

MAST2 231.39 CAGGAGTGTGCTGTCTGGCAA Of TTC31 64427 TGCGATGGCGCCGATTCCAAA 541

TAF6 6878 CAGCGTGCAGCCCATCGTCAA 508 GLTSCR1 29998 CCGCATCGGGCTCAAGCTCAA 542

SCT 6343 CAGCGAGCAGGACGCAGAGAA 509 PDLIM3 27295 CAGGACGGGAACTACTTTGAA 543

BAHD1 22893 CCGCCACGGGCGCATCCTTAA 510 KRTF8 1963f4. CAGCCTGTTCTGCTCGCTCAA 544 US 2012/0244131 A1 Sep. 27, 2012 31

TABLE IV- continued TABLE IV- continued siRNA hits stimulating GS SC-induced siRNA hits stimulating GS SC-induced luciferase signal with at least a fold luciferase signal with at least a fold increase of increase of 1.83 increase of increase of 1.83

SEQ SEQ Gene Gene siRNA ID Gene Gene siRNA ID Targeted ID target sequence NO Targeted ID target sequence NO

C6orf21 221481 CAGGTTCACTCCAACTTCCTA 5.45 SAR1B 51128 CACATTGGTTCCAGGTCTCAA 552 PTPN23 2593 O CCGCCAGATCCTTACGCTCAA 546 FLJ4 O243 13355.8 CACTGCGAAAGTGCTGACAAA. 553 EPC2 26.122 CAGCAGTTAGTTCAGATGCAA 47 SNORA27 619.499 CAAACTGGGTGTTTGTCTGTA 554 RBP1 5947 TAGGAACTACATCATGGACTT 548 LOC559 O8 559 O8 CTGGGTCTCTATGGCCGCACA 555 3.8 - 1 352961 TTGGATGTCTTTGGGAACCAT 549

C17orf79 5.5352 TTCCTTATTGACAGTGTTCAA 550 0284 This screen led also to the identification of 486 siRNAs hits that inhibit SC GS-induced mutagenic NHEJ SYT3 84258 TAGGGCGTAGTTGGTGCTGGA 551 repair luciferase signal to at least a normalized luciferase activity inferior to 37.5 (see Table V below).

TABL E V siRNA hits inhibiting GS SC-induced lusiferase signal to at least a normalized luciferase activity inferior to 37, 5 Gene Targeted Gene ID siRNA target sequence SEQ ID NO

SNX3 8724 ATCGATGTGAGCAACCCGCAA 556

MYO1E 4643 CACAGACGAACTCAGCTTTAA sist

MEGF9 19ss CAGGATGCCATCAGTCCTTTA 558

NOC3L 64318 AAGCATGAACGCATTATAGAT 559

HHIPL2 798 O2 CCCGTTCAGACCACTCGCCAA 560

ITIH4 37 OO ATGGATCGAAGTGACCTTCAA 561

DPY19L2 2.83417 CTCCGTAATCAATGGAGCATA 562

ZNF454 285676 ATAATCCGTTCTAGAGAATAA 563

TTC21A 199223 CAAGGCGGTACAGTCTTATAA 564

TTC21A 199223 CTGCTACTGGGCGATGCCTTA 565

MFSD11 7917 CCCGCGGCTCTGACTACCGAA 566

MYO1E 4643 CAGGGTAAAGCATCAAGTCGA 567

ARL5B 221079 TAGACGGTGCTGATTGGGAAA 568

SPZ1. 846.4 TACCATTGCCTTATTCGAAAT 569

MFSD11 791.57 CAGCAACTACCTTCTCCTTCA st O

LHX6 26468 ATGCTTGACGTTGGCACTTAA sf1

GABARAPL1 23710 CAGCTGCTAGTTAGAAAGGTT sf2

OBFC1 79991 TCAGCTTAACCTCACAACTTA sf3

BCAS2 10286 CTCGCAGATACCGACCTACTA sf4

CCDC62 84 660 ACCTACGAGTTTGTTAATCTA sfs

STC1 6781 CCAGAGAATCTTAAGGTCTAA 576

TTC 23 64927 CAGGGTGATATATGCTATAAA 577

US 2012/0244131 A1 Sep. 27, 2012 34

TABLE V- continued siRNA hits inhibiting GS SC-induced lusiferase signal to at least a normalized luciferase activity inferior to 37, B Gene Targeted Gene ID siRNA target sequence SEQ ID NO

MYST1 84.148 CAGATGACCAGTATCACCCAA 649

PLD3 23646 CCGGTTCTATGACACCCGCTA 650

EFNA2 1943 CCGCGCCAACTCGGACCGCTA 651

GP9 2815 CAGACAGGAGCACCTGACCAA 652

ELA28 51032 TGGCGTGATATGCACCTGCAA 653

DNAI1. 27.019 AAGAAGGCACATATAAGCCTA 654

SLK 97.48 TAGCATCTTGTGATCACCCAA 655

ZNF74 7625 CAGGGTGCCTCCTCTAGTTAA 65.6

OGDH 4967 CAGGATCAATCGTGTCACCGA 657

MMRN1 22915 CAGGGTCGTGATGATGCCTTA 658

AGPAT1 10554 TGGCTCCATGCTGCCCTTCAA 659

CYP3A4 1576 CTCGATGCAATGAACACTTAA 660

GNAL 2774 ATGGGTTTAATCCCGAGGAAA 661

MYCBP2 23 Off CTCGATATATTGCCATAACAA 662

PSIP1 111.68 AGGCAGCAACTAAACAATCAA 663

SLC13A2 90.58 CCCGCTAATCCTGGGCTTCAT 664

HMBS 31.45 CAGCTTAACGATGCCCATTAA 665

MED14 92.82 CGGGTGAAGTTTCGTGTTGAA 666

PLXNA4 91584 CCGCATCGTCCAGACCTGCAA 667

TINF2 26.277 TCCTGTGGATTTGGCCTCGAA 668

POLR3C 106.23 CCGGTACATCTATACTACCAA 669

POLR2A 543 O. CAGCGGTTGAAGGGCAAGGAA 670

CLDN12 9069 CTCCTCAGTGTGGGCGAGTAA 671

ZNF559 8 4527 TCCCGAGAGATGGCTAATGAA 672

WDR3 10885 CCGGGATGTTATCGGCTTCAA 673

K1F2A 3796 CAGCAAGCAAATCAACCCGAA 674

ARHGAP17 55114 CAGACCAGCGATGTGAATAAA 675

IPO11 51194 ATGGGTCGAGTTCTACTACAA 676

GYPA 2993 ACCGGACATGCAGGTGAATAT 677

TDGF3 6998 CTGCCCGTTTACATATAACAA 678

INTS4 921 OS CAGATACGTCTCATGGTGTAA 6.79

KRBA1 84 626 CCGACAAACCGTGGCCTACAA 68O

KIAA1853 8453 O CGCCAGTATCACGGCCCGCAA 681

TNNT2 7139 CAGGTCGTTCATGCCCAACTT 682

PLOD1 5351 CACCATCAACATCGCCCTGAA 683

SERP1NA4 5267 TCGCCACATCCTGCGATTCAA 684

US 2012/0244131 A1 Sep. 27, 2012 42

TABLE V- continued siRNA hits inhibiting GS SC-induced lusiferase signal to at least a normalized luciferase activity inferior to 37, B Gene Targeted Gene ID siRNA target sequence SEQ ID NO

PTTG3 262.55 AGGCATCCTTGTGGCTACAAA 933

MBD6 114785 TTCCACTGTAGTGATGCCTTA 934

OR13 C3 1388 O3 ATGGGTGAGATTAACCAGACA 935

OR2T27 4 O3239 CACGGACACATCAGCCTACGA 936

TNNC1 7134 CGCCAGCATGGATGACATCTA 937

TCP11L2 255.394 CAAGCTAATCTTATAGGTCAA 938

APOBEC4 4 O3314 TACCATATTCGAACAGGTGAA 939

CPN1 1369 CCGGTGGATGCACTCCTTCAA 94 O

FRAP1 24.75 CCGGAGTGTTAGAATATGCCA 941

PTBP1 sf2 CACGCACATTCCGTTGCCTTA 942

LGR5 8549 CAGCAGTATGGACGACCTTCA 943

ZNF567 163O81 TACCACTTCCGTAGCCTATAA 944

CHMP4C 9242.1 TGGCAGCTTGGGCTACCTAAA 945

NOL9 797 07 ATCCGGGTTCATCC TACATTT 946

KIAAO831 22863 CTCGGTGACCTCCTGGTTTAA 947

STRN 68 O1. CTGGAATACCACTAATCCCAA 948

ZNF576 791.77 CGGGCTGGTGCGACTATACTA 949

RPLPO 6175. CAAGAACACCATGATGCGCAA 950

CMTM3 123920 CTCCATCACGGCCATCGCCAA 951

ARHGEF1 91.38 CACCGATCACAAAGCCTTCTA 952

FOXD4 2298 CAGCGGCATCTGCGCCTTCAT 95.3

Pf6 1964 63 GTGGATGATCGTGGACTACAA 954

FTH1 2495 CGCCATCAACCGCCAGATCAA 955

C12orf53 1965 OO CACAATTACCATCTCCATCAT 956

RPS11 62O5 CCGAGACTATCTGCACTACAT 95.7

RHBDF2 796.51 CACGGCTATTTCCATGAGGAA 958

ALOX15B 247 TTGGACCTTATGGTCACCCAA 959

UNC13D 201294 CTGGTGTACTGCAGCCTTATA 96.O

PLEKHB1 58473 CAGACCGTGGTGGGCCTTCAA 961

PCNXL2 8 OOO3 CCGAAGGATCCTCATCCGCTA 962

DGCR5 2622O TACGTTCTAGCATCCATTCAA 963

FARSA 2193 CCGCTTCAAGCCAGCCTACAA 964

AGPAT1 105.54 ACGCAACGTCGAGAACATGAA 965

C19 of 63 284.361 CAAGACGGTCCTGATGTACAA 966

C18orf51 125704 AGCGCAGCGCGTAAACAACAA 967

TMEM31 2O3562 CACGTAGGACACCTACAACAT 968 US 2012/0244131 A1 Sep. 27, 2012 43

TABLE V- continued siRNA hits inhibiting GS SC-induced lusiferase signal to at least a normalized luciferase activity inferior to 37, B Gene Targeted Gene ID siRNA target sequence SEQ ID NO

TMEM54 113452 CCACTAGGACCCTGCAAGCAA 969

PML 5371 CAGGAGCAGGATAGTGCCTTT 97O

GABRD 2563 CACCTTCATCGTGAACGCCAA 971.

UNO9391 2O3 O74 CACCTCGTTGGTGAACTACAA 972

ITGA9 3 680 ACAGGTCACTGTCTACATCAA 973

PDZD8 11.8987 ACCGATCTCGTAGAACCTTCA 974.

GPX4 2879 GTGGATGAAGATCCAACCCAA 97.

GPBAR1 1513 O6 CAGGACCAAGATGACGCCCAA 976

NME2 4831 TACATTGACCTGAAAGACCGA 977

ZFP106 64397 AGGCGACATAGTGCACAATTA 978

TCAM1 146771 CACGCTCGCCTGCGTCCCAAA 979

LOC374. 443 37.4443 CCCATCGCATTTGGAAATGGA 98O

HRG 3.273 TTGGACTTGGAAAGCCCGAAA 981

TMEM166 84.141 ATGGAGGTGATTCTGATTCAA 982

RICH2 99.12 CAAACGCTAATAGAAGTGCAA 983

LAMC3 103.19 ATCGCGTATCTCACTGGAGAA 984

APOC1 341 CAGCCGCATCAAACAGAGTGA 985

OR2G3 81469 AGCACTCATCTCCATCTCCTA 986

PLCXD1 5.5344 CACGATGACGTACTGCCTGAA 987

FAM83H 286 Off CAGGTGCTCCATAATGAGTCA 988

TREML2 79865 CCGCTACTTGCTGCAGGACGA 989

PATZ1. 23.98 CCCGTCTGGCTGCTACACATA 990

BANF1 8815 CCGGAAAGGAGCGCCTACTAA 991

KLHL3 O 377OOf CTGGCATAACAGGGACAGGAA 992

CA11 77O CCGGCTCGGAACATCAGATCA 993

ECE2 971.8 CAGACACTATGCCCAAGCCTA 994

TMEM87A 25963 AGCGCTGATTGTTACAATGAA 995

PMS2 5395 TGGATGTTGAAGGTAACTTAA 996

TDRD3 81550 AAGCATCGAGGCAAGCTCTTA 997

SHC1. 64 64 CACCTGACCATCAGTACTATA 998

DNMT3B 1789 CTCACGGTTCCTGGAGTGTAA 999

ITCH 83737 CACGGGCGAGTTTACTATGTA 1OOO

MAT1A 4143 TTGGCTCACACTCGACATGAA 1OO1

RAL.A. 5898 CGAGCTAATGTTGACAAGGTA 1OO2

DEF6 5 O619 CTGGACGCTGACGGCCAAGAA 1OO3

US 2012/0244131 A1 Sep. 27, 2012

TABLE V- continued siRNA hits inhibiting GS SC-induced lusiferase signal to at least a normalized luciferase activity inferior to 37, B Gene Targeted Gene ID siRNA target sequence SEQ ID NO

C19 orf23 148046 CACGACGTGGCAGACGAGGAA 104 O

TRPM2 7226 CAGGCCTATGTCTGTGAGGAA 1041

Example 4 supplemented with 10% fetal calf serum, 2 mM L-glutamine, NHEJ GFP Reporter Gene Based Model in HEX293 100 UI/ml penicilline, 100 g/ml streptomycine, 0.25 ug/ml Cell Line amphotericine B (Fongizone). 0285 in order to validate the siRNAs hits issued from the primary high-throughput screening using the detection of a Cellular Transient Transfection for Functional Validation of luciferase signal, it was also useful to derive a new construct NHEJ GFP Reporter Plasmid based on a different reporter gene allowing the establishment of a correlation between the efficiency of the NHEJactivity 0288 One day prior transfection the 293H cell line was induced by a meganuclease and the effect of the siRNAs hits. seeded in 96 well plate at the density of 15000 cells per well After it's functional validation in a transient transfection in 100 ul. The next day, cells were transfected with Polyfect assay in 293H cell line, such plasmid may be further used to transfection reagent (Qiagen), Briefly a quantity of total DNA establish a cellular model with a single copy of the substrate for NHEJ recombination at the RAG1 locus to measure at a of 200 ng or 250 ng was diluted in 30 ul of water RNAse free. chromosomal location the frequency of SC GS induced On the other hand 1.33 ul of Polyfect was resuspended in 20 mutagenesis and validate novel effectors increasing NHEJ ul of DMEM without serum. Then the DNA was added to the efficiency. Polyfect mix and incubated for 20 min. at room temperature. After the incubation period the total transfection mix (50 ul) a) Material and Methods was added over plated, cells. After 96 h of incubation at 37° Design and Construction of Vector Monitoring GFP Mega C., cells were trypsinized and the percentage of EGFP posi nuclease Induced NHEJ Mutagenesis tive cells was monitored by flow cytometry analysis (Guava 0286. The plasmids pCLS6810 (SEQ ID NO: 5) and Instrument) and corrected by the transfection efficiency. pCLS6663 (SEQID NO: 6) were designed to quantify NHEJ repair frequency induced by SC GS or I-Sce meganucleases Stable Transfection to Generate 293H Based Cellular Model respectively. These plasmids depicted in FIG. 6 are derived Measuring Efficiency of Chromosomal Meganuclease-In from the hsRAG1 Integration Matrix CMV Neo used in duced Mutagenic NHEJ Repair cGPSR) Custom Human Full Kit DD of Cellectis Bioresearch. pCLS6810 (SEQID NO: 5) and pCLS6663 (SEQID NO: 6) 0289. One day prior to transfection, 293H cells are seeded contain all the characteristics to obtain by homologous in 10 cm tissue culture dishes (10° cells per dish) in complete recombination a highly efficient insertion eventofa transgene medium. The next day 3 ug of SC RAG encoding vector DNA sequence of interest at the RAG1 natural endogenous pCLS2222 (SEQID NO:36) and 2 ug of plasmid measuring locus. They are composed of two homology arms of 1.8 kb SC GS induced GFP mutagenic NHEJ repair (pCLS6810 and 1.2 kb separated by i) an expression cassette of neomycin SEQID NO: 5) are co transfected using 25 ul of Lipo resistance gene driven by mammalian CMV promoter and ii) fectamine 2000 reagent (Invitrogen) during 6 hours according an expression, cassette for the Substrate of recombination monitoring NHEJ of GFP reporter gene driven also by CMV to the instructions of the manufacturer. Three days following promoter. As for the vectors pCLS6883 (SEQID NO: 1) and transfection, 2000 cells are seeded and G418 selection was pCLS6884 (SEQID NO: 2) described in FIG. 2 the sequence added at 400 ug/ml one week after seeding. Neomycin resis used to measure meganuclease-induced mutagenesis is made tant clones were transferred in 96 well plate using Clone Fix of an ATG start codon followed by i) 2 codons for alanine ii) (Genetix) and cultured in presence of 400 ug/ml of G418 and the tag HA sequence iii) GS or I-Sce recognition sites iv) a 50 uMof Gancyclovir (Sigma). Genomic DNA of Neomycin glycine serine stretch, V) the same 2 codons for alanine as in and Ganclovir resistant clones is extracted and targeted inte i) and finally vi) a GFP reporter gene lacking its ATG start gration of a single copy of the transgene at the RAG1 locus codon. Since by itself GFP reporter gene is inactive due to a identified by specific PGR amplification. (cGPSR) Custom frame-shift introduced by GS or I-SceI recognition sites, cre Human Full Kit DD, Cellectis Bioresearch). ation of a DNA double strand break (DSB) by SC GS or I-SceI meganuclease (SEQ ID NO. 4 and SEQ ID NO: 40 b) Results respectively) followed by a mutagenic DSB repair event of NHEJ can lead to restoration of GFP gene expression inframe A) Extrachromosomal Validation of the NHEJ GFP Reporter with the ATG start codon. Vector Cell Culture 0290. In order to test the ability of the vector pCLS6810 (0287 Cell line 293H was cultured at 37°C. with 5% CO, (SeqID NO: 5) to achieve efficiently NHEJ mutagenesis of in Dulbecco's modified Eagle's medium (DMEM) Glutamax GFP reporter gene induced by SC GS expression plasmid US 2012/0244131 A1 Sep. 27, 2012 46 transient transfections in 96 well plate format were set up. FIGS. 7A and B present the functional assays corresponding TABLE VI to cotransfections of 100 ng of pCLS6810 (SEQID NO: 5) with 150 ng of the SC GS expression vector pCLS2690 Oligonucleotides used to create the Trex2/SC GS construct (SEQID NO:3) or the pCLS0002 (SEQID NO: 41) control SEQ SEQ plasmid. As presented in FIGS. 7A and B, we get a measur Amplified Forward ID Reverse ID able increase of the percentage of EGFP positive cells with Construct ORF primer NO: primer NO: the pCLS2690 (SEQID NO: 3) expression plasmid in com Trex2FSC GS Trex2 CMVfor 104.4 Link1OTrexRev 1048 parison with the transfection performed with the vector con SC GS Link10GSFor 1047 VSrewerse 1045 trol pCLS0002 (SEQID NO: 41). In fact, we get a percentage of EGFP positive cells of 13.3% vs 6.2% with a fold increase b) Cellular Transfection in 96 Well Format for Functional ratio of 2.1 obtained. These data imply that pCLS6810 can be Validation of the siRNAS Hits used to further establish a cellular model allowing testing the 0294 Same protocol of cotransfection with polyfect as potential effect of different siRNAs hits issued from the high described in example 3 with 200 ng of pCLS2690 DNA throughput Lueiferase primary screening on the modulation (SEQID NO: 3) or pCLS8054 (SEQID NO: 1046) plasmids of the efficiency of the NHEJ repair mechanism induced by a and siRNA at a final concentration of 33 nM. After, 96 h of custom meganuclease. incubation at 37°C., cells were trypsinized and the percent B) Functional Validation of the siRNAs Hits on the NHEJ age of EGFP positive cells was monitored by flow cytometry GFP Reporter Gene Based HEK293 Cell Line analysis (Guava Instrument) and corrected by the transfection 0291. The high-throughput screening of the siRNA human efficiency. genome wide library has allowed the identification of several hundreds of potential hits (cf Table IV) able to increase Results: SC GS-induced mutagenic NHEJ repair of a luciferase 0295 The new cell line containing a single copy of the reporter gene. To correlate Such effect to an improvement of GFP reporter system integrated at RAG1 locus was first vali the frequency of the NHEJ activity, siRNAs were tested in a dated by comparing the frequency of the EGFP positive cells new cellular model described in this example with the read obtained after transfection with the empty vector pCLS0002 out of a different reporter gene EGFP. (SEQID NO: 41) to the one obtained with the SC GS encod ing vectorpCLS2690 (SEQIDNO:3). Typically, transfection Material and Methods: with pCLS0002 (SEQ ID NO: 41) gave no EGFP positive cells as for untreated cells whereas transfections with SC GS a) Culture Conditions of the NHEJGFP Reporter Gene Based encoding vector (SEQ ID NO: 3) with no siRNA or with HEK293 Cell Model siRNA control AS led to detection of 0.5%+/-0.1 of EGFP 0292 Same protocolas for the culture of the 293H cell line positive cells (data not shown). This result, implies that, in except that the complete culture medium DMEM Glutamax comparison with the high-throughput cellular model moni medium with penicilline (100 UI/ml), streptomycine (100 toring the effect of the siRNAs hits using the detection of a ug/ml), amphotericine B (Fongizone) (0.25ug/ml), 10% FBS luciferase signal, this NHEJ GFP new cell line is useful to is supplemented with 0.25 mg/ml of G418 sulfate (Invitro establish a correlation between a percentage of GFP+ cells gen-Life Science). and a frequency of the NHEJ mutagenesis induced by SC GS in presence of different siRNAs. 0296. In this example, the effect of 223 different siRNAs b) Making of Trex2/SC GS Fusion Protein (220 siRNAs identified with the high-throughput screening 0293. The Trex2 protein was fused to the SC GS mega (cf Example 3) and three siRNAs issued from the results of nuclease to its N-terminus using a ten amino acids glycin the extrachromosomal screening (cf Example 3) and target stretch (GGGGS) (SEQ ID NO: 1042) as linker. Both ing the genes FANCD2 (SEQID NO: 39), AKT2 (SEQID SC GS and Trex2 were initially cloned into the AscI/XhoI NO: 15) and LIG4 (SEQID NO:24) were monitored using the restriction sites of the pCLS1853 (FIG. 13, SEQ ID NO: same siRNAS as those used, during the primary screening. 1043), a derivative of the pcDNA3.1 (Invitrogen), which They were chosen based on the high luciferase signal stimu drives the expression of a gene of interest under the control of lation obtained. Co-transfections with SC GS encoding vec the CMV promoter. The fusion protein construct was tor (SEQID NO:3) were performed in 96w format at least in obtained by amplifying separately the two ORFs using a triplicates and the potential effect of siRNAs hits was specific primer and the primer CMV for (5'-CG assessed using the statistical Student test analysis to eliminate CAAATGGGCGGTAGGCGT-3'; SEQ ID NO: 1044) or such siRNAs that do not have a robust effect. The ratio of V5reverse (5'-CGTAGAATCGAGACCGAGGAGAGG-3'; EGFP positive cells percentage calculated between a siRNA SEQID NO: 1045), which are located on the plasmid back hit and siRNA control AS leads to determine the stimulation bone. Then, after a gel purification of the two PCR fragments, factor of each siRNA. a PGR assembly was performed using the CMVfor/ 0297. In parallel, using the same functional assay and the V5 reverse oligonucleotides. The final PCR product was then statistical analysis methodas described previously, functional digested by AscI and XhoI and ligated into the pCLS18S3 validation of the 223 siRNAs was also performed in the digested with these same enzymes to generate the pCLS8054 context of a cotransfection with an expression vector (FIG. 14, SEQID NO: 1046) expression vector encoding the pCLS8054 (FIG. 14, SEQ ID NO: 1046) encoding for the fused protein Trex2 SC GS (SEQ ID NO: 1049). The fol Trex2/SC GS (Seq ID NO: 1049) protein consisting to N-ter lowing table VI gives the oligonucleotides that were used to minus fusion between the meganuclease SC GS (Seq ID NO: create the construct. 4) and a 236 amino acid functional version (SEQ ID NO: US 2012/0244131 A1 Sep. 27, 2012 47

1050) of the exonuclease Trex2 (SEQ ID NO: 1051). In fact, EGFP positive cells induced by SC GS (SEQID NO: 4) or human Trex2 protein (SEQID NO: 1051) was choosen since Trex2/SC GS (SEQID NO: 1046) expression vectors with at it's known to exhibit a 3' to 5' non processive exonuclease least a stimulation factor of 2. Moreover, a group of 15 siR activity (Mazur and Perrino, 2001) that might be compatible NAs corresponding to the ClassI have specifically an effect with the degradation of the 3' DNA overhangs generated by detected in the context of a transfection with SC GS mega the meganuclease GS and with an improvement of it's NHEJ nuclease, whereas another group of 63 siRNAs correspond mutagenesis in presence or not of siRNAS. In comparison ing to ClassII have an activity detected only in presence of the with the transfection of the NHEJ GFP reporter cell line with Trex2/SC GS fused meganuclease. Finally, the ClassIII con SC GS expression vector pCLS2690 (SEQID NO:3) quan cerns a group of 37 siRNAs that increase the percentage of tification of the percentage of EGFP+ cells induced by the GFP+ cells in the presence of either SC GS or Trex2/SC GS fused meganuclease Trex2/SC GS encoded by pCLS8054 meganucleases. (SEQ ID NO: 1046) was typically enhanced from 0.5%+A 0299. Altogether, such data confirm the pertinence of the 0.1 to 1.8%+/-0.7 (data not shown) demonstrating the potential hits identified with the cellular model based on increased efficiency (3.6 fold induction) of the fusion protein detection of luciferase signal confirming the robustness of the Trex2/SC GS to obtain mutagenic repair of the reporter gene. methodology applied to determine the cellular genes able to 0298 As indicated in Table VII below, among the 223 hits increase the efficiency of double-strand break-induced tested, 115 siRNAs are able to increase the percentage of mutagenesis by a meganuclease.

TABLE WII Validation of siRNAs hits stimulating SC GS or Trex2/SC GS-induced EGFP activity with at least a 2 fold increase Gene SEO ID Effect with Effect with Class of Targeted Gene ID siRNA target sequence NO SC GS Trex2/SC GS Hits AFG3L1P 172 CGGCTGGAAGTCGTGAACAAA. 14 O (-) (+)

AKT2 208 CAAGCGTGGTGAATACATCAA 15 (-) (+)

BRCA1 672 CTGCAGATAGTTCTACCAGTA 45 (+) (+) I

C16orf3 750 CTGGGACAACGCAGTGTTCAA 268 (+) (+) I

CAMK2G 818 GAGGAAGAGATCTATACCCTA 16 (+) (+) I

CAW3 859 TTGCGTTCACTTGTACTGTAA 126 (-) (+)

CSH1 1442 ACGGGCTGCTCTACTGCTTCA 177 (+) (+) I

CYP2A7 1549 CCCAAGCTAGGTGGCATTCAT 282 (-) (+)

DDX3X 1654 AACGAGAGAGTTGGCAGTACA 184 (-) (+)

DPP4 18O3. ATCGGGAAGTGGCGTGTTCAA 163 (+) (+) I

DUSP1 1843 CACGAACAGTGCGCTGAGCTA 106 (+) (+) I

EEF1A1 1915 CAGAATAGGAACAAGGTTCTA 256 (-) (+)

FANCE 2178 TCGAATCTGGATGATGCTAAA 285 (-) (+)

GNG4 2786 CCGAAGTCAACTTGACTGTAA 185 (-) (+)

SFN 2810 CCGGGAGAAGGTGGAGACTGA 162 (+) (-)

GRIN2C 29 Os CCCAGCTTTCACTATCGGCAA 167 (+) (-)

GTF2I 2969 TAGGTGGTCGTGTGATGGTAA 141 (-) (+) I

HIST1H2AE 3O12 ATCCCGAGTCCCAGAAACCAA 2O3 (+) (+) II

HMX2 3.167 CGGGCGCGTACTGTACTGTAA 83 (-) (+) I

HES1 3280 CACGACACCGGATAAACCAAA. 232 (-) (+) I

IK 3550 CAGGCGCTTCAAGGAAACCAA 236 (-) (+) I

KCNC3 3748 CAGCGGCAAGATCGTGATCAA 18O (-) (+) I

LRP5 4O41. CTGGACGGACTCAGAGACCAA 399 (-) (+) I

LTBR 4 O55 TACATCTACAATGGACCAGTA 328 (+) (+)

US 2012/0244131 A1 Sep. 27, 2012 49

TABLE VII - continued Validation of siRNAs hits stimulating SC GS or Trex2/SC GS-induced EGFP activity with at least a 2 fold increase

Gene SEO ID Effect with Effect with C ass of Targeted Gene ID siRNA target sequence NO SC GS Trex2/SC GS Hits UBE2D4 51619 CCGAATGACAGTCCTTACCAA 237 (+) I

CL IC6 54102 CCGAATCTAATTCCGCAGGAA

PAF1 54 623 CTCCACTGAGTTCAACCGTTA 98

BANP 54971 CAGCGACATCCAGGTTCAGTA 258

SETD5 55.209 AACGCGCTTGAACAACACCTA 221

OGFOD1 55239 TCGGACGCTGTTACGGAAGAA 196

LARP6 5.5323 ATGGTGTCTTGTAGGACCAAA 151

IL 5554. O CCGCTTGTTGAAGGCCACCAA 223

TMEM13 O 55769 TCCGTCAACAGTAGTTCCTTA 103

ST 6GALNAC1 CCCACGACGCAGAGAAACCAA 225

C12orf 62 56.245 CAACCTGATGTGCAACTGTAA 195

SEMA3G 56920 CCCTGCCCTATTGAAACT CAA 267

INTS12 st 117 CAGGACCTAGTGGAAGTACTA. 97

sff.2 CAAGCCTGAAACAGACGACAA

SLC25A19 60386 CTCCCTGTGATCAGTTACCAA 299

PROK2 60 675 TCGCTCTGGAGTAGAAACCAA 15

CHP2 63928 CAGGGCGACAATAAACTGTAT 38

PRDM14 63978 ACCGGCCTCACAAGTGTTCTA 34 O

CARD9 64 170 CAGCGACAACACCGACACTGA 49

FAM59A 64762 AAGGGCAGATTTAGCACCCGA 89

BCL11B 64 919 CAGAGGTGGGTTAAACTGTAA 93

KCTD15 AACCTTGGAGATTCACGGCAA 87

SECISBP2 79 0.48 TCCCAGTATCTTTATAACCAA 265

SA P3 OL 79685 CAAGAGCGTAAGGCAC CTATA 86

EP HX3. 79852 CAGCTCAGTGCTACTCTGAAT 243

PANK2 80025 CTGTGTGTGAACTTACTGTAA 331

AT TAGGACGACTGAAATAACCAA 239 II

RCBC 84230 TACCTTATACTGGCTGTTCTA 181 II

CGB 94 O27 ACCAAGGATGGAGATGTTCCA 244 II

NU P35 1294 O1 CAGGACTTGGATCAACACCTT 135

LI 14291. O AGGGTTGTTGTATACTTGCAA 198 US 2012/0244131 A1 Sep. 27, 2012 50

TABLE VII- continued Validation of siRNAs hits stimulating SC GS or Trex2/SC GS-induced EGFP activity with at least a 2 fold increase Gene SEO ID Effect with Effect with C ass of Targeted Gene ID siRNA target sequence NO SC GS Trex2/SC GS Hits

CSAG2 152667 CTCCTTTATCTTCCAAACCAA 252 (+) I

NKX2-3 159296 CAGGTACAAGTGCAAGAGACA 146 (+)

FAM179A 1651.86 CAACGTCTCTATAGAGACCAA 188 (+)

KDM1B 221 656 ATCGATGCGGTATGAAACCAA 174 (+)

TMEM13 O 22.2865. CCCGCTGGTGCTTACTGGCAA 1O2 (+)

GKS 256356 TACCATCTTGTACGAGCAATA 416 (+)

KLHL34 25724 O CTCGGCAGTCGTGGAAACCAA 121 (+) I

C10orf53 2829 66 CTGGAATGTGGTGGAACTCAT 254 (+) I

AFM1 OOB 283 991 CACGTTCTTCCAAGAAACCAA. 249 (+) I

CXorf59 2864 64 CTGTGAGTTCCTGTACACCTA 11 O

KRTAP13-3 33796 O CAGGACT CACATGCTCTGCAA 143

ADAMTSL5 33.9366 ATGCCTAACCAGGCACTGTAA 118

OTOP3 34.7741 TTGCCAGTACTTCACCCTCTA. 257

PEAR1 375 O33 CTGCACGCTGCTCATGTGAAA 215

TMEM179 388 O21, CGGGCCGGCCATGGCGCTCAA 168

C2or f32 389 O84 CACAGACGATGTTCCACAGGA, 233

C5orf 46 38.933 6 ACCAGACAAGCCAGACGACAA 250

SAMD 5 389 432 CTGCTCATAGGAGTTCAGTAA 114

TRIM61 391712 TAGGGTATGTATATGTTCCTA 139

RAM10 4O1123 CCCGTTAGTGCTACACTCATT 247

XKRX 4 O2415 CACCCATAATGTAGTAGACTA. 204

MTHFD2L 44.1024 CAGCGGTATATTAGTTCAGTT 89

SPRN 503542 CAGGAACATTCCCAAGCAGGA 96

CSAG2 728461 CCAGCCGAACGAGGAACT CAA 251

SNORD114 - 17 7675.95 ATGAATGATATGTGTCTGAAA 104

(+) indicates detection of at least a 2 fold increase of the the frequency of the NHEJ repair activity of the reporter gene, percentage of GFP+ cells deep sequencing analysis was performed to quantify the fre (-) indicates absence of detection of at least a 2 fold increase quency of mutagenesis occurring at the site of the meganu of the percentage of GFP+ cells clease after its cleavage. siRNAs ClassI: effect detected with meganuclease SC GS siRNAs ClassII: effect detected with meganuclease Trex2/ Material and Methods: SC GS siRNAs ClassIII: effect detected with meganuclease SC GS Transfection in the Cellular Model NHEJ EGFP Monitoring and Trex2/SC GS Meganuclease-Induced Mutagenesis C) Effect of the siRNAs on the NHEJ Repair Mutagenesis 0301 One million of cells of the NHEJ GFP model were Induced by the SC GS and Trex2/SC GS Meganucleases seeded one day prior transfection. Cells were cotransfected 0300. In order to correlate the increase of the EGFP+ cells with either 3 ug of plasmid encoding SC GS (pCLS2690, induced by SC GS or Trex2/SC GS in presence of siRNAs SEQID NO: 3) or Trex2/SC GS (pCLS8054, SEQID NO: hits identified precedently (cf Table VII) with an increase of 1046) in 5 lug of total DNA by complementation with an US 2012/0244131 A1 Sep. 27, 2012

empty vector pCLS0003 (SEQ ID NO: 1052) in presence or corresponding to stimulations factors of the GFP+ cells of not of siRNAs at final concentrations of 5 nM, 10 nM or 20 1.53, 1.93 and 1.51. Besides, comparatively to the co-trans nM depending on the siRNA used and 25ul of lipofectamine fection with Trex2/SC GS and the siRNA control AS, we also (Invitrogen) according to the manufacturer's instructions. observed an increase of the percentage of GFP+ cells to 0302) Three to four days following transfection, cells were 18.06%, 15.07% and 16.04% with the siRNAs targeting harvested for flow cytometry analysis using Guava instru respectively the genes TALDO1 (SEQID NO: 111), DUSP1 mentation and for genomic DNA extraction. Locus specific (SEQID NO: 106) and PTPN22 (SEQID NO: 283) leading to PGR around the GS target site was performed using the stimulations factors of the GFP+ cells of 2.04, 1.70 and 1.82. following primers: 5'-CCATCTCATCCCTGCGTGTCTC 0305. This phenotypic stimulation of GFP+ cells was also CGACTCAG (forward adaptor sequence)-10N-(sequences confirmed at a molecular level (cf Table VIII). In fact, SC GS needed for PGR product identification)-GCrCTCTG (SEQID NO: 4) led to 4.7% of targeted mutagenesis whereas GCTAACTAGAGA ACCC (transgenic locus specific for co-transfection of SC GS expressing plasmid with the siR ward sequence)-3' (SEQ ID NO: 1053) and 5'-CCTATC NAs CAP1 (SEQID NO:367), TALDO1 (SEQID NO: 111) CCCTGTGTGCCTTGGCAGTCTCAG-(reverse adaptor and DUSP1 (SEQID NO: 106) stimulate this mutagenic DSB sequence)-TCGATCAGCACGGGCACGATGCC (trans repair to 5.9%, 8.8% and 6.2% respectively. A similar result genic locus specific reverse sequence) (SEQ ID NO: 1054). was obtained with the transfection with Trex2/SC GS (SEQ PGR products were sequenced by a 454 sequencing system ID NO: 1049). In this case, the frequency of mutagenesis of (454 Life Sciences). Approximately 10,000 sequences were 19.3% with the siRNA control AS was increased repectively obtained per PGR product and then analyzed for the presence to 32.6%, 30%, and 37% with the siRNAs TALDO1 (SEQID of site-specific insertion or deletion events. NO:111), DUSP1 (SEQID NO: 106) and PTPN22 (SEQID NO: 283). Results 0306 Altogether these data and the result presented in FIG. 15 demonstrate that after transfection of this NHEJ GFP 0303. This example is focused on testing if siRNAs hits cell line by SC GS or Trex2/SC GS meganuclease express known to stimulate the percentage of EGFP+ cells induced by ing plasmids with siRNAs hits, the percentage of GFP posi SC GS or Trex2/SC GS are also able to increase the fre tive cells is increased and directly correlated to the mutagenic quency of the NHEJ mutagenic repair of the reporter gene. NHEJ repair frequency at the meganuclease targeted site For that purpose, the cell line described in this example was implying that siRNAs hits may be useful to improve targeted co-transfected either with the plasmid pCLS2690 expressing mutagenesis at different chromosomal locus cleaved by dis SC GS (SEQID NO:3) and the siRNAs control AS and those tinct custom meganucleases.

TABLE VIII Deep sequencing analysis of the effect of siRNAs hits on NHEJ repair mutagenesis induced by the SC GS and Trex2/SC GS neganucleases. Meganuclease siRNA Seq ID % age GFP+ Stimulation factor of % age of NHEJ Stimulation factor of NHEJ used tested NO cells GFP+ cells mutagenesis mutagenesis Ctrl (pCLSO002) O.O1 O.O1 O.OO O.OO SC GS Ctrl AS 0.96 1.OO 4.70 1.00 (pCLS2690) CAP1 367 1.47 1.53 5.86 1.25 TALDO1 111 1.85 1.93 8.77 1.87 DUSP1 106 1.45 1.51 6.19 1.32 Trex2FSC GS Ctrl AS 8.86 1.OO 1926 1.00 (pCLS8054) TALDO1 111 18.06 2.04 32.60 1.69 DUSP1 106 15.07 1.70 30.00 1.56 PTPN22 283 16.14 1.82 37.01 1.92 targeting the genes CAP1 (SEQID NO:367), TALDO1 (SEQ Example 5 ID NO: 111) and DUSP1 (SEQ ID NO: 106) or with the expressing vector pCLS8054 encoding Trex2/SC GS (SEQ Stimulation of Meganuclease-Induced Mutagenesis ID NO: 1046) and the siRNAs control AS and those targeting at an Endogenous Locus Using siRNAS Targeting the genes TALDO1 (SEQID NO: 111), DUSP1 (SEQID NO: Specific Genes 106) and PTPN22 (SEQID NO: 283). Quantification of the percentage of GFP+ cells was determined by flow cytometry (0307. In order to verify that a define siRNA could stimu 4 days post transfection and frequency of mutagenesis deter late mutagenic DSB repair at an endogenous locus, siRNAS mined by deep sequencing analysis. targeting genes involved in DSB repair or siRNAs identified 0304. As shown in table VIII the percentages of 0.96% and during the screenings with the two cellular models were co 8.86% of GFP+ cells induced by SC GS or Trex2/SC GS transfected in 293H cells with meganuclease SC RAG (SEQ respectively in presence of the siRNA control AS were ID NO: 11 encoded by pCLS2222, SEQ ID NO: 36), or increased with the different siRNAs tested. In the case of the SCTrex2/SC RAG (SEQ ID NO: 1056 encoded by transfection with SC GS, percentage of GFP+ cells was pCLS9573, FIG. 16 and SEQID NO: 1055) plasmid encod stimulated to 1.47%, 1.85% and 1.45% with the siRNAs ing for the meganuclease SC RAG fused at it's N terminus to targeting respectively the genes CAP1 (SEQ ID NO: 367), a single chain version of Trex2 exonuclease. Mutagenic DSB TALDO1 (SEQID NO: 111) and DUSP1 (SEQID NO: 106) repair was monitored at molecular level by Deep Sequencing. US 2012/0244131 A1 Sep. 27, 2012 52

Materials and Methods When siRNA (SEQID NO:46) targeting ATR gene (Gene ID Cellular Transfection of 293HCell Line and PCRAnalysis of N 545) was added, percentage of NHEJ was in the same Mutagenic DSB Repair range as with siRNA AS: 0.81%. The presence of siRNAs XRCC6 (SEQ ID NO. 44), BRCA1 (SEQ ID NO: 45), 0308. 293H cell line was plated at a density of 1x10 cells FANCD2 (SEQ ID NO:39), WRN (SEQ ID NO: 37) or per 10 cm dish in complete medium (DMEM supplemented MAPK3 (SEQ ID NO: 38) enhanced the percentage of with 2 mML-glutamine, penicillin (100 IU/ml), streptomycin mutagenic NHEJ repair up to 1.13%, 1.88%, 2.06% 2.15% (100 mg/ml), amphotericin B (Fongizone: 0.25 mg/ml, Invit and 1.6%, respectively corresponding to stimulations factors rogen-Life Science) and 10% FBS). The next day, cells were of 1.7.2.8, 3.1, 3.2, 2.4 (Table IX). Moreover the nature of the transfected in the presence of 25ul of lipofectamine reagent deletions was also modified for all those stimulating siRNAs (Invitrogen) according to the manufacturer's protocol. Typi since they all presented larger deletion events (superior to 100 cally cells were co-transfected with 2 ug of empty vector bp) than the deletion observed with the other siRNAs (the pCLS0002 (SEQ ID NO: 4.1), and 3 g of meganuclease control AS and the siRNAATR cf. FIG. 5). Altogether these expression vectors pCLS2222 (SEQ ID NO: 36) or results demonstrate that siRNAS targeting genes involved in pCLS9573 (SEQ ID NO: 1055) in presence of siRNAs at a DNA repair mechanism or regulation can be used to increase final concentration of 1 nM, 7.5 nM, 10 nM or 20 nM depend and modulate the efficiency and the nature of mutagenic ing on the siRNA used. After 48 h to 72 h of incubation at 37° NHEJ repair induced by I-CreI meganuclease with a modified C., cells were harvested for genomic DNA extraction with the specificity and at a natural locus (cf. Table IX below). Blood and Cell culture DNA midikit (QIAGEN) according to the manufacturer's protocol. PCR amplification reactions TABL E IX were performed using primers to obtain a fragment of RAG1 locus flanked by specific adaptor sequences. The forward siRNA stimulating endonuclease-induced mutagenesis primer contains the following sequence: 5'-CCATCTCATC at RAG1 locus. CCTGCGTGTCTCCGACTCAG (forward adaptor SEQ NHEJ sequence)-4-1ON-(sequences needed for PCR product iden Gene Gene ID Stimulation tification), GGCAAAGATGAATCAAAGATTCTGTCCT targeted ID siRNA target sequence NO: factor (RAG1 locus specific sequence)-3' (SEQID NO: 1057) and XRCC6 2547 ACCGAGGGCGATGAAGAAGCA 44 1, 7 the reverse primer contains the following sequence, 5'-CCTATCCCCTGTGTGCCTTGGCAGTCTCAG (re BRCA1 672 ACCATACAGCTTCATAAATAA 45 2, 8 verse adaptor sequence), GATCTCACCCGGAACAGCT TAAATTTC (RAG1 locus specific sequence)-3', (SEQID FANCD2 2177 AAGCAGCTCTCTAGCACCGAT 39 3, 1 NO: 1058). 4-10N is a fixed sequence of different lengths WRN 7486 CGGATTGTATACGTAACTCCA 37 3, 2 ranging from four to ten nucleotides (depending on the pro tocol of the manufacturer) included in the primer used to MAPK3 5595 CCCGTCTAATATATAAATATA 38 2, 4 perform the PCR amplification and that allows to identify each sequence issued from a deep sequencing reaction with a 0312 Moreover, the siRNAs CAP1 (SEQ ID NO: 367), mix of different PCR fragments corresponding to different VAV3 (SEQ ID NO: 85), PTPN22 (SEQ ID NO: 283), experimental conditions, PCR products were sequenced by a MTHFD2L (SEQID NO: 89), TALDO1 (SEQID NO: 111) 454 sequencing system (454 Life Sciences), Approximately and DUSP1 (SEQID NO: 106) identified with the screenings 10,000 exploitable sequences were obtained per PCR pool of the two cellular-models and belonging to three different and then analyzed for the presence of site-specific insertion or classes of siRNAs defined in Table VII were also tested for deletion events. their capacity to increase the frequency of mutagenic repair at 0309 a) Results: the endogenous locus RAG. As previously described in this 0310 siRNAs targeting genes known to be involved in example, 293H cell line was cotransfected with the expres DSB repair mechanism or regulation were tested to estimate sion plasmids pCLS 2222 (Seq ID NO:36) or pCLS9573 their potential for increasing mutagenic DSB repair by NHE.J. (SEQ ID NO: 1055) encoding for the meganucleases 293H cell lines were co-transfected with 3 lug of SC RAG SC RAG (SEQID NO: 11) and SCTrex2/SC RAG (SEQID meganuclease encoding vector (pCLS2222, SEQID NO:36), NO: 1056) in presence of the siRNA control AS or the differ 2 ug of empty vector (pCLS0002, SEQ ID: 41) and 1 nM of ent siRNAs tested. Frequency of mutagenesis at RAG locus siRNA targeting XRCC6 (SEQID NO:44), BRCA1 (SEQID was analyzed by deep sequencing to monitor the efficiency of NO: 45), ATR (SEQID NO:46), FANCD2 (SEQID NO:39), each siRNA to increase the mutagenic repair induced by each WEN (SEQID NO:37), MAPK3 (SEQID NO:38) or AS (a type of meganuclease. siRNA control with no known human targets). Those genes 0313 As shown in Table X below, in agreement with their were chosen because of their implication in classical NHEJ belonging to different classes defined in Table VII die two (XRCC6) or in NHEJ and other DNA repair pathways siRNAs CAP1 (SEQID NO: 367) and VAV3 (SEQ ID NO: (BRCA1, FANCD2, WRN) or in DNA repair pathway (ATR) 85) are able to increase the frequency of mutagenesis of or in DNA repair regulation (MAPK3). Genomic DNA was SC RAG meganuclease with respectively stimulation factors extracted 2-3 days after transfection and was used to perform of 1.43, 1.25 while the siRNA PTPN22 (SEQ ID NO: 283) a PCR with primer allowing 454 sequencing technology. enhances the NHEJ mutagenic repair of the SCTrex2/SC Sequences obtained per PCR were analyzed to determine the RAG meganuclease with a 1.39 fold increase. Moreover, the frequency and the nature of mutagenic DSB repair (insertion three siRNAs MTHFD2L (SEQID NO: 89), TALDO1 (SEQ and or deletion) at RAG1 locus. ID NO: 111) and DUSP1 (SEQ ID NO: 106), known to have 0311 Mutagenic DSB repair at RAG1 locus in presence of an effect with SC GS or Trex2/SC/GS are also able to siRNA AS appeared in 0.66%+/-0.13 of events analyzed. increase the targeted mutagenesis induced by the meganu US 2012/0244131 A1 Sep. 27, 2012

cleases SC RAG (SEQ ID NO: 11 encoded by pCLS2222, 0325 Bolduc, J. M., P. C. Spiegel, et al. (2003). “Struc SEQID NO:36) or SCTrex2/SC RAG (SEQID NO: 1056 tural and biochemical analyses of DNA and RNA binding encoded by pCLS9573, SEQID NO: 1055) with stimulations by a bifunctional homing endonuclease and group I intron factors of respectively 1.22, 1.23 and 1.43 splicing factor.” Genes Dev 17(23): 2875-88. 0314. Altogether, these data imply that siRNAs targeting 0326 Britt, A. B. (1999). “Molecular genetics of DNA genes involved in double strand break repair or other cellular repair in higher plants.” Trends Plant Sci 4(1): 20-25. process can be useful effectors to enhance the efficiency of 0327 Burden and O. N. (1998). “Mechanism of action of NHEJ mutagenesis at natural endogenous locus targeted with eukaryotic topoisomerase II and drugs targeted to the distinct custom meganucleases fused or not to the Trex2 enzyme. Biochim Biophys Acta. 1400(1-3): 139-154. exonuclease. 0328. Capecchi, M. R. (1989). “The new mouse genetics: altering the genome by gene targeting.” Trends Genet 5(3): TABLE X 70-6. 0329 Cathomen, T. and J. K. Joung (2008). “Zinc-finger Sec. Stimulation factor of NHE mutagenesis nucleases: the next generation emerges.” Mol Ther 16(7): ID SC RAG SCTrex SC RAG 1200-7. siRNA Class siRNA tested NO (Seq ID NO: 11) (Seq ID NO: 1056) 0330 Chames, P. J. C. Epinat, et al. (2005), “In vivo Ctrl AS 1.OO 1.00 selection of engineered homing endonucleases using I CAP1 367 1.43 ND double-strand break induced homologous recombination.” I WAV3 85 1.25 ND Nucleic Acids Res 33 (20): e178. II PTPN22 283 ND 1.39 III MTHFD2L. 89 122 ND 0331 Chevalier, B., M. Turmel, et al. (2003). “Flexible III TALDO1 111 ND 1.23 DNA target site recognition by divergent homing endonu III DUSP1 106 ND 1.43 clease isoschizomers I-Crel and I-MsoI.' J Mol Biol 329 Effect of siRNAs hits on NHEJ repair mutagenesis induced by the SC RAG and SCTrex2. (2): 253-69. SC RAG meganucleases; ND: non determined. 0332 Chevalier, B. S., T. Kortemme, et al. (2002). “Design, activity, and structure of a highly specific artifi LIST OF CITED REFERENCES cial endonuclease.” Mol Cell 10(4): 895-905. 0333 Chevalier, B.S., R.J. Monnat, Jr., et al. (2001). “The 0315 Arimondo, P. B., C. J. Thomas, et al. (2008), homing endonuclease I-CreI uses three metals, one of “Exploring the cellular activity of camptothecin-triple-he which is shared between the two active sites.” Nat Struct lix-forming oligonucleotide conjugates. Mol Cell Biol Biol 8(4): 312-6. 28(1): 324-33. 0334 Chevalier, B. S. and B. L. Stoddard (2001). “Hom 0316 Arnould, S. P. Chames, et al. (2006). “Engineering ing endonucleases: structural and functional insight into of large numbers of highly specific horning endonucleases the catalysts of intron/intein mobility.” Nucleic Acids Res that induce recombination on novel DNA targets. J Mol 29(18): 3757-74. Biol 355(3): 443-58, 0335 Choulika, A., A. Perrin, et al. (1995). “Induction of 0317. Arnould, S., C. Perez, et al. (2007. “Engineered homologous recombination in mammalian I-Crel derivatives cleaving sequences from the human by using the I-Sce system of Saccharomyces cerevisiae.' XPC gene can induce highly efficient gene correction in Mol Cell Biol 15(4): 1968-73. mammalian cells.”J Mol Biol 371(1): 49-65. 0336 Christian, M., T. Cermak, et al. (2010). “Targeting 0318 Ashworth, J. J. J. Havranek, et al. (2006). “Compu DNA double-strand breaks with TAL effector nucleases.” tational redesign of endonuclease DNA binding and cleav Genetic 186(2): 757-61. age specificity.” Nature 441 (7093): 656-9. 0337 Cohen-Tannoudi, M., S. Robine, et al. (1998), 0319 Audebert, M., B. Salles, et al. (2004). “involvement “I-Sce-induced gene replacement at a natural locus in of poly(ADP-ribose) polymerase-1 and XRCC1/DNA embryonic stem cells. Mol Cell Biol 18(3): 1444-8. ligase III in an alternative route for DNA double-strand 0338 Critchlow, S. E. and S. P. Jackson (1998). “DNA breaks rejoining.” J Biol Chem 279(53): 551.17-26. end-joining: from yeast to man. Trends Biochem Sci 0320 Bau, D. T.Y. C. Mau, et al. (2006). “The role of 23(10): 394-8. BRCA1 in non-homologous end-joining.” Cancer Lett 240 0339 Donoho, G., M. Jasin, et al. (1998). “Analysis of (1): 1-8. gene targeting and intrachromosomal homologous recom 0321 Baumann, P. and S. C. West (1998). “DNA end bination stimulated by genomic double-strand breaks in joining catalyzed by human cell-free extracts.” Proc Natl mouse embryonic stem cells.” Mol Cell Biol 18(7): 4070-8. AcadSci USA 95(24): 14066-70. (0340 Doyon, J. B. V. Pattanayak, et al. (2006), “Directed 0322 Beumer, K. J. J. K. Trautman, et al. (2008). “Effi evolution and Substrate specificity profile of homing endo cient gene targeting in Drosophila by direct embryo injec nuclease I-SceI.”JAm ChemSoc. 128(7): 2477-84. tion with zinc-finger nucleases.” Proc Natl AcadSci USA (0341 Doyon, Y., J. M. McCammon, et al. (2008). “Heri 105(50): 19821-6. table targeted gene disruption in Zebrafish using designed 0323 Blunt, T., N. J. Finnie, et al. (1995). “Defective zinc-finger nucleases.” Nat Biotechnol 28(6): 702-8. DMA-dependent protein kinase activity is linked to V(D)J (0342. Dujon, B., L. Colleaux, et al. (1988). “Mitochon recombination and DNA repair defects associated with the drial introns as mobile genetic elements: the role of intron murine scid mutation.” Cell 80(5): 813–23. encoded proteins. Basic Life Sci 40: 5-27. 0324 Boch, J., H. Scholze, et al. (2009). “Breaking the (0343 Eisenschmidt, K., T. Lanio, et al. (2005). “Develop code of DNA binding specificity of TAL-type III effectors. ing a programmed restriction endonuclease for highly spe Science 328(5959): 1509-12. cific DMA cleavage.” Nucleic Acids Res 33(22): 7039-47. US 2012/0244131 A1 Sep. 27, 2012 54

(0344 Elbashir, S.M., J. Harborth, et al. (2001). “Duplexes 0364 Li, T. S. Huang, et al. (2010). “TAL nucleases of 21-nucleotide RNAs mediate RNA interference in cul (TALNs): hybrid proteins composed of TAL effectors and tured mammalian cells.” Nature 411 (6836): 494-8. FokI DNA-cleavage domain.” Nucleic Acids Res 39(1): (0345 Endo, M., K. Osakabe, et al. (2005). “Molecular 359-72, characterisation of true and ectopic gene targeting events at 0365 Lloyd, A., C. L. Plaisier, et al. (2005), “Targeted the acetolactate synthase gene in Arabidopsis.' Plant Cell mutagenesis using zinc-finger nucleases in Arabidopsis.' Physiol 47(3): 372-9. Proc Natl AcadSci USA 102(6): 2232-7. (0346) Endo, M., K. Osakabe, et al. (2007). “Molecular 0366 Mahaney, B. L., K. Meek, et al. (2009). “Repair of breeding of a novel herbicide-tolerant rice by gene target ionizing radiation-induced DNA double-strand breaks by ing.” Plant J 52(1): 157-66, non-homologous end-joining.” Biochem J 417(3): 639-50. (0347 Epinat, J. C., S. Arnould, et al. (2003), “A novel 0367 Mazur, D.J., and Perrino, F.W. (2001). “Excision of engineered meganuclease induces homologous recombi 3' termini by the Trex 1 and TREX2 3'->5' exonucleases. nation in yeast and mammalian cells. Nucleic Acids Res Characterization of the recombinant proteins' J Biol 31 (11): 2952-62. Chem. 276(20): 17022-9 0348 Essers, J., H. van Steeg, et al. (2000). “Homologous 0368 McCaffrey, A. P. L. Meuse, et al. (2002). “RNA and non-homologous recombination differentially affect interference in adult mice.” Nature 418(6893): 38-9. DMA damage repair in mice.” EmboJ 19(7): 1703-10, 0369 Meister, G. and T. Tuschl (2004). “Mechanisms of (0349 Feldmann, E. V. Schmiemann, et al. (2000), “DMA gene silencing by double-stranded RNA. Nature double-strand break repair in cell-free extracts from Ku80 431(7006): 343-9. deficient cells: implications for Ku serving as an alignment 0370 Meng, X., M. B. Noyes, et al. (2008). “Targeted factor in non-homologous DNA end joining. Nucleic gene inactivation in Zebrafish using engineered zinc-finger Acids Res 28(13): 2585-96. nucleases.” Nat Biotechnol 26(6): 695-701. 0350 Gimble, F. S., C. M. Moure, et al. (2003). “Assessing 0371 Moore, I., M. Samalova, et al. (2006). “Transacti the plasticity of DNA target site recognition of the PI-SceI vated and chemically inducible gene expression in plants.” horning endonuclease using a bacterial two-hybrid selec tion system.”J Mol Biol 334(5): 993-1008, Plant J 45(4): 651-83. 0351 Gouble, A.J. Smith, et al. (2006). “Efficient in toto 0372 Moscou, M. J. and A. J. Bogdanove (2009). “A targeted recombination in mouse liver by meganuclease simple cipher governs DNA recognition by TAL effectors.” induced double-strand break.' J Gene Med 8(5): 616-22. Science 326(5959): 1501. 0352 Green, E. L. and T. H. Roderick (1966). “Radiation 0373) Moure, C. M., F. S. Gimble, et al. (2002). “Crystal Genetics.” In Biology of the Green, E. structure of the intein homing endonuclease PI-Scelbound L., eds. (McGraw-Hill, New York)): 165-185. to its recognition sequence.” Nat Struct Biol 9(10): 764-70. 0353 Grizot, S., J. Smith, et al. (2009). “Efficient target 0374 Moure, C. M., F. S. Gimble, et al. (2003). “The ing of a SCID gene by an engineered single-chain homing crystal structure of the gene targeting homing endonu endonuclease.” Nucleic Acid Res 37(18): 5405-19, clease I-Sce reveals the origins of its target site specificity. 0354) Haber, J. E. (1995). “In vivo biochemistry: physical J Mol Biol 334(4): 685-95. monitoring of recombination induced by site-specific 0375 Nagy, Z. and E. Soutoglou (2009). “DNA repair: endonucleases.” Bioessays 17(7): 809-20. easy to visualize, difficult to elucidate.” Trends Cell Biol 0355 Hanin, M., S. Volrath, et al. (2001), “Gene targeting 19(11): 617-29. in Arabidopsis," Plant J28(8): 671-7, 0376 Needleman, S. B. and C. D. Wunsch (1970). “A 0356. Hannon, G.J. (2002). “RNA interference.” Nature general method applicable to the search for similarities in 418(6894): 244-51. the amino acid sequence of two proteins.”J Mol Biol 48(3): 0357 Harborth, J., S. M. Elbashir, et al. (2001). “Identifi 443-53. cation of essential genes in cultured mammalian cells using 0377 Nouspikel, T. (2009). “DNA repair in mammalian small interfering RNAs.” J Cell Sci 114(Pt 24): 4557-65. cells: Nucleotide excision repair: variations on versatility.” 0358 Hutvagner, G., J. McLachlan, et al. (2001). “A cel Cell Mol Life Sci 66(6): 994-1009. lular function for the RNA-interference enzyme Dicer in 0378 Pace, P. G. Mosedale, et al. (2010). “Ku70 corrupts the maturation of the let-7 small temporal RNA. Science DNA repair in the absence of the Fanconi anemia path 293(5531): 834-8, way.” Science 329(5988): 219-23, 0359. Ichiyanagi, K. Y. Ishino, et al. (2000). “Crystal 0379 Padidam, M. (2003). “Chemically regulated gene structure of an archaeal intein-encoded homing endonu expression in plants. Curr Opin Plant Biol 6(2): 169-77. clease PI-PfuI.”J Mol Biol 300(4): 889-901. (0380 Paques, F. and P. Duchateau (2007). “Meganu 0360 Kalish, J. M. and P. M. Glazer (2005), “Targeted cleases and DNA double-strand break-induced recombina genome modification via triple helix formation.” Ann NY tion: perspectives for gene therapy. Curr Gene Ther 7(1): AcadSci 1058: 151-61. 49-66. 0361 Kim, Y. G., J. Cha, et al. (1996). “Hybrid restriction (0381 Pingoud, A. and G. H. Silva (2007). “Precision enzymes: Zinc finger fusions to FokI cleavage domain.” genome surgery.” Nat Biotechnol 25(7): 743-4. Proc Natl AcadSci USA 3(3): 1156-60. (0382 Porteus, M. H. and D. Carroll (2005). “Gene target 0362 Kirik, A., S. Salomon, et al. (2000). “Species-spe ing using Zinc finger nucleases.” Nat Biotechnol 23(8): cific double-strand break repair and genome evolution in 967-73. plants.” Embo J 19(20): 5562-6. (0383 Postal, G. V. Kolisnychenko, et al. (1999). “Mark 0363 Lee, Y., K. Jeon, et al. (2002). “MicroRNA matura erless gene replacement in Escherichia coli stimulated by a tion: Stepwise processing and Subcellular localization.” double-strand break in the chromosome. Nucleic Acids Embo J 21 (17): 4663-70. Res 27(22): 4409-15. US 2012/0244131 A1 Sep. 27, 2012

0384 Potenza, C. L. Aleman, et al. (2004). “Targeting 0398. Smith, J., S. Grizot, et al. (2006). “A combinatorial transgene expression in research, agricultural, and environ approach to create artificial homing endonucleases cleav mental applications: Promoters used in plant transforma ing chosen sequences. Nucleic Acids Res 34(22): e149, tion. In vitro cellular & developmental biology plant 0399 Spiegel, P.C., B. Chevalier, etal. (2006). “The struc 40(1): 1-22. ture of I-Ceul homing endonuclease: Evolving asymmetric 0385 Povirk, L. F. (1996). “DNA damage and mutagen DNA recognition from a symmetric protein scaffold.” esis by radiomimetic DNA-cleaving agents: bleomycin, Structure 14(5): 869-80. (0400 Stoddard, B. L. (2005). “Homing endonuclease neocarzinostatin and other enediynes. Mutat Res 355(1- structure and function.” O Rev Bioph vs 38(1): 49-95. 2): 71-89. 0401 Sussman, D., M. Chadsey, et al. (2004). “Isolation 0386 Puchta, H., B. Dujon, et al. (1996). “Two different and characterization of new homing endonuclease speci but related mechanisms are used in plants for the repair of ficities at individual target site positions.”J Mol Biol 342 genomic double-strand breaks by homologous recombina (1): 31-41. tion.” Proc Natl AcadSci USA 93(10): 5055-60. (0402 Taccioii, G. E., G. Rathbun, et al. (1993), “Impair (0387 Rosen, L. E. H. A. Morrison, et al. (2006). “Homing ment of V(D)J recombination in double-strand break repair endonuclease I-CreI derivatives with novel DNA target mutants.” Science 280(5105): 207-10, specificities. Nucleic Acids Res. (0403. Takata, M., M. S. Sasaki, et al. (1998), “Homolo 0388 Rothstein, R. (1991). “Targeting, disruption, gous recombination and non-homologous end-joining replacement, and allele rescue: integrative DNA transfor pathways of DNA double-strand break repair have over mation in yeast.” Methods Enzymol 194: 281-301. lapping roles in the maintenance of chromosomal integrity 0389. Rouet, P. F. Smih, et al. (1994). “Expression of a in vertebrate cells.” Embo J 17(18): 5497-508. site-specific endonuclease stimulates homologous recom 04.04 Teicher, B. A. (2008). “Next generation topoi bination in mammalian cells.” Proc Natl Acad Sci USA somerase I inhibitors: Rationale and biomarker strategies.” 91 (13): 6064-8. Biochem Pharmacol 75(6): 1262-71. 0390 Rouet, P. F. Smih, et al. (1994). “Introduction of 04.05 Terada, R., Y. Johzuka-Hisatomi, et al. (2007). double-strand breaks into the genome of mouse cells by "Gene targeting by homologous recombination as a bio expression of a rare-cutting endonuclease.” Mol Cell Biol technological tool for rice functional genomics. Plant 14(12): 8096-106. Physiol 144(2): 846-56. 0391) Sargent, R. G., M. A. Brenneman, et al. (1997). (0406 Terada, R. H. Urawa, et al. (2002). “Efficient gene “Repair of site-specific double-strand breaks in a mamma targeting by homologous recombination in rice.” Nat Bio lian chromosome by homologous and illegitimate recom technol 20(10): 1030-4. bination.” Mol Cell Biol 17(1): 267-77. (0407 Wang, H., B. Rosidi, et al. (2005). “DNA ligase III 0392 Sarkaria, J. N. R. S. Tibbetts, et al. (1998). “inhibi as a candidate component of backup pathways of nonho tion of phosphoinositide 3-kinase related kinases by the mologous end joining.” Cancer Res 65(10): 4020-30, radiosensitizing agent wortmannin.” Cancer Res 58(19): 04.08 Wang, M., W. Wu, et al (2006), “PARP-1 and Ku 4375-82. compete for repair of DNA double strand breaks by distinct 0393. Seligman, L. M., K. M. Stephens, et al. (1997). NHEJ pathways.” Nucleic Acids Res34(21): 6170-82. "Genetic analysis of the Chlamydomonas reinhardtti 04.09 Wang, R. X. Zhou, et al. (2003). “Chemically regu I-CreI mobile intron homing system in Escherichia coli.' lated expression systems and their applications in trans Genetics 147(4): 1653-64. genic plants.” Transgenic Res 12(5): 529-40. 0394 Shrivastav, M., L. P. De Haro, et ah (2008). “Regu 0410 Williams, R. S., J. S. Williams, et al. (2007). lation of DNA double-strand break repair pathway choice.” “Mre 11-Rad50-Nbs1 is a keystone complex connecting Cell Res 18(1): 134-47. DNA repair machinery, double-strand break signaling, and 0395. Siebert, R. and H. Puchta (2002). “Efficient Repair the chromatin template.” Biochem Cell Biol85(4):509-20. of Genomic Double-Strand Breaks by Homologous 0411. Yi, R., Y. Qin, et al. (2003), “Exportin-5 mediates Recombination between Directly Repeated Sequences in the nuclear export of pre-microRNAs and short hairpin the Plant Genome.” Plant Cell 14(5): 1121-31. RNAs. Genes Dev 17(24): 3011-6. 0396 Silva, G. H., J. Z. Dalgaard, et al. (1999). “Crystal 0412 Zeng, Y., X. Cai, et al. (2005). “Use of RNA poly structure of the thermostable archaeal intron-encoded merase II to transcribe artificial microRNAs.” Methods endonuclease I-DmoIJ Mol Biol 286(4): 1123-36. Enzymol 392: 371-80. 0397) Simon, P. F. Cannata, et al. (2008). “Sequence 04.13 Zuo, J. and N. H. Chua (2000). “Chemical-inducible specific DNA cleavage mediated by bipyridine polyamide systems for regulated expression of plant genes. Curr conjugates. Nucleic Acids Res 38(11): 3531-8. Opin Biotechnol 11(2): 146-51.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 1058

<21 Os SEQ ID NO 1 &211s LENGTH: 124 67 &212s. TYPE: DNA <213> ORGANISM: Artificial

US 2012/0244131 A1 Sep. 27, 2012 70

- Continued

212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: 223 OTHER INFORMATION: SC-GS

<4 OOs, SEQUENCE: 4 Met Ala Asn. Thir Lys Tyr Asn. Glu Glu Phe Lieu. Lieu. Tyr Lieu Ala Gly 1. 5 1O 15 Phe Val Asp Ala Asp Gly Ser Ile Ile Ala Glin Ile Llys Pro Arg Glin 2O 25 3O Ser Arg Llys Phe Lys His Glu Lieu Ser Lieu. Thr Phe Asp Val Thr Glin 35 4 O 45 Llys Thr Glin Arg Arg Trp Phe Lieu. Asp Llys Lieu Val Asp Glu Ile Gly SO 55 6 O Val Gly Tyr Val Tyr Asp Ser Gly Ser Val Ser Tyr Tyr Gln Leu Ser 65 70 7s 8O Glu Ile Llys Pro Leu. His Asn Phe Lieu. Thr Gln Leu Gln Pro Phe Leu 85 90 95 Glu Lieu Lys Glin Lys Glin Ala Asn Lieu Val Lieu Lys Ile Ile Glu Glin 1OO 105 11 O Lieu Pro Ser Ala Lys Glu Ser Pro Ala Lys Phe Lieu. Glu Val Cys Thr 115 12 O 125 Trp Val Asp Glin Ile Ala Ala Lieu. Asn Asp Ser Llys Thr Arg Llys Thr 13 O 135 14 O Thir Ser Glu Thr Val Arg Ala Val Lieu. Asp Ser Lieu. Ser Glu Lys Llys 145 150 155 160 Llys Ser Ser Pro Ala Ala Gly Gly Ser Asp Llys Tyr Asn Glin Ala Lieu. 1.65 17O 17s Ser Lys Tyr Asn. Glin Ala Lieu. Ser Lys Tyr Asn. Glin Ala Lieu. Ser Gly 18O 185 19 O Gly Gly Gly Ser Asn Llys Llys Phe Lieu. Lieu. Tyr Lieu Ala Gly Phe Val 195 2OO 2O5 Asp Gly Asp Gly Ser Ile Ile Ala Glin Ile Llys Pro Arg Glin Gly Tyr 21 O 215 22O Llys Phe Llys His Gln Leu Ser Lieu. Thr Phe Glin Val Thr Glin Llys Thr 225 23 O 235 24 O Glin Arg Arg Trp Phe Lieu. Asp Llys Lieu Val Asp Arg Ile Gly Val Gly 245 250 255 Tyr Val Ala Asp Arg Gly Ser Val Ser Asp Tyr Arg Lieu. Ser Glu Ile 26 O 265 27 O Llys Pro Lieu. His Asn. Phe Lieu. Thr Glin Lieu Gln Pro Phe Lieu Lys Lieu. 27s 28O 285 Lys Glin Lys Glin Ala Asn Lieu Val Lieu Lys Ile Ile Glu Gln Lieu Pro 29 O 295 3 OO Ser Ala Lys Glu Ser Lieu. Asp Llys Phe Lieu. Glu Val Cys Thir Trp Val 3. OS 310 315 32O Asp Glin Ile Ala Ala Lieu. Asn Asp Ser Lys Thr Arg Llys Thir Thir Ser 3.25 330 335 Glu Thr Val Arg Ala Val Lieu. Asp Ser Lieu. Ser Glu Lys Llys Llys Ser 34 O 345 35. O

Ser Pro

US 2012/0244131 A1 Sep. 27, 2012 81

- Continued aggatgat ct ggacgaagag cat caggggc ticgc.gc.ca.gc caactgttc gcc aggctica O 860 aggcgc.gcat gcc.cgacggc gaggat.ct cq t cqtgaccca togatgcc tecttgc.cga O92O at at catggt ggaaaatggc cgcttittctggatt catcga Ctgtggc.cgg Ctgggtgtgg O98O cggaccgcta t cagga cata gcgttggcta ccc.gtgat at tctgaagag Cttggcggcg 104 O aatgggctga cc.gctt Cotic gtgctttacg gtatcgc.cgc. tcc.cgatt.cg cagcgcatcg 11OO ccttctat cq cct tcttgac gagttcttct gattaattaa caggactgac cqtgctacga 116 O gattitcgatt C Caccgc.cgc Cttctatgaa aggttgggct tcggaatcgt titt Cogggac 122 O gccggctgga tigatcCtcca gcgcggggat ct catgctgg agttctt.cgc ccacccCaac 128O ttgtttattg cagcttataa toggttacaaa taaagcaata gcatcacaaa titt cacaaat 134 O aaag catttt titt cactgca ttctagttgt gigtttgtc.ca aact catcaa totatictitat 14 OO Catgtctgta taccgt.cgac Ctctagdtag agcttggcgt aa 1442

<210s, SEQ ID NO 7 &211s LENGTH: 163 212. TYPE: PRT <213> ORGANISM: Chlamydomonas reinhardtii <4 OO > SEQUENCE: 7 Met Asn. Thir Lys Tyr Asn Lys Glu Phe Lieu. Lieu. Tyr Lieu Ala Gly Phe 1. 5 1O 15 Val Asp Gly Asp Gly Ser Ile Ile Ala Glin Ile Llys Pro Asn Glin Ser 2O 25 3O Tyr Llys Phe Llys His Gln Leu Ser Lieu. Thr Phe Glin Val Thr Gln Lys 35 4 O 45 Thr Glin Arg Arg Trp Phe Lieu. Asp Llys Lieu Val Asp Glu Ile Gly Val SO 55 6 O Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Asp Tyr Ile Lieu. Ser Glu 65 70 7s 8O Ile Llys Pro Leu. His Asn Phe Lieu. Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95 Lieu Lys Gln Lys Glin Ala Asn Lieu Val Lieu Lys Ile Ile Glu Glin Lieu. 1OO 105 11 O Pro Ser Ala Lys Glu Ser Pro Asp Llys Phe Leu Glu Val Cys Thir Trp 115 12 O 125 Val Asp Glin Ile Ala Ala Lieu. Asn Asp Ser Lys Thr Arg Llys Thir Thr 13 O 135 14 O Ser Glu Thr Val Arg Ala Val Lieu. Asp Ser Lieu. Ser Glu Lys Llys Llys 145 150 155 160

Ser Ser Pro

<210s, SEQ ID NO 8 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: C1221 target <4 OOs, SEQUENCE: 8 caaaacgt.cg tacgacgttt to 22 US 2012/0244131 A1 Sep. 27, 2012 82

- Continued <210s, SEQ ID NO 9 &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: SC-GS target <4 OOs, SEQUENCE: 9 Ctgc.cccagg gtgagaaagt c caa 24

<210s, SEQ ID NO 10 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: SC-RAG target <4 OOs, SEQUENCE: 10 tgttct cagg tacct cagcc ag 22

<210s, SEQ ID NO 11 &211s LENGTH: 354 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: 223 OTHER INFORMATION: SC-RAG

<4 OOs, SEQUENCE: 11 Met Ala Asn. Thir Lys Tyr Asn. Glu Glu Phe Lieu. Lieu. Tyr Lieu Ala Gly 1. 5 1O 15 Phe Val Asp Gly Asp Gly Ser Ile Ile Ala Glin Ile ASn Pro Asn Glin 2O 25 3O Ser Ser Llys Phe Lys His Arg Lieu. Arg Lieu. Thr Phe Tyr Val Thr Glin 35 4 O 45 Llys Thr Glin Arg Arg Trp Phe Lieu. Asp Llys Lieu Val Asp Glu Ile Gly SO 55 6 O Val Gly Tyr Val Arg Asp Ser Gly Ser Val Ser Glin Tyr Val Lieu. Ser 65 70 7s 8O Glu Ile Llys Pro Leu. His Asn Phe Lieu. Thr Gln Leu Gln Pro Phe Leu 85 90 95 Glu Lieu Lys Glin Lys Glin Ala Asn Lieu Val Lieu Lys Ile Ile Glu Glin 1OO 105 11 O Lieu Pro Ser Ala Lys Glu Ser Pro Asp Llys Phe Lieu. Glu Val Cys Thr 115 12 O 125 Trp Val Asp Glin Ile Ala Ala Lieu. Asn Asp Ser Llys Thr Arg Llys Thr 13 O 135 14 O Thir Ser Glu Thr Val Arg Ala Val Lieu. Asp Ser Lieu. Ser Gly Lys Llys 145 150 155 160 Llys Ser Ser Pro Ala Ala Gly Gly Ser Asp Llys Tyr Asn Glin Ala Lieu. 1.65 17O 17s Ser Lys Tyr Asn. Glin Ala Lieu. Ser Lys Tyr Asn. Glin Ala Lieu. Ser Gly 18O 185 19 O Gly Gly Gly Ser Asn Llys Llys Phe Lieu. Lieu. Tyr Lieu Ala Gly Phe Val 195 2OO 2O5 Asp Ser Asp Gly Ser Ile Ile Ala Glin Ile Llys Pro Arg Glin Ser Asn 21 O 215 22O Llys Phe Llys His Gln Leu Ser Lieu. Thr Phe Ala Val Thr Glin Llys Thr

US 2012/0244131 A1 Sep. 27, 2012 86

- Continued aggatc.ttac cqctgttgag atccagttcg atgta accca citcgtgcacc caact gatct 588 O t cagcatctt ttactitt cac cagcgtttct gggtgagcaa aaa.caggaag gcaaaatgcc 594 O gcaaaaaagg gaataagggc gacacggaaa tdttgaatac toatact citt cctttittcaa 6 OOO tatt attgaa gcatttatca gggittattgt ct catgagcg gatacatatt togaatgtatt 6 O6 O tagaaaaata aacaaatagg ggttcc.gc.gc acattt CCCC gaaaagtgcc acctgacgt. 6119

<210s, SEQ ID NO 13 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene CSNK1D

<4 OOs, SEQUENCE: 13 ccggtctagg atcgaaatgt t 21

<210s, SEQ ID NO 14 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene AK2

<4 OOs, SEQUENCE: 14 cggcagaacc cagtatic ct a 21

<210s, SEQ ID NO 15 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene AKT2

<4 OOs, SEQUENCE: 15 Caag.cgtggit gaatacatca a 21

<210s, SEQ ID NO 16 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene CAMK2G

<4 OOs, SEQUENCE: 16 gaggaagaga tictataccct a 21

<210s, SEQ ID NO 17 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene GK2 <4 OOs, SEQUENCE: 17 tacgittagaa gag cactgta a 21

<210s, SEQ ID NO 18 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial US 2012/0244131 A1 Sep. 27, 2012 87

- Continued

22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene PFKFB4 <4 OOs, SEQUENCE: 18

Cagaaagtgt Ctggacttgt a 21

<210s, SEQ ID NO 19 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene MAPK12

<4 OOs, SEQUENCE: 19 ctggacgt at t cactic ct ga t 21

<210s, SEQ ID NO 2 O &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene PRKCE

<4 OOs, SEQUENCE: 2O cc.cgac catg gtagtgttca a 21

<210 SEQ ID NO 21 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene EIF2AK2

<4 OOs, SEQUENCE: 21 cggaaagact tacgittatta a 21

<210s, SEQ ID NO 22 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene WEE1

<4 OOs, SEQUENCE: 22

Cagggtag at tacct cqqat a 21

<210s, SEQ ID NO 23 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene CDK5R1

<4 OOs, SEQUENCE: 23 ccggaaggcc acgctgtttg a 21

<210s, SEQ ID NO 24 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Target sequence in the gene LIG4 <4 OOs, SEQUENCE: 24