) ( 2 (51) International Patent Classification: Street, Brookline, MA 02446 (US). WILSON, Christo¬ C12N 9/22 (2006.01) pher, Gerard; 696 Main Street, Apartment 311, Waltham, MA 0245 1(US). DOMAN, Jordan, Leigh; 25 Avon Street, (21) International Application Number: Somverville, MA 02143 (US). PCT/US20 19/033 848 (74) Agent: HEBERT, Alan, M. et al. ;Wolf, Greenfield, Sacks, (22) International Filing Date: P.C., 600 Atlanitc Avenue, Boston, MA 02210-2206 (US). 23 May 2019 (23.05.2019) (81) Designated States (unless otherwise indicated, for every (25) Filing Language: English kind of national protection av ailable) . AE, AG, AL, AM, (26) Publication Language: English AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, (30) Priority Data: DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, 62/675,726 23 May 2018 (23.05.2018) US HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, 62/677,658 29 May 2018 (29.05.2018) US KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, (71) Applicants: THE BROAD INSTITUTE, INC. [US/US]; MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, 415 Main Street, Cambridge, MA 02142 (US). PRESI¬ OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, DENT AND FELLOWS OF HARVARD COLLEGE SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, [US/US]; 17 Quincy Street, Cambridge, MA 02138 (US). TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (72) Inventors: LIU, David, R.; 3 Whitman Circle, Lexing¬ (84) Designated States (unless otherwise indicated, for every ton, MA 02420 (US). KOBLAN, Luke, W.; 7 1Monmouth kind of regional protection available) . ARIPO (BW, GH, (54) Title: BASE EDITORS AND USES THEREOF FIG. 2A (57) Abstract: Some aspects of this disclosure provide strategies, systems, reagents, methods, and kits that are useful for the targeted editing of nucleic acids, including editing a single site within the genome of a cell or subject, e.g., within the human genome. The disclosure provides fusion proteins of nucleic acid programmable DNA binding proteins (napDNAbp), e.g., Cas9 or variants thereof, and nucleic acid editing proteins such as cytidine deaminase domains (e.g., novel cytidine deaminases generated by ancestral sequence reconstruction), and adenosine deaminases that deaminate adenine in DNA. Aspects of the disclosure relate to fusion proteins (e.g., base editors) that have improved expression and/or localize efficiently to the nucleus. In some embodiments, base editors are codon optimized for expression in mammalian cells. In some embodiments, base editors include multiple nuclear localization sequences (e.g., bipartite NLSs), e.g., at least two NLSs. In some embodiments, methods for targeted nucleic acid editing are provided. [Continued on next page] W O 2019/226953 A 1 GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, Cl, CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG). Published: with international search report (Art. 21(3)) with sequence listing part of description (Rule 5.2(a)) BASE EDITORS AND USES THEREOF FEDERALLY SPONSORED RESEARCH [0001] This invention was made with government support under Grant No. HR001 1- 17-2-0049 awarded by the Department of Defense, and Grant Nos. HG009490, EB022376, GM1 18062, CA014051, and GM095450 awarded by the National Institutes of Health. The government has certain rights in the invention. BACKGROUND OF THE INVENTION [0002] Targeted editing of nucleic acid sequences, for example, the targeted cleavage or the targeted introduction of a specific modification into genomic DNA, is a highly promising approach for the study of gene function and also has the potential to provide new therapies for human genetic diseases, for example, those caused by point mutations. Point mutations represent the majority of known human genetic variants associated with disease (1). Developing robust methods to introduce and correct point mutations is therefore an important challenge to understand and treat diseases with a genetic component. [0003] Engineered base editors have been recently developed (2, 3). Base editors are fusions of catalytically disabled Cas moiety and a nucleobase modification enzyme (e.g., natural or evolved nucleobase deaminases). In some cases, base editors may also include proteins that alter cellular DNA repair processes to increase the efficiency and stability of the resulting single-nucleotide change, e.g., a UGI domain (2, 3). [0004] Two classes of base editors have been generally described to date: cytidine base editors convert target C G base pairs to T A base pairs, and adenine base editors convert A T base pairs to G C base pairs. Collectively, these two classes of base editors enable the targeted installation of all four transition mutations (C-to-T, G-to-A, A-to-G, and T-to-C), which collectively account for about 61% of known human pathogenic small nucleotide polymorphisms (SNPs) in the ClinVar database. In addition, base editors have been used widely in organisms ranging from prokaryotes to plants to amphibians to mammals, and have even been used to correct pathogenic mutations in human embryos (4- 18). [0005] However, the utility of base editing is limited by several constraints, including the PAM requirement imposed by the particular Cas moiety used (e.g., naturally occurring Cas9 from S. pyogenes, or a modified version thereof, or a homolog thereof), off-target base editing of non-target nucleotides nearby the desired editing site, the production of undesired edited genomic byproducts (e.g., indels), and overall low editing efficiencies. [0006] The development of “next-generation” base editors has begun to address some of these limitations, including base editors with different or expanded PAM compatibilities (19-21), highfidelity base editors with reduced off-target activity (20, 22-25), base editors with narrower editing windows (normally ~5 nucleotides wide) (19), and a cytidine base editor (BE4) with reduced by-products (6). [0007] Nevertheless, despite these recent advances, the efficiency of base editing by base editors varies widely by among other factors, cell type and target locus. Thus, there continues to be a significant need in the art for the development of base editors with improved editing efficiencies, and in particular, wherein the improvements are aimed to address those fundamental underlying biological aspects which restrict the genome editing efficiencies of base editor systems. The present disclosure provides improved base editors which overcome the problems in the art. SUMMARY OF THE INVENTION [0008] The instant specification provides for improved base editors which overcome deficiencies of those in art. In particular, the specification provides base editors with improved editing efficiencies, for example, wherein the improvements address underlying biological aspects that limit the efficiency of genome editing achieved by existing base editor systems, including, for example, improved expression and/or nuclear localization. In addition, the instant specification provides for nucleic acid molecules encoding and/or expressing the improved base editors disclosed herein, as well as vectors for cloning and/or expressing the improved base editors described herein, host cells comprising said nucleic acid molecules and cloning and/or expression vectors, and compositions for delivering and/or administering nucleic acid-based embodiments described herein. In addition, the disclosure provides for improved base editors as described herein, as well as compositions comprising said improved base editors. Still further, the present disclosure provides for methods of making the base editors, as well as methods of using the improved base editors or nucleic acid molecules encoding the improved base editors in applications including editing a nucleic acid molecule, e.g., a genome, with improved efficiency as compared to base editor that forms the state of the art. The specification also provides methods for efficiently editing a target nucleic acid molecule, e.g., a single nucleobase of a genome, with a base editing system described herein (e.g., in the form of an improved base editor protein as described herein or a vector encoding same) and conducting based editing. Still further, the specification provides therapeutic methods for treating a genetic disease and/or for altering or changing a genetic trait or condition by contacting a target nucleic acid molecule, e.g., a genome, with a base editing system (e.g., in the form of an isolated improved base editor protein or a vector encoding same) and conducting base editing to treat the genetic disease and/or change the genetic trait (e.g., eye color). [0009] The present inventors have surprisingly discovered various ways to improve the efficiency of base editing by recognizing that the fraction of cells expressing active base editors, and/or the amount of functional base editor protein produced by each cell, constitutes restrictions on the efficiency of base editing. In particular, the inventors have surprisingly discovered that by (a) improving nuclear localization of the expressed base editor or component thereof to the nucleus, (b) optimizing codon usage of the sequence encoding the base editor or component thereof, and (c) enhancing the expression of the sequence encoding the base editor or component thereof, or a combination thereof, e.g., by ancestral protein reconstruction (ASR), significantly improves the editing efficiencies of previously known base editors, e.g., cytidine base editors. Ancestral protein reconstruction uses an alignment of known protein sequences, an evolutionary model, and a resulting phylogenetic tree to infer ancestral protein sequences at the nodes of the phylogeny.
