WO 2017/205336 Al 30 November 2017 (30.11.2017) W !P O PCT
Total Page:16
File Type:pdf, Size:1020Kb
(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2017/205336 Al 30 November 2017 (30.11.2017) W !P O PCT (51) International Patent Classification: Zengmin; 29-10 137th Street, Flushing, NY 11354 (US). A61K 31/70 (2006.01) C07H 19/04 (2006.01) HSIEH, Min-Kang; 521 W. 122nd Street, Apt.34, New A61K 31/7042 (2006.01) C07H 21/00 (2006.01) York, NY 10027 (US). CHIEN, Minchen; 273 Tenafly A61K 31/7052 (2006.01) C07H 21/04 (2006.01) Road, Tenafly, NJ 07670 (US). SHI, Shundi; 7705 95th Ave, Ozone Park, NY 11416 (US). REN, Jianyi; 536 W (21) International Application Number: 113th Street, Apt. 35, New York, NY 10025 (US). GUO, PCT/US2017/033939 Cheng; 23 Bay 8th Street, 1st Floor, Brooklyn, NY 11228 (22) International Filing Date: (US). KUMAR, Shiv; 2 1 Muirhead Court, Belle Mead, NJ 23 May 2017 (23.05.2017) 08502 (US). RUSSO, James, J.; 60 Haven Avenue, Apt. 25E, New York, NY 10032 (US). TAO, Chuanjuan; 3 (25) Filing Language: English Horizon Road, Apt. 1211, Fort Lee, NJ 07024 (US). JOCK- (26) Publication Langi English USCH, Steffen; 604 W . 115th Street, New York, NY 10025 (US). KALACHIKOV, Sergey; 5425 Valles Avenue, Apt. (30) Priority Data: 2J, Bronx, NY 10471 (US). 62/340,419 23 May 2016 (23.05.2016) US 62/365,321 2 1 July 2016 (21 .07.2016) US (74) Agent: GERSHIK, Gary, J.; Cooper & Dunham LLP, 30 62/477,945 28 March 2017 (28.03.2017) us Rockefeller Plaza, 20th Floor, New York, NY 101 12 (US). (71) Applicant: THE TRUSTEES OF COLUMBIA (81) Designated States (unless otherwise indicated, for every UNIVERSITY IN THE CITY OF NEW YORK kind of national protection available): AE, AG, AL, AM, [US/US]; 412 Low Memorial Library, 535 W. 116th Street, AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, New York, NY 10027 (US). CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, (72) Inventors: JU, Jingyue; 267 Marietta Street, Englewood HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KH, KN, KP, KR, Cliffs, NJ 07632 (US). CHEN, Xin; 567 W . 125th Street, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG, #6, New York, NY 10027 (US). LI, Xiaoxu; 403 W MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, 115th Street, Apt. 22, New York, NY 10025 (US). LI, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, SC, (54) Title: NUCLEOTIDE DERIVATIVES AND METHODS OF USE THEREOF (57) Abstract: Disclosed herein, inter alia, are compounds, composi tions, and methods of thereof in the sequencing a nucleic acid. o o FIG. 1 o [Continued on nextpage] WO 2017/205336 Al llll II II 11III I II 11llll II II II III II I II SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (84) Designated States (unless otherwise indicated, for every kind of regional protection available): ARIPO (BW, GH, GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG). Published: — with international search report (Art. 21(3)) — before the expiration of the time limit for amending the claims and to be republished in the event of receipt of amendments (Rule 48.2(h)) — with sequence listing part of description (Rule 5.2(a)) NUCLEOTIDE DERIVATIVES AND METHODS OF USE THEREOF CROSS-REFERENCES TO RELATED APPLICATIONS [0001] This application claims the benefit of U.S. Provisional Application No. 62/340,419, filed May 23, 2016, U.S. Provisional Application No. 62/365,321, filed July 21, 2016, and U.S. Provisional Application No. 62/477,945, filed March 28, 2017, each of which are incorporated herein by reference in entirety and for all purposes. REFERENCE TO A "SEQUENCE LISTING," A TABLE, OR A COMPUTER PROGRAM LISTING APPENDIX SUBMITTED AS AN ASCII FILE [0002] The Sequence Listing written in file 51385-502001WO_ST25.txt, created March 27, 2017, 5,636 bytes, machine format IBM-PC, MS Windows operating system, is hereby incorporated by reference. BACKGROUND [0003] DNA sequencing is a fundamental tool in biological and medical research; it is an essential technology for the paradigm of personalized precision medicine. Among various new DNA sequencing methods, sequencing by synthesis (SBS) is the leading method for realizing the goal of the $1,000 genome. Currently, the widely used high-throughput SBS technology (Bentley DR, et al. Nature, 2008, 456, 53-59) determines DNA sequences during the polymerase reaction using cleavable fluorescently labeled nucleotide reversible terminator (NRT) sequencing chemistry that has been previously developed (Ju J et al. 2003, US Patent 6664079; Ju J et al. Proc Natl Acad Sci USA, 2006, 103, 19635-19640). These cleavable fluorescent NRTs were designed based on the rationale that each of the nucleotides is modified by attaching a unique cleavable fluorophore to the specific location of the base and capping the 3'-OH group with a small reversible-blocking moiety so they are still recognized by DNA polymerase as substrates. A disadvantage of the abovementioned SBS approach is the production of a small molecular "scar" (e.g., a propargylamine or a modified propargylamino moiety) at the nucleotide base after cleavage of the fluorescent dye from the incorporated nucleotide in the polymerase reaction. The growing DNA chain accumulates these scars through each successive round of SBS. At some point, the residual scars may be significant enough to interfere with the DNA double helix structure, thereby negatively affecting DNA polymerase recognition and consequently limiting the read length. Accumulated research efforts indicated that the major challenge for this approach is that DNA polymerase has difficulty accepting V-0 bulky-dye- modified nucleotides as substrates, because the 3' position on the deoxyribose of the nucleotides is very close to the amino acid residues in the active site of the DNA polymerase while in the ternary complex formed by the polymerase with the complementary nucleotide and the primed template. Accordingly, there is a need for the use in scarless SBS, and synthesis of, 3'-0 modified nucleotides and nucleosides that are effectively recognized as substrates by DNA polymerases, are efficiently and accurately incorporated into growing DNA chains during SBS, have a 3'-0 blocking group that is cleavable under mild conditions wherein cleavage results in a 3' -OH, and permit long SBS read-lengths. Disclosed herein, inter alia, are solutions to these and other problems in the art. BRIEF SUMMARY OF THE INVENTION [0004] In an aspect is provided a nucleotide analogue having the formula: [0005] B is a base or analogue thereof. L is covalent linker. L2 is covalent linker. L4 is covalent linker. X is a bond, O, NR , or S. R3 is -OH, monophosphate, polyphosphate or a 4A 6 nucleic acid. R and R are independently hydrogen, -OH, -CF 3, -CC1 , -CBr3, - CI , -CHF 2, -CHCI2, -CHBr 2, -CHI2, -CH F, -CH2C1, -CH Br, -CH2I, -CN, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. R5 is a detectable label, anchor moiety, or affinity 6 anchor moiety. R is hydrogen, -CF3, -CC13, -CBr 3, -CI3, -CHF2, -CHCI2, -CHBr 2, - CHI2, -CH2F, -CH2CI, -CH2Br, -CH2I, -CN, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. R7 is hydrogen or -OR , wherein R is hydrogen or a polymerase-compatible moiety. R 12 is a complementary affinity anchor moiety binder. R 3 is a detectable label. The symbol "— " is a non-covalent bond. [0006] In an aspect is provided a thermophilic nucleic acid polymerase complex, wherein the thermophilic nucleic acid polymerase is bound to a nucleotide analogue having the formula: [0007] B is a base or analogue thereof. L1 is covalent linker. L2 is covalent linker. L4 is covalent linker. R3 is -OH, monophosphate, polyphosphate or a nucleic acid. R A and R are independently is hydrogen, -OH, -CF3, -CCI3, -CBr3, -CI3, -CHF2, -CHCI2, -CHBr2, - CHI2, -CH2F, -CH2CI, -CH2B1-, -CH2I, -CN, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. 4B R is hydrogen , -OH, -CF3, -CC13, -CBr3, -CI3, -CHF2, -CHC1 2, -CHBr2, -CHI2, -CH2F, - 6 CH2CI, -CH2Br, -CH2I, -CN, -X-R , substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. X is a bond, O, NR A, or S .R5 is a detectable label, anchor moiety, or affinity anchor moiety. 6 R is hydrogen, -CF3, -CC13, -CBr3, -CI3, -CHF2, -CHCI2, -CHBr , -CHI2, -CH2F, -CH2C1, - CH2Br, -CH2I, -CN, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. R7 is hydrogen or -OR 7 , wherein R is hydrogen or a polymerase-compatible moiety.