USOO6613508B1 (12) United States Patent (10) Patent No.: US 6,613,508 B1 NeSS et al. (45) Date of Patent: *Sep. 2, 2003

(54) METHODS AND COMPOSITIONS FOR FOREIGN PATENT DOCUMENTS ANALYZING NUCLEIC ACID MOLECULES UTILIZING SIZING TECHNIQUES CA 2062454 A1 9/1992 (75) Inventors: sy. YENSA. WA S. J (List continued on next page.) Jeffry Howbert, Bellevue, WA (US); John T. Mulligan, Seattle, WA (US) OTHER PUBLICATIONS (73) Assignee: Qiagen Genomics, Inc., Bothell, WA AeberSold et al., “Design, Synthesis, and characterization of (US) a protein Sequencing reagent yielding amino acid derivatives (*) Notice: This patent issued on a continued pros- with enhanced detectability by mass spectrometry,” Protein ecution application filed under 37 CFR Science 1:494-503, 1992. span its. th ty ny Amankwa and Kuhr, “Trypsin-Modified Fused-Silica Cap 154(a)(2). a a- - illary Microreactor for Peptide Mapping by Capillary Zone Electrophoresis,” Anal. Chem. 64:1610–1613, 1992. Subject to any disclaimer, the term of this patent is extended or adjusted under 35 Baldwin et al., “Synthesis of a Small Molecule Combina U.S.C. 154(b) by 0 days. torial Library Encoded with Molecular Tags,” J. Am. Chem. Soc. 117:5588-5589, 1995. (+ Supplementary Material, 21 This patent is Subject to a terminal dis- pages). claimer. Borchardt and Still, “Synthetic Receptor Binding Elucidated (21) Appl. No.: 08/898,564 with an Encoded Combinatorial Library,” J. Am. Chem. Soc. (22) Filed: Jul. 22, 1997 116:373-374, 1994. Related U.S. Application Data Brown et al., “A Single-bead decode Strategy using electro spray ionization mass spectrometry and a new photolabile (63) Continuation-in-part of application No. 08/786,834, filed on linker: 3-Amino-3-(2-nitrophenyl)propionic acid,' (60) Jan.Provisional 22, 1997, application now abandoned. No. 60/014,536, filed on Jan. 23, Molecular Diversity 1:4–12,Al 1995. E. approvisional application No. 60/020,487, filed on Ching et al., “Polymers as Surface-Based Tethers with 2 7 Photolyticy Triggers99. Enabling9. Laser-Induced Release/Des (51) Int. Cl.' ...... C12O 1/68 orption of Convalently Bound Molecules,” Bioconjugate (52) U.S. Cl 435/6 y Jug (58) FieldO X O of-- O Search------...... ------435/6;------536/26.6;- - - - Chemistryemistry 7(5):525-528,7(5) s 1996. 436/501 Ching et al., “Surface Chemistries Enabling Photoinduced Uncoupling/Desorption of Covalently Tethered Biomol (56) References Cited ecules,” J. Org. Chem. 61:3582–3583, 1996. U.S. PATENT DOCUMENTS Church et al., “New Technologies For Genome Sequencing 2.5 A : 31, Si ------. And Analysis,” in DOE Human Genome Program Contrac 4705016 A 11/1987 s ------530/389 tor-Grantee Workshop V, Sante Fe, New Mexico, Jan. 4762.770 A 8/1988 Smitman ...... 435/6 28-Feb. 1, 1996, available NTIS, CONF-960143, 1996, p. 4,775,619 A 10/1988 Urdea ...... 435/6 51. 4.942,124 A 7/1990 Church ...... 435/6 4,962,020 A 10/1990 Tabor et al...... 435/6 Li d 4,965,349 A 10/1990 Woo et al...... 536/27 (List continued on next page.) 4,994,372 A 2/1991 Tabor et al...... 435/6 4997928. A 3/1991 Hobbs, Jr...... 536/27 5,118,605 A 6/1992 Urdea ...... 435/6 Primary Examiner Scott W. Houtteman 5,135,870 A 8/1992 Williams et al...... 436/173 (74) Attorney, Agent, or Firm SEED Intellectual Property 5,149.625 A 9/1992 Church et al...... 435/6 Law Group PLLC 5,262,536 A 11/1993 Hobbs, Jr...... 546/25 5,266.466 A 11/1993 Tabor et al...... 435/91.5 (57) ABSTRACT 5,288,644 A 2/1994 Beavis et al...... 436/94 S.C. A S. ------i. 535: Tags and linkerS Specifically designed for a wide variety of 5,302,500 A 3.E. 4 EN 4. 56 nucleic acid reactions are disclosed, which are Suitable for a

5324,631 A 6/1994 Helentaris et al... 435/6 wide variety of nucleic acid reactions wherein Separation of 5,346,670 A 9/1994 Renzoni et al...... 422/52 nucleic acid molecules based upon size is required. 5,360,819 A 11/1994 Giese ...... 514/538 (List continued on next page.) 40 Claims, 44 Drawing Sheets US 6,613,508 B1 Page 2

U.S. PATENT DOCUMENTS OTHER PUBLICATIONS 5,403,708 A 4/1995 Brennan et al...... 435/6 Church and Kieffer-Higgins, “Multiplex DNA Sequencing, 5,409,811 A 4/1995 Tabor et al...... 435/6 Science 240:185-188, 1988. 5,451,463 A 9/1995 Nelson et al. . ... 428/402 Cobb and Novotny, “High-Sensitivity Peptide Mapping by 5,516,931 A 5/1996 Giese et al...... 560/59 Capillary Zone Electrophoresis and Microcolumn Liquid 5,547,835 A 8/1996 Köster ------... 435/6 Chromatography, Using Immobilized Trypsin for Protein 5,602.273 A 2/1997 Giese et al...... 560/60 5,604,097 A 2/1997 Brenner ...... Digestion." Anal. Chen. 61.2226-2231, 1989. 5,604,104 A 2/1997 Giese et al...... 4357. Colombo, Roberto, “Liquid-Phase Synthesis of Naturally 5,610,020 A 3/1997 Giese et al...... 435/7. Occurring Peptides, II. Syntheses of Three Mast Cell 5,622,824. A 4/1997 Köster ...... 435/6 Degranulating Tetradecapeptide Amides from Wasp Ven 5,635,400 A 6/1997 Brenner ...... 435/320.1 oms.” Hoppe-Seyler's Z. Physiol. Chem. 362: 1393-1403, 5,650.270 A 7/1997 Giese et al...... 435/6 1981. 5,654,413 A 8/1997 Brenner ...... 536/22.1 Covey et al., “The Determination of Protein, Oligonucle 5,674,716 A 10/1997 Tabor et al...... 435/91.1 otide and Peptide Molecular Weights by Ion-spray Mass 5,691,141. A 11/1997 Köster ...... 435/6 Spectrometry,” Rapid Communications. In Mass Spectrom 5,695,934. A 12/1997 Brenner ...... 435/6 etry 2(11):249-256, 1988. 5,700,921. A 12/1997 Westling et al...... 536/22.1 Geysen et al., “Isotope or mass encoding of combinatorial s: A E. E. Jr. et al...... E. libraries,” Chemistry & Biology 3(8):679-688, 1996. 582105s A 10/1998 Smith et al ------435/6 Giese et al., “Electrophore Mass Labels For TOF-MS DNA 5,851,7652 2 A 12/1998 Köster ...... ------... - - 435/6 Sequencing,”ss in Proceedings of the 44ti ASMS Conference 5,856,097 A 1/1999 Pinkel et al...... 435/6 On Mass Spectrometry and Allied Topic, Portland, Oregon, 5,863,722 A 1/1999 Brenner ...... 435/6 May 12–16, 1996, p. 673. 5,872,003 A 2/1999 Köster ...... 435/283.1 Roger W. Giese, “Electrophoric release tags: ultrasensitive 5,908,745 A 6/1999 Mirzabekov et al...... 435/6 molecular labels providing multiplicity,” Trends in Analyti 5,952,654 A 9/1999 Giese ...... 250/288 cal Chemistry 2(7):166-168, 1983. 5,962,223. A 10/1999 Whiteley et al...... 435/6 Greenberg and Gilmore, “Cleavage of Oligonucleotides 6,013,431 A 1/2000 Söderlund et al...... 435/5 from Solid-Phase Supports Using o-Nitrobenzyl Photo 6,033,909 A 3/2000 Uhlmann et al...... 435/375 chemistry,” J. Org. Chem. 59:746–753, 1994. 6,087,095 A 7/2000 Rosenthal et al. ------435/6 Marc M. Greenberg, “Photochemical Cleavage of Oligo 6,087,186 A 7/2000 Cargill et al...... 436/518 nucleotides From Solid Phase Supports.” Tetrahedron Let 6,312,893 B1 11/2001 Van Ness ...... 435/6 ters 34(2):251-254, 1993. s Holmes and Jones, “Reagents for Combinatorial Organic FOREIGN PATENT DOCUMENTS Synthesis: Development of a New O-Nitrobenzyl Photo EP 127 154 A2 12/1984 labile Linker for Solid Phase Synthesis,” J. Org. Chem. EP 251786 A2 1/1988 60:2318-2319, 1995. (+ Supplementary Material, 7 pages). EP 3OO 730 A1 1/1989 Jacobson et al., “Applications of Mass Spectrometry to DNA EP O 351 138 : 1/1990 Sequencing,' Genetic Analysis Techniques and Applications EP 36O 940 A2 4/1990 EP 401821 A1 12/1990 8(8):223–229, 1991. EP 5O2 595 9/1992 Jane et al., “High-Performance Liquid Chromatographic EP 514927 A1 11/1992 Analysis. Of Basic Drugs on Silica Columns Using Non EP 539 343 A1 4/1993 Aqueous Ionic Eluents,” Journal of Chromatography EP 639 647 2/1995 323:191-225, 1985. EP 711362 5/1996 Kremsky et al., “Immobilization of DNA via oligonucle JP 6-289.018 10/1994 otides containing an aldehyde or carboxylic acid group at the WO WO 89/12694 12/1989 5'terminus,” Nucleic Acids Research 15(7):2891-2909, WO WO 90/13666 11/1990 1987. WO WO 92/O2528 2/1992 WO WO 92/O2638 2/1992 Little et al., “Rapid Sequencing of Oligonucleotides by WO WO 92/10587 6/1992 High-Resolution Mass Spectrometry,” J. Am. Chem. Soc. WO WO 93/20233 10/1993 116:4893-4897, 1994. WO WO 93/20236 10/1993 Lloyd-Williams et al., “Convergent Solid-Phase Peptide WO WO 93/21340 10/1993 Synthesis,” Tetrahedron 49(48): 11065-11133, 1993. WO WO 94/08051 4/1994 WO 94/16101 7/1994 Musch et al., “Expert System For Pharmaceutical Analysis. WO WO 94/16101 7/1994 I. Selection Of The Detection System. In High-Performance WO WO 94/21822 9/1994 Liquid Chromatographic Analysis: UV Versus Amperomet WO WO 94/28418 12/1994 ric Detection,” Journal of Chromatography 348:97-110, WO WO95/04160 2/1995 1985. WO WO95/06752 3/1995 Nashabeh and El Rassi, “Enzymorphoresis of nucleic acids WO WO95/11961 5/1995 by tandem capillary enzyme reactor-capillary Zone electro WO WO95/14108 5/1995 WO WO95/25737 9/1995 phoresis,” Journal of Chromatography 596:251-264, 1992. WO WO95/28640 10/1995 Nestler et al., “A General Method for Molecular Tagging of WO WO 96/OO378 1/1996 Encoded Combinatorial Chemistry Libraries,” J. Org. WO WO 96/22966 8/1996 Chem. 59:4723-4724, 1994. US 6,613,508 B1 Page 3

Ordoukhanian and Taylor, “Design and Synthesis of a Ver Thiele and Fahrenholz, “Photocleavable Biotinylated Lig satile Photocleavable DNA Building Block. Application to nads for Affinity Chromatography,” Analytical Biochemistry Phototriggered Hybridization,” J. Am. Chem. Soc. 117: 218:330–337, 1994. 9570-9571, 1995. Valaskovic et al., “Attomole-Sensitivity Electrospray V. N. Rajasekharan Pillai, “Photoremovable Protecting Source for Large-Molecule Mass Spectrometry,” Anal. Groups in Organic Synthesis,” Synthesis 1:1-26, 1980. Chem, 67:3802–3805, 1995. Rich and Gurwara, “Preparation of a New O-Nitrobenzyl Voivodov et al., “Surface Arrays of Energy Absorbing Resin for Solid-Phase Synthesis of tert-Butyloxycarbonyl Polymers Enabling Covalent Attachment of Biomolecules Protected Peptide Acids,” J. Am. Chem. Soc. 97.1575-1579, for Subsequent Laser-Induced Uncoupling/DeSorption,” 1975. Rock and Chan, “Synthesis and Photolysis Properties of a Tetrahedron Letters 37(32):5669–5672, 1996. Photolabile Linker Based on 3'-Methoxybenzoin,” J. Org. Charles Hignite, “Nucleic Acids And Derivatives,” in Bio Chem. 61:1526–1529, 1996. chemical Applications Of Mass Spectrometry, First Supple Sanger et al., “DNA sequencing with chain-terminating mentary Volume, Waller et al. (eds.), John Wiley & Sons, inhibitors.” Proc. Natl. Acad. Sci. USA 74(12):5463-5467, New York, 1981, pp. 527-566. 1977. Yoo and Greenberg, “Synthesis of Oligonucleotides Con Senter et al., “Novel Photocleavable Protein Crosslinking taining 3'-Alkyl Carboxylic Acids. Using Universal, Photo Reagents And Their Use In The Preparation Of Antibody labile Solid Phase Synthesis Supports,” J. Org. Chem. Toxing Conjugates, Photochemistry and Photobiology 60:3358-3364, 1995. 42(3):231-237, 1985. Zablocki et al., “Potent in Vitro and in Vivo Inhibitors of Sumer et al., “Factors Determining Relative Sensitivity of Platlet Aggregation Based Upon the Arg-Gly-Asp Analytes in Positive Mode Atmospheric Pressure Ionization Sequence of Fibronogen. (Aminobenzamidino)Succinyl Mass Spectrometry,” Anal. Chem. 60:1300–1307, 1988. (ABAS) Series of Orally Active Fibrinogen Receptor Simon J. Teague, “Facile Synthesis of a o-Nitrobenzyl Antagonists,” J. Med. Chem. 38:2378-2394, 1995. Photolabile Linker for Combinatorial Chemistry,” Tetrahe dron Letters 37(32):5751-5754, 1996. * cited by examiner

U.S. Patent US 6,613,508 B1

W U.S. Patent Sep. 2, 2003 Sheet 3 of 44 US 6,613,508 B1 U.S. Patent US 6,613,508 B1 U.S. Patent Sep. 2, 2003 Sheet S of 44 US 6,613,508 B1

n Co P

2. w

O 2. v 1. d O vo > Cd er C CD eye se 5 t se > O 3. d as wer lu Od H Ol 2. Og w O ce R l C ce O l H O ? -a-

d D l

s Cd

O O

a. rf CP O c li U.S. Patent Sep. 2, 2003 Sheet 6 of 44 US 6,613,508 B1 •!!!Cae, ,W0

U.S. Patent Sep. 2, 2003 Sheet 8 of 44 US 6,613,508 B1

99-||IIIA ZON 99–|} ==,======`N O-O ZON 00dBIS

HN

ZON 0HN 99-||IA

O*: 0 0NH00uu3 U.S. Patent Sep. 2, 2003 Sheet 9 of 44 US 6,613,508 B1

HN

|dBIS

U.S. Patent US 6,613,508 B1

U.S. Patent Sep. 2, 2003 Sheet 14 of 44 US 6,613,508 B1 O-O 0 ZON 99-||IIIA

NH

99-||IA NH U.S. Patent Sep. 2, 2003 Sheet 15 of 44 US 6,613,508 B1

Z 99–|XI ZON

|d'HIS NH

HHººZON 0 0°,(NH 99–|x U.S. Patent Sep. 2, 2003 Sheet 16 of 44 US 6,613,508 B1

III

II

WWN‘DIWH

N() ZON•ººo AHN0 IA NH00uu3 U.S. Patent Sep. 2, 2003 Sheet 17 of 44 US 6,613,508 B1

d OC : r O O w n O2 d D 2

C a

O

SeS is CO5 -P M i v 1. -( f 2 O d nry O o on O 2. D Z

O

O 2. ; O O s

U.S. Patent Sep. 2, 2003 Sheet 20 of 44 US 6,613,508 B1

Q–N H

99-||IIIA NH008

99-||IIA NH008 U.S. Patent Sep. 2, 2003 Sheet 21 of 44 US 6,613,508 B1

C C 2 A co o al Cld a O

p d d P . 21 d SS a r C V n C >< same O s 21 t r d

C 21 rear 5 1. d O 5 & d

s Cld als, L. 9. MPIO - H

on A O 2.

Z

C 2 2 on O Cd rere o >< 2 Z r 21 re n C O 5-&C O s U.S. Patent Sep. 2, 2003 Sheet 22 of 44 US 6,613,508 B1

III AI

TR?JIS ZONZONQ–(o3. 0O-Od31S U.S. Patent Sep. 2, 2003 Sheet 23 of 44 US 6,613,508 B1

99–|XI

Su04008)99

IIIA

9d31S ==,======

{{{} IIA

U.S. Patent Sep. 2, 2003 Sheet 25 of 44 US 6,613,508 B1

F H

C O F O. NH 9 NH -

1 O

O 4. R -36 N Q Oligonucleotide 1-36 X1-36 5'-Aminohexyl-tailed oligonculeotides FROM EXAMPLE A XI 1-36

STEP A

100 mM Sodium Borate pH 8.5

NO 2 HN-(CH2)5-0--0 BASE O O O NH -N NH O sHN O Oligonucleotidet -36 N 0 R1-56 N 5'-CMST-Oligonucleotide Conjugate XII 1-36 Fig. 9

U.S. Patent Sep. 2, 2003 Sheet 27 of 44 US 6,613,508 B1

L w o a. n ?

O R N o o- DS C C- o 9. Sp P O S Ol-O d d S d SS . ca S g Y w S o sS 2 n N.so d E N se

s a / - O a C 2 se M d n t al r on a O O 2. O

O 2

2 n

Z U.S. Patent US 6,613,508 B1

09.Z

0|Z

| U.S. Patent US 6,613,508 B1

‘ŽIHŽI

ZIMO

0|Z U.S. Patent Sep. 2, 2003 Sheet 30 of 44 US 6,613,508 B1

H0–g–NUO—(0,0—30)– U.S. Patent Sep. 2, 2003 Sheet 31 of 44 US 6,613,508 B1

A: 100mM dimethyloctylammonium trifuoroacetate B: 100mM DMOATFA/90% acetonitrile 18%B to 85%B in 35 minutes 500

100bp and 200bp standards

0.0 60.0 Fig. 14A

A: 50mM Dimethylheptylammonium Acetate pH 6.6 B: 50mM DMHepAA/95% ACN 25% B to 95%B in 60 minutes 50C pBR522 Hael digest

15 20 25 Minutes Fig. 14B U.S. Patent Sep. 2, 2003 Sheet 32 of 44 US 6,613,508 B1

O 9 8 A: 50mM Dimethylhexylammonium Acetate

B: 50mM DMHAA/99.5% acetonitrile 7 45C 6 pBR522 Hael digest 5 4 3 2 1 O 5 10 15 20 Minutes Fig. 14C

A: 150mM dimethylaminobutyl acetate B: 150mM DMABA/85% acetonitrile 20%B to 58%B in 60 minutes 40C 10- pBR322 Haeli digest

O 25 50 Minutes Fig. 14D U.S. Patent Sep. 2, 2003 Sheet 33 of 44 US 6,613,508 B1

A: 100mM dimethylisopropylammonium acetate B: 100mM DMIPAA/90% dicetonitrile 5%B to 12%B in 55 minutes 45C pBR522 Hoell digest

0.0 25.0 Fig. 14E

A: 200mM dimethylcyclohexylammonium acetate B: 200mM DMCHAA/90% acetonitrile 11%B to 40%B in 35 minutes 450 pBR322 Hael digest

0.0 40,0 Fig. 14F U.S. Patent Sep. 2, 2003 Sheet 34 of 44 US 6,613,508 B1

A: 100mM 1-methylpiperidine acetate B: 100mM mepipac/90% acetonitrile 11%B to 51%B in 40 minutes 50C 100bp and 200bp standards

0.0 45.0 Fig. 140

A: 100mM 1-methylpyrrolidone B: 100mM "methylpyrr"/70% acetonitrile 14%B to 100%B in 60 minutes 500 100bp and 200bp standards

0.0 65.0 Fig. 14.H U.S. Patent Sep. 2, 2003 Sheet 35 of 44 US 6,613,508 B1

A: 100mM TEAA B: 100mM TEAA/25% acetonitrile 35%B to 65%B in 60 minutes 50C pBR322 Hael digest

0.0 65.0 Fig. 14I U.S. Patent Sep. 2, 2003 Sheet 36 of 44 US 6,613,508 B1 ~~'

9/9/

80103130}}010ETT003.0|A30 ff| U.S. Patent Sep. 2, 2003 Sheet 37 of 44 US 6,613,508 B1 ~~'

OTdH U.S. Patent Sep. 2, 2003 Sheet 38 of 44 US 6,613,508 B1

~^

SSWW SW3||SÅSÅWSSW 30WAWET0010Hd

U.S. Patent Sep. 2, 2003 Sheet 41 of 44 US 6,613,508 B1

5

5. 2

O

s U.S. Patent Sep. 2, 2003 Sheet 42 of 44 US 6,613,508 B1

HN-B00 H-BOC

Br FMOC OH 2N-121 FMOC 0N-1s H DEA H O O 9 10 H-BOC

pyrrolidine 10 re-as-a- O

O 11

O NH

OH O -N O

HATU,NMM -N O 12

NH, HC

O HCl 1,4-dioxane O -N O 13 Fig. 20A U.S. Patent Sep. 2, 2003 Sheet 43 of 44 US 6,613,508 B1

OE HO O OE O HATU,NMM -N O 14

14. --N NOOH O OH -N 0 15

o-na OE H2N O 15 + O O NO HATU N-1S NMM -N O O N02 16 rolO O OE 16 1 N NOOH O's OH -N H O O Fig. 20B N0, 17 U.S. Patent Sep. 2, 2003 Sheet 44 of 44 US 6,613,508 B1

HN-B00 O N-B00

O EO co O O -a- N-1S HN N1s HATUNMM N N 0 1 EO 0 18 NH, HC

O HC 1,4-dioxOne 18 o H 0N-1s EO O 19 O

O ONa O OH O -N N-1N —- Elo 0 20 HATU,NMM

O

HN NS O 1 N NOOH OH 20 NN- EO ors 0 21 Fig. 21 US 6,613,508 B1 1 2 METHODS AND COMPOSITIONS FOR A more Sensitive detection technique than dyes uses a ANALYZING NUCLEIC ACID MOLECULES radioactive (or nonradioactive) label. Typically, either a UTILIZING SIZING TECHNIQUES radiolabeled nucleotide or a radiolabeled primer is included in the PCR reaction. Following separation, the radiolabel is CROSS-REFERENCE TO RELATED “visualized' by autoradiography. Although more Sensitive, APPLICATIONS the detection Suffers from film limitations, Such as reciproc ity failure and non-linearity. These limitations can be over This application is a continuation-in-part of U.S. patent come by detecting the label by phosphor image analysis. application Ser. No. 08/786,834, filed Jan. 22, 1997, now However, radiolabels have Safety requirements, increasing abandoned; which application claims the benefit of provi resource utilization and necessitating Specialized equipment sional application No. 60/014,536, filed Jan. 23, 1996, and and perSonnel training. For Such reasons, the use of nonra of provisional application No. 60/020,487, filed Jun. 4, dioactive labels has been increasing in popularity. In Such 1996, all of which are incorporated by reference herein in Systems, nucleotides contain a label, Such as a fluorophore, their entirety. biotin or digoxin, which can be detected by an antibody or 15 other molecule (e.g., other member of a ligand pair) that is TECHNICAL FIELD labeled with an enzyme reactive with a chromogenic Sub The present invention relates generally to methods and Strate. These Systems do not have the Safety concerns as compositions for analyzing nucleic acid molecules, and described above, but use components that are often labile more Specifically to tags which may be utilized in a wide and may yield nonspecific reactions, resulting in high back variety of nucleic acid reactions, wherein Separation of ground (i.e., low signal-to-noise ratio). nucleic acid molecules based on Size is required. The present invention provides novel compositions and methods which may be utilized in a wide variety of nucleic BACKGROUND OF THE INVENTION acid reactions, and further provides other related advantages. Detection and analysis of nucleic acid molecules are among the most important techniques in biology. Such 25 SUMMARY OF THE INVENTION techniques are at the heart of molecular biology and play a Briefly Stated, the present invention provides composi rapidly expanding role in the rest of biology. tions and methods which may be utilized in a wide variety Generally, one type of analysis of nucleic acid reactions of ligand pair reactions wherein Separation of molecules of involves Separation of nucleic acid molecules based on interest, Such as nucleic acid molecules, based on size is length. For example, one widely used technique, polymerase required. Representative examples of methods which may chain reaction (PCR) (see, U.S. Pat. Nos. 4,683,195, 4,683, be enhanced given the disclosure provided herein include 202, and 4,800,159) has become a widely utilized technique PCR, differential display, RNA fingerprinting, PCR-SSCP, to both identify Sequences present in a Sample and to oligo litations assays, nuclease digestion methods (e.g., exo synthesize DNA molecules for further manipulation. 35 and endo-nuclease based assays), and dideoxy fingerprint Briefly, in PCR, DNA sequences are amplified by enzy ing. The methods described herein may be utilized in a wide matic reaction that Synthesizes new DNA Strands in either a array of fields, including, for example, in the development of geometric or linear fashion. Following amplification, the clinical or research-based diagnostics, the determination of DNA sequences must be detected and identified. Because of polymorphisms, and the development of genetic maps. non-specific amplifications, which would otherwise confuse 40 Within one aspect of the present invention, methods are analysis, or the need for purity, the PCR reaction products provided for determining the identity of a nucleic acid are generally Subjected to Separation prior to detection. molecule, comprising the Steps of (a) generating tagged Separation based on the size (i.e., length) of the products nucleic acid molecules from one or more Selected target yields the most useful information. The method giving the nucleic acid molecules, wherein a tag is correlative with a highest resolution of nucleic acid molecules is electro 45 particular nucleic acid fragment and detectable by non phoretic separation. In this method, each individual PCR fluorescent spectrometry or potentiometry, (b) separating the reaction is applied to an appropriate gel and Subjected to a tagged fragments by size, (c) cleaving the tags from the Voltage potential. The number of Samples that can be pro tagged fragments, and (d) detecting tags by non-fluorescent cessed is limited by the number of wells in the gel. On most Spectrometry or potentiometry, and therefrom determining gel apparatus, from approximately 10 to 64 Samples can be 50 the identity of the nucleic acid molecules. Separated in a single gel. Thus, processing large numbers of Within a related aspect of the invention, methods are Samples is both labor and material intensive. provided for detecting a Selected nucleic acid molecule, Electrophoretic Separation must be coupled with Some comprising the steps of (a) combining tagged nucleic acid detection System in order to obtain data. Detection Systems probes with target nucleic acid molecules under conditions of nucleic acids commonly, and almost exclusively, utilize 55 and for a time Sufficient to permit hybridization of a tagged an intercalating dye or radioactive label, and less frequently, nucleic acid probe to a complementary Selected target a nonradioactive label. Intercalating dyes, Such as ethidium nucleic acid Sequence, wherein a tagged nucleic acid probe bromide, are simple to use. The dye is included in the gel is detectable by non-fluorescent Spectrometry or matrix during electrophoresis or, following electrophoresis, potentiometry, (b) altering the size of hybridized tagged the gel is Soaked in a dye-containing Solution. The dye can 60 probes, unhybridized probes or target molecules, or the be directly visualized in Some cases, but more often, and for probe: target hybrids, (c) separating the tagged probes by ethidium bromide in particular, is excited by light (e.g., UV) size, (d) cleaving tags from the tagged probes, and (e) to fluoresce. In spite of this apparent ease of use, Such dyes detecting the tags by non-fluorescent Spectrometry or have Some notable disadvantages. First, the dyes are insen potentiometry, and therefrom detecting the Selected nucleic Sitive and there must be a large mass amount of nucleic acid 65 acid molecule. molecules in order to visualize the products. Second, the Within further aspects methods are provided for genotyp dyes are typically mutagenic or carcinogenic. ing a Selected organism, comprising the Steps of (a) gener US 6,613,508 B1 3 4 ating tagged nucleic acid molecules from a Selected target Spectrometry, electron ionization mass spectrometry, fast molecule, wherein a tag is correlative with a particular atom bombard ionization mass spectrometry, MALDI mass fragment and may be detected by non-fluorescent spectrom Spectrometry, photo-ionization time-of-flight mass etry or potentiometry, (b) separating the tagged molecules by Spectrometry, laser droplet mass spectrometry, MALDI Sequential length, (c) cleaving the tag from the tagged 5 TOF mass spectrometry, APCI maSS Spectrometry, nano molecule, and (d) detecting the tag by non-fluorescent Spray mass spectrometry, nebulised Spray ionization mass Spectrometry or potentiometry, and therefrom determining Spectrometry, chemical ionization mass spectrometry, reso nance ionization mass spectrometry, Secondary ionization the genotype of the organism. mass Spectrometry and thermospray mass spectrometry. Within another aspect, methods are provided for geno Within yet other embodiments of the invention, the target typing a selected organism, comprising the steps of (a) molecules, hybridized tagged probes, unhybridized probes combining a tagged nucleic acid molecule with a Selected or target molecules, probe: target hybrids, or tagged nucleic target molecule under conditions and for a time Sufficient to acid probes or molecules may be separated from other permit hybridization of the tagged molecule to the target molecules utilizing methods which discriminate between the molecule, wherein a tag is correlative with a particular Size of molecules (either actual linear size, or three fragment and may be detected by non-fluorescent spectrom 15 dimensional size). Representative examples of Such methods etry or potentiometry, (b) separating the tagged fragments by include gel electrophoresis, capillary electrophoresis, micro Sequential length, (c) cleaving the tag from the tagged channel electrophore SiS, HPLC, Size eXclusion fragment, and (d) detecting the tag by non-fluorescent spec chromatography, filtration, polyacrylamide gel trometry or potentiometry, and therefrom determining the electrophoresis, liquid chromatography, reverse size exclu genotype of the organism. Sion chromatography, ion-exchange chromatography, Within the context of the present invention it should be reverse phase liquid chromatography, pulsed-field understood that “biological Samples' include not only electrophoresis, field-inversion electrophoresis, dialysis, and Samples obtained from living organisms (e.g., mammals, fluorescence-activated liquid droplet Sorting. Alternatively, fish, bacteria, parasites, viruses, fungi and the like) or from the target molecules, hybridized tagged probes, unhybrid the environment (e.g., air, water or Solid Samples), but 25 ized probes or target molecules, probe: target hybrids, or biological materials which may be artificially or Syntheti tagged nucleic acid probes or molecules may be bound to a cally produced (e.g., phage libraries, organic molecule Solid Support (e.g., hollow fibers (Amicon Corporation, libraries, pools of genomic clones, cDNA clones, RNA Danvers, Mass.), beads (Polysciences, Warrington, Pa.), clones, or the like). Representative examples of biological magnetic beads (Robbin Scientific, Mountain View, Calif.), Samples include biological fluids (e.g., blood, Semen, cere plates, dishes and flasks (Coming Glass Works, Coming, bral spinal fluid, urine), biological cells (e.g., stem cells, B N.Y.), meshes (Becton Dickinson, Mountain View, Calif.), or T cells, liver cells, fibroblasts and the like), and biological tissues. Finally, representative examples of organisms that screens and solid fibers (see Edelman et al., U.S. Pat. No. may be genotyped include virtually any unicellular or mul 3.843,324; see also Kuroda etyal., U.S. Pat. No. 4.416,777), 35 membranes (Millipore Corp., Bedford, Mass.), and ticellular organism, Such as warm-blooded animals, mam dipsticks). If the first or Second member, or exposed nucleic mals or vertebrates (e.g., humans, chimps, macaques, acids are bound to a Solid Support, within certain embodi horses, cows, pigs, sheep, dogs, cats, rats and mice, as well ments of the invention the methods disclosed herein may as cells from any of these), bacteria, parasites, viruses, fungi further comprise the Step of Washing the Solid Support of and plants. 40 unbound material. Within various embodiments of the above-described Within other embodiments, the tagged nucleic acid mol methods, the nucleic acid probes and or molecules of the ecules or probes may be cleaved by a methods Such as present invention may be generated by, for example, a chemical, oxidation, reduction, acid-labile, base labile, ligation, cleavage or extension (e.g., PCR) reaction. Within enzymatic, electrochemical, heat and photolabile methods. other related aspects the nucleic acid probes or molecules 45 Within further embodiments, the Steps of Separating, cleav may be tagged by non-3' tagged oligonucleotide primers ing and detecting may be performed in a continuous manner, (e.g., 5'-tagged oligonucleotide primers) or dideoxynucle for example, on a single device which may be automated. otide terminators. Within certain embodiments of the invention, the size of Within other embodiments of the invention, 4, 5, 10, 15, the hybridized tagged probes, unhybridized probes or target 20, 25, 30, 35, 40, 45,50, 60, 70, 80,90, 100, 200, 250, 300, 50 molecules, or probe: target hybrids are altered by a method 350, 400, 450, or greater than 500 different and unique Selected from the group consisting of polymerase extension, tagged molecules may be utilized within a given reaction ligation, exonuclease digestion, endonuclease digestion, Simultaneously, wherein each tag is unique for a Selected restriction enzyme digestion, Site-specific recombinase nucleic acid molecule or fragment, or probe, and may be digestion, ligation, mismatch Specific nuclease digestion, Separately identified. 55 methylation-specific nuclease digestion, covalent attach Within further embodiments of the invention, the tag(s) ment of probe to target and hybridization. may be detected by fluorometry, mass spectrometry, infrared The methods an compositions described herein may be Spectrometry, ultraViolet spectrometry, or, potentioStatic utilized in a wide variety of applications, including for amperometry (e.g., utilizing coulometric or amperometric example, identifying PCR amplicons, RNA fingerprinting, detectors). Representative examples of Suitable spectromet 60 differential display, Single-Strand conformation polymor ric techniques include time-of-flight mass Spectrometry, phism detection, dideoxyfingerprinting, restriction maps and quadrupole mass spectrometry, magnetic Sector mass Spec restriction fragment length polymorphisms, DNA trometry and electric Sector mass spectrometry. Specific fingerprinting, genotyping, mutation detection, oligonucle embodiments of Such techniques include ion-trap mass otide ligation assay, Sequence Specific amplifications, for Spectrometry, electrospray ionization mass Spectrometry, 65 diagnostics, forensics, identification, developmental ion-spray maSS Spectrometry, liquid ionization mass biology, biology, molecular medicine, toxicology, animal Spectrometry, atmospheric pressure ionization maSS breeding, US 6,613,508 B1 S 6 These and other aspects of the present invention will wherein Separation of nucleic acid molecules based on size become evident upon reference to the following detailed is required. The present methods permit the Simultaneous description and attached drawings. In addition, various detection of molecules of interest, which include nucleic references are set forth below which describe in more detail acids and fragments, proteins, peptides, etc. certain procedures or compositions (e.g., plasmids, etc.), and are therefore incorporated by reference in their entirety. Briefly Stated, in one aspect the present invention pro vides compounds wherein a molecule of interest, or precur BRIEF DESCRIPTION OF THE DRAWINGS sor thereto, is linked via a labile bond (or labile bonds) to a FIGS. 1A, 1B, and 1C depicts the flowchart for the tag. Thus, compounds of the invention may be viewed as Synthesis of pentafluorophenyl esters of chemically cleav having the general formula: able mass spectroScopy tags, to liberate tags with carboxyl amide termini. FIGS. 2A, 2B, and 2C depicts the flowchart for the Synthesis of pentafluorophenyl esters of chemically cleav 15 wherein T is the tag component, L is the linker component able mass spectroScopy tags, to liberate tags with carboxyl that either is, or contains, a labile bond, and X is either the acid termini. molecule of interest (MOI) component or a functional group FIGS. 3A, 3B, and 3C; 4A, 4B, and 4C; 5A, 5B, and component (L.) through which the MOI may be joined to 5C;-6A, 6B, and 6C; and 8A, 8B, and 8C depict the T L. Compounds of the invention may therefore be rep flowchart for the synthesis of tetrafluorophenyl esters of a Set of 36 photochemically cleavable mass spectroscopy tags. resented by the more Specific general formulas: FIGS. 7A, 7B, and 7C depicts the flowchart for the T-L-MOI Synthesis of a Set of 36 amine-terminated photochemically cleavable mass spectroscopy tags. and FIG. 9 depicts the synthesis of 36 photochemically cleav 25 able mass SpectroScopy tagged oligonucleotides made from the corresponding Set of 36 tetrafluorophenyl esters of photochemically cleavable mass spectroscopy tag acids. T-L-L FIGS. 10A and 10B depicts the synthesis of 36 photo For reasons described in detail below, sets of T-L-MOI chemically cleavable mass spectroScopy tagged oligonucle compounds may be purposely Subjected to conditions that otides made from the corresponding Set of 36 amine cause the labile bond(s) to break, thus releasing a tag moiety terminated photochemically cleavable mass spectroscopy from the remainder of the compound. The tag moiety is then tags. characterized by one or more analytical techniques, to FIG. 11 illustrates the simultaneous detection of multiple 35 thereby provide direct information about the structure of the tags by mass Spectrometry. tag moiety, and (most importantly) indirect information FIG. 12 shows the mass spectrogram of the alpha-cyano about the identity of the corresponding MOI. matrix alone. AS a simple illustrative example of a representative com FIG. 13 depicts a modularly-constructed tagged nucleic pound of the invention wherein L is a direct bond, reference acid fragment. 40 is made to the following structure (i): FIGS. 14A-14I show the separation of DNA fragments by HPLC using a variety of different buffer solutions. Structure (i) FIG. 15 is a Schematic representation of genetic finger printing and differential display Systems in accordance with an exemplary embodiment of the present invention. 45 N N -(Nucleic Acid Fragment) FIG. 16 is a Schematic representation of genetic finger 21 printing and differential display Systems in accordance with an exemplary embodiment of the present invention. FIG. 17 is a Schematic representation of assay Systems in | accordance with an exemplary embodiment of the present 50 Linker (L) component invention. FIG. 18 is a Schematic representation of assay Systems in accordance with an exemplary embodiment of the present Tag component Molecule of Interest invention. component FIGS. 19A and 19B illustrate the preparation of a cleav 55 able tag of the present invention. FIGS. 20A and 20B illustrate the preparation of a cleav In structure (i), T is a nitrogen-containing polycyclic aro able tag of the present invention. matic moiety bonded to a carbonyl group, X is a MOI (and FIG. 21 illustrates the preparation of an intermediate 60 Specifically a nucleic acid fragment terminating in an amine compound useful in the preparation of a cleavable tag of the group), and L is the bond which forms an amide group. The invention. amide bond is labile relative to the bonds in T because, as recognized in the art, an amide bond may be chemically DETAILED DESCRIPTION OF THE cleaved (broken) by acid or base conditions which leave the INVENTION 65 bonds within the tag component unchanged. Thus, a tag AS noted above, the present invention provides compo moiety (i.e., the cleavage product that contains T) may be Sitions and methods for analyzing nucleic acid molecules, released as shown below: US 6,613,508 B1 8 -continued Structure (i) O NO

1. (Nucleic Acid Fragment) 5 Sn (Nucleic Acid N N Fragment) H Tag Moiety Remainder of the Compound

sis or base The invention thus provides compounds which, upon exposure to appropriate cleavage conditions, undergo a O cleavage reaction So as to release a tag moiety from the N (Nucleic Acid Fragment) remainder of the compound. Compounds of the invention 21 OH HN1 15 may be described in terms of the tag moiety, the MOI (or precursor thereto, L), and the labile bond(s) which join the two groups together. Alternatively, the compounds of the invention may be described in terms of the components from N C which they are formed. Thus, the compounds may be described as the reaction product of a tag reactant, a linker Tag Moiety Reminder of the Compound reactant and a MOI reactant, as follows. The tag reactant consists of a chemical handle (T) and a However, the linker L. may be more than merely a direct variable component (T), So that the tag reactant is seen to bond, as shown in the following illustrative example, where have the general Structure: reference is made to another representative compound of the 25 T-T, invention having the structure (ii) shown below: To illustrate this nomenclature, reference may be made to Structure (iii), which shows a tag reactant that may be used Structure (ii) to prepare the compound of Structure (ii). The tag reactant having structure (iii) contains a tag variable component and a tag handle, as shown below: Structure (iii) (Nucleic Acid O Fragment) 35 N 21 A N

40 It is well-known that compounds having an ortho nitrobenzylamine moiety (see boxed atoms within structure Tag Variable Tag (ii)) are photolytically unstable, in that exposure of Such Component Handle compounds to actinic radiation of a Specified wavelength 45 will cause Selective cleavage of the benzylamine bond (see In structure (iii), the tag handle (-C(=O)-A) simply provides an avenue for reacting the tag reactant with the bond denoted with heavy line in structure (ii)). Thus, struc linker reactant to form a T-L moiety. The group “A” in ture (ii) has the same T and MOI groups as Structure (i), Structure (iii) indicates that the carboxyl group is in a however the linker group contains multiple atoms and bonds 50 chemically active State, So it is ready for coupling with other within which there is a particularly labile bond. Photolysis handles. “A” may be, for example, a hydroxyl group or of Structure (ii) thus releases a tag moiety (T-containing pentafluorophenoxy, among many other possibilities. The moiety) from the remainder of the compound, as shown invention provides for a large number of possible tag below. handles which may be bonded to a tag variable component, 55 as discussed in detail below. The tag variable component is Structure (ii) thus a part of “T” in the formula T-L-X, and will also be part of the tag moiety that forms from the reaction that cleaves L. AS also discussed in detail below, the tag variable com ponent is So-named because, in preparing Sets of compounds N (Nucleic Acid according to the invention, it is desired that members of a Set N Fragment) have unique variable components, So that the individual H members may be distinguished from one another by an analytical technique. AS one example, the tag variable 65 component of structure (iii) may be one member of the y following Set, where members of the Set may be distin guished by their UV or mass spectra: US 6,613,508 B1 10 activated carboxyl group of the tag reactant (iii) may cleanly react with the amine group of the linker reactant (iv) to form an amide bond and give a compound of the formula T-L- Lh. The MOI reactant is a suitably reactive form of a molecule of interest. Where the molecule of interest is a nucleic acid fragment, a Suitable MOI reactant is a nucleic acid fragment bonded through its 5’ hydroxyl group to a phosphodiester group and then to an alkylene chain that terminates in an amino group. This amino group may then react with the carbonyl group of structure (iv), (after, of course, deprotect ing the carbonyl group, and preferably after Subsequently activating the carbonyl group toward reaction with the amine group) to thereby join the MOI to the linker. 15 When viewed in a chronological order, the invention is Seen to take a tag reactant (having a chemical tag handle and a tag variable component), a linker reactant (having two chemical linker handles, a required labile moiety and 0-2 optional labile moieties) and a MOI reactant (having a molecule of interest component and a chemical molecule of interest handle) to form T L-MOI. Thus, to form T -L- MOI, either the tag reactant and the linker reactant are first reacted together to provide T-L-L, and then the MOI reactant is reacted with T-L-L, So as to provide T-L- 25 MOI, or else (less preferably) the linker reactant and the Likewise, the linker reactant may be described in terms of MOI reactant are reacted together first to provide L-L- its chemical handles (there are necessarily at least two, each MOI, and then L-L-MOI is reacted with the tag reactant of which may be designated as L.) which flank a linker to provide T-L-MOI. For purposes of convenience, com labile component, where the linker labile component con pounds having the formula T L-MOI will be described in sists of the required labile moiety (L) and optional labile terms of the tag reactant, the linker reactant and the MOI moieties (L' and L), where the optional labile moieties reactant which may be used to form Such compounds. Of effectively serve to separate Li from the handles L., and the course, the same compounds of formula T-L-MOI could required labile moiety serves to provide a labile bond within be prepared by other (typically, more laborious) methods, the linker labile component. Thus, the linker reactant may be and still fall within the scope of the inventive T-L-MOI Seen to have the general formula: 35 compounds. L-L-L-L-L In any event, the invention provides that a T-L-MOI compound be Subjected to cleavage conditions, Such that a The nomenclature used to describe the linker reactant may tag moiety is released from the remainder of the compound. be illustrated in View of structure (iv), which again draws The tag moiety will comprise at least the tag variable from the compound of Structure (ii): 40 component, and will typically additionally comprise Some or all of the atoms from the tag handle, Some or all of the atoms Structure (iv) from the linker handle that was used to join the tag reactant NO2 to the linker reactant, the optional labile moiety L' if this group was present in T-L-MOI, and will perhaps contain 45 Some part of the required labile moiety Li depending on the N H Y precise structure of L and the nature of the cleavage Linker chemistry. For convenience, the tag moiety may be referred Handle Linker to as the T-containing moiety because T will typically 2 Handle L constitute the major portion (in terms of mass) of the tag 50 moiety. L3 Given this introduction to one aspect of the present invention, the various components T, L and X will be AS structure (iv) illustrates, atoms may serve in more than described in detail. This description begins with the follow one functional role. Thus, in structure (iv), the benzyl ing definitions of certain terms, which will be used herein nitrogen functions as a chemical handle in allowing the 55 after in describing T, L and X. linker reactant to join to the tag reactant via an amide AS used herein, the term “nucleic acid fragment’ means forming reaction, and Subsequently also serves as a neces a molecule which is complementary to a Selected target sary part of the structure of the labile moiety L in that the nucleic acid molecule (i.e., complementary to all or a portion benzylic carbon-nitrogen bond is particularly Susceptible to thereof), and may be derived from nature or Synthetically or photolytic cleavage. Structure (iv) also illustrates that a 60 recombinantly produced, including non-naturally occurring linker reactant may have an Li group (in this case, a molecules, and may be in double or Single Stranded form methylene group), although not have an L' group. Likewise, where appropriate; and includes an oligonucleotide (e.g., linker reactants may have an L' group but not an Ligroup, DNA or RNA), a primer, a probe, a nucleic acid analog (e.g., or may have L' and Li groups, or may have neither of L' nor PNA), an oligonucleotide which is extended in a 5' to 3' Ligroups. In structure (iv), the presence of the group “P” 65 direction by a polymerase, a nucleic acid which is cleaved next to the carbonyl group indicates that the carbonyl group chemically or enzymatically, a nucleic acid that is termi is protected from reaction. Given this configuration, the nated with a dideoxy terminator or capped at the 3' or 5' end US 6,613,508 B1 11 12 with a compound that prevents polymerization at the 5' or 3' The term “aryl” refers to a carbocyclic (consisting entirely end, and combinations thereof. The complementarity of a of carbon and hydrogen) aromatic group Selected from the nucleic acid fragment to a Selected target nucleic acid group consisting of phenyl, naphthyl, indenyl, indanyl, molecule generally means the exhibition of at least about aZulenyl, fluorenyl, and anthracenyl, or a heterocyclic aro 70% specific base pairing throughout the length of the matic group Selected from the group consisting of furyl, fragment. Preferably the nucleic acid fragment exhibits at thienyl, pyridyl, pyrrolyl, oxazolyly, thiazolyl, imidazolyl, least about 80% specific base pairing, and most preferably at pyrazolyl, 2-pyrazolinyl, pyrazolidinyl, isoxazolyl, least about 90%. Assays for determining the percent mis isothiazolyl, 1,2,3-oxadiazolyl, 1,2,3-triazolyl, 1,3,4- match (and thus the percent specific base pairing) are well thiadiazolyl, pyridaZinyl, pyrimidinyl, pyrazinyl, 1,3,5- known in the art and are based upon the percent mismatch triazinyl, 1,3,5-trithianyl, indolizinyl, indolyl, isolindolyl, as a function of the Tm when referenced to the fully base 3H-indolyl, indolinyl, ben Zob fur any 1, 2,3- paired control. dihydrobenzofuranyl, benzobthiophenyl, 1H-indazolyl, AS used herein, the term “alkyl, alone or in combination, benzimidazolyl, benzthiazolyl, purinyl, 4H-quinolizinyl, refers to a Saturated, Straight-chain or branched-chain hydro quinolinyl, isoquinolinyl, cinnolinyl, phthalazinyl, carbon radical containing from 1 to 10, preferably from 1 to quinazolinyl, quinoxalinyl, 1,8-naphthyridinyl, pteridinyl, 6 and more preferably from 1 to 4, carbon atoms. Examples 15 carbazolyl, acridinyl, phenazinyl, phenothiazinyl, and phe of Such radicals include, but are not limited to, methyl, ethyl, noxazinyl. n-propyl, iso-propyl, n-butyl, iso-butyl, Sec-butyl, tert-butyl, "Aryl groups, as defined in this application may inde pentyl, iso-amyl, hexyl, decyl and the like. The term “alky pendently contain one to four Substituents which are inde lene' refers to a Saturated, Straight-chain or branched chain pendently Selected from the group consisting of hydrogen, hydrocarbon diradical containing from 1 to 10, preferably halogen, hydroxyl, amino, nitro, trifluoromethyl, from 1 to 6 and more preferably from 1 to 4, carbon atoms. trifluoromethoxy, alkyl, alkenyl, alkynyl, cyano, carboxy, Examples of Such diradicals include, but are not limited to, carboalkoxy, 1,2-dioxyethylene, alkoxy, alkenoxy or methylene, ethylene (-CH-CH-), propylene, and the alkynoxy, alkylamino, alkenylamino, alkynylamino, ali like. phatic or aromatic acyl, alkoxy-carbonylamino, The term “alkenyl,” alone or in combination, refers to a 25 alkylsulfonylamino, morpholino carbonylamino, Straight-chain or branched-chain hydrocarbon radical having thiomorpholinocarbonylamino, N-alkyl guanidino, aralky at least one carbon-carbon double bond in a total of from 2 lamino Sulfonyl, aralkoxyalkyl, N-aralkoxyurea; to 10, preferably from 2 to 6 and more preferably from 2 to N-hydroxylurea; N-alkenylurea; N,N-(alkyl, hydroxyl)urea; 4, carbon atoms. Examples of Such radicals include, but are heterocyclyl; thioaryloxy-substituted aryl; N,N-(aryl, alkyl) not limited to, ethenyl, E- and Z-propenyl, isopropenyl, E hydrazino, Ar"-Substituted Sulfonylheterocyclyl, aralkyl and Z-butenyl, E- and Z-isobutenyl, E- and Z-pentenyl, Substituted heterocyclyl cycloalkyl and cycloakenyl decenyl and the like. The term “alkenylene' refers to a Substituted heterocyclyl, cycloalkyl-fused aryl; aryloxy Straight-chain or branched-chain hydrocarbon diradical hav Substituted alkyl; heterocyclylamino; aliphatic or aromatic ing at least one carbon-carbon double bond in a total of from acylaminocarbonyl, aliphatic or aromatic acyl-Substituted 2 to 10, preferably from 2 to 6 and more preferably from 2 35 alkenyl, Ar"-Substituted aminocarbonyloxy, Ar", Ar"- to 4, carbon atoms. Examples of Such diradicals include, but disubstituted aryl; aliphatic or aromatic acyl-Substituted are not limited to, methylidene (=CH-), ethylidene acyl, cycloalkylcarbonylalkyl, cycloalkyl-Substituted (-CH=CH-), propylidene (-CH-CH=CH-) and amino, aryloxycarbonylalkyl, phosphorodiamidyl acid or the like. ester, The term “alkynyl,” alone or in combination, refers to a 40 “Ar' is a carbocyclic or heterocyclic aryl group as Straight-chain or branched-chain hydrocarbon radical having defined above having one to three Substituents Selected from at least one carbon-carbon triple bond in a total of from 2 to the group consisting of hydrogen, halogen, hydroxyl, amino, 10, preferably from 2 to 6 and more preferably from 2 to 4, nitro, trifluoromethyl, trifluoromethoxy, alkyl, alkenyl, carbon atoms. Examples of Such radicals include, but are not alkynyl, 1,2-dioxymethylene, 1,2-dioxyethylene, alkoxy, limited to, ethynyl (acetylenyl), propynyl (propargyl), 45 alkenoxy, alkynoxy, alkylamino, alkenylamino or butynyl, hexynyl, decy nyl and the like. The term alkynylamino, alkylcarbonyloxy, aliphatic or aromatic acyl, “alkynylene', alone or in combination, refers to a Straight alkyl carbonyl amino, alko Xy carbony la mino, chain or branched-chain hydrocarbon diradical having at alkylsulfonylamino, N-alkyl or N,N-dialkyl urea. least one carbon-carbon triple bond in a total of from 2 to 10, The term “alkoxy,” alone or in combination, refers to an preferably from 2 to 6 and more preferably from 2 to 4. 50 alkyl ether radical, wherein the term “alkyl is as defined carbon atoms. Examples of Such radicals include, but are not above. Examples of suitable alkyl ether radicals include, but limited, ethynylene (-C=C- ), propynylene (-CH are not limited to, methoxy, ethoxy, n-propoxy, iSo-propoxy, C=C ) and the like. n-butoxy, iso-butoxy, Sec-butoxy, tert-butoxy and the like. The term “cycloalkyl,” alone or in combination, refers to The term “alkenoxy, alone or in combination, refers to a a Saturated, cyclic arrangement of carbon atoms which 55 radical of formula alkenyl-O-, wherein the term “alkenyl number from 3 to 8 and preferably from 3 to 6, carbon is as defined above provided that the radical is not an enol atoms. Examples of Such cycloalkyl radicals include, but are ether. Examples of Suitable alkenoxy radicals include, but not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclo are not limited to, allyloxy, E- and Z-3-methyl-2-propenoxy hexyl and the like. The term “cycloalkylene' refers to a and the like. diradical form of a cycloalkyl. 60 The term “alkynyloxy,” alone or in combination, refers to The term “cycloalkenyl,” alone or in combination, refers a radical of formula alkynyl-O-, wherein the term “alky to a cyclic carbocycle containing from 4 to 8, preferably 5 nyl' is as defined above provided that the radical is not an or 6, carbon atoms and one or more double bonds. Examples ynol ether. Examples of Suitable alkynoxy radicals include, of Such cycloalkenyl radicals include, but are not limited to, but are not limited to, propargyloxy, 2-butynyloxy and the cyclopentenyl, cyclohexenyl, cyclopentadienyl and the like. 65 like. The term “cycloalkenylene' refers to a diradical form of a The term “thioalkoxy' refers to a thioether radical of cycloalkenyl. formula alkyl-S-, wherein alkyl is as defined above. US 6,613,508 B1 13 14 The term “alkylamino,” alone or in combination, refers to gen atoms in order to be an independent Stable molecule. a mono- or di-alkyl-substituted amino radical (i.e., a radical Thus, a hydrocarbon radical has two open Valence Sites on of formula alkyl-NH- or (alkyl)-N-), wherein the term one or two carbon atoms, through which the hydrocarbon “alkyl” is as defined above. Examples of suitable alkylamino radical may be bonded to other atom(s). Alkylene, radicals include, but are not limited to, methylamino, alkenylene, alkynylene, cycloalkylene, etc. are examples of ethylamino, propylamino, isopropylamino, t-butylamino, hydrocarbon diradicals. N,N-diethylamino and the like. The term “hydrocarbyl” refers to any stable arrangement The term “alkenylamino,” alone or in combination, refers consisting entirely of carbon and hydrogen having a single Valence Site to which it is bonded to another moiety, and thus to a radical of formula alkenyl-NH- or (alkenyl)-N-, includes radicals known as alkyl, alkenyl, alkynyl, wherein the term “alkenyl' is as defined above, provided cycloalkyl, cycloalkenyl, aryl (without heteroatom incorpo that the radical is not an enamine. An example of Such ration into the aryl ring), arylalkyl, alkylaryl and the like. alkenylamino radicals is the allylamino radical. Hydrocarbon radical is another name for hydrocarbyl. The term “alkynylamino,” alone or in combination, refers The term “hydrocarbylene' refers to any stable arrange to a radical of formula alkynyl-NH- or (alkynyl)-N-, ment consisting entirely of carbon and hydrogen having two wherein the term “alkynyl' is as defined above, provided 15 Valence Sites to which it is bonded to other moieties, and thus that the radical is not an ynamine. An example of Such includes alkylene, alkenylene, alkynylene, cycloalkylene, alkynylamino radicals is the propargyl amino radical. cycloalkenylene, arylene (without heteroatom incorporation The term "amide" refers to either-N(R)-C(=O)- or into the arylene ring), arylalkylene, alkylarylene and the -C(=O)-N(R)- where R is defined herein to include like. Hydrocarbon diradical is another name for hydrocar hydrogen as well as other groups. The term “Substituted bylene. amide” refers to the situation where R is not hydrogen, The term “hydrocarbyl-O-hydrocarbylene' refers to a while the term "unsubstituted amide” refers to the situation hydrocarbyl group bonded to an oxygen atom, where the where R is hydrogen. oxygen atom is likewise bonded to a hydrocarbylene group The term “aryloxy,” alone or in combination, refers to a at one of the two valence sites at which the hydrocarbylene radical of formula aryl-O-, wherein aryl is as defined 25 group is bonded to other moieties. The terms “hydrocarbyl above. Examples of aryloxy radicals include, but are not S-hydrocarbylene”, “hydrocarbyl-NH-hydrocarbylene' and limited to, phenoxy, naphthoxy, pyridyloxy and the like. “hydrocarbyl-amide-hydrocarbylene' have equivalent The term “arylamino,” alone or in combination, refers to meanings, where oxygen has been replaced with Sulfur, a radical of formula aryl-NH-, wherein aryl is as defined -NH- or an amide group, respectively. above. Examples of arylamino radicals include, but are not The term N-(hydrocarbyl)hydrocarbylene refers to a limited to, phenylamino (anilido), naphthylamino, 2-, 3- and hydrocarbylene group wherein one of the two Valence Sites 4-pyridylamino and the like. is bonded to a nitrogen atom, and that nitrogen atom is The term “aryl-fused cycloalkyl, alone or in Simultaneously bonded to a hydrogen and a hydrocarbyl combination, refers to a cycloalkyl radical which shares two group. The term N,N-di(hydrocarbyl)hydrocarbylene refers adjacent atoms with an aryl radical, wherein the terms 35 to a hydrocarbylene group wherein one of the two Valence “cycloalkyl” and “aryl” are as defined above. An example of Sites is bonded to a nitrogen atom, and that nitrogen atom is an aryl-fused cycloalkyl radical is the benzofused cyclobutyl Simultaneously bonded to two hydrocarbyl groupS. radical. The term “hydrocarbylacyl-hydrocarbylene' refers to a The term “alkylcarbonylamino,” alone or in combination, hydrocarbyl group bonded through an acyl (-C(=O) ) refers to a radical of formula alkyl-CONH, wherein the term 40 group to one of the two Valence sites of a hydrocarbylene “alkyl” is as defined above. grOup. The term “alkoxycarbonylamino,” alone or in The terms "heterocyclylhydrocarbyl” and “heterocylyl” combination, refers to a radical of formula alkyl-OCONH-, refer to a stable, cyclic arrangement of atoms which include wherein the term “alkyl is as defined above. carbon atoms and up to four atoms (referred to as The term “alkylsulfonylamino,” alone or in combination, 45 heteroatoms) Selected from Oxygen, nitrogen, phosphorus refers to a radical of formula alkyl-SONH-, wherein the and Sulfur. The cyclic arrangement may be in the form of a term “alkyl” is as defined above. monocyclic ring of 3–7 atoms, or a bicyclic ring of 8–11 The term “arylsulfonylamino,” alone or in combination, atoms. The rings may be Saturated or unsaturated (including refers to a radical of formula aryl-SONH-, wherein the aromatic rings), and may optionally be benzofused. Nitrogen term “aryl' is as defined above. 50 and Sulfur atoms in the ring may be in any oxidized form, The term "N-alkylurea,” alone or in combination, refers to including the quaternized form of nitrogen. A heterocyclyl a radical of formula alkyl-NH-CO-NH-, wherein the hydrocarbyl may be attached at any endocyclic carbon or term “alkyl” is as defined above. heteroatom which results in the creation of a stable Structure. The term “N-arylurea,” alone or in combination, refers to Preferred heterocyclylhydrocarbyls include 5-7 membered a radical of formula aryl-NH-CO-NH-, wherein the 55 monocyclic heterocycles containing one or two nitrogen term “aryl' is as defined above. heteroatoms. The term “halogen” means fluorine, chlorine, bromine A substituted heterocyclylhydrocarbyl refers to a hetero and iodine. cyclylhydrocarbyl as defined above, wherein at least one The term “hydrocarbon radical” refers to an arrangement ring atom thereof is bonded to an indicated Substituent of carbon and hydrogen atoms which need only a single 60 which extends off of the ring. hydrogen atom to be an independent Stable molecule. Thus, In referring to hydrocarbyl and hydrocarbylene groups, a hydrocarbon radical has one open Valence Site on a carbon the term "derivatives of any of the foregoing wherein one or atom, through which the hydrocarbon radical may be more hydrogens is replaced with an equal number of fluo bonded to other atom(s). Alkyl, alkenyl, cycloalkyl, etc. are rides' refers to molecules that contain carbon, hydrogen and examples of hydrocarbon radicals. 65 fluoride atoms, but no other atoms. The term “hydrocarbon diradical” refers to an arrange The term “activated ester is an ester that contains a ment of carbon and hydrogen atoms which need two hydro “leaving group” which is readily displaceable by a US 6,613,508 B1 15 16 nucleophile, Such as an amine, and alcohol or a thiol those properties because they have been designed into the nucleophile. Such leaving groups are well known and tag variable component, which will typically constitute the include, without limitation, N-hydroxy Succinimide, major portion of the tag moiety. In the following discussion, N-hydroxybenzotriazole, halogen (halides), alkoxy includ the use of the word “tag” typically refers to the tag moiety ing tetrafluorophenolates, thioalkoxy and the like. The term (i.e., the cleavage product that contains the tag variable “protected ester” refers to an ester group that is masked or component), however can also be considered to refer to the otherwise unreactive. See, e.g., Greene, "Protecting Groups tag variable component itself because that is the portion of In Organic Synthesis.” the tag moiety which is typically responsible for providing In view of the above definitions, other chemical terms used throughout this application can be easily understood by the uniquely detectable properties. In compounds of the those of skill in the art. Terms may be used alone or in any formula T-L-X, the “T” portion will contain the tag combination thereof. The preferred and more preferred variable component. Where the tag variable component has chain lengths of the radicals apply to all Such combinations. been designed to be characterized by, e.g., maSS A. Generation of Tagged Nucleic Acid Fragments spectrometry, the “T” portion of T L-X may be referred AS noted above, one aspect of the present invention 15 to as T". Likewise, the cleavage product from T-L-X provides a general Scheme for DNA sequencing which that contains T may be referred to as the T"-containing allows the use of more than 16 tags in each lane, with moiety. The following spectroscopic and potentiometric continuous detection, the tags can be detected and the methods may be used to characterize T"-containing moi Sequence read as the size Separation is occurring, just as with eties. conventional fluorescence-based Sequencing. This Scheme is a. Characteristics of MS Tags applicable to any of the DNA sequencing techniques-based Where a tag is analyzable by mass spectrometry (i.e., is a on size Separation of tagged molecules. Suitable tags and linkers for use within the present invention, as well as MS-readable tag, also referred to herein as a MS tag or methods for Sequencing nucleic acids, are discussed in more “T"-containing moiety”), the essential feature of the tag is detail below. 25 that it is able to be ionized. It is thus a preferred element in 1. Tags the design of MS-readable tags to incorporate therein a "Tag”, as used herein, generally refers to a chemical chemical functionality which can carry a positive or nega moiety which is used to uniquely identify a "molecule of tive charge under conditions of ionization in the MS. This interest', and more Specifically refers to the tag variable feature conferS improved efficiency of ion formation and component as well as whatever may be bonded most closely greater overall Sensitivity of detection, particularly in elec to it in any of the tag reactant, tag component and tag moiety. trospray ionization. The chemical functionality that Supports A tag which is useful in the present invention possesses an ionized charge may derive from T" or L or both. Factors Several attributes: that can increase the relative Sensitivity of an analyte being 1) It is capable of being distinguished from all other tags. detected by mass spectrometry are discussed in, e.g., Sunner, This discrimination from other chemical moieties can be 35 J., et al., Anal. Chem. 60:1300–1307 (1988). based on the chromatographic behavior of the tag A preferred functionality to facilitate the carrying of a (particularly after the cleavage reaction), its spectroscopic or negative charge is an organic acid, Such as phenolic potentiometric properties, or Some combination thereof. hydroxyl, carboxylic acid, phosphonate, phosphate, Spectroscopic methods by which tags are usefully distin tetrazole, Sulfonyl urea, perfluoro alcohol and Sulfonic acid. guished include mass spectroscopy (MS), infrared (IR), 40 ultraviolet (UV), and fluorescence, where MS, IR and UV Preferred functionality to facilitate the carrying of a are preferred, and MS most preferred spectroscopic meth positive charge under ionization conditions are aliphatic or ods. Potentiometric amperometry is a preferred potentio aromatic amines. Examples of amine functional groups metric method. which give enhanced detectability of MS tags include qua 2) The tag is capable of being detected when present at 45 ternary amines (i.e., amines that have four bonds, each to 10° to 10 mole. carbon atoms, see Aebersold, U.S. Pat. No. 5,240,859) and 3) The tag possesses a chemical handle through which it tertiary amines (i.e., amines that have three bonds, each to can be attached to the MOI which the tag is intended to carbon atoms, which includes C=N-C groupS Such as are uniquely identify. The attachment may be made directly to present in pyridine, see Hess et al., Anal. Biochem. 224:373, the MOI, or indirectly through a “linker' group. 50 1995; Bures et al., Anal. Biochem. 224:364, 1995). Hindered 4) The tag is chemically stable toward all manipulations tertiary amines are particularly preferred. Tertiary and qua to which it is Subjected, including attachment and cleavage ternary amines may be alkyl or aryl. A T"-containing from the MOI, and any manipulations of the MOI while the moiety must bear at least one ionizable species, but may tag is attached to it. possess more than one ionizable Species. The preferred 5) The tag does not significantly interfere with the 55 charge State is a single ionized Species per tag. Accordingly, manipulations performed on the MOI while the tag is it is preferred that each T"-containing moiety (and each tag attached to it. For instance, if the tag is attached to an variable component) contain only a single hindered amine or oligonucleotide, the tag must not significantly interfere with organic acid group. any hybridization or enzymatic reactions (e.g., PCR Suitable amine-containing radicals that may form part of Sequencing reactions) performed on the oligonucleotide. 60 the T"-containing moiety include the following: Similarly, if the tag is attached to an antibody, it must not Significantly interfere with antigen recognition by the anti body. A tag moiety which is intended to be detected by a certain Spectroscopic or potentiometric method should possess 65 properties which enhance the Sensitivity and Specificity of (C1-C10); detection by that method. Typically, the tag moiety will have US 6,613,508 B1 17 18 -continued for mass Spectrometers to distinguish among moieties hav ing parentions below about 200–250 daltons (depending on the precise instrument), and thus preferred T"-containing moieties of the invention have masses above that range. AS explained above, the T"-containing moiety may con tain atoms other than those present in the tag variable component, and indeed other than present in T" itself. Accordingly, the mass of T' itself may be less than about 250 daltons, So long as the T"-containing moiety has a mass of at least about 250 daltons. Thus, the mass of T" may range from 15 (i.e., a methyl radical) to about 10,000 daltons, and preferably ranges from 100 to about 5,000 daltons, and more preferably ranges from about 200 to about N-(C-Clo); 1,000 daltons. 15 It is relatively difficult to distinguish tags by mass spec trometry when those tags incorporate atoms that have more than one isotope in Significant abundance. Accordingly, (C-Co)-N preferred T groups which are intended for mass Spectro Scopic identification (T" groups), contain carbon, at least one of hydrogen and fluoride, and optional atoms Selected N from Oxygen, nitrogen, Sulfur, phosphorus and iodine. While other atoms may be present in the T", their presence can co-( ) render analysis of the mass spectral data Somewhat more difficult. Preferably, the T" groups have only carbon, nitro \ 25 gen and oxygen atoms, in addition to hydrogen and/or fluoride. co-( ) Fluoride is an optional yet preferred atom to have in a T" N group. In comparison to hydrogen, fluoride is, of course, much heavier. Thus, the presence of fluoride atoms rather co-( ) than hydrogen atoms leads to T' groups of higher mass, thereby allowing the T" group to reach and exceed a mass of greater than 250 daltons, which is desirable as explained (C-Co)-N O : above. In addition, the replacement of hydrogen with fluo ride conferS greater Volatility on the T"-containing moiety, (C1-C10) 35 and greater Volatility of the analyte enhances Sensitivity /- when mass spectrometry is being used as the detection (C-Co)-N method. The molecular formula of T" falls within the scope of (C1-C10) C1-soo No-oooo-ooSo-oPo-oHFI's wherein the Sum of C., f: 40 and 8 is sufficient to satisfy the otherwise unsatisfied valen cies of the C, N, O, S and P atoms. The designation C1-soo No-oooo-oo So-oPo-oHFI's means that T" con tains at least one, and may contain any number from 1 to 500 carbon atoms, in addition to optionally containing as many 45 as 100 nitrogen atoms (“No” means that T" need not contain any nitrogen atoms), and as many as 100 oxygen atoms, and as many as 10 Sulfur atoms and as many as 10 phosphorus atoms. The Symbols C, B and ö represent the number of hydrogen, fluoride and iodide atoms in T", 50 where any two of these numbers may be Zero, and where the Sum of these numbers equals the total of the otherwise unsatisfied valencies of the C, N, O, S and P atoms. Preferably, T" has a molecular formula that falls within the Scope of Clso No.oOo.o.H.F where the Sum of C. and f 55 equals the number of hydrogen and fluoride atoms, respectively, present in the moiety. b. Characteristics of IR Tags There are two primary forms of IR detection of organic chemical groups: Raman Scattering IR and absorption IR. 60 Raman Scattering IR spectra and absorption IR spectra are The identification of a tag by mass spectrometry is complementary Spectroscopic methods. In general, Raman preferably based upon its molecular mass to charge ratio excitation depends on bond polarizability changes whereas (m/z). The preferred molecular mass range of MS tags is IR absorption depends on bond dipole moment changes. from about 100 to 2,000 daltons, and preferably the T"- Weak IR absorption lines become strong Raman lines and containing moiety has a mass of at least about 250 daltons, 65 vice versa. Wavenumber is the characteristic unit for IR more preferably at least about 300 daltons, and still more Spectra. There are 3 spectral regions for IR tags which have preferably at least about 350 daltons. It is generally difficult separate applications: near IR at 12500 to 4000 cm, mid IR US 6,613,508 B1 19 20 at 4000 to 600 cm, far IR at 600 to 30 cm. For the uses maximal resolution. Fluorescence intensity per probe mol described herein where a compound is to Serve as a tag to ecule is proportional to the product of e and QY. The range identify an MOI, probe or primer, the mid spectral regions of these parameters among fluorophores of current practical would be preferred. For example, the carbonyl stretch (1850 importance is approximately 10,000 to 100,000 cm'M' to 1750 cm) would be measured for carboxylic acids, for e and 0.1 to 1.0 for QY. Compounds that can serve as carboxylic esters and amides, and alkyl and aryl carbonates, fluorescent tags are as follows: fluorescein, rhodamine, carbamates and ketones. N-H bending (1750 to 160 cm) lambda blue 470, lambda green, lambda red 664, lambda red would be used to identify amines, ammonium ions, and 665, acridine orange, and propidium iodide, which are amides. At 1400 to 1250 cm, R-OH bending is detected commercially available from Lambda Fluorescence Co. as well as the C-N stretch in amides. Aromatic Substitution (Pleasant Gap, Pa.). Fluorescent compounds Such as nile red, patterns are detected at 900 to 690 cm (C-H bending, Texas Red, lissamine TM, BODIPYTMs are available from N-H bending for ArNH). Saturated C-H, olefins, aro Molecular Probes (Eugene, Oreg.). matic rings, double and triple bonds, esters, acetals, ketals, e. Characteristics of Potentiometric Tags ammonium Salts, N-O compounds Such as OXimes, nitro, The principle of electrochemical detection (ECD) is based N-oxides, and nitrates, azo, hydrazones, quinones, carboxy 15 on Oxidation or reduction of compounds which at certain lic acids, amides, and lactams all possess Vibrational infrared applied Voltages, electrons are either donated or accepted correlation data (see Pretsch et al., Spectral Data for Struc thus producing a current which can be measured. When ture Determination of Organic Compounds, Springer certain compounds are Subjected to a potential difference, Verlag, New York, 1989). Preferred compounds would the molecules undergo a molecular rearrangement at the include an aromatic nitrile which exhibits a very Strong working electrodes Surface with the loss (oxidation) or gain nitrile stretching vibration at 2230 to 2210 cm. Other (reduction) of electrons, Such compounds are said to be useful types of compounds are aromatic alkynes which have electronic and undergo electrochemical reactions. EC detec a strong Stretching vibration that gives rise to a Sharp tors apply a Voltage at an electrode Surface over which the absorption band between 2140 and 2100 cm. A third HPLC eluent flows. Electroactive compounds eluting from compound type is the aromatic azides which exhibit an 25 the column either donate electrons (oxidize) or acquire intense absorption band in the 2160 to 2120 cm region. electrons (reduce) generating a current peak in real time. Thiocyanates are representative of compounds that have a Importantly the amount of current generated depends on strong absorption at 2275 to 2263 cm. both the concentration of the analyte and the Voltage applied, c. Characteristics of UV Tags with each compound having a Specific Voltage at which it A compilation of organic chromophore types and their begins to oxidize or reduce. The currently most popular respective UV-visible properties is given in Scott electrochemical detector is the amperometric detector in (Interpretation of the UV Spectra of Natural Products, which the potential is kept constant and the current produced Permagon Press, New York, 1962). A chromophore is an from the electrochemical reaction is then measured. This atom or group of atoms or electrons that are responsible for type of Spectrometry is currently called “potentiostatic the particular light absorption. Empirical rules exist for the 35 amperometry”. Commercial amperometers are available It to t maxima in conjugated Systems (see Pretsch et al., from ESA, Inc., Chelmford, Mass. Spectral Data for Structure Determination of Organic When the efficiency of detection is 100%, the specialized Compounds, p. B65 and B70, Springer-Verlag, New York, detectors are termed “coulometric'. Coulometric detectors 1989). Preferred compounds (with conjugated systems) are Sensitive which have a number of practical advantages would possess n to at and at to It transitions. Such 40 with regard to Selectivity and Sensitivity which make these compounds are exemplified by Acid Violet 7, Acridine types of detectors useful in an array. In coulometric Orange, Acridine Yellow G, Brilliant Blue G, Congo Red, detectors, for a given concentration of analyte, the Signal Crystal Violet, Malachite Green oxalate, Metanil Yellow, current is plotted as a function of the applied potential Methylene Blue, Methyl Orange, Methyl Violet B, Naphtol (voltage) to the working electrode. The resultant sigmoidal Green B, Oil Blue N, Oil Red O, 4-phenylazophenol, 45 graph is called the current-Voltage curve or hydrodynamic Safranie O, Solvent Green 3, and Sudan Orange G, all of voltammagram (HDV). The HDV allows the best choice of which are commercially available (Aldrich, Milwaukee, applied potential to the working electrode that permits one Wis.). Other Suitable compounds are listed in, e.g., Jane, I., to maximize the observed signal. A major advantage of ECD et al., J. Chrom. 323:191-225 (1985). is its inherent sensitivity with current levels of detection in d. Characteristic of a Fluorescent Tag 50 the Subfemtomole range. Fluorescent probes are identified and quantitated most Numerous chemicals and compounds are electrochemi directly by their absorption and fluorescence emission wave cally active including many biochemicals, pharmaceuticals lengths and intensities. Emission spectra (fluorescence and and pesticides. Chromatographically coeluting compounds phosphorescence) are much more Sensitive and permit more can be effectively resolved even if their half-wave potentials Specific measurements than absorption Spectra. Other pho 55 (the potential at half signal maximum) differ by only 30-60 tophysical characteristics Such as excited-State lifetime and mV. fluorescence anisotropy are less widely used. The most Recently developed coulometric Sensors provide generally useful intensity parameters are the molar extinc Selectivity, identification and resolution of co-eluting com tion coefficient (e) for absorption and the quantum yield pounds when used as detectors in liquid chromatography (QY) for fluorescence. The value of e is specified at a single 60 based Separations. Therefore, these arrayed detectors add wavelength (usually the absorption maximum of the probe), another Set of Separations accomplished in the detector itself. whereas QY is a measure of the total photon emission over Current instruments possess 16 channels which are in prin the entire fluorescence Spectral profile. A narrow optical ciple limited only by the rate at which data can be acquired. bandwidth (<20 nm) is usually used for fluorescence exci The number of compounds which can be resolved on the EC tation (via absorption), whereas the fluorescence detection 65 array is chromatographically limited (i.e., plate count bandwidth is much more variable, ranging from full Spec limited). However, if two or more compounds that chro trum for maximal sensitivity to narrow band (-20 nm) for matographically co-elute have a difference in half wave US 6,613,508 B1 21 22 potentials of 30–60 mV, the array is able to distinguish the Suitable compounds are Set forth in, e.g., Jane, I., et al. J. compounds. The ability of a compound to be electrochemi Chrom. 323:191-225 (1985) and Musch, G., et al., J. Chrom. cally active relies on the possession of an EC active group 348:97-110 (1985). These compounds can be incorporated (i.e., -OH, -O, -N, -S). into compounds of formula T-L-X by methods known in Compounds which have been Successfully detected using the art. For example, compounds having a carboxylic acid coulometric detectors include 5-hydroxytryptamine, group may be reacted with amine, hydroxyl, etc. to form 3-methoxy-4-hydroxyphenyl-glycol, homogentisic acid, amide, ester and other linkages between T and L. , , 3-hydroxykynure ninr, In addition to the above properties, and regardless of the a c e to mino phen, 3-hydroxy tryptop hol, intended detection method, it is preferred that the tag have 5-hydroxyindoleacetic acid, octaneSulfonic acid, phenol, a modular chemical Structure. This aids in the construction o-creSol, pyrogallol, 2-nitrophenol, 4-nitrophenol, 2,4- of large numbers of Structurally related tags using the dinitrophenol, 4,6-dinitrocreSol, 3-methyl-2-nitrophenol, techniques of combinatorial chemistry. For example, the T" 2,4-dichlorophenol, 2,6-dichlorophenol, 2,4,5- group desirably has Several properties. It desirably contains trichlorophenol, 4-chloro-3-methylphenol, 5-methylphenol, a functional group which Supports a single ionized charge 4-methyl-2-nitrophe no 1, 2-hydroxyani line, 15 State when the T"-containing moiety is Subjected to mass 4-hydroxyaniline, 1,2-phenylenediamine, benzocatechin, spectrometry (more simply referred to as a “mass spec buturon, chlortholuron, diuron, isoproturon, linuron, sensitivity enhancer” group, or MSSE). Also, it desirably methobromuron, metoxuron, monolinuron, monuron, can Serve as one member in a family of T"-containing methionine, , , 4-aminobenzoic acid, moieties, where members of the family each have a different 4-hydroxybenzoic acid, 4-hydroxycoumaric acid, mass/charge ratio, however have approximately the same 7-methoxycoumarin, apigenin baicalein, caffeic acid, Sensitivity in the mass spectrometer. Thus, the members of catechin, centaurein, chlorogenic acid, daidzein, datiscetin, the family desirably have the same MSSE. In order to allow dioSmetin, epicatechin gallate, epigallo catechin, epigallo the creation of families of compounds, it has been found catechin gallate, eugenol, eupatorin, ferulic acid, fisetin, convenient to generate tag reactants via a modular Synthesis galangin, gallic acid, gardenin, genistein, gentisic acid, 25 Scheme, So that the tag components themselves may be hesperidin, irigenin, kaemferol, leucoyanidin, luteolin, Viewed as comprising modules. mangostin, morin, my ricetin, naringin, narirutin, In a preferred modular approach to the structure of the T" pelargondin, peonidin, phloretin, pratensein, protocatechuic group, T" has the formula acid, rhamnetin, quercetin, Sakuranetin, Scutellarein, Scopoletin, Syringaldehyde, Syringic acid, tangeritin, troXerutin, umbelliferone, Vanillic acid, 1,3-dimethyl wherein T is an organic moiety formed from carbon and tetrahydroisoquinoline, 6-hydroxydopamine, r-Salsolinol, one or more of hydrogen, fluoride, iodide, Oxygen, nitrogen, N-methyl-r-SalSolinol, tetrahydro isoquinoline, Sulfur and phosphorus, having a mass range of 15 to 500 amitriptyline, apomorphine, capsaicin, chlordiazepoxide, daltons; T is an organic moiety formed from carbon and one chlorpromazine, daunorubicin, desipramine, doxepin, 35 or more of hydrogen, fluoride, iodide, Oxygen, nitrogen, fluoxetine, flurazepam, imipramine, isoprote renol, sulfur and phosphorus, having a mass range of 50 to 1000 methoxamine, morphine, morphine-3-glucuronide, daltons, J is a direct bond or a functional group Such as nortriptyline, oxazepam, phenylephrine, trimipramine, amide, ester, amine, Sulfide, ether, thioester, disulfide, a Scorbic acid, N-acetyl , 3,4- thioether, urea, thiourea, carbamate, thiocarbamate, Schiff dihydroxybenzylamine, 3,4-dihydroxy mandelic acid 40 base, reduced Schiff base, imine, Oxime, hydrazone, (DOMA), 3,4-dihydroxyphenylacetic acid (DOPAC), 3,4- phosphate, phosphonate, phosphoramide, phosphonamide, dihydroxyphenylalanine (L-DOPA), 3,4- Sulfonate, Sulfonamide or carbon-carbon bond; and n is an dihydroxyphenylglycol (DHPG), 3-hydroxyanthranilic acid, integer ranging from 1 to 50, Such that when n is greater than 2-hydroxyphenylacetic acid (2HPAC), 4-hydroxybenzoic 1, each T and J is independently selected. acid (4HBAC), 5-hydroxyindole-3-acetic acid (5HIAA), 45 The modular structure T-(J-T), provides a conve 3-hydroxykynurenine, 3-hydroxymandelic acid, 3-hydroxy nient entry to families of T-L-X compounds, where each 4-methoxyphenylethylamine, 4-hydroxyphenylacetic acid member of the family has a different T group. For instance, (4HPAC), 4-hydroxyphenyllactic acid (4HPLA), when T is T', and each family member desirably has the 5-hydroxytryptophan (5HTP), 5-hydroxytryptophol same MSSE, one of the T groups can provide that MSSE (5HTOL), 5-hydroxy trypt a mine (5HT), 50 structure. In order to provide variability between members 5-hydroxy tryptamine Sulfate, 3-methoxy-4- of a family in terms of the mass of T", the T group may hydroxyphenylglycol (MHPG), 5-methoxytryptamine, be varied among family members. For instance, one family 5-methoxy tryptophan, 5-methoxy tryptophol, member may have T=methyl, while another has T=ethyl, 3-methoxytyramine (3MT), 3-methoxytyrosine (3-OM and another has T=propyl, etc. DOPA), 5-methylcysteine, 3-methylguanine, bufotenin, 55 In order to provide "gross” or large jumps in mass, a T dopamine dopamine-3-glucuronide, dopamine-3-sulfate, group may be designed which adds significant (e.g., one or dopamine-4-Sulfate, epinephrine, epinine, folic acid, glu several hundreds) of mass units to T-L-X. Such a T tathione (reduced), guanine, guanosine, homogentisic acid group may be referred to as a molecular weight range (HGA), (HVA), homovanillyl alcohol adjuster group(“WRA”). A WRA is quite useful if one is (HVOL), homoveratic acid, hva sulfate, hypoxanthine, 60 working with a single set of T groups, which will have indole, indole-3-acetic acid, indole-3-lactic acid, masses extending over a limited range. A single set of T kynurenine, , metanephrine, N-methyltryptamine, groups may be used to create T" groups having a wide N-methyltyramine, N,N-dimethyltryptamine, N,N- range of mass simply by incorporating one or more WRAT dimethyltyramine, , , groups into the T". Thus, using a simple example, if a Set Octopamine, pyridoxal, pyridoxal phosphate, pyridoxamine, 65 of T groups affords a mass range of 250-340 daltons for the Synephrine, tryptophol, tryptamine, tyramine, uric acid, T", the addition of a single WRA, having, as an exemplary (Vma), Xanthine and Xanthosine. Other number 100 dalton, as a T group provides access to the US 6,613,508 B1 23 24 mass range of 350-440 daltons while using the same set of morpholino carbonyl-substituted alkyl, T groups. Similarly, the addition of two 100 dalton MWA thiomorpholinocarbonyl-substituted alkyl, N-(alkyl, alk groups (each as a T group) provides access to the mass enyl or alkynyl)- or N,N-dialkyl, dialkenyl, dialkynyl or range of 450–540 daltons, where this incremental addition (alkyl, alkenyl)-aminocarbonyl-substituted alkyl, of WRA groups can be continued to provide access to a very he tero cyclyl a mino carbonyl, large mass range for the T" group. Preferred compounds of he tero cylyl alkyl ene a mino carbonyl, the formula T-(J-T-), L-X have the formula he terocyclylamino carbonyl-Sub Stituted alkyl, Rw-(Rwa),—Rasse-L-X where VWC is a "T" heterocylylalkyleneaminocarbonyl-substituted alkyl, N,N- group, and each of the WRA and MSSE groups are “T” dialkylalkylene amino carbonyl, N,N-dialkyl groups. This structure is illustrated in FIG. 12, and repre alkyleneaminocarbonyl-Substituted alkyl, alkyl-Substituted Sents one modular approach to the preparation of T". he tero cyclyl carbonyl, alkyl-substituted heterocyclylcarbonyl-alkyl, carboxyl-Substituted alkyl, In the formula T'-(J-T ), , T and T are prefer dialkylamino-Substituted acylaminoalkyl and amino acid ably selected from hydrocarbyl, hydrocarbyl-O- 15 Side chains Selected from arginine, asparagine, glutamine, hydrocarbylene, hydrocarby 1-S-hydrocarbylene, S-methyl cysteine, methionine and corresponding Sulfoxide hydrocarbyl-NH-hydrocarbylene, hydrocarbyl-amide and Sulfone derivatives thereof, glycine, leucine, isoleucine, hydrocarbylene, N-(hydrocarbyl)hydrocarbylene, N,N-di allo-isoleucine, tert-leucine, norleucine, phenylalanine, (hydro car by l) hydro car by lene, tyrosine, tryptophan, proline, alanine, omithine, histidine, hydrocarbylacylhydrocarbylene, heterocyclylhydrocarbyl glutamine, Valine, threonine, Serine, aspartic acid, beta wherein the heteroatom(s) are selected from oxygen, cyanoalanine, and allo threonine; aly nyl and nitrogen, Sulfur and phosphorus, Substituted heterocyclyl heterocyclylcarbonyl, aminocarbonyl, amido, mono- or hydrocarbyl wherein the heteroatom(s) are selected from dialkylaminocarbonyl, mono- or diarylaminocarbonyl, oxygen, nitrogen, Sulfur and phosphorus and the Substituents 25 alkylarylaminocarbonyl, diarylaminocarbonyl, mono- or are selected from hydrocarbyl, hydrocarbyl-O- diacylaminocarbonyl, aromatic or aliphatic acyl, alkyl hydrocarbylene, hydrocarbyl-NH-hydrocarbylene, optionally Substituted by Substituents Selected from amino, hydrocarby 1-S-hydrocarbylene, N-(hydrocarby1) carboxy, hydroxy, mercapto, mono- or dialkylamino, mono hydrocarbylene, N,N-di(hydrocarbyl)hydrocarbylene and or diarylamino, alkylarylamino, diarylamino, mono- or hydrocarbylacyl-hydrocarbylene. In addition, T and/or T diacylamino, alkoxy, alkenoxy, aryloxy, thioalkoxy, may be a derivative of any of the previously listed potential thioalkenoxy, thioalkynoxy, thioaryloxy and heterocyclyl. T/T groups, such that one or more hydrogens are replaced fluorides. A preferred compound of the formula T-(J-T ), L-X has the structure: Also regarding the formula T-(J-T-), , a pre 35 ferred T has the formula -G(R)-, wherein G is C, alkylene chain having a single R substituent. Thus, if G is ethylene (-CH-CH-) either one of the two ethylene Amide carbons may have a R substituent, and R is selected from 40 O (CH2) C alkyl, alkenyl, alkynyl, cycloalkyl, aryl-fused cycloalkyl, cycloalkenyl, aryl, aralkyl, aryl-Substituted alkenyl or T2 l . L Yx alkynyl, cycloalkyl-substituted alkyl, cycloalkenyl Substituted cycloalkyl, biaryl, alkoxy, alkenoxy, alkynoxy, R1 O aralkoxy, aryl-Substituted alkenoxy or alkynoxy, 45 alkylamino, alkenylamino or alkynylamino, aryl-Substituted alkylamino, aryl-Substituted alkenylamino or alkynylamino, wherein G is (CH). Such that a hydrogen on one and only aryloxy, arylamino, N-alkylure a-Substituted alkyl, one of the CH groups represented by a Single “G” is N-ary lure a-Substituted alkyl, alkylcarbonylamino 50 replaced with-(CH-)-Amide-T'; T and T are organic Substituted alkyl, aminocarbonyl-Substituted alkyl, moieties of the formula CasNo.oOo.H.F. Such that the heterocyclyl, heterocyclyl-Substituted alkyl, heterocyclyl Sum of C. and B is Sufficient to Satisfy the otherwise unsat Substituted amino, carboxyalkyl Substituted aralkyl, isfied valencies of the C, N, and O atoms; amide is oxocarbocyclyl-fused aryl and heterocyclylalkyl, O O cycloalkenyl, aryl-Substituted alkyl and, aralkyl, hydroxy 55 | Substituted alkyl, alkoxy-Substituted alkyl, aralkoxy -N-C- or -C-N- Substituted alkyl, alkoxy-Substituted alkyl, aralkoxy Substituted alkyl, amino-Substituted alkyl, (aryl-substituted alkyloxycarbonylamino)-substituted alkyl, thiol-substituted 60 alkyl, alkylsulfonyl-substituted alkyl, (hydroxy-substituted R" is hydrogen or Clio alkyl; c is an integer ranging from alkylthio)-substituted alkyl, thioalkoxy-substituted alkyl, 0 to 4, and n is an integer ranging from 1 to 50 Such that hydrocarby la cylamino-Sub Stituted alkyl, when n is greater than 1, G, c, Amide, R' and T are heterocyclylacylamino-Substituted alkyl, hydrocarbyl independently Selected. Substituted-heterocyclylacylamino-Substituted alkyl, 65 alkylsulfonylamino-Substituted alkyl, arylsulfonylamino In a further preferred embodiment, a compound of the Substituted alkyl, morpholino-alkyl, thiomorpholino-alkyl, formula T-(J-T), L-X has the structure: US 6,613,508 B1 25 26 -continued T4 -N- (C-Co)-N id O O s". O T2 1, -G N I-X Where the above compounds have a T group, and the R1 O st “G” group has a free carboxyl group (or reactive equivalent Amide thereof), then the following are preferred -Amide-T group, which may conveniently be prepared by reacting the appro k priate organic amine with a free carboxyl group extending from a “G” group: wherein T is an organic moiety of the formula C.2s No. 15 oOo.o.H.F. Such that the sum of C. and f is sufficient to satisfy the otherwise unsatisfied valencies of the C, N, and O atoms; and T includes a tertiary or quaternary amine or an organic acid; m is an integer ranging from 0–49, and T, T', R', L and X have been previously defined. O Another preferred compound having the formula T (J-T ), L-X has the particular structure: T4 -CNH-(C-Co) 25 O id O s". O N G N X -- 11 -CNH-(C-Co) R1 O ide O

- CNH- (C-C10) -N O : 35 wherein T is an organic moiety of the formula CasNo. I oOooH.F. Such that the sum of C. and f is Sufficient to (C1-C10) satisfy the otherwise unsatisfied valencies of the C, N, and O atoms; and T includes a tertiary or quaternary amine or - CNH- co an organic acid; m is an integer ranging from 0–49, and T, 40 T', c, R", “Amide", L and X have been previously defined. | \ In the above structures that have a T group, -Amide-T (C1-C10) is preferably one of the following, which are conveniently made by reacting organic acids with free amino groups extending from “G”: 45 o O ti- cro- 1) | O N O 50 (C1-C10);

O

--O-o-c-en-scrolO 55 -N-G co- ) O 1 NN -NHC- (Co-Cio 2 O --O- (C1-C10); and O In three preferred embodiments of the invention, T-L- MOI has the structure: US 6,613,508 B1 27 28 Selected from the following, which are exemplary only and T4 do not constitute an exhaustive list of potential organic H acids: Formic acid, Acetic acid, Propiolic acid, Propionic Amide acid, Fluoroacetic acid, 2-Butynoic acid, Cyclopropanecar O NS boxylic acid, Butyric acid, Methoxyacetic acid, Difluoro O st (C-Co)-ODN-3'-OH acetic acid, 4-Pentynoic acid, Cyclobutanecarboxylic acid, G N 3,3-Dimethylacrylic acid, Valeric acid, N,N- Dimethylglycine, N-Formyl-Gly-OH, Ethoxyacetic acid, (Methylthio)acetic acid, Pyrrole-2-carboxylic acid, 3-Furoic R1 O NO2 acid, ISOxazole-5-carboxylic acid, trans-3-Hexenoic acid, Trifluoroacetic acid, Hexanoic acid, Ac-Gly-OH, 2-Hydroxy-2-methylbutyric acid, Benzoic acid, Nicotinic acid, 2- Pyrazine carboxylic acid, 1 - Methyl-2- pyrrollecarboxylic acid, 2-Cyclopentene-1-acetic acid, or the Structure: 15 Cyclopentylacetic acid, (S)-(–)-2-Pyrrollidone-5-carboxylic acid, N-Methyl-L-proline, Heptanoic acid, Ac-b-Ala-OH, T4 2-Ethyl-2-hydroxybutyric acid, 2-(2-Methoxyethoxy)acetic acid, p-Toluic acid, 6-Methylnicotinic acid, 5-Methyl-2- pyrazinecarboxylic acid, 2,5-Dimethylpyrrole-3-carboxylic id acid, 4-Fluorobenzoic acid, 3,5-Dimethylisoxazole-4- O s". | carboxylic acid, 3-Cyclopentylpropionic acid, Octanoic G N acid, N,N-DimethylSuccinamic acid, Phenylpropiolic acid, -(- Cinnamic acid, 4-Ethylbenzoic acid, p-Anisic acid, 1,2,5- H O NO Trimethylpyrrole-3-carboxylic acid, 3-Fluoro-4- N 2 25 methylbenzoic acid, Ac-DL-Propargylglycine, 1. H 3-(Trifluoromethyl)butyric acid, 1-Piperidinepropionic acid, M N-Acetylproline, 3,5-Difluorobenzoic acid, Ac-L-Val-OH, y M Indole-2-carboxylic acid, 2-Benzofurancarboxylic acid, O (C1-C10)-OND-3'-OH Benzotriazole-5-carboxylic acid, 4-n-Propylbenzoic acid, 3-Dimethylaminobenzoic acid, 4-Ethoxybenzoic acid, 4-(Methylthio)benzoic acid, N-(2-Furoyl)glycine, or the Structure: 2-(Methylthio)nicotinic acid, 3-Fluoro-4-methoxybenzoic acid, Tfa-Gly-OH, 2-Naptholic acid, Quinaldic acid, Ac-L- O NO Ile -OH, 3-Methylind ene-2-carboxylic acid, 35 2-Quinoxalinecarboxylic acid, 1-Methylindole-2-carboxylic (Amide ls N 21 acid, 2,3,6-Trifluorobenzoic acid, N-Formyl-L-Met-OH, T21 y k N H 2-2-(2-Methoxyethoxy)ethoxyacetic acid, 4-n- Butylbenzoic acid, N-Benzoylglycine, 5-Fluoroindole-2- (h). , NAV Amide O carboxylic acid, 4-n-Propoxybenzoic acid, 4-Acetyl-3,5- 40 dimethyl-2-pyrrollecarboxylic acid, 3,5-Dimethoxybenzoic acid, 2,6-Dimethoxynicotinic acid, Cyclohexanepentanoic acid, 2-Naphthylacetic acid, 4-(1H-Pyrrol-1-yl)benzoic wherein T and T are organic moieties of the formula acid, Indole-3-propionic acid, m-Trifluoromethylbenzoic C-2s No.oOoo Sos PosHFI's Such that the Sum of C, B and acid, 5-Methoxyindole-2-carboxylic acid, 4-Pentylbenzoic 8 is sufficient to satisfy the otherwise unsatisfied valencies of 45 acid, Bz-b-Ala-OH, 4-Diethylaminobenzoic acid, 4-n- the C, N, O, S and Patoms; G is (CH), wherein one and Butoxybenzoic acid, 3-Methyl-5-CF3-isoxazole-4- only one hydrogen on the CH2 groups represented by each carboxylic acid, (3,4-Dimethoxyphenyl)acetic acid, G is replaced with -(CH2).-Amide-T"; Amide is 4-Biphenylcarboxylic acid, Pivaloyl-Pro-OH, Octanoyl Gly-OH, (2-Naphthoxy)acetic acid, Indole-3-butyric acid, O O 50 4-(Trifluoromethyl)phenylacetic acid, 5-Methoxyindole-3- acetic acid, 4-(Trifluoromethoxy)benzoic acid, Ac-L-Phe -N-C- O OH, 4-Pentyloxybenzoic acid, Z-Gly-OH, 4-Carboxy-N- R1 (fur-2-ylmethyl)pyrrolidin-2-one, 3,4-Diethoxybenzoic acid, 2,4-Dimethyl-5-COEt-pyrrole-3-carboxylic acid, R" is hydrogen or Clio alkyl; c is an integer ranging from 55 N-(2-Fluorophenyl)succinamic acid, 3,4,5- 0 to 4, "C-Co” represents a hydrocarbylene group having Trimethoxybenzoic acid, N-Phenylanthranilic acid, from 2 to 10 carbon atoms, “ODN-3'-OH' represents a 3 - Phenoxybenzoic acid, No nanoyl-Gly-OH, nucleic acid fragment having a terminal 3' hydroxyl group 2-Phenoxypyridine-3-carboxylic acid, 2,5-Dimethyl-1- (i.e., a nucleic acid fragment joined to (C-Co) at other than phenylpyrrole-3-carboxylic acid, trans-4-(Trifluoromethyl) the 3' end of the nucleic acid fragment); and n is an integer 60 cinnamic acid, (5-Methyl-2-phenyloxazol-4-yl)acetic acid, ranging from 1 to 50 Such that when n is greater than 1, then 4-(2-Cyclohexenyloxy)benzoic acid, 5-Methoxy-2- G, c, Amide, R and T are independently selected. Prefer methylindole-3-acetic acid, trans-4-Cotininecarboxylic acid, ably there are not three heteroatoms bonded to a single BZ-5-Aminovaleric acid, 4-Hexyloxybenzoic acid, N-(3- carbon atom. Methoxyphenyl)succinamic acid, Z-Sar-OH, 4-(3,4- In structures as set forth above that contain a T-C 65 Dimethoxyphenyl)butyric acid, Ac-o-Fluoro-DL-Phe-OH, (=O)-N(R)- group, this group may be formed by react N-(4-Fluorophenyl)glutaramic acid, 4'-Ethyl-4- ing an amine of the formula HN(R)- with an organic acid biphenyl carboxylic acid, 1,2,3,4- US 6,613,508 B1 29 30 Tetrahydroacridinecarboxylic acid, 3-Phenoxyphenylacetic phosphinamides, etc., and poly(Sulfur derivatives), e.g., acid, N-(2,4-Difluorophenyl)Succinamic acid, N-Decanoyl Sulfones, Sulfonates, Sulfites, Sulfonamides, Sulfenamides, Gly-OH, (+)-6-Methoxy-a-methyl-2-naphthaleneacetic etc. acid, 3-(Trifluoromethoxy)cinnamic acid, N-Formyl-DL One common type of oligomeric combinatorial library is Trp-OH, (R)-(+)-a-Methoxy-a- (trifluoromethyl) the peptide combinatorial library. Recent innovations in phenylacetic acid, Bz-DL-Leu-OH, 4-(Trifluoromethoxy) peptide chemistry and molecular biology have enabled phenoxyacetic acid, 4-Heptyloxybenzoic acid, 2,3,4- libraries consisting of tens to hundreds of millions of dif Trimethoxycinnamic acid, 2,6-Dimethoxybenzoyl-Gly-OH, ferent peptide Sequences to be prepared and used. Such 3-(3,4,5-Trimethoxyphenyl)propionic acid, 2,3,4,5,6- libraries can be divided into three broad categories. One Pentafluorophenoxyacetic acid, N-(2,4-Difluorophenyl) category of libraries involves the chemical Synthesis of glutaramic acid, N-Unde canoyl-Gly-OH, 2-(4- Soluble non-Support-bound peptide libraries (e.g., Houghten Fluorobenzoyl)benzoic acid, 5-Trifluoromethoxyindole-2- et al., Nature 354:84, 1991). A second category involves the carboxylic acid, N-(2,4-Difluorophenyl)diglycolamic acid, chemical Synthesis of Support-bound peptide libraries, pre Ac-L-Trp-OH, Tfa-L-Phenylglycine-OH, 3-Iodobenzoic Sented on Solid Supports Such as plastic pins, resin beads, or acid, 3-(4-n-Pentylbenzoyl)propionic acid, 2-Phenyl-4- 15 cotton (Geysen et al., Mol. Immunol. 23:709, 1986; Lam et quinolinecarboxylic acid, 4-Octyloxybenzoic acid, BZ-L- al., Nature 354:82, 1991; Eichler and Houghten, Biochem Met-OH, 3,4,5-Triethoxybenzoic acid, N-Lauroyl-Gly-OH, istry 32:11035, 1993). In these first two categories, the 3.5-Bis(trifluoromethyl)benzoic acid, Ac-5-Methyl-DL-Trp building blocks are typically L-amino acids, D-amino acids, OH, 2-Iodophenylacetic acid, 3-Iodo-4-methylbenzoic acid, unnatural amino acids, or Some mixture or combination 3-(4-n-Hexylbenzoyl)propionic acid, N-Hexanoyl-L-Phe thereof. A third category uses molecular biology approaches OH, 4-Nonyloxybenzoic acid, 4'-(Trifluoromethyl)-2- to prepare peptides or proteins on the Surface of filamentous biphenylcarboxylic acid, Bz-L-Phe-OH, N-Tridecanoyl phage particles or plasmids (Scott and Craig, Curr: Opinion Gly-OH, 3.5-Bis(trifluoromethyl)phenylacetic acid, 3-(4-n- Biotech. 5:40, 1994). Soluble, nonsupport-bound peptide Heptylbenzoyl)propionic acid, N-Hepytanoyl-L-Phe-OH, libraries appear to be Suitable for a number of applications, 4-Decyloxybenzoic acid, N-(C.O.C.-trifluoro-m-tolyl) 25 including use as tags. The available repertoire of chemical anthranilic acid, Niflu mic acid, 4-(2- diversities in peptide libraries can be expanded by StepS. Such Hydroxyhexafluoroisopropyl)benzoic acid, N-Myristoyl as permethylation (Ostresh et al., Proc. Natl. Acad. Sci., USA Gly-OH, 3-(4-n-Octylbenzoyl)propionic acid, N-Octanoyl 91:11138, 1994). L-Phe-OH, 4-Undecyloxybenzoic acid, 3-(3,4,5- Numerous variants of peptide combinatorial libraries are Trimethoxyphenyl)propionyl-Gly-OH, 8-Iodonaphthoic possible in which the peptide backbone is modified, and/or acid, N-Pentadecanoyl-Gly-OH, 4-Dodecyloxybenzoic the amide bonds have been replaced by mimetic groups. acid, N-Palmitoyl-Gly-OH, and N-Stearoyl-Gly-OH. These Amide mimetic groups which may be used include ureas, organic acids are available from one or more of Advanced urethanes, and carbonylmethylene groups. Restructuring the ChemTech, Louisville, Ky., Bachem Bioscience Inc., backbone Such that Sidechains emanate from the amide Torrance, Calif.; Calbiochem-Novabiochem Corp., San 35 nitrogens of each amino acid, rather than the alpha-carbons, Diego, Calif.; Farchan Laboratories Inc., Gainesville Fla.; gives libraries of compounds known as peptoids (Simon et Lancaster Synthesis, Windham N.H.; and May Bridge al., Proc. Natl. Acad Sci., USA 89:9367, 1992). Chemical Company (c/o Ryan Scientific), Columbia, S.C. Another common type of oligomeric combinatorial The catalogs from these companies use the abreviations library is the oligonucleotide combinatorial library, where which are used above to identify the acids. 40 the building blocks are Some form of naturally occurring or f. Combinatorial Chemistry as a Means for Preparing Tags unnatural nucleotide or polysaccharide derivatives, includ Combinatorial chemistry is a type of Synthetic Strategy ing where various organic and inorganic groups may Sub which leads to the production of large chemical libraries Stitute for the phosphate linkage, and nitrogen or Sulfur may (see, for example, PCT Application Publication No. WO Substitute for oxygen in an ether linkage (Schneider et al., 94/08051). These combinatorial libraries can be used as tags 45 Biochem. 34: 9599, 1995; Freier et al., J. Med. Chem. for the identification of molecules of interest (MOIs). Com 38:344, 1995; Frank, J. Biotechnology 41:259, 1995; binatorial chemistry may be defined as the Systematic and Schneider et al., Published PCT WO 942052; Ecker et al., repetitive, covalent connection of a set of different “building Nucleic Acids Res. 21:1853, 1993). blocks” of varying structures to each other to yield a large More recently, the combinatorial production of collec array of diverse molecular entities. Building blockS can take 50 tions of non-oligomeric, Small molecule compounds has many forms, both naturally occurring and Synthetic, Such as been described (DeWitt et al., Proc. Natl. Acad. Sci., USA nucleophiles, electrophiles, dienes, alkylating or acylating 90:690, 1993; Bunin et al., Proc. Natl. Acad. Sci., USA agents, diamines, nucleotides, amino acids, Sugars, lipids, 91:4708, 1994). Structures suitable for elaboration into organic monomers, Synthons, and combinations of the Small-molecule libraries encompass a wide variety of above. Chemical reactions used to connect the building 55 organic molecules, for example heterocyclics, aromatics, blockS may involve alkylation, acylation, oxidation, alicyclics, aliphatics, Steroids, antibiotics, enzyme reduction, hydrolysis, Substitution, elimination, addition, inhibitors, ligands, hormones, drugs, alkaloids, opioids, cyclization, condensation, and the like. This process can terpenes, porphyrins, toxins, catalysts, as well as combina produce libraries of compounds which are oligomeric, non tions thereof. oligomeric, or combinations thereof. If oligomeric, the com 60 g. Specific Methods for Combinatorial Synthesis of Tags pounds can be branched, unbranched, or cyclic. Examples of Two methods for the preparation and use of a diverse Set oligomeric structures which can be prepared by combinato of amine-containing MS tags are outlined below. In both rial methods include oligopeptides, oligonucleotides, methods, Solid phase Synthesis is employed to enable Simul oligosaccharides, poly lipids, polyesters, polyamides, taneous parallel Synthesis of a large number of tagged polyurethanes, polyureas, polyethers, poly(phosphorus 65 linkers, using the techniques of combinatorial chemistry. In derivatives), e.g., phosphates, phosphonate S, the first method, the eventual cleavage of the tag from the phosphoramide S, phosphonamides, phosphite S, oligonucleotide results in liberation of a carboxyl amide. In US 6,613,508 B1 31 32 the Second method, cleavage of the tag produces a carboxy lic acid. The chemical components and linking elements -continued used in these methods are abbreviated as follows: X1 . . . Xn-CONH-A-COO-1MeO-CO2H R=resin couple to n oligos (oligo1 . . . oligo(n)) (e.g., via Pfp esters) FMOC=fluorenylmethoxycarbonyl protecting group X1 ... Xn-CONH-A-COO-1MeO-CONH-oligo1 ... oligo(n) All=allyl protecting group pool tagged oligos COH=carboxylic acid group perform sequencing reaction CONH=carboxylic amide group separate different length fragments from sequencing reaction (e.g., via HPLC orCE) NH=amino group cleave tags from linkers with 25-100% TFA OH=hydroxyl group X1 . . . Xn-CONH-A-COH CONH=amide linkage COO=ester linkage analyze by mass spectrometry NH-Rink-COH=4-(C.-amino)-2,4-dimethoxybenzyl phenoxybutyric acid (Rink linker) 15 2. Linkers OH-1MeO-COH=(4-hydroxymethyl)phenoxybutyric A "linker” component (or L), as used herein, means either a direct covalent bond or an organic chemical group which acid is used to connect a “tag” (or T) to a “molecule of interest” OH-2MeO-COH=(4-hydroxymethyl-3-methoxy) (or MOI) through covalent chemical bonds. In addition, the phenoxyacetic acid direct bond itself, or one or more bonds within the linker NH-A-COOH=amino acid with aliphatic or aromatic component is cleavable under conditions which allowST to amine functionality in Side chain be released (in other words, cleaved) from the remainder of X1 . . . Xn-COOH=set of n diverse carboxylic acids with the T-L-X compound (including the MOI component). unique molecular weights The tag variable component which is present within T oligo1 . . . oligo(n)=set of n oligonucleotides should be stable to the cleavage conditions. Preferably, the HBTU=O-benzotriazol-1-yl-N,N,N',N'-tetramethyluronium 25 cleavage can be accomplished rapidly; within a few minutes hexafluorophosphate and preferably within about 15 seconds or less. In general, a linker is used to connect each of a large Set The sequence of steps in Method 1 is as follows: of tags to each of a similarly large Set of MOIS. Typically, a Single tag-linker combination is attached to each MOI (to give various T-L-MOI), but in Some cases, more than one OH-2MeO-CONH-R tag-linker combination may be attached to each individual FMOC-NH-Rink-COH: couple (e.g., HBTU) MOI (to give various (T-L)n-MOI). In another embodi FMOC-NH-Rink-COO-2MeO-CONH-R ment of the present invention, two or more tags are bonded w piperidine (remove FMOC) to a single linker through multiple, independent Sites on the NH-Rink-COO-2MeO-CONH-R FMOC-NH-A-COOH; couple (e.g., HBTU) 35 linker, and this multiple tag-linker combination is then FMOC-NH-A-CONH-Rink-COO-2MeO-CONH-R bonded to an individual MOI (to give various (T)n-L- w piperidine (remove FMOC) MOI). NH-A-CONH Rink-COO-2MeO-CONH-R After various manipulations of the Set of tagged MOIs, divide into n aliquots Special chemical and/or physical conditions are used to couple to n different acids X1 ... Xn-COOH X1. . . Xn CONH-A-CONH-Rink-COO-2MeO-CONH-R 40 cleave one or more covalent bonds in the linker, resulting in Cleave tagged linkers from resin with 1% TFA the liberation of the tags from the MOIs. The cleavable X1.. . Xn-CONH-A-CONH-Rink-COH bond(s) may or may not be Some of the same bonds that were couple to n oligos (oligo1 . . . oligo(n)) formed when the tag, linker, and MOI were connected (e.g., via Pfp esters) together. The design of the linker will, in large part, deter X1.. . Xn-CONH-A-CONH-Rink-CONH-oligo1... oligo(n) mine the conditions under which cleavage may be accom pool tagged oligos 45 perform sequencing reaction plished. Accordingly, linkers may be identified by the cleav separate different length fragments from age conditions they are particularly Susceptible too. When a sequencing reaction (e.g., via HPLC or CE) linker is photolabile (i.e., prone to cleavage by exposure to cleave tags from linkers with 25%–100% TFA actinic radiation), the linker may be given the designation X1.. . Xin-CONH-A-CONH L. Likewise, the designations L,L,L.P, LP, L', w analyze by mass spectrometry 50 L', L^ and L’ may be used to refer to linkers that are particularly Susceptible to cleavage by acid, base, chemical oxidation, chemical reduction, the catalytic activity of an The sequence of steps in Method 2 is as follows: enzyme (more simply “enzyme”), electrochemical oxidation or reduction, elevated temperature (“thermal”) and thiol 55 eXchange, respectively. OH-1MeO-CO-All Certain types of linker are labile to a single type of FMOC-NH-A-COH: couple (e.g., HBTU) cleavage condition, whereas others are labile to Several types FMOC-NH-A-COO-1MeO-CO-All of cleavage conditions. In addition, in linkers which are Palladium (remove Allyl) capable of bonding multiple tags (to give (T)n-L-MOI type FMOC-NH-A-COO-1MeO-COH OH-2MeO-CONH-R; couple (e.g., HBTU) 60 Structures), each of the tag-bonding sites may be labile to FMOC-NH-A-COO-1MeO-COO-2MeO-CONH-R different cleavage conditions. For example, in a linker piperidine (remove FMOC) having two tags bonded to it, one of the tags may be labile NH-A-COO-1MeO-COO-2MeO-CONH-R only to base, and the other labile only to photolysis. divide into n aliquots couple to n different acids X1 . . . Xn-CO2H A linker which is useful in the present invention possesses X1. . . Xin-CONH-A-COO-1MeO-COO-2MeO-CONH-R 65 several attributes: cleave tagged linkers from resin with 1% TFA 1) The linker possesses a chemical handle (L.) through which it can be attached to an MOI. US 6,613,508 B1 33 34 2) The linker possesses a second, separate chemical (CH-)-, etc.) or hydrocarbylene-(O-hydrocarbylene)- handle (L) through which the tag is attached to the linker. wherein w is an integer ranging from 1 to about 10 (e.g., If multiple tags are attached to a single linker ((T)n-L-MOI CH-O-Ar-, -CH-(O-CHCH) , etc.). type structures), then a separate handle exists for each tag. With the advent of Solid phase synthesis, a great body of 3) The linker is stable toward all manipulations to which literature has developed regarding linkers that are labile to it is Subjected, with the exception of the conditions which Specific reaction conditions. In typical Solid phase Synthesis, allow cleavage Such that a T-containing moiety is released a Solid Support is bonded through a labile linker to a reactive from the remainder of the compound, including the MOI. Site, and a molecule to be Synthesized is generated at the Thus, the linker is stable during attachment of the tag to the reactive site. When the molecule has been completely linker, attachment of the linker to the MOI, and any manipu Synthesized, the Solid Support-linker-molecule construct is lations of the MOI while the tag and linker (T-L) are Subjected to cleavage conditions which releases the mol attached to it. ecule from the solid support. The labile linkers which have 4) The linker does not significantly interfere with the been developed for use in this context (or which may be used manipulations performed on the MOI while the T-L is in this context) may also be readily used as the linker attached to it. For instance, if the T-L is attached to an 15 reactant in the present invention. oligonucleotide, the T-L must not significantly interfere Lloyd-Williams, P., et al., “Convergent Solid-Phase Pep with any hybridization or enzymatic reactions (e.g., PCR) tide Synthesis”, Tetrahedron Report No. 347, 49(48) performed on the oligonucleotide. Similarly, if the T-L is :11065-11133 (1993) provides an extensive discussion of attached to an antibody, it must not significantly interfere linkers which are labile to actinic radiation (i.e., photolysis), with antigen recognition by the antibody. as well as acid, base and other cleavage conditions. Addi 5) Cleavage of the tag from the remainder of the com tional Sources of information about labile linkers may be pound occurs in a highly controlled manner, using physical readily obtained. or chemical processes that do not adversely affect the As described above, different linker designs will confer detectability of the tag. cleavability (“lability”) under different specific physical or For any given linker, it is preferred that the linker be 25 chemical conditions. Examples of conditions which Serve to attachable to a wide variety of MOIs, and that a wide variety cleave various designs of linker include acid, base, of tags be attachable to the linker. Such flexibility is advan oxidation, reduction, fluoride, thiol eXchange, photolysis, tageous because it allows a library of T-L conjugates, once and enzymatic conditions. prepared, to be used with several different sets of MOIs. Examples of cleavable linkers that Satisfy the general AS explained above, a preferred linker has the formula criteria for linkers listed above will be well known to those in the art and include those found in the catalog available L-L-L-L-L from Pierce (Rockford, Ill). Examples include: wherein each L is a reactive handle that can be used to link ethylene glycobis(succinimidylsuccinate) (EGS), an the linker to a tag reactant and a molecule of interest amine reactive croSS-linking reagent which is cleavable reactant. Li is an essential part of the linker, because Lif 35 by hydroxylamine (1 M at 37° C. for 3–6 hours); imparts lability to the linker. L' and L are optional groups disuccinimidyl tartarate (DST) and sulfo-DST, which are which effectively serve to separate Li from the handles L. amine reactive cross-linking reagents, cleavable by L' (which, by definition, is nearer to T than is L), serves 0.015 M Sodium periodate; to separate T from the required labile moiety Lif. This bis(2-(succinimidyloxycarbonyloxy)ethylsulfone Separation may be useful when the cleavage reaction gen 40 (BSOCOES) and sulfo-BSOCOES, which are amine erates particularly reactive species (e.g., free radicals) which reactive cross-linking reagents, cleavable by base (pH may cause random changes in the Structure of the T-containing moiety. AS the cleavage Site is further separated 11.6); from the T-containing moiety, there is a reduced likelihood 1,4-di-3'-(2-pyridyldithio(propionamido))butane that reactive Species formed at the cleavage site will disrupt 45 (DPDPB), a pyridyldithiol crosslinker which is cleav the Structure of the T-containing moiety. Also, as the atoms able by thiol eXchange or reduction; in L1 will typically be present in the T-containing moiety, N-4-(p-azidosalicylamido)-butyl-3'-(2-pyridydithio) these L' atoms may impart a desirable quality to the propionamide (APDP), a pyridyldithiol crosslinker T-containing moiety. For example, where the T-containing which is cleavable by thiol eXchange or reduction; moiety is a T"-containing moiety, and a hindered amine is 50 bis-beta-4-(azidosalicylamido)ethyl-disulfide, a photo desirably present as part of the structure of the T"- reactive crosslinker which is cleavable by thiol containing moiety (to serve, e.g., as a MSSE), the hindered eXchange or reduction; amine may be present in L' labile moiety. N-Succinimidyl-(4-azidophenyl)-1,3'dithiopropionate In other instances, L' and/or Li may be present in a linker (SADP), a photoreactive crosslinker which is cleavable component merely because the commercial Supplier of a 55 by thiol eXchange or reduction; linker chooses to sell the linker in a form having Such a L' Sulfo Succinimidyl-2-(7-azido-4-methylcoumarin-3- and/or Li group. In Such an instance, there is no harm in acetamide)ethyl-1,3'-dithiopropionate (SAED), a pho using linkers having L" and/or Li groups, (So long as these toreactive crosslinker which is cleavable by thiol group do not inhibit the cleavage reaction) even though they eXchange or reduction; may not contribute any particular performance advantage to 60 SulfoSuccinimidyl-2-(m-azido-o-nitrobenzamido)-ethyl the compounds that incorporate them. Thus, the present 1,3'dithiopropionate (SAND), a photore active invention allows for L' and/or Ligroups to be present in the crosslinker which is cleavable by thiol eXchange or linker component. reduction. L' and/or Li groups may be a direct bond (in which case Other examples of cleavable linkers and the cleavage the group is effectively not present), a hydrocarbylene group 65 conditions that can be used to release tags are as follows. A (e.g., alkylene, arylene, cycloalkylene, etc.), silyl linking group can be cleaved by fluoride or under acidic -O-hydrocarbylene (e.g., -O-CH2-, O-CHCH conditions. A 3-, 4-, 5-, or 6-Substituted-2-nitrobenzyloxy or US 6,613,508 B1 35 36 2-, 3-, 5-, or 6-Substituted-4-nitrobenzyloxy linking group pionic acid. Examples of enzymatic cleavage include can be cleaved by a photon Source (photolysis). A 3-, 4-, 5-, esterases which will cleave ester bonds, nucleaseS which or 6-Substituted-2-alkoxyphenoxy or 2-, 3-, 5-, or will cleave phosphodiester bonds, proteaseS which cleave 6-Substituted-4-alkoxyphenoxy linking group can be peptide bonds, etc. cleaved by Ce(NH) (NO) (oxidation). A NCO A preferred linker component has an ortho-nitrobenzyl (urethane) linker can be cleaved by hydroxide (base), acid, structure as shown below: or LiAlH (reduction). A 3-pentenyl, 2-butenyl, or 1-butenyl linking group can be cleaved by O, O,O/IO, or KMnO, d (oxidation). A 2-3-, 4-, or 5-Substituted-furyloxy linking group can be cleaved by O, Br, MeOH, or acid. Conditions for the cleavage of other labile linking groups include: t-alkyloxy linking groups can be cleaved by acid; NO methyl(dialkyl)methoxy or 4-substitued-2-alkyl-1,3- -N dioxlane-2-yl linking groups can be cleaved by HO"; 3. 2-silylethoxy linking groups can be cleaved by fluoride or 15 acid; 2-(X)-ethoxy (where X=keto, ester amide, cyano, NO, Sulfide, Sulfoxide, Sulfone) linking groups can be cleaved wherein one carbon atom at positions a, b, c, d or e is under alkaline conditions, 2-, 3-, 4-, 5-, or 6-Substituted substituted with -Li-X, and L' (which is preferably a benzyloxy linking groups can be cleaved by acid or under direct bond) is present to the left of N(R') in the above reductive conditions, 2-butenyloxy linking groups can be Structure. Such a linker component is Susceptible to Selective cleaved by (Ph-P)RhCl(H), 3-, 4-, 5-, or 6-substituted-2- photo-induced cleavage of the bond between the carbon bromophenoxy linking groups can be cleaved by Li, Mg, or labeled “a” and N(R'). The identity of R is not typically BuLi; methylthiomethoxy linking groups can be cleaved by critical to the cleavage reaction, however R' is preferably Hg"; 2-(X)-ethyloxy (where X=a halogen) linking groups Selected from hydrogen and hydrocarbyl. The present inven can be cleaved by Zn or Mg, 2-hydroxyethyloxy linking 25 tion provides that in the above structure, -N(R)- could groups can be cleaved by oxidation (e.g., with Pb(OAc)). be replaced with -O-. Also in the above structure, one or Preferred linkers are those that are cleaved by acid or more of positions b, c, d or e may optionally be Substituted photolysis. Several of the acid-labile linkers that have been with alkyl, alkoxy, fluoride, chloride, hydroxyl, carboxylate developed for Solid phase peptide Synthesis are useful for or amide, where these Substituents are independently linking tags to MOIs. Some of these linkers are described in Selected at each occurrence. a recent review by Lloyd-Williams et al. (Tetrahedron A further preferred linker component with a chemical 49:11065-11133, 1993). One useful type of linker is based handle L, has the following structure: upon p-alkoxybenzyl alcohols, of which two, 4-hydroxy methylphenoxyacetic acid and 4-(4- d hydroxymethyl-3-methoxyphenoxy)butyric acid, are com 35 mercially available from Advanced ChemTech (Louisville, Ky.). Both linkers can be attached to a tag via an ester linkage to the benzylalcohol, and to an amine-containing NO2 MOI via an amide linkage to the carboxylic acid. Tags linked -N by these molecules are released from the MOI with varying 40 C-R2 concentrations of trifluoroacetic acid. The cleavage of these R1 linkers results in the liberation of a carboxylic acid on the O tag. Acid cleavage of tags attached through related linkers, Such as 2,4-dimethoxy-4'-(carboxymethyloxy)- wherein one or more of positions b, c, d or e is Substituted benzhydrylamine (available from Advanced ChemTech in 45 with hydrogen, alkyl, alkoxy, fluoride, chloride, hydroxyl, FMOC-protected form), results in liberation of a carboxylic carboxylate or amide, R is hydrogen or hydrocarbyl, and R' amide on the released tag. is -OH or a group that either protects or activates a The photolabile linkers useful for this application have carboxylic acid for coupling with another moiety. Fluoro also been for the most part developed for Solid phase peptide carbon and hydrofluorocarbon groups are preferred groups synthesis (see Lloyd-Williams review). These linkers are 50 that activate a carboxylic acid toward coupling with another usually based on 2-nitrobe in Zyle Sters or moiety. 2-nitrobenzylamides. Two examples of photolabile linkers 3. Molecule of Interest (MOI) that have recently been reported in the literature are 4-(4- Examples of MOIs include nucleic acids or nucleic acid (1-Fmoc-amino)ethyl)-2-methoxy-5-nitrophenoxy)butanoic analogues (e.g., PNA), fragments of nucleic acids (i.e., acid (Holmes and Jones, J. Org Chem. 60:2318-2319, 1995) 55 nucleic acid fragments), Synthetic nucleic acids or and 3-(Fmoc-amino)-3-(2-nitrophenyl)propionic acid fragments, oligonucleotides (e.g., DNA or RNA), proteins, (Brown et al., Molecular Diversity 1:4–12, 1995). Both peptides, antibodies or antibody fragments, receptors, recep linkers can be attached via the carboxylic acid to an amine tor ligands, members of a ligand pair, cytokines, hormones, on the MOI. The attachment of the tag to the linker is made oligosaccharides, Synthetic organic molecules, drugs, and by forming an amide between a carboxylic acid on the tag 60 combinations thereof. and the amine on the linker. Cleavage of photolabile linkers Preferred MOIs include nucleic acid fragments. Preferred is usually performed with UV light of 350 nm wavelength at nucleic acid fragments are primer Sequences that are intensities and times known to those in the art. Cleavage of complementary to Sequences present in vectors, where the the linkers results in liberation of a primary amide on the tag. vectors are used for base Sequencing. Preferably a nucleic Examples of photocleavable linkers include nitrophenyl 65 acid fragment is attached directly or indirectly to a tag at glycine esters, eXO- and endo-2-benzonorborneyl chlorides other than the 3' end of the fragment; and most preferably at and methane Sulfonates, and 3-amino-3(2-nitrophenyl) pro the 5' end of the fragment. Nucleic acid fragments may be US 6,613,508 B1 37 38 purchased or prepared based upon genetic databases (e.g., Of particular interest for linking tags (or tags with linkers) Dib et al., Nature 380:152–154, 1996 and CEPH Genotype to oligonucleotides is the formation of amide bonds. Primary Database, http://www.cephb.fr) and commercial vendors aliphatic amine handles can be readily introduced onto (e.g., Promega, Madison, Wis.). Synthetic oligonucleotides with phosphoramiditeS Such as AS used herein, MOI includes derivatives of an MOI that 6-monomethoxytritylhexylcyanoethyl-N,N-diisopropyl contain functionality useful in joining the MOI to a T-L- phosphoramidite (available from Glenn Research, Sterling, L., compound. For example, a nucleic acid fragment that has Va.). The amines found on natural nucleotides Such as a phosphodiester at the 5' end, where the phosphodiester is adenosine and guanosine are virtually unreactive when also bonded to an alkyleneamine, is an MOI. Such an MOI compared to the introduced primary amine. This difference is described in, e.g., U.S. Pat. No. 4,762,779 which is in reactivity forms the basis of the ability to selectively form incorporated herein by reference. A nucleic acid fragment amides and related bonding groups (e.g., ureas, thioureas, with an internal modification is also an MOI. An exemplary Sulfonamides) with the introduced primary amine, and not internal modification of a nucleic acid fragment is where the the nucleotide amines. base (e.g., adenine, guanine, cytosine, thymidine, uracil) has AS listed in the Molecular Probes catalog (Eugene, been modified to add a reactive functional group. Such 15 Oreg.), a partial enumeration of amine-reactive functional internally modified nucleic acid fragments are commercially groups includes activated carboxylic esters, isocyanates, available from, e.g., Glen Research, Herndon, Va. Another isothiocyanates, Sulfonyl halides, and dichlorotriaZenes. exemplary internal modification of a nucleic acid fragment Active esters are excellent reagents for amine modification is where an abasic phosphoramidate is used to Synthesize a Since the amide products formed are very Stable. Also, these modified phosphodiester which is interposed between a reagents have good reactivity with aliphatic amines and low Sugar and phosphate group of a nucleic acid fragment. The reactivity with the nucleotide amines of oligonucleotides. abasic phosphoramidate contains a reactive group which Examples of active esters include N-hydroxySuccinimide allows a nucleic acid fragment that contains this esters, pentafluorophenyl esters, tetrafluorophenyl esters, phosphoramidate-derived moiety to be joined to another and p-nitrophenyl esters. Active esters are useful because moiety, e.g., a T-L-L, compound. Such abasic phospho 25 they can be made from virtually any molecule that contains ramidates are commercially available from, e.g., Clonetech a carboxylic acid. Methods to make active esters are listed Laboratories, Inc., Palo Alto, Calif. in Bodansky (Principles of Peptide Chemistry (2d ed.), 4. Chemical Handles (L) Springer Verlag, London, 1993). A chemical handle is a Stable yet reactive atomic arrange 5. Linker Attachment ment present as part of a first molecule, where the handle can Typically, a single type of linker is used to connect a undergo chemical reaction with a complementary chemical particular Set or family of tags to a particular set or family handle present as part of a Second molecule, So as to form of MOIs. In a preferred embodiment of the invention, a a covalent bond between the two molecules. For example, Single, uniform procedure may be followed to create all the the chemical handle may be a hydroxyl group, and the various T-L-MOI structures. This is especially advanta complementary chemical handle may be a carboxylic acid 35 geous when the set of T L-MOI structures is large, group (or an activated derivative thereof, e.g., a hydrof because it allows the Set to be prepared using the methods of luroaryl ester), whereupon reaction between these two combinatorial chemistry or other parallel processing tech handles forms a covalent bond (specifically, an ester group) nology. In a Similar manner, the use of a Single type of linker that joins the two molecules together. allows a single, uniform procedure to be employed for Chemical handles may be used in a large number of 40 cleaving all the various T L-MOI structures. Again, this covalent bond-forming reactions that are Suitable for attach is advantageous for a large Set of T-L-MOI structures, ing tags to linkers, and linkers to MOIS. Such reactions because the Set may be processed in a parallel, repetitive, include alkylation (e.g., to form ethers, thioethers), acylation and/or automated manner. (e.g., to form esters, amides, carbamates, ureas, thioureas), There are, however, other embodiment of the present phosphorylation (e.g., to form phosphates, phosphonates, 45 invention, wherein two or more types of linker are used to phosphoramides, phosphonamides), Sulfonylation (e.g., to connect different Subsets of tags to corresponding Subsets of form Sulfonates, Sulfonamides), condensation (e.g., to form MOIS. In this case, Selective cleavage conditions may be imines, oximes, hydrazones), Sillylation, disulfide formation, used to cleave each of the linkers independently, without and generation of reactive intermediates, Such as nitrenes or cleaving the linkers present on other subsets of MOIs. carbenes, by photolysis. In general, handles and bond 50 A large number of covalent bond-forming reactions are forming reactions which are Suitable for attaching tags to Suitable for attaching tags to linkers, and linkers to MOIs. linkers are also suitable for attaching linkers to MOIs, and Such reactions include alkylation (e.g., to form ethers, Vice-versa. In Some cases, the MOI may undergo prior thioethers), acylation (e.g., to form esters, amides, modification or derivitization to provide the handle needed carbamates, ureas, thioureas), phosphorylation (e.g., to form for attaching the linker. 55 phosphate S, phosphonate S, phosphoramide S, One type of bond especially useful for attaching linkers to phosphonamides), Sulfonylation (e.g., to form Sulfonates, MOIs is the disulfide bond. Its formation requires the Sulfonamides), condensation (e.g., to form imines, OXimes, presence of a thiol group (“handle') on the linker, and hydrazones), Sillylation, disulfide formation, and generation another thiol group on the MOI. Mild oxidizing conditions of reactive intermediates, Such as nitrenes or carbenes, by then suffice to bond the two thiols together as a disulfide. 60 photolysis. In general, handles and bond-forming reactions Disulfide formation can also be induced by using an exceSS which are Suitable for attaching tags to linkers are also of an appropriate disulfide eXchange reagent, e.g., pyridyl suitable for attaching linkers to MOIs, and vice-versa. In disulfides. Because disulfide formation is readily reversible, Some cases, the MOI may undergo prior modification or the disulfide may also be used as the cleavable bond for derivitization to provide the handle needed for attaching the liberating the tag, if desired. This is typically accomplished 65 linker. under Similarly mild conditions, using an exceSS of an One type of bond especially useful for attaching linkers to appropriate thiol eXchange reagent, e.g., dithiothreitol. MOIs is the disulfide bond. Its formation requires the US 6,613,508 B1 39 40 presence of a thiol group (“handle') on the linker, and linker is bonded to a MOI, to create the structure T-L- another thiol group on the MOI. Mild oxidizing conditions MOI. Alternatively, the same structure is formed by first then suffice to bond the two thiols together as a disulfide. bonding a linker to a MOI, and then bonding the combina Disulfide formation can also be induced by using an exceSS tion of linker and MOI to a tag. An example is where the of an appropriate disulfide eXchange reagent, e.g., pyridyl MOI is a DNA primer or oligonucleotide. In that case, the disulfides. Because disulfide formation is readily reversible, tag is typically first bonded to a linker, then the T-L is the disulfide may also be used as the cleavable bond for bonded to a DNA primer or oligonucleotide, which is then liberating the tag, if desired. This is typically accomplished used, for example, in a Sequencing reaction. under Similarly mild conditions, using an exceSS of an One useful form in which a tag could be reversibly appropriate thiol eXchange reagent, e.g., dithiothreitol. attached to an MOI (e.g., an oligonucleotide or DNA Of particular interest for linking tags to oligonucleotides Sequencing primer) is through a chemically labile linker. is the formation of amide bonds. Primary aliphatic amine One preferred design for the linker allows the linker to be handles can be readily introduced onto Synthetic oligonucle cleaved when exposed to a volatile organic acid, for otide S with phosphoramidite S. Such as example, trifluoroacetic acid (TFA). TFA in particular is 6-monomethoxytritylhexyclcyanoethyl-N,N-diisopropyl 15 compatible with most methods of MS ionization, including phosphoramidite (available from Glenn Research, Sterling, electrospray. Va.). The amines found on natural nucleotides Such as As described in detail below, the invention provides adenosine and guanosine are virtually unreactive when methodology for genotyping. A composition which is useful compared to the introduced primary amine. This difference in the genotyping method comprises a a plurality of com in reactivity forms the basis of the ability to selectively form pounds of the formula: amides and related bonding groups (e.g., ureas, thioureas, Sulfonamides) with the introduced primary amine, and not T'-L-MOI the nucleotide amines. AS listed in the Molecular Probes catalog (Eugene, wherein, Oreg.), a partial enumeration of amine-reactive functional 25 T" is an organic group detectable by mass spectrometry, groups includes activated carboxylic esters, isocyanates, comprising carbon, at least one of hydrogen and isothiocyanates, Sulfonyl halides, and dichlorotriaZenes. fluoride, and optional atoms Selected from Oxygen, Active esters are excellent reagents for amine modification nitrogen, Sulfur, phosphorus and iodine. In the formula, Since the amide products formed are very Stable. Also, these L is an organic group which allows a T"-containing reagents have good reactivity with aliphatic amines and low moiety to be cleaved from the remainder of the reactivity with the nucleotide amines of oligonucleotides. compound, wherein the T"-containing moiety com Examples of active esters include N-hydroxySuccinimide prises a functional group which Supports a Single esters, pentafluorophenyl esters, tetrafluorophenyl esters, ionized charge State when the compound is Subjected to and p-nitrophenyl esters. Active esters are useful because mass spectrometry and is Selected from tertiary amine, they can be made from virtually any molecule that contains 35 quaternary amine and organic acid. In the formula, a carboxylic acid. Methods to make active esters are listed MOI is a nucleic acid fragment wherein L is conjugated in Bodansky (Principles of Peptide Chemistry (2d ed.), to the MOI at a location other than the 3' end of the Springer Verlag, London, 1993). MOI. In the composition, at least two compounds have Numerous commercial croSS-linking reagents exist which the same T" but the MOI groups of those molecules can serve as linkers (e.g., see Pierce Cross-linkers, Pierce 40 have non-identical nucleotide lengths. Chemical Co., Rockford, Ill.). Among these are homobi Another composition that is useful in the genotyping functional amine-reactive croSS-linking reagents which are method comprises a plurality of compounds of the formula: exemplified by homobifunctional imidoesters and N-hydroxysuccinimidyl (NHS) esters. There also exist het T'-L-MOI erobifunctional croSS-linking reagents possess two or more 45 wherein T" is an organic group detectable by mass different reactive groups that allows for Sequential reactions. Spectrometry, comprising carbon, at least one of hydrogen Imidoesters react rapidly with amines at alkaline pH. NHS and fluoride, and optional atoms Selected from Oxygen, esters give Stable products when reacted with primary or nitrogen, Sulfur, phosphorus and iodine. In the formula, L is Secondary amines. Maleimides, alkyl and aryl halides, an organic group which allows a T"-containing moiety to alpha-haloacyls and pyridyl disulfides are thiol reactive. 50 be cleaved from the remainder of the compound, wherein the Maleimides are specific for thiol (sulfhydryl) groups in the T"-containing moiety comprises a functional group which pH range of 6.5 to 7.5, and at alkaline pH can become amine Supports a Single ionized charge State when the compound is reactive. The thioether linkage is stable under physiological Subjected to mass spectrometry and is Selected from tertiary conditions. Alpha-haloacetyl croSS-linking reagents contain amine, quaternary amine and organic acid. In the formula, the iodoacetyl group and are reactive towards Sulfhydryls. 55 MOI is a nucleic acid fragment wherein L is conjugated to Imidazoles can react with the iodoacetyl moiety, but the the MOI at a location other than the 3' end of the MOI. In reaction is very slow. Pyridyl disulfides react with thiol the composition, at least two compounds have the same T" groups to form a disulfide bond. Carbodiimides couple but those compounds have non-identical elution times by carboxyls to primary amines of hydrazides which give rises column chromatography. to the formation of an acyl-hydrazine bond. The arylazides 60 are photoaffinity reagents which are chemically inert until Another composition that may be used in the genotyping exposed to UV or visible light. When such compounds are method comprises a plurality of compounds of the formula: photolyzed at 250-460 nm, a reactive aryl nitrene is formed. T'-L-MOI The reactive aryl nitrene is relatively non-Specific. Glyoxals are reactive towards guanidinyl portion of arginine. 65 wherein T" is an organic group detectable by mass In one typical embodiment of the present invention, a tag Spectrometry, comprising carbon, at least one of hydrogen is first bonded to a linker, then the combination of tag and and fluoride, and optional atoms Selected from Oxygen, US 6,613,508 B1 41 42 nitrogen, Sulfur, phosphorus and iodine. In the formula, L is and fluoride, and optional atoms Selected from Oxygen, an organic group which allows a T"-containing moiety to nitrogen, Sulfur, phosphorus and iodine. In the formula, L is be cleaved from the remainder of the compound, wherein the an organic group which allows a T"-containing moiety to T"-containing moiety comprises a functional group which be cleaved from the remainder of the compound, wherein the Supports a Single ionized charge State when the compound is T"-containing moiety comprises a functional group which Subjected to mass spectrometry and is Selected from tertiary Supports a Single ionized charge State when the compound is amine, quaternary amine and organic acid. In the formula, Subjected to mass spectrometry and is Selected from tertiary MOI is a nucleic acid fragment wherein L is conjugated to amine, quaternary amine and organic acid. In the formula, the MOI at a location other than the 3' end of the MOI. In MOI is a nucleic acid fragment wherein L is conjugated to the composition, no two compounds which have the same the MOI at a location other than the 3' end of the MOI; and MOI nucleotide length also have the same T". each primer pair associates with a different loci. In the kit, In the above composition, the plurality is preferably the plurality is preferably at least 3, and more preferably at greater than 2, and preferably greater than 4. Also, the least 5. nucleic acid fragment in the MOI have a Sequence comple AS noted above, the present invention provides compo mentary to a portion of a vector, wherein the fragment is 15 Sitions and methods for determining the Sequence of nucleic capable of priming polynucleotide Synthesis. Preferably, the acid molecules. Briefly, Such methods generally comprise T" groups of members of the plurality differ by at least 2 the Steps of (a) generating tagged nucleic acid fragments amu, and may differ by at least 4 amu. which are complementary to a Selected nucleic acid mol The invention also provides for a composition comprising ecule (e.g., tagged fragments) from a first terminus to a a plurality of Sets of compounds, each Set of compounds Second terminus of a nucleic acid molecule), wherein a tag having the formula: is correlative with a particular or Selected nucleotide, and may be detected by any of a variety of methods, (b) Separating the tagged fragments by Sequential length, (c) wherein T" is an organic group detectable by mass cleaving a tag from a tagged fragment, and (d) detecting the Spectrometry, comprising carbon, at least one of hydrogen 25 tags, and thereby determining the Sequence of the nucleic and fluoride, and optional atoms Selected from Oxygen, acid molecule. Each of the aspects will be discussed in more nitrogen, Sulfur, phosphorus and iodine. In the formula, L is detail below. an organic group which allows a T"-containing moiety to B. Diagnostic Methods be cleaved from the remainder of the compound, wherein the 1. Introduction T"-containing moiety comprises a functional group which AS noted above, the present invention also provides a Supports a Single ionized charge State when the compound is wide variety of methods wherein the above-described tags Subjected to mass spectrometry and is Selected from tertiary and/or linkers may be utilized in place of traditional labels amine, quaternary amine and organic acid. Also, in the (e.g., radioactive or enzymatic), in order to enhance the formula, MOI is a nucleic acid fragment wherein L is Specificity, Sensitivity, or number of Samples that may be conjugated to the MOI at a location other than the 3' end of 35 Simultaneously analyzed, within a given method. Represen the MOI. In the composition, members within a first set of tative examples of Such methods which may be enhanced compounds have identical Tms groups, however have non include, for example, RNA amplification (see Lizardi et al., identical MOI groups with differing numbers of nucleotides Bio/Technology 6:1197–1202, 1988; Kramer et al., Nature in the MOI and there are at least ten members within the first 339:401–402, 1989; Lomeli et al., Clinical Chem. 35(9) set, wherein between sets, the T" groups differ by at least 40 :1826–1831, 1989; U.S. Pat. No. 4.786,600), and DNA 2 amu. The plurality is preferably at least 3, and more amplification utilizing LCR or polymerase chain reaction preferably at least 5. (“PCR”) (see, U.S. Pat. Nos. 4,683,195, 4,683.202, and The invention also provides for a composition comprising 4,800,159). a plurality of Sets of compounds, each Set of compounds Within one aspect of the present invention, methods are having the formula 45 provided for determining the identity of a nucleic acid molecule or fragment (or for detecting the presence of a wherein, T" is an organic group detectable by mass Selected nucleic acid molecule or fragment), comprising the Spectrometry, comprising carbon, at least one of hydrogen Steps of (a) generating tagged nucleic acid molecules from and fluoride, and optional atoms Selected from Oxygen, one or more Selected target nucleic acid molecules, wherein nitrogen, Sulfur, phosphorus and iodine. In the formula, L is 50 a tag is correlative with a particular nucleic acid molecule an organic group which allows a T"-containing moiety to and detectable by non-fluorescent Spectrometry or be cleaved from the remainder of the compound, wherein the potentiometry, (b) separating the tagged molecules by size, T"-containing moiety comprises a functional group which (c) cleaving the tags from the tagged molecules, and (d) Supports a Single ionized charge State when the compound is detecting the tags by non-fluorescent Spectrometry or Subjected to mass spectrometry and is Selected from tertiary 55 potentiometry, and therefrom determining the identity of the amine, quaternary amine and organic acid. In the formula, nucleic acid molecules. MOI is a nucleic acid fragment wherein L is conjugated to Within a related aspect of the invention, methods are the MOI at a location other than the 3' end of the MOI. In provided for detecting a Selected nucleic acid molecule, the composition, the compounds within a Set have the same comprising the steps of (a) combining tagged nucleic acid elution time but non-identical T" groups. 60 probes with target nucleic acid molecules under conditions In addition, the invention provides a kit for genotyping. and for a time Sufficient to permit hybridization of a tagged The kit comprises a plurality of amplification primer pairs, nucleic acid probe to a complementary Selected target wherein at least one of the primerS has the formula: nucleic acid Sequence, wherein a tagged nucleic acid probe is detectable by non-fluroescent Spectrometry or 65 potentiometry, (b) altering the size of hybridized tagged wherein T" is an organic group detectable by mass probes, unhybridized probes or target molecules, or the Spectrometry, comprising carbon, at least one of hydrogen probe: target hybrids, (c) separating the tagged probes by US 6,613,508 B1 43 44 size, (d) cleaving tags from the tagged probes, and (e) anywhere in the RNA, including open reading frames detecting tags by non-fluorescent spectrometry or (Welsh et al., Nuc. Acids. Res. 20:4965, 1992). In addition, potentiometry, and therefrom detecting the Selected nucleic it can be used on RNAS that are not polyadenylated, Such as acid molecule. These, other related techniques are discussed many bacterial RNAS. This variant of RNA fingerprinting by in more detail below. arbitrarily primed PCR has been called RAP-PCR. 2. PCR If arbitrarily primed PCR fingerprinting of RNA is per PCR can amplify a desired DNA sequence of any origin formed on Samples derived from cells, tissueS or other (virus, bacteria, plant, or human) hundreds of millions of biological material that have been subjected to different times in a matter of hours. PCR is especially valuable experimental treatments or have different developmental because the reaction is highly specific, easily automated, and histories, differences in gene expression between the capable of amplifying minute amounts of Sample. For these Samples can be detected. For each reaction, it is assumed that reasons, PCR has had a major impact on clinical medicine, the same number of effective PCR doubling events occur genetic disease diagnostics, forensic Science and evolution and any differences in the initial concentrations of cDNA ary biology. products are preserved as a ratio of intensities in the final Briefly, PCR is a process based on a specialized 15 fingerprint. There are no meaningful relationships between polymerase, which can Synthesize a complementary Strand the intensities of bands within a Single lane on a gel, which to a given DNA strand in a mixture containing the 4 DNA are a function of match and abundance. However, the ratio bases and 2 DNA fragments (primers, each about 20 bases between lanes is preserved for each Sampled RNA, allowing long) flanking the target Sequence. The mixture is heated to differentially expressed RNAS to be detected. The ratio of Separate the Strands of double-Stranded DNA containing the Starting materials between Samples is maintained even when target Sequence and then cooled to allow (1) the primers to the number of cycles is sufficient to allow the PCR reaction find and bind to their complementary Sequences on the to Saturate. This is because the number of doublings needed Separated Strands and (2) the polymerase to extend the to reach Saturation are almost completely controlled by the primers into new complementary Strands. Repeated heating invariant products that make up the majority of the finger and cooling cycles multiply the target DNA exponentially, 25 print. In this regard, PCR fingerprinting is different from Since each new double Strand Separates to become two conventional PCR of a single product in which the ratio of templates for further synthesis. In about 1 hour, 20 PCR Starting materials between Samples is not preserved unless cycles can amplify the target by a millionfold. products are sampled in the exponential phase of amplifi Within one embodiment of the invention, methods are cation. provided for determining the identity of a nucleic acid Within one embodiment of the invention methods are molecule, or for detecting the Selected nucleic acid molecule provided for determining the identity of a nucleic acid in, for example, a biological Sample, utilizing the technique molecule, or for detecting a Selecting nucleic acid molecule, of PCR. Briefly, Such methods comprise the Steps of gen in, for example a biological Sample, utilizing the technique erating a Series of tagged nucleic acid fragments or mol of RNA fingerprinting. Briefly, Such methods generally ecules during the PCR and Separating the resulting frag 35 comprise the Steps of generating a Series of tagged nucleic ments are by size. The size separation Step can be acid fragments. The fragments generated by PCR or Similar accomplished utilizing any of the techniques described amplification Schemes and are then Subsequently Separated herein, including for example gel electrophoresis (e.g., poly by Size. The size Separation Step can be, for example, any of acrylamide gel electrophoresis) or preferably HPLC The the techniques described herein, including for example gel tags are then cleaved from the Separated fragments and 40 electrophoresis (e.g., polyacrylamide gel electrophoresis) or detected by the respective detection technology. Examples preferably HPLC. The tags are then cleaved from the of Such technologies have been described herein, and Separated fragments, and then the tags are detected by the include for example mass spectrometry, infra-red respective detection technology. Representative examples of Spectrometry, potentiostatic amperometry or UV spectrom Suitable technologies include maSS Spectrometry, infra-red etry. 45 Spectrometry, potentiostatic amperometry or UV spectrom 3. RNA Fingerprinting and Differential Display etry. The relative quantities of any given nucleic acid When the template is RNA, the first step in fingerprinting fragments are not important, but the size of the band is is reverse transcription. Liang and Pardee (Science 257:967, informative when referenced to a control Sample. 1992) were the first to describe an RNA fingerprinting 4. Fluorescence-Based PCR Single-Strand Conformation protocol, using a primer for reverse transcription based on 50 Polymorphism (PCR-SSCP) oligo (dT) but with an anchor of two bases at the 5' end A number of methods in addition to the RFLP approach (e.g., oligo 5'-(dT)CA-3". Priming occurs mainly at the 5' are available for analyzing base Substitution polymorphisms. end of the poly(rA) tail and mainly in Sequences that end Orita, et al. have devised a way of analyzing these poly 5'-UpG-poly(rA)-3', with a selectivity approaching one out morphisms on the basis of conformational differences in of 12 polyadenylated RNAs. After reverse transcription and 55 denatured DNA. Briefly, restriction enzyme digestion or denaturation, arbitrary priming is performed on the resulting PCR is used to produce relatively small DNA fragments first strand of cDNA. PCR can now be used to generate a which are then denatured and resolved by electrophoresis on fingerprint of products that best matches the primers and that non-denaturing polyacrylamide gels. Conformational differ are derived from the 3' end of the mRNAS and polyadeny ences in the Single-Stranded DNA fragments resulting from lated heterogeneous RNAS. This protocol has been named 60 base substitutions are detected by electrophoretic mobility differential display. shifts. Intra-Strand base pairing creates Single Strand confor Alternatively, an arbitrary primer can be used in the first mations that are highly Sequence-specific and distinctive in Step of reverse transcription, Selecting those regions internal electrophoretic mobility. However, detection rates in differ to the RNA that have 6-8 base matches with the 3' end of the ent studies using conventional SSCP range from 35% to primer. This is followed by arbitrary priming of the resulting 65 nearly 100% with the highest detection rates most often first strand of cDNA with the same or a different arbitrary requiring Several different conditions. In principle, the primer and then PCR. This particular protocol samples method could also be used to analyze polymorphisms based US 6,613,508 B1 45 46 on short insertions or deletions. This method is one of the followed by separation based upon size. Preferably, the size most powerful tools for identifying point mutations and Separation Step is non-denaturing and the nucleic acid frag deletions in DNA (SSCP-PCR, Dean et al., Cell 61:863, ments are denatured prior to the Separation methodology. 1990). The size Separation Step can be accomplished, for example Within one embodiment of the invention methods are gel electrophoresis (e.g., polyacrylamide gel provided for determining the identity of a nucleic acid electrophoresis) or preferably HPLC. The tags are then molecule, or for detecting a Selecting nucleic acid molecule, cleaved from the Separated fragments, and then the tags are in, for example a biological Sample, utilizing the technique detected by the respective detection technology (e.g., mass of PCR-SSP. Briefly, such methods generally comprise the Spectrometry, infra-red spectrometry, potentiostatic amper Steps of generating a Series of tagged nucleic acid fragments. ometry or UV spectrometry). The fragments generated by PCR are then separated by size. 6. Restriction Maps and RFLPs Preferably, the Size Separation Step is non-denaturing and the Restriction endonucleaseS recognize Short DNA nucleic acid fragments are denatured prior to the Separation Sequences and cut DNA molecules at those Specific Sites. methodology. The size Separation Step can be accomplished, Some restriction enzymes (rare-cutters) cut DNA very for example gel electrophoresis (e.g., polyacrylamide gel 15 infrequently, generating a Small number of very large frag electrophoresis) or preferably HPLC. The tags are then ments (Several thousand to a million bp). Most enzymes cut cleaved from the Separated fragments, and then the tags are DNA more frequently, thus generating a large number of detected by the respective detection technology (e.g., mass Small fragments (less than a hundred to more than a thou Spectrometry, infra-red spectrometry, potentiostatic amper Sand bp). On average, restriction enzymes with 4-base ometry or UV spectrometry). recognition Sites will yield pieces 256 bases long, 6-base 5. Dideoxy Fingerprinting (ddF) recognition sites will yield pieces 4000 bases long, and Another method has been described (ddF, Sarkar et al., 8-base recognition sites will yield pieces 64,000 bases long. Genomics 13:441, 1992) that detected 100% of single-base Since hundreds of different restriction enzymes have been changes in the human factor IX gene when tested in a characterized, DNA can be cut into many different small retrospective and prospective manner. In total, 84 of 84 25 fragments. different Sequence changes were detected when genomic A wide variety of techniques have been developed for the DNA was analyzed from patients with hemophilia B. analysis of DNA polymorphisms. The most widely used Briefly, in the applications of tags for genotyping or other method, the restriction fragment length polymorphism purposes, one method that can be used is dideoxy (RFPL) approach, combines restriction enzyme digestion, fingerprinting. This method utilizes a dideoxy terminator in gel electrophoresis, blotting to a membrane and hybridiza a Sanger Sequencing reation. The principle of the method is tion to a cloned DNA probe. Polymorphisms are detected as as follows: a target nucleic acid that is to be sequenced is variations in the lengths of the labeled fragments on the placed in a reaction which possesses a dideoxy-terminator blots. The RFLP approach can be used to analyze base complementary to the base known to be mutated in the target Substitutions when the Sequence change falls within a nucleic acid. For example, if the mutation results in a A->G 35 restriction enzyme site or to analyze minisatellites/VNTRs change, the reaction would be carried out in a C dideoxy by choosing restriction enzymes that cut outside the repeat terminator reaction. PCR primers are used to locate and units. The agarose gels do not usually afford the resolution amplify the target Sequence of interest. If the hypothetical necessary to distinguish minisatellite/VNTR alleles differing target Sequence contains the A->G change, the size of a by a single repeat unit, but many of the minisatellite/VNTRs population of Sequences is changed due to the incorporation 40 are So variable that highly informative markers can Still be of a dideoxy-terminator in the amplified Sequences. In this obtained. particular application of tags, a fragment would be generated Within one embodiment of the invention methods are which would possess a predictable size in the case of a provided for determining the identity of a nucleic acid mutation. The tags would be attached to the 5'-end of the molecule, or for detecting a Selecting nucleic acid molecule, PCR primers and provide a “map' to sample type and 45 in, for example a biological Sample, utilizing the technique dideoxy-terminator type. A PCR amplification reaction of restriction mapping or RFLPs. Briefly, such methods would take place, the resulting fragments would be sepa generally comprise the Steps of generating a Series of tagged rated by size by for example HPLC or PAGE. At the end of nucleic acid fragments in which the fragments generated are the Separation procedure, the DNA fragments are collected digested with restriction enzymes. The tagged fragments are in a temporal reference frame, the tags are cleaved and the 50 generated by conducting a hybridization Step of the tagged presence or absence of mutation is determined by the chain probes with the digested target nucleic acid. The hybridiza length due to premature chain terminator by the incorpora tion Step can take place prior to or after the restriction tion of a given dideoxy-terminator. nuclease digestion. The resulting digested nucleic acid frag It is important to note that didf results in the gain or loSS ments are then Separated by size. The Size Separation Step of a dideoxy-termination Segment and or a shift in the 55 can be accomplished, for example gel electrophoresis (e.g., mobility of at least one of the termination Segments or polyacrylamide gel electrophoresis) or preferably HPLC. products. Therefore, in this method, a Search is made of the The tags are then cleaved from the Separated fragments, and shift of one fragment mobility in a high background of other then the tags are detected by the respective detection tech molecular weight fragments. One advantage is the fore nology (e.g., mass spectrometry, infra-red spectrometry, knowledge of the length of fragment associated with a given 60 potentiostatic amperometry or UV spectrometry). mutation. 7. DNA Fingerprinting Within one embodiment of the invention methods are DNA fingerprinting involves the display of a set of DNA provided for determining the identity of a nucleic acid fragments from a specific DNA sample. A variety of DNA molecule, or for detecting a Selecting nucleic acid molecule, fingerprinting techniques are presently available (Jeffreys et in, for example a biological Sample, utilizing the technique 65 al., Nature 314:67–73, 1985; Zabeau and Vos, 1992); of ddF. Briefly, Such methods generally comprise the Steps “Selective Restriction Fragment Amplification: A General of generating a Series of tagged nucleic acid fragments, Method for DNA Fingerprinting.' European Patent Appli US 6,613,508 B1 47 48 cation 92402629.7.; Vos et al., “DNA FINGERPRINTING. are ligated to the restriction fragments. The PCR primers are A New Technique for DNA Fingerprinting.” Nucl. Acids Specific for the known Sequences of the adaptors and restric Res. 23: 4407-4414, 1996; Bates, S. R. E., Knorr, D. A., tion sites. The Steps of the genetic fingerprinting proceSS are Weller, J. W., and Ziegle, J. S., “Instrumentation for Auto described below. mated Molecular Marker Acquisition and Analysis.” Chap 1) Restriction and Ligation. Restriction fragments of ter 14, pp. 239-255, in The Impact of Plant Molecular genomic DNA are generated by using two different restric Genetics, edited by B. W. S. Sobral, published by tion enzymes: a rare cutter (the Six-base recognition enzyme Birkhauser, 1996. Thus, one embodiment of the invention methods are EcoRI) and a frequent cutter (the four-base recognition provided for determining the identity of a nucleic acid enzyme Mse). Three different types of fragments are pro molecule, or for detecting a Selecting nucleic acid molecule, duced: ones with EcoRI cuts at both ends, ones with Mse in, for example a biological Sample, utilizing the technique cuts at both ends, and ones with an EcoRI cut at one end and of DNA fingerprinting. Briefly, Such methods generally an Mse cut at the other end. Double-stranded adaptors are comprise the Steps of generating a Series of tagged nucleic then ligated to the sticky ends of the DNA fragments, acid fragments, followed by Separation of the fragments by generating template DNA for amplification. The adaptors size. The size Separation Step can be accomplished, for 15 are specific for either the EcoRI site or the Mse site. example gel electrophoresis (e.g., polyacrylamide gel Restriction and ligation take place in a single reaction. electrophoresis) or preferably HPLC. The tags are then Ligation of the adaptor to the restricted DNA alters the cleaved from the Separated fragments, and then the tags are restriction Site So as to prevent a Second restriction from detected by the respective detection technology (e.g., mass taking place after ligation has occurred. Spectrometry, infra-red spectrometry, potentiostatic amper 2) Preselective Amplification. The Sequences of the adap ometry or UV spectrometry). tors and restriction Sites Serve as primer binding Sites for the Briefly, DNA fingerprinting is based on the selective PCR “preselective PCR amplification.” The preselective primers amplification of restriction fragments from a total digest of each have a “selective' nucleotide that will recognize the genomic DNA. The technique involves three steps: 1) Subset of restriction fragments having the matching nucle restriction of the DNA fragments and Subsequent ligation of 25 otide downstream from the restriction site. The primary oligonucleotide adaptors, 2) Selective amplification of Sets products of the preselective PCR are those fragments having of restriction fragments, 3) gel analysis of the amplified one Mse cut and one EcoRI cut, and also having the fragments. PCR amplification of the restriction fragments is matching internal nucleotide. The preselective amplification achieved by using the adaptor and restriction Site Sequence achieves a 16-fold reduction of the complexity of the as target Sites for primer annealing. The Selective amplifi fragment mixture. cation is achieved by the use of primers that extend into the 3) Selective Amplification with CMST-Labeled Primers. restriction fragments, amplifying only those fragments in The complexity of the PCR product mixture is further which the primer extensions match the nucleotides flanking reduced (256-fold) and fragments are labeled with a set of the restriction Sites. CMSTS by carrying out a second PCR using selective This method therefore yields sets of restriction fragments 35 primers labeled with CMSTs. It is possible to choose from which may be visualized by a variety of methods (i.e., among 64 different primer pairs (resulting from all possible PAGE, HPLC, or other types of spectrometry) without prior combinations of eight MseI and eight EcoRI primers) for knowledge of the nucleotide Sequence. The method also this amplification. Each of these primers possesses three allows the co-amplification of large numbers of restriction Selective nucleotides. The first is the same as that used in the fragments. The number of fragments however is dependent 40 pre-Selective amplification; the others can be any of the 16 on the resolution of the detection system. Typically, 50-100 possible combinations of the four nucleotides. Only that restriction fragments are amplified and detected on denatur Subset of fragments having matching nucleotides at all three ing polyacrylamide gels. In the application described herein, positions will be amplified at this stage in the amplification. the separation will be performed by HPLC. 8. Application of Cleavable Tags to Genotyping and The DNA fingerprinting technique is based on the ampli 45 Polymorphism Detection fication of Subsets of genomic restriction fragments using a. Introduction PCR. DNA is cut with restriction enzymes and double strand Although a few known human DNA polymorphisms are adapters and the are ligated to the ends of the DNA frag based upon insertions, deletions or other rearrangements of ments to generate template DNA for the amplification reac non-repeated Sequences, the vast majority are based either tions. The Sequence of the ligated adapters and the adjacent 50 upon Single base Substitutions or upon variations in the restriction enzymes (sites) serve as binding sites for the number of tandem repeats. Base Substitutions are very DNA fingerprinting of primers for PCR-based amplification. abundant in the human genome, occurring on average once Selective nucleotides are included at the 3' end of the of the every 200-500 bp. Length variations in blocks of tandem PCR primers which therefore can only prime DNA synthesis repeats are also common in the genome, with at least tens of from a Subset of the restriction sites. Only restriction frag 55 thousands of interspersed polymorphic Sites (termed loci). ments in which the nucleotides flanking the restriction site Repeat lengths for tandem repeat polymorphisms range from can match the Selective nucleotide will be amplified. 1 bp in (dA),(dT), Sequences to at least 170 bp in C-satellite The DNA fingerprinting proceSS produces "fingerprint” DNA. Tandem repeat polymorphisms can be divided into patterns of different fragment lengths that are characteristic two major groups which consist of minisatelliteS/variable and reproducible for an individual organism. These finger 60 number of tandem repeats (VNTRs), with typical repeat prints can be use to distinguish even very closely related lengths of tens of base pairs and with tens to thousands of organisms, including near-isogenic lines. The differences in total repeat units, and microsatellites, with repeat lengths of fragment lengths can be traced to base changes in the up to 6 bp and with maximum total lengths of about 70 bp. restriction site or the primer extension site, or to insertions Most of the microsatellite polymorphisms identified to date or deletions in the body of the DNA fragment. 65 have been based on (dC-dA), or (dG-dT), dinucleotide Dependence on Sequence knowledge of the target genome repeat Sequences. Analysis of microSatellite polymorphisms is eliminated by the use of adaptors of known Sequence that involves amplification by the polymerase chain reaction US 6,613,508 B1 49 SO (PCR) of a small fragment of DNA containing a block of Genetics, 2, p1123) and may possess up to 24 alleles. repeats followed by electrophoresis of the amplified DNA on MicroSatellite dinucleotide repeats can be amplified using denaturing polyacrylamide gel. The PCR primers are primers complementary to the unique regions Surrounding complementary to unique Sequences that flank the blocks of the dinucleotide repeat by PCR. Following amplification, repeats. Polyacrylamide gels, rather than agarose gels, are 5 Several amplified loci and be combined (multiplexed) prior traditionally used for microsatellites because the alleles to a Size Separation Step. The process of applying the often only differ in Size by a Single repeat. amplified microSatellite fragments to a Size Separation Step Thus, within one aspect of the present invention methods and then identifying the size and therefore the allele is are provided for genotyping a Selected organism, comprising known as genotyping. Chromosome specific markers which the steps of (a) generating tagged nucleic acid molecules permit a high level of multiplexing have been reported for from a Selected target molecule, wherein a tag is correlative performing whole genome Scans for linkage analysis with a particular fragment and may be detected by non (Davies et al., 1994, Nature, 371, p130). fluorescent spectrometry or potentiometry, (b) separating the Tags can be used to great effect in genotyping with tagged molecules by Sequential length, (c) cleaving the tag microsatellites. Briefly, the PCR primers are constructed to from the tagged molecule, and (d) detecting the tag by non 15 carry tags and used in a carefully chosen PC reaction to fluorescent Spectrometry or potentiometry, and therefrom amplify di-, tri-, or tetra-nucleotide repeats. The amplifica determining the genotype of the organism. tion products are then Separated according to Size by meth Within another aspect, methods are provided for geno ods such as HPLC or PAGE. The DNA fragments are then typing a selected organism, comprising the steps of (a) collected in a temporal fashion, the tags cleaved from their combining a tagged nucleic acid molecule with a Selected respective DNA fragments and length deduced from com target molecule under conditions and for a time Sufficient to parison to internal Standards in the size Separation Step. permit hybridization of the tagged molecule to the target Allele identification is made from reference to size of the molecule, wherein a tag is correlative with a particular amplified products. fragment and may be detected by non-fluorescent spectrom With cleavable tags approach to genotyping, it is possible etry or potentiometry, (b) separating the tagged fragments by 25 to combine multiple Samples on a single Separation Step. Sequential length, (c) cleaving the tag from the tagged There are two general ways in which this can performed. fragment, and (d) detecting the tag by non-fluorescent spec The first general method for high through-put Screening is trometry or potentiometry, and therefrom determining the the detection of a single polymorphism in a large group of genotype of the organism. individuals. In this senario a single or nested set of PCR b. Application of Cleavable Tags to Genotyping primerS is used and each amplification is done with one A PCR approach to identify restriction fragment length DNA sample type per reaction. The number of samples that polymorphism (RFPL) combines gel electrophoresis and can be combined in the Separation Step is proportional to the detection of tags assoicated with specific PCR primers. In number of cleavable tags that can be generated per detection general, one PCR primer will possess one specific tag. The technology (i.e., 400-600 for mass spectrometer tags). It is tag will therefore represent one set of PCR primers and 35 therefore possible to identify 1 to Several polymorphisms in therefore a pre-determined DNA fragment length. Polymor a large group of individuals Simultaneously. The Second phisms are detected as variations in the lengths of the labeled approach is to use multiple Sets of PCR primers which can fragments in a gel or eluting from a gel. Polyacrylamide gel identify numerous polymorphisms on a Single DNA sample electrophoresis will usually afford the resolution necessary (genotyping an individual for example). In this approach to distinguish minisatellite/VNTR alleles differing by a 40 PCR primers are combined in a single amplification reaction Single repeat unit. Analysis of microSatellite polymorphisms which generate PCR products of different length. Each involves amplification by the polymerase chain reaction primer pair or nested Set is encoded by a specific cleavable (PCR) of a small fragment of DNA containing a block of Tag which implies each PCR fragment will be encoded with repeats followed by electrophoresis of the amplified DNA on a specific tag. The reaction is run on a Single Separation Step denaturing polyacrylamide gel or followed by Separation of 45 (see below). The number of samples that can be combined DNA fragments by HPLC. The amplified DNA will be in the Separation Step is proportional to the number of labeled using primers that have cleavable tags at the 5' end cleavable tags that can be generated per detection technol of the primer. The primers are incorporated into the newly ogy (i.e., 400–600 for mass spectrometer tags). synthesized strands by chain extension. The PCR primers c. Enzymatic Detection of Mutation and the Applications are complementary to unique Sequences that flank the blockS 50 of Tags of repeats. Minisatellite/VNTR polymorphisms can also be In this particular application or method, mismatches in amplified, much as with the microSatellites described above. heteroduplexes are detected by enzymatic cleavage of mis Descriptions of many types of DNA sequence polymor matched base pairs in a given nucleic acid duplex. DNA phisms have provided the fundamental basis for the under Sequences to be tested for the presence of a mutation are Standing of the structure of the human genome (Botstein et 55 amplified by PCR using a Specific Set of primers, the al., Am. J. Human Genetics 32:p314, 1980; Donis-Keller, amplified products are denatured and mixed with denatured Cell 51:319, 1981; Weissenbach et al., Nature 359:794). The reference fragments and hybridized which result in the construction of extensive framework linkage maps has been formation of heteroduplexes. The heteroduplexes are then facilitated by the use of these DNA polymorphisms and has treated with enzymes which recognize and cleave the duplex provided a practical means for localization of disease genes 60 if a mismatch is present. Such enzymes are nuclease S1, by linkage. MicroSatellite dinucleotide markers are proving Mung bean nuclease, “resolvases”, T4 endonuclease IV, etc. to be very powerful tools in the identification of human Essentially any enzyme can be used which recognizes genes which have been shown to contain mutations and in mismatches in vitro and cleave the resulting mismatch. The Some instances cause disease. Genomic dinucleotide repeats treatment with the appropriate enzyme, the DNA duplexes are highly polymorphic (Weber, 1990, Genomic Analysis, 65 are separated by size, by, for example HPLC or PAGE. The Vol 1, pp 159-181, Cold Spring Laboratory Press, Cold DNA fragments are collected temporally. Tags are cleaved Spring Harbor, N.Y.; Weber and Wong, 1993, Hum. Mol. and detected. The presence of a mutation is detected by the US 6,613,508 B1 S1 52 shift in mobility of a fragments relative to a wild-type analysis in Stage III carconoma of the colon has more reference fragment. recently been demonstrated (Pricolo et al., Am. J. Surg. d. Applications of Tags to the Oligonucleotide Ligation 171:41, 1996). It is therefore apparent that genotyping of Assay (OLA) tumors and pre-cancerous cells, and Specific mutation detec The oligonucleotide ligation assay as originally described tion will become increasingly important in the treatment of by Landegren et al. (Landegen et al., Science 241:487, 1988) cancers in humans. is a useful technique for the identification of Sequences 9. Single Nucleotide Extension Assay (known) in very large and complex genomes. The principle The primer extension technique for the detection of Single of the OLA reaction is based on the ability of ligase to nucleotide in genomic DNA was first described by Sokolov covalently join two diagnostic oligonucleotides as they in 1989 (Nucleic Acids Res. 18(12): 3671, 1989). In this hybridize adjacent to one another on a given DNA target. If paper, Sokolov described the Single nucleotide extension of the Sequences at the probe junctions are not perfectly 30-merS and 20-mers complementary to the known Sequence based-paired, the probes will not be joined by the ligase. The of the cystic fibrosis gene. It was shown that the method had ability of a thermostable ligase to discriminate potential the ability to correctly identify a Single nucleotide change Single base-pair differences when positioned at the 3' end of 15 within t the gene. The method was based on the use of the "upstream” probe provides the opportunity for Single radiolabelled deoxynucleotides for a labeling method in the base-pair resolution (Barony, PNAS USA 88:189, 1991). In Single nucleotide extension assay. Later publications the application of tags, the tags can be attached to a probe described the use of Single nucleotide extension assays for which is ligated to the amplified product. After completion genetic diseases Such as hemophilia B (factor IX) and the of the OLR, the fragments are separated on the basis of Size, cyctic fibrosis gene (Kuppuswamy et al., PNAS USA the tags cleaved and detected by mass spectrometry. 88:p1143–1147, 1991). Recently, the parameters of the e. Sequence Specific Amplification Single nucleotide extension assay in terms of the quantitative PCR primers with a 3' end complementary either to a range, variability, and multiplex analysis has been described mutant or normal oligonucleotide Sequence can be used to in detail (Greenwood, A. D. and Burke, D. T., Genome selectively amplify one or the other allele (Newton et al., 25 Research 6:336–348, 1996). Nuc. Acids Res., 17, p2503; et al., 1989, Genomics, 5, p535; The strategy is based on the fidelity of the DNA poly Okayama et al., 1989, J. Lab. Clin. Med., 114, p.105; merase to add only the correctly paired nucleotide onto the Sommer et al., 1989, Mayo Clin. Proc., 64, 1361; Wu et al., 3' end of the template hybridized primer. Since only one PNAS USA, 86, p2757). Usually the PCR products are dideoxy-terminator nucleotide is added per reaction, it is a visualized after amplification by PAGE, but the principle of Simple matter to Sort out which primer has been extended in Sequence Specific amplification can be applied to Solid phase all four types of dNTPs. formats. Hence, within one aspect of the invention methods are f. Application of Tags to Some Amplification Based provided for the detection of a single Selected nucleic acid ASSayS within a nucleic acid molecule, comprising the steps of (a) Genotyping of Viruses: One application of tags is the 35 hybridizing in at least two Separate reactions a tagged primer genotyping or identification of Viruses by hybridization with and a target nucleic acid molecule under conditions and for tagged probes. For example, F+ RNA coliphages may be a time sufficient to permit hybridization of the primer to the useful candidates as indicators for enteric virus contamina target nucleic acid molecule, wherein each reaction contains tion. Genotyping by nucleic acid hybridization methods is a an enzyme which will add a nucleotide chain terminator, reliable, rapid, Simple, and inexpensive alternative to Sero 40 and, a nucleotide chain terminator complementary to typing (Kafatos et al., Nucleic Acids Res. 7:1541, 1979). adenosine, cytosine, guanosine, thymidine or uracil, and Amplification techniques and nucleic aid hybridization tech wherein each reaction contains a different nucleotide chain niques have been Successfully used to classify a variety of terminator, (b) separating tagged primers by size, (c) cleav microorganisms including E. coli (Feng, Mol. Cell Probes ing the tag from the tagged primer, and (d) detecting the tag 7:151, 1993). Representative examples of viruses that may 45 by non-fluorescent spectrometry or potentiometry, and be detected utilizing the present invention include rotavirus therefrom determining the presence of the Selected nucle (Sethabutr et. al., J. Med Virol. 37:192, 1992), hepatitis otide within the nucleic acid molecule. viruses such as hepatitis C virus (Stuyver et. al., J. Gen Virol. AS noted herein a wide variety of Separation methods may 74:1093, 1993), herpes simplex virus (Matsumoto et al., J. be utilized, including for example liquid chromatographic Virol Methods 40:119, 1992). 50 means such as HPLC. In addition, a wide variety of detec Prognostic applications of mutational analysis in cancers: tion methodologies may be utilized, including for example, Genetic alterations have been described in a variety of maSS Spectrometry, infrared Spectrometry, ultraViolet experimental mammalian and human neoplasms and repre Spectrometry, or, potentiostatic amperometry. Also, Several Sent the morphological basis for the Sequence of morpho different enzymes may be utilized (e.g., a polymerase), as logical alterations observed in carcinogenesis (Vogelstein et 55 well as any of the tags provided herein. Within certain al., NEJM 319:525, 1988). In recent years with the advent of preferred embodiments, each primer which is utilized within molecular biology techniques, allelic losses on certain chro a reaction has a different unique tag. In this manner, multiple mosomes or mutation of tumor Suppressor genes as well as Samples (or multiple sites) may be simulataneously probed mutations in Several oncogenes (e.g., c-myc, c-jun, and the for the presence of Selected nucleotides. ras family) have been the most studied entities. Previous 60 Single nucleotide assayS. Such as those described herein work (Finkelstein et al., Arch Surg. 128:526, 1993) has may be utilized to detect polymorphic variants, or to inter identified a correlation between Specific types of point rogate a biological Sample for the presence a specific nucle mutations in the K-ras oncogene and the Stage at diagnosis otide within or near a known Sequence. Target nucleic acid in colorectal carcinoma. The results Suggested that muta molecules include not only DNA (e.g., genomic DNA), but tional analysis could provide important information of tumor 65 RNA as well. aggressiveness, including the pattern and spread of metasta In general, this method involves hybridizing a primer to sis. The prognostic value of TP53 and K-ras-2 mutational the target DNA sequence such that the 3' end of the primer US 6,613,508 B1 S3 S4 is immediately adjacent to the mutation to be detected and the entire genome is amplified in any reaction. For example, identified. The procedure is similar to the Sanger Sequencing if 2 base-pair (bp) extensions are used, only one in 256 reaction except that only the dideoxynucleotide of a given molecules is amplified. To further limit the number of nucleotide is added to the reaction mixture. Each dideoxy fragments that are actually visualized (So that a manageable nucleotide is labeled with a unique tag. Of the four reaction number is observed), only one of the primers is labeled. mixtures, Solely one will add a dideoxyterminator on to the Finally, the amplified DNAS are separated on a polyacryla primer Sequence. If the mutation is present, it will be mide gel (Sequencing type) and an autoradiograph or phos detected through the unique tag on the dideoxynucleotide phor image is generated. and its identity established. Multiple mutations can be Within one embodiment of the invention methods are ascertained simultaneously by tagging the DNA primer with provided for determining the identity of a nucleic acid a unique tag as well. Within one aspect of the invention molecule, or for detecting a Selecting nucleic acid molecule, methods are provided for analyzing Single nucleotide muta in, for example a biological Sample, utilizing the technique tions from a Selected biological Sample, comprising the Steps of genetic fingerprinting. Briefly, Such methods generally of exposing nucleic acids from a biological Sample and comprise the Steps of digesting (e.g., genomic DNA) to combining the exposed nucleic acids with one or more 15 completion by two different restriction enzymes. Specific Selected nucleic acid probes, which may or may not be double-strand oligonucleotide adapters (-25-30 bp) are tagged, under conditions and for a time Sufficient for Said ligated to the restricted DNA fragments. Optional pre probes to hybridize to Said nucleic acids, wherein the tag, if amplification utilizing primers with a 1 bp extension may be used, is correlative with a particular nucleic acid probe and performed. The PCR product is then diluted, and tagged detectable by non-fluorescent spectrometry, or potentiom primerS homologous to the adapters, but having extensions etry. The DNA fragments are reacted in four Separate reac at the 3'-end are used to amplify by PCR a Subset of the DNA tions each including a different tagged dideoxyterminator, fragments. The resulting PCR products are then Separated by wherein the tag is correlative with a particular dideoxynucle size. The size Separation Step can be accomplished by a otide and detectable by non-fluorescent spectrometry, or variety of methods, including for example, HPLC. The tags potentiometry. The DNA fragments are Separated according 25 are then cleaved from the Separated fragments, and detected to size by, for example, gel electrophoresis (e.g., polyacry by the respective detection technology (e.g., mass lamide gel electrophoresis) or preferably HPLC. The tags Spectrometry, infrared spectrometry, potentioStatic amper are cleaved from the Separated fragments and detected by ometry or UV/visible spectrophotometry). the respective detection technology (e.g., mass 11. Gene Expression Analysis Spectrometry, infrared spectrometry, potentiostatic amper One of the inventions disclosed herein is a high through ometry or UV/visible spectrophotometry). The tags detected put method for measuring the expression of numerous genes can be correlated to the particular DNA fragment under (1-2000) in a single measurement. The method also has the investigation as well as the identity of the mutant nucleotide. ability to be done in parallel with greater than one hundred 10. Amplified Fragment Length Polymorphism (AFLP) Samples per process. The method is applicable to drug AFLP was designed as a highly sensitive method for DNA 35 Screening, developmental biology, molecular medicine Stud fingerprinting to be used in a variety of fields, including ies and the like. Within one aspect of the invention methods plant and animal breeding, medical diagnostics, forensic are provided for analyzing the pattern of gene expression analysis and microbial typing, to name a few. (Vos et al., from a Selected biological Sample, comprising the Steps of Nucleic Acids Res. 23:4407-4414, 1995.) The power of (a) exposing nucleic acids from a biological Sample, (b) AFLP is based upon the molecular genetic variations that 40 combining the exposed nucleic acids with one or more exist between closely related Species, varieties or cultivars. Selected tagged nucleic acid probes, under conditions and These variations in DNA sequence are exploited by the for a time sufficient for the probes to hybridize to the nucleic genetic fingerprinting technology Such that "fingerprints of acids, wherein the tag is correlative with a particular nucleic particular genotypes can be routinely generated. These “fin acid probe and detectable by non-fluorescent spectrometry, gerprints' are simply RFLPs visualized by selective PCR 45 or potentiometry, (c) separating hybridized probes from amplification of DNA restriction fragments. Briefly genetic unhybridized probes, (d) cleaving the tag from the tagged fingerprinting technology consists of the following Steps: fragment, and (e) detecting the tag by non-fluorescent genomic DNA is digested to completion by two different Spectrometry, or potentiometry, and therefrom determining restriction enzymes. Specific double Strand oligonucleotide the pattern of gene expression of the biological Sample. adapters (~25-30 bp) are ligated to the restricted DNA 50 Within a particularly preferred embodiment of the fragments. Oligonucleotide primerS homologous to the invention, assays or methods are provided which are adapters, but having extensions at the 3'-end are used to described as follows: RNA from a target source is bound to amplify a Subset of the DNA fragments. (A pre-amplification a Solid Support through a specific hybridization step (i.e., Step can also be performed where the extension is only 1 bp capture of poly(A) mRNA by a tethered oligo(dT) capture in length. Amplification with the primer having a 3 base-pair 55 probe). The solid support is then washed and cDNA is extension would follow.) These extensions can vary in Synthesized on the Solid Support using Standard methods length from 1 to 3 base-pair, but are of defined length for a (i.e., reverse transcriptase). The RNA strand is then removed given primer. The Sequence of the extension can also vary via hydrolysis. The result is the generation of a DNA from one primer to another but is of a Single, defined population which is covalently immobilized to the solid Sequence within a given primer. The Selective nature of 60 Support which reflects the diversity, abundance, and com AFLP-PCR is based on the 3' extensions on the oligonucle plexity of the RNA from which the cDNA was synthesized. otide primers. Since these extensions are not homologous to The solid support then interrogated (hybridized) with 1 to adapter Sequence, only DNA fragments complementary to Several thousand probes that are complementary to a gene the extensions will be amplified due to the inability of Taq Sequence of interest. Each probe type is labeled with a DNA polymerase, unlike some other DNA polymerases, to 65 cleavable mass spectrometry tag or other type of cleavable extend DNAS if mismatches occur at the 3'-end of a mol tag. After the interrogation Step, exceSS or unhybridized ecule that is being Synthesized. Therefore only a Subset of probe is washed away, the Solid Support is placed, for US 6,613,508 B1 SS S6 example, in the well of a microtiter plate and the mass type of capture probe employed guide hybridization times. Spectrometry tag is cleaved from the Solid Support. The Solid Hybridization temperatures are dictated by the type of Support is removed from the well of Sample container, and chaotrope employed and the final concentration of chaotrope the contents of the well are measured with a mass spectrom (see Van Ness and Chen, Nuc. Acids Res. 1991, for general eter. The appearance of Specific mass spectrometer tags guidelines). The lysate is preferentially agitated continually indicates the presence of RNA in the sample and evidence with the Solid support to effect diffusion of the target RNA. that a specific gene is expressed in a given biological Once the Step of capturing the target nucleic acid is Sample. The method can also be quantifiable. accomplished, the lysate is washed from the Solid Support The compositions and methods for the rapid measurement and all chaotrope or hybridization solution is removed. The of gene expression using cleavable tags can be described in Solid Support is preferentially washed with Solutions con detail as follows. Briefly, tissue (liver, muscle, etc.), primary taining ionic or non-ionic detergents, buffers and Salts. The or transformed cell lines, isolated or purified cell types or next step is the synthesis of DNA complementary to the any other Source of biological material in which determining captured RNA. In this step, the tethered capture oligonucle genetic expression is useful can be used as a Source of RNA. otide Serves as the extension primer for reverse transcriptase. In the preferred method, the biological Source material is 15 The reaction is generally performed at 25 to 37 C. and lysed in the presence of a chaotrope in order to SuppreSS preferably agitated during the polymerization reaction. After nucleases and proteases and Support Stringent hybridization the cDNA is synthesized, it becomes covalently attached to of target nucleic acid to the Solid Support. Tissues, cells and the Solid Support Since the capture oligonucleotide Serves as biological Sources can be effectively lysed in 1 to 6 molar the extension primer. The RNA is then hydrolyzed from the chaotropic Salts (guanidine hydrochloride, guanidine cDNA/RNA duplex. The step can be effected by the use of thiocyanate, Sodium perchlorate, etc.). After the Source heat that denatures the duplex or the use of base (i.e., 0.1 N biological Sample is lysed, the Solution is mixed with a Solid NaOH) to chemically hydrolyze the RNA. The key result at Support to effect capture of target nucleic acid present in the this step is to make the cDNA available for Subsequent lysate. In one permutation of the method, RNA is captured hybridization with defined probes. The Solid support or set using a tethered oligo(dT) capture probe. Solid Supports can 25 of Solid supports is then further washed to remove RNA or include nylon beads, polystyrene microbeads, glass beads RNA fragments. At this point, the Solid Support contains a and glass Surfaces or any other type of Solid Support to which approximate representative population of cDNA molecules oligonucleotides can be covalently attached. The Solid Sup that represents the RNA population in terms of Sequence ports are preferentially coated with an amine-polymer Such abundance, complexity, and diversity. The next Step is to as polyethylene(imine), acrylamide, amine-dendrimers, etc. hybridize selected probes to the Solid support to identify the The amines on the polymers are used to covalently immo presence or absence and the relative abundance of Specific bilize oligonucleotides. Oligonucleotides are preferentially cDNA sequences. Probes are preferentially oligonucleotides Synthesized with a 5'-amine (generally a hexylamine that in length of 15 to 50 nucleotides. The Sequence of the probes includes a six carbon Spacer-arm and a distal amine). Oli is dictated by the end-user of the assay. For example, if the gonucleotides can be 15 to 50 nucleotides in length. Oligo 35 end-user intended to Study gene expression in an inflamma nucleotides are activated with homo-bifunctional or hetero tory response in a tissue, probes would be Selected to be bifunctional cross-linking reagents Such as cyanuric complementary to numerous cytokine mRNAS, RNAS that chloride. The activated oligonucleotides are purified from encode enzymes that modulate lipids, RNAS that encode excess cross-linking reagent (i.e., cyanuric chloride) by factors that regulate cells involved in an inflammatory exclusion chromatography. The activated oligonucleotide 40 response, etc. Once a Set of defined Sequences are defined for are then mixed with the Solid Supports to effect covalent Study, each Sequence is made into an oligonucleotide probe attachment. After covalent attachment of the and each probe is assigned a specific cleavable tag. The oligonucleotides, the unreacted amines of the Solid Support tag(s) is then attached to the respective oligonucleotide(s). are capped (i.e., with Succinic anhydride) to eliminate the The oligonucleotide(s) are hybridized to the cDNA on the positive charge of the Solid Support. The Solid Supports can 45 Solid Support under appropriate hybridization conditions. be used in parallel and are preferentially configured in a After completion of the hybridization Step; the Solid Support 96-well or 384-well format. The solid supports can be is washed to remove any unhybridized probe. The solid attached to pegs, Stems, or rods in a 96-well or, 384-well Support or array of Supports is then heated to cleave the configuration, the Solid Supports either being detachable or covalent bond between the cDNA and the solid support. The alternatively integral to the particular configuration. The 50 tagged cDNA fragments are then Separated according to size particular configuration of the Solid Supports is not of critical by gel electrophoresis (e.g., polyacrylamide gel importance to the functioning of the assay, but rather, affects electrophoresis) or preferably HPLC. The tags are then the ability of the assay to be adapted to automation. The cleaved from the DNA probe molecules, and detected by the solid supports are mixed with the lysate for 15 minutes to respective detection technology (e.g., mass spectrometry, Several hours to effect capture of the target nucleic acid onto 55 infrared spectrometry, potentiostatic amperometry or the Solid Support. In general, the “capture” of the target UV/visible spectrophotometry). Each tag present is nucleic acid is through complementary base pairing of target identified, and the presence (and abundance) or absence of RNA and the capture probe immobilized on the solid Sup an expressed mRNA is determined. port. One permutation utilizes the 3' poly(A) stretch found An alternative procedure would hybridize the tagged on most eucaryotic messengerS RNAS to hybridize to a 60 DNA probes directly to the tethered mRNA target molecules tethered oligo(dT) on the Solid Support. Another permutation under the appropriate hybridization conditions. After is to utilize a specific oligonucleotide or long probes (greater completion of the hybridization Step, the Solid Support is than 50 bases) to capture an RNA containing a defined washed to remove any unhybridized probe. The RNA is then Sequence. Another possibility is to employ degenerate prim hydrolyzed from the DNA probe/RNA duplex. The step can ers (oligonucleotides) that would effect the capture of 65 be effected by the use of heat which denatures the duplex or numerous related Sequences in the target RNA population. the use of base (i.e., 0.1 N NaOH) to chemically hydrolyze The Sequence complexity of the RNA population and the the RNA. This step will leave free mRNA and their corre US 6,613,508 B1 57 58 sponding DNA probes that can then be isolated through a comprise the Steps of generating a Series of tagged nucleic Size Separation Step generally consisting of gel electrophore acid fragments in which the fragments generated are sis (e.g., polyacrylamide gel electrophoresis) or preferably digested with restriction enzymes. The tagged fragments are HPLC. The tags are then cleaved from the DNA probe generated by conducting a hybridization Step of the tagged molecules, and detected by the respective detection technol probes with the digested target nucleic acid. The hybridiza ogy (e.g., mass spectrometry, infrared spectrometry, poten tion Step can take place prior to or after the restriction tiostatic amperometry or UV/visible spectrophotometry). nuclease digestion. The resulting digested nucleic acid frag Each tag present is identified, and the presence (and ments are then Separated by size. The Size Separation Step abundance) or absence of an expressed mRNA is deter can be accomplished, for example, by gel electrophoresis mined. (e.g. polyacrylamide gel electrophoresis) or preferably 12. Hybridization Techniques HPLC. The tags are then cleaved from the separated The Successful cloning and Sequencing of a gene leads to fragments, and then the tags are detected by the respective the investigation of its structure and expression by making detection technology (e.g., mass spectrometry, infrared it possible to detect the gene or its mRNA in a large pool of Spectrometry, potentiostatic amperometry or UV/visible unrelated DNA or RNA molecules. The amount of mRNA 15 spectrophotometry). encoding a specific protein in a tissue is an important The presence and quantification of a specific gene tran parameter for the activity of a gene and may be significantly Script and its regulation by physiological parameters can be related to the activity of function Systems. Its regulation is analyzed by means of Northern blot analysis and RNase dependent upon the interaction between Sequences within protection assay. The principle basis of these methods is the gene (cis-acting elements) and Sequence-specific DNA hybridization of a pool of total cellular RNA to a specific binding proteins (trans-acting factors), which are activated probe. In the Northern blot technique, total RNA of a tissue tissue-specifically or by hormones and Second messenger is fractionated using an HPLC or LC method, hybridized to Systems. a labeled antisense RNA (cRNA), complementary to the Several techniques are available for analysis of a particu RNA to be detected. By applying Stringent washing lar gene, its regulatory Sequences, its specific mRNA and the 25 conditions, non-specifically bound molecules are elimi regulation of its expression; these include Southern or nated. Specifically bound molecules, would Subsequently be Northern blot analysis and ribonuclease (RNase) protection detected according to the type of probe utilized (mass asSay. spectrometry, or with a electrochemical detector). In Variations in the nucleotide composition of a certain gene addition, Specificity can be controlled by comparing the size may be of great pathophysiological relevance. When local of the detected mRNA with the predicted length of the ized in the non-coding regions (5', 3'-flanking regions and mRNA of interest. intron), they can affect the regulation of gene expression, Within one embodiment of the invention methods are causing abnormal activation or inhibition. When localized in provided for determining the identity of a ribonucleic acid the coding regions of the gene (exons), they may result in molecule, or for detecting a Selecting ribonucleic acid alteration of the protein function or dysfunctional proteins. 35 molecule, in, for example a biological Sample, utilizing the Thus, a certain Sequence within a gene can correlate to a techniques similar to Northern blotting. Briefly, such meth Specific disease and can be useful as a marker of the disease. ods generally comprise the Steps of generating a Series of One primary goal of research in the medical field is, tagged RNA molecules by conducting a hybridization Step therefore, to detect those genetic variations as diagnostic of the tagged probes with the target RNA. The tagged RNA tools, and to gain important information for the understand 40 molecules are then Separated by size. The size Separation ing of pathophysiological phenomena. The basic method for step can be accomplished, for example by preferably HPLC. the analysis of a population regarding the variations within The tags are cleaved from the Separated RNA molecules, and a certain gene is DNA analysis using the Southern blot then the tags are detected by the respective detection tech technique. Briefly, prepared genomic DNA is digested with nology (e.g., mass spectrometry, infrared spectrometry, a restriction enzyme (RE), resulting in a large number of 45 potentio static amp ero me try or UV / visible DNA fragments of different lengths, determined by the spectrophotometry). presence of the Specific recognition site of the RE on the The most specific method for detection of a mRNA genome. Alleles of a certain gene with mutations inside this species is the RNase protection assay. Briefly, total RNA restriction site will be cleaved into fragments of different from a tissue or cell culture is hybridized to a tagged specific number and length. This is called restriction fragment length 50 cRNA of complete homology. Specificity is accomplished polymorphism (RFLP) and can be an important diagnostic by Subsequent RNase digestion. Non-hybridized, single marker with many applications. The fragment to be analyzed stranded RNA and non-specifically hybridized fragments has to be separated from the pool of DNA fragments and with even Small mismatches will be recognized and cleaved, distinguished from other DNA species using a specific while double-stranded RNA of complete homology is not probe. Thus, DNA is subjected to electrophoretic fraction 55 accessible to the enzyme and will be protected. After remov ation using an agarose gel, followed by transfer and fixation ing RNase by proteinase K digestion and phenol eXtraction, to a nylon or nitrocellulose membrane. The fixed, single the Specific protected fragment can be separated from deg stranded DNA is hybridized to a tagged DNA that is comple radation products, usually by HPLC. mentary to the DNA to be detected, After removing non Within one embodiment of the invention methods are specific hybridizations, the DNA fragment of interest can be 60 provided for determining the identity of a ribonucleic acid Visualized according to the probes characteristics molecule, or for detecting a Selecting ribonucleic acid (autoradiography or phosphor image analysis). molecule, in, for example a biological Sample, utilizing the Within one embodiment of the invention methods are technique of RNase protection assay. Briefly, Such methods provided for determining the identity of a nucleic acid generally comprise the Steps of total RNA from a tissue or molecule, or for detecting a Selecting nucleic acid molecule, 65 cell culture being hybridized to a tagged specific cRNA of in, for example a biological Sample, utilizing the techniques complete homology, a RNase digestion, treatment with similar to Southern blotting. Briefly, such methods generally proteinase K and a phenol eXtraction. The tagged, protected US 6,613,508 B1 59 60 RNA fragment is isolated from the degradation products. An alternative Strategy for detecting a mutation in a DNA The size Separation Step can be accomplished, for example Strand is by Substituting (during Synthesis) one of the normal by LC or HPLC. The tag is cleaved from the separated RNA nucleotides with a modified nucleotide, altering the molecu molecules, and then is detected by the respective detection lar weight or other physical parameter of the product. A technology (e.g., mass spectrometry, infrared spectrometry, Strand with an increased or decreased number of this modi potentio static amp ero metry or UV / visible fied nucleotide relative to the wild-type Sequence exhibits spectrophotometry). altered electrophoretic mobility (Naylor et al., Lancet 13. Mutation Detection Techniques The detection of diseases is increasingly important in 337:635, 1991). This technique detects the presence of a prevention and treatments. While multifactorial diseases are mutation, but does not provide the location. difficult to devise genetic tests for, more than 200 known 1O Two other Strategies visualize mutations in a DNA seg human disorders are caused by a defect in a Single gene, ment by altered gel migration. In the Single-Strand confor often a change of a single amino acid residue (Olsen, mation polymorphism technique (SSCP), mutations cause Biotechnology: An industry comes of age, National Aca denatured Strands to adopt different Secondary Structures, demic Press, 1986). Many of these mutations result in an thereby influencing mobility during native gel electrophore altered amino acid that causes a disease State. 15 Sis. Heteroduplex DNA molecules, containing internal Sensitive mutation detection techniqueS offer extraordi mismatches, can also be separated from correctly matched nary possibilities for mutation Screening. For example, molecules by electrophoresis (Orita, Genomics 5:874, 1989; analyses may be performed even before the implantation of Keen, Trends Genet. 7:5, 1991). As with the techniques a fertilized egg (Holding and Monk, Lancet 3:532, 1989). discussed above, the presence of a mutation may be deter Increasingly efficient genetic tests may also enable Screening mined but not the location. AS well, many of these tech for oncogenic mutations in cells exfoliated from the respi niques do not distinguish between a single and multiple ratory tract or the bladder in connection with health check mutations. All of the above-mentioned techniques indicate ups (Sidransky et al., Science 252:706, 1991). Also, when an the presence of a mutation in a limited Segment of DNA and unknown gene causes a genetic disease, methods to monitor Some of them allow approximate localization within the DNA sequence variants are useful to Study the inheritance of 25 Segment. However, Sequence analysis is still required to disease through genetic linkage analysis. However, detect unravel the effect of the mutation on the coding potential of ing and diagnosing mutations in individual genes poses the Segment. Sequence analysis is very powerful, allowing, technological and economic challenges. Several different approaches have been pursued, but none are both efficient for example, Screening for the Same mutation in other and inexpensive enough for truly wide-scale application. individuals of an affected family, monitoring disease pro Mutations involving a Single nucleotide can be identified gression in the case of malignant disease or for detecting in a Sample by physical, chemical, or enzymatic means. residual malignant cells in the bone marrow before autolo Generally, methods for mutation detection may be divided gous transplantation. Despite these advantages, the proce into scanning techniques, which are suitable to identify dure is unlikely to be adopted as a routine diagnostic method previously unknown mutations, and techniques designed to because of the high expense involved. detect, distinguish, or quantitate known Sequence variants. 35 A large number of other techniques have been developed Several Scanning techniques for mutation detection have to analyze known Sequence variants. Automation and been developed in heteroduplexes of mismatched comple economy are very important considerations for these types mentary DNA strands, derived from wild type and mutant of analyses that may be applied, for Screening individuals Sequences, exhibit an abnormal behavior especially when and the general population. None of the techniques dis denatured. This phenomenon is exploited in denaturing and 40 cussed below combine economy and automation with the temperature gradient gel electrophoresis (DGGE and TGGE, required specificity. respectively) methods. Duplexes mismatched in even a Mutations may be identified via their destabilizing effects Single nucleotide position can partially denature, resulting in on the hybridization of short oligonucleotide probes to a retarded migration, when electrophoresed in an increasingly target sequence (see Wetmur, Crit. Rev. Biochem. Mol. Biol. denaturing gradient gel (Myers et al., Nature 313:495, 1985; 45 26:227, 1991). Generally, this technique, allele-specific oli Abrades et al., Genomics 7:463, 1990; Henco et al., Nucl. gonucleotide hybridization, involves amplification of target Acids Res. 18:6733, 1990). Although mutations may be Sequences and Subsequent hybridization with short oligo detected, no information is obtained regarding the precise nucleotide probes. An amplified product can thus be Scanned location of a mutation. Mutant forms must be further iso for many possible Sequence variants by determining its lated and Subjected to DNA sequence analysis. Alternatively, 50 hybridization pattern to an array of immobilized oligonucle RNase A may cleave a heteroduplex of an RNA probe and otide probes. However, establishing conditions that distin a target Strand at a position where the two Strands are not guish a number of other Strategies for nucleotide Sequence properly paired. The Site of cleavage can then be determined distinction all depend on enzymes to identify Sequence by electrophoresis of the denatured probe. However, some differences (Saiki, PNAS USA 86:6230, 1989; Zhang, Nucl. mutations may escape detection because not all mismatches 55 Acids Res. 19:3929, 1991). are efficiently cleaved by RNase A. Mismatched bases in a For example, restriction enzymes recognize Sequences of dupleX are also Susceptible to chemical modification. Such about 4-8 nucleotides. Based on an average G+C content, modifications can render the Strands Susceptible to cleavage approximately half of the nucleotide positions in a DNA at the Site of the mismatch or cause a polymerase to Stop in Segment can be monitored with a panel of 100 restriction a Subsequent extension reaction. The chemical cleavage 60 enzymes. AS an alternative, artificial restriction enzyme technique allows identification of a mutation in target recognition Sequences may be created around a variable Sequences of up to 2 kb and it provides information on the position by using partially mismatched PCR primers. With approximate location of mismatched nucleotide(s) (Cotton this technique, either the mutant or the wild-type Sequence et al., PNAS USA 85.4397, 1988; Ganguly et al., Nucl. Acids alone may be recognized and cleaved by a restriction Res. 18:3933, 1991). However, this technique is labor inten 65 enzyme after amplification (Chen et al., Anal. Biochem. Sive and may not identify the precise location of the muta 195:51, 1991; Levi et al., Cancer Res. 51:3497, 1991). tion. Another method exploits the property that an oligonucle US 6,613,508 B1 61 62 otide primer that is mismatched to a target Sequence at the fluorescence for example). For assays requiring very high 3' penultimate position exhibits a reduced capacity to Serve levels of Sensitivity, the probes are concentrated and mea as a primer in PCR. However, some 3' mismatches, notably Sured. G-T, are less inhibitory than others limiting its usefulneSS Other highly sensitive hybridization protocols may be are. In attempts to improve this technique, additional mis used. The methods of the present invention enable one to matches are incorporated into the primer at the third position readily assay for a nucleic acid containing a mutation from the 3' end. This results in two mismatched positions in Suspected of being present in cells, Samples, etc., i.e., a target the three 3' nucleotides of the primer hybridizing with one nucleic acid. The target nucleic acid contains the nucleotide allelic variant, and one mismatch in the third position in sequence of deoxyribonucleic acid (DNA) or ribonucleic from the 3' end when the primer hybridizes to the other acid (RNA) whose presence is of interest, and whose pres allelic variant (Newton et al., Nucl. Acids Res. 17:2503, ence or absence is to be detected for in the hybridization 1989). It is necessary to define amplification conditions that assay. The hybridization methods of the present invention Significantly favor amplification of a 1 bp mismatch. may also be applied to a complex biological mixture of DNA polymerases have also been used to distinguish nucleic acid (RNA and/or DNA). Such a complex biological allelic Sequence variants by determining which nucleotide is 15 mixture includes a wide range of eucaryotic and procaryotic added to an oligonucleotide primer immediately upstream of cells, including protoplasts, and/or other biological materials a variable position in the target Strand. which harbor polynucleotide nucleic acid. The method is A ligation assay has been developed. In this method, two thus applicable to tissue culture cells, animal cells, animal oligonucleotide probes hybridizing in immediate juxtaposi tissue, blood cells (e.g., reticulocytes, lymphocytes), plant tion on a target Strand are joined by a DNA ligase. Ligation cells, bacteria, yeasts, viruses, mycoplasmas, protozoa, is inhibited if there is a mismatch where the two oligonucle fungi and the like. By detecting a specific hybridization otide probes abut. between nucleic acid probes of a known Source, the Specific 14. Assays for Mutation Detection presence of a target nucleic acid can be established. Atypical Mutations are a Single-base pair change in genomic DNA. hybridization assay protocol for detecting a target nucleic Within the context of this invention, most Such changes are 25 acid in a complex population of nucleic acids is described as readily detected by hybridization with oligonucleotides that follows: Target nucleic acids are Separated by size on an LC are complementary to the Sequence in question. In the or HPLC, cloned and isolated, sub-divided into pools, or left System described here, two oligonucleotides are employed as a complex population. Within one embodiment of the to detect a mutation. One oligonucleotide possesses the invention methods are provided for determining the identity wild-type Sequence and the other oligonucleotide possesses of a nucleic acid molecule, or for detecting a Selecting the mutant Sequence. When the two oligonucleotides are nucleic acid molecule, in, for example a biological Sample, used as probes on a wild-type target genomic Sequence, the utilizing the general techniques of hybridization assayS. wild-type oligonucleotide will form a perfectly based paired Briefly, Such methods generally comprise the steps of target Structure and the mutant oligonucleotide Sequence will form nucleic acids being cloned and isolated, Sub-divided into a duplex with a single base pair mismatch. AS discussed 35 pools, or left as a complex population. The target nucleic above, a 6 to 7 C. difference in the Tm of a wild type versus acids are hybridized with tagged oligonucleotide probes mismatched duplex permits the ready identification or dis under conditions described above. The target nucleic acids crimination of the two types of duplexes. To effect this are separated according to Size by LC or HPLC. The tags are discrimination, hybridization is performed at the Tm of the cleaved from the Separated fragments, and then the tags are mismatched duplex in the respective hybotropic Solution. 40 detected by the respective detection technology (e.g., mass The extent of hybridization is then measured for the set of Spectrometry, infrared spectrometry, potentioStatic amper oligonucleotide probes. When the ratio of the extent of ometry or UV/visible spectrophotometry). hybridization of the wild-type probe to the mismatched 15. Sequencing by Hybridization probe is measured, a value to 10/1 to greater than 20/1 is DNA sequence analysis is conventionally performed by obtained. These types of results permit the development of 45 hybridizing a primer to target DNA and performing chain robust assays for mutation detection. extensions using a polymerase. Specific Stops are controlled For exemplary purposes, one assay format for mutation by the inclusion of a dideoxynucleotide. The specificity of detection utilizes target nucleic acid (e.g., genomic DNA) priming in this type of analysis can be increased by includ and oligonucleotide probes that span the area of interest. The ing a hybotrope in the annealing buffer and/or incorporating oligonucleotide probes are greater or equal to 24 nt in length 50 an abasic residue in the primer and annealing at a discrimi (with a maximum of about 36 nt) and labeled with a nating temperature. fluorochrome at the 3' or 5' end of the oligonucleotide probe. Within one embodiment of the invention methods are The target nucleic acid is obtained via the lysis of tissue provided for determining the identity of a nucleic acid culture cells, tissues, organisms, etc., in the respective molecule, or for detecting a Selecting nucleic acid molecule, hybridization solution. The lysed solution is then heated to 55 in, for example a biological Sample, utilizing the general a temperature that denatures the target nucleic acid (15–25 techniques of Sequencing by hybridization using the Sanger C. above the Tm of the target nucleic acid duplex). The method. Briefly, Such methods generally comprise the Steps oligonucleotide probes are added at the denaturation of hybridizing a tagged primer to target DNA and perform temperature, and hybridization is conducted at the Tm of the ing chain extensions using a polymerase. Specific Stops are mismatched duplex for 0.5 to 24 hours. The genomic DNA 60 controlled by the inclusion of a dideoxynucleotide that may is then collected and by passage through a GF/C (GF/B, and also be tagged. The target nucleic acids are separated accord the like) glass fiber filter. The filter is then washed with the ing to size by HPLC. The tags are cleaved from the separated respective hybridization Solution to remove any non fragments, and are detected by the respective detection hybridized oligonucleotide probes (RNA, short oligos and technology (e.g., mass spectrometry, infrared spectrometry, nucleic acid does not bind to glass fiber filters under these 65 potentio static amp ero me try or UV / visible conditions). The hybridization oligo probe can then be spectrophotometry). Other sequence analysis methods thermally eluted from the target DNA and measured (by involve hybridization of the target with an assortment of US 6,613,508 B1 63 64 random, Short oligonucleotides. The Sequence is constructed from the Separated fragments, and then the tags are detected by overlap hybridization analysis. In this technique, precise by the respective detection technology (e.g., mass hybridization is essential. Use of hybotropes or abasic Spectrometry, infrared Spectrophotometry, potentioStatic residues and annealing at a discriminating temperature is amperometry or UV/visible spectrophotometry. Recent beneficial for this technique to reduce or eliminate mis advances in the OLA assay have allowed for the analysis matched hybridization. The goal is to develop automated multiple Samples and multiple mutations concurrently. hybridization methods in order to probe large arrays of (Baron et al., Nature Biotechnology 87:1279, 1996.) Briefly, oligonucleotide probes or large arrays of nucleic acid the method consists of amplifying the gene fragment con Samples. Applications of Such technologies include gene taining the mutation of interest with PCR. The PCR product mapping, clone characterization, medical genetics and gene is then hybridized with a common and two allele-Specific discovery, DNA sequence analysis by hybridization, and oligonucleotide probes (one containing the mutation while finally, Sequencing verification. Many parameters must be the other does not) such that the 3' ends of the allele-specific controlled in order to automate or multiplex oligonucleotide probes are immediately adjacent to the 5' end of the common probes. The stability of the respective probes must be probe. This Sets up a competitive hybridization-ligation Similar, the degree of mismatch with the target nucleic acid, 15 process between the two allelic probes and the common the temperature, ionic Strength, the A+T content of the probe probe at each locus. The thermostable DNA ligase then (or target), as well as other parameters when the probe is discriminates between Single-base mismatches at the junc short (i.e., 6 to 50 nucleotides) should be similar. Usually, tion site, thereby producing allele-Specific ligation products. the conditions of the experiment and the Sequence of the The common probe is labeled with one of four fluorophores probe are adjusted until the formation of the perfectly and the allele-Specific probes are each labeled with one or based-paired probe is thermodynamically favored over the more pentaethyleneoxide mobility modifying tails which any duplex that contains a mismatch. Very large-scale appli provide a sizing difference between the different allele cations of probes Such as Sequencing by hybridization Specific probes. The Samples are then Separated by gel (SBH), or testing highly polymorphic loci Such as the cystic electrophoresis based upon the length of the modifying tails fibrosis trans-membrane protein locus require a more Strin 25 and detected by the fluorescent tag on the common probe. gent level of control of multiplexed probes. Within one Through the use in Sizing differences on the allele-Specific embodiment of the invention methods are provided for probes and four fluorophores available for the common determining the identity of a nucleic acid molecule, or for probe, many Samples can be analyzed on one lane of the detecting a Selecting nucleic acid molecule, in, for example electrophoretic gel. Within one embodiment of the invention a biological Sample, utilizing the general techniques of methods are provided for determining the identity of a Sequencing by hybridization. Briefly, Such methods gener nucleic acid molecule, or for detecting a Selecting nucleic ally comprise of hybridizing a Series of tagged primers to a acid molecule, in, for example a biological Sample, utilizing DNA target or a series of target DNA fragments under the technique of oligonucleotide ligation assay for concur carefully controlled conditions. The target nucleic acids are rent multiple Sample detection. Briefly, Such methods gen Separated according to Size by HPLC. The tags are then 35 erally comprise the Steps of performing PCR on the target cleaved from the Separated fragments, and detected by the DNA followed by hybridization with the common probe respective detection technology (e.g., mass spectrometry, (untagged) and two allele-specific probes tagged according infrared spectrometry, potentiostatic amperometry or to the Specifications of the invention. The Sample is incu UV/visible spectrophotometry). bated with DNA ligase and fragments separated by, for 16. Oligonucleotide-ligation ASSay 40 example, preferably by LC or HPLC. The tags are cleaved Oligonucleotide-ligation assay is an extension of PCR from the Separated fragments, and then the tags are detected based Screening that uses an ELISA-based assay (OLA, by the respective detection technology (e.g., mass Nickerson et al., Proc. Natl. Acad. Sci. USA 1587:8923, Spectrometry, infrared Spectrophotometry, potentioStatic 1990) to detect the PCR products that contain the target amperometry or UV/visible spectrophotometry. Sequence. Thus, both gel electrophoresis and colony hybrid 45 17. Differential Display ization are eliminated. Briefly, the OLA employs two adja a. Overview cent oligonucleotides: a “reporter” probe (tagged at the 5' Mammals, such as human beings, have about 100,000 end) and a 5'-phosphorylated/3'-biotinylated “anchor" different genes in their genome, of which only a Small probe. The two oligonucleotides, which are complementary fraction, perhaps 15%, are expressed in any individual cell. to Sequences internal to the PCR primers, are annealed to 50 The choice of genes expressed determine the biochemical target DNA and, if there is perfect complementarity, the two character of any given cell or tissue. The process of normal probes are ligated by T4 DNA ligase. Capture of the cellular growth and differentiation, as well as the pathologi biotinylated anchor probe on immobilized streptavidin and cal changes that arise in diseases like cancer, are all driven analysis for the covalently linked reporter probe test for the by changes in gene expression. Differential display methods presence or absence of the target Sequences among the PCR 55 permits the identification of genes Specifically expressed in products. Within one embodiment of the invention methods individual cell types. are provided for determining the identity of a nucleic acid The differential display technique amplifies the 3' termi molecule, or for detecting a Selecting nucleic acid molecule, nal portions of corresponding cDNAS by using a primer in, for example a biological Sample, utilizing the technique designed to bind to the 5' boundary of a poly(A)tail and of oligonucleotide ligation assay. Briefly, Such methods 60 primers of, arbitrary Sequence that bind upstream. Amplified generally comprise the Steps of performing PCR on the populations with each primer pair are visualized by a size target DNA followed by hybridization with the 5' tagged separation method (PAGE, HPLC, etc.), allowing direct ireporteri DNA probe and a 5" phosphorylated/non comparison of the mRNAS between two biological samples biotinylated probe. The sample is incubated with T4 DNA of interest. The differential display method has the potential ligase. The DNA Strands with ligated probes can be sepa 65 to visualize all the expressed genes (about 10,000 to 15,000 rated from the DNA with non-ligated probes by, for mRNA species) in a mammalian cell and enables sequence example, preferably by LC or HPLC. The tags are cleaved analysis. It is possible to compare: (1) the total number of US 6,613,508 B1 65 66 peaks amplified in the parents, (2) the number of polymor incomplete hybridization of the primer leads to Smearing of phic peaks between parents, and (3) the Segregation ratios of bands on the gels. The optimal concentration of RNA is polymorphic peaks in the progeny of crosses in animals or 200-300 ng per cDNA synthesis. plants. Differential display is also used for the identification CMST-based differential display is performed essentially of up- and down-regulated genes, known or unknown, after as previously described (Liang and Pardee, Science a variety of stimuli. Differential display PCR fragments can 257:967–971, 1992) except for the design of primers used be used as probes for cDNA cloning (discovering an for reverse transcription and amplification Steps, and the unknown gene from a cDNA or genomic library). choice of radiolabeled nucleotide. A complete differential Briefly, the steps in differential display are as follows: 1) display analysis of the cDNAs from two biological samples RNA is isolated from biological sample of interest. Total of interest using nine downstream primers and 24 upstream RNA, cytoplasmic RNA or mRNA can be used. 2) first primers would generate 9x24x2 CMST-based differential Strand cDNAS are generated using an anchored oligodT display reactions Amplification products can be separated by (oligodTdN, where N is A, C, or G). 3) Amplification of HPLC and reamplified if desired. cDNA using oligodTdN and short primers with arbitrary Following incubation of the RNA at 65 C. for 5 minutes, Sequence. For a complete differential display analysis of two 15 Samples are chilled on ice, added to the reverse transcription cell populations or two Samples of interest, 9 different mix and incubated for 60 minutes at 37° C. followed by 95° primers are required. The detection limit of differential C. for 5 minutes. Duplicate cDNA samples are then ampli display for a specific mRNA is less than 0.001% of the total fied using the same 5'-primer in combination with a Series of mRNA population. 13 mers of arbitrary but defined sequence: H-AP: AAGCT Because of the Simplicity, Sensitivity, and reproducibility TCGACTGT (Seq. ID No. 2), H-AP: AAGCTTTGGTCAG of the method disclosed here, the CMST differential display (Seq. ID No. 3), H-AP4: AAGCTTCTCAACG (Seq. ID No. method is a Significant advance over traditional gel based 4), H-AP5: AAGCTTAGTAGGC (Seq. ID No. 5). systems. With the CMST-based differential display analysis Amplification is performed in reaction mixes containing of two cell types, including 64x24 PCR runs can be com 0.1x volume of reverse transcription reaction, 1xPCR buffer pleted rapidly as opposed to a labor intensive, lengthy time 25 (10xPCR buffer=100 mM TRIS-HCl, 15 mM MgCl2, 10 by the traditional method. Moreover, Sequence heterogene mM KCl pH 8.3). 2 uM dNTPs, 0.2 uM RS H-T11C ity of bands isolated from differential display gels has been anchored primer, 0.2 uM appropriate arbitrary primer, 1.5 U found to be a contributing factor to the high failure rate of ExpandTM high fidelity DNA polymerase, and water to a this technique. This is completely avoided with the CMST final volume of 20 ul. Amplification of the cDNA is per based differential display methology described here. formed under the following conditions: 94° C. (1 minute) b. CMST-based Differential Display Example. followed by 40 cycles of 94° C. (30 seconds), 40° C. (2 The starting material for differential display is RNA minutes), 72° C. (30 seconds) and finished with 72° C. (5 isolated from two different populations of cells. Generally, minutes). the cells are of Similar origin, and differ with respect to their Amplification for each gene is performed with gene treatment with drugs, their being “normal” versus 35 Specific primerS Spanning a known intron/exon boundry (see “transformed”, or their expression of various introduced tables below). All amplifications are done in 20 ul volumes geneS. containing 10 mM Tris HCl pH 8.3, 1 mM NH4C1, 1.5 M Plant tissue or animal material (2–3 g) is harvested and MgCl2, 100 mM KCl, 0.125 mM NTPs, 10 ng/ml of the minced or chopped in Sterile petri dishes. The material is respective oligonucleotide primers and 0.75 units of Taq then ground to a fine powder in a precooled pestle and 40 DNA polymerase (Gibco-BRL). Cycling parameters were mortar under liquid N. 1 g of frozen powder is transferred 94 C. preheating step for 5 minutes followed by 94 C. to a 12 ml poly-propelene tubes (1 g ca. 5 ml powder) denaturing step for 1 minute, 55 C. annealing step for 2 containing 8 ml of a hot (80° C.) 1:1 mixture of RNA minutes, and a 72 C. extension step for 30 seconds to 1 extraction buffer (100 mM Tris-HCl (pH 8.0), 100 mM LiCl, minute and a final extension at 72 C. for 10 minutes. 10 mM EDTA, 1.0% LiDS) and phenol (base vol. 4 ml). 45 Amplification cycles are generally 30–45 in number. The sample is mixed (vortexed) at high speed for at least 30 Amplification products are gel purified (Zhen and Swank, Seconds. A volume of chloroform is added (4 ml) and again BioTechniques 14:894-898, 1993) on 1% agarose gels run mixed and spun for 20 mins in a centrifuge at 5000 to in 0.04 M Tris-acetate, 0.001M EDTA (1xTEA) buffer and 10,000xg. The aqueous phase is transferred to a new 12 ml Stained with ethidium bromide. A trough is cut just in front tube and 1/3 volume of cold 8M LiCl is added to precipitate 50 of the band of interest and filled with 50–200ul of 10% PEG the RNA (3 h at 0°C.). The RNA is centrifuged at 0°C. for in 1xTAE buffer. Electrophoresis is continued until the band 20 mins and resuspended in 1 ml H2O. Residual genomic has completely entered the trough. The contents are then DNA is removed by treating the RNA sample with DNase I. removed and extracted with phenol, cholorform extracted, Reverse transcription is performed on each RNA sample and precipitated in 0.1 volume of 7.5 M ammonium acetate using 500 ng of DNA free RNA in 1x reverse transcription 55 and 2.5 volumes of 100% EtOH. Samples are washed with buffer, 10 mM DTT, 20 uM dNTPs, 0.2 uM 5'RS H-T11C 75% EtOH and briefly dried at ambient temperature. Quan (one base anchored primer with 5' restriction site), 200 U titation of yield is done by electrophoresis of a Small aliquot MoMuLV reverse transcriptase and 1.5U RNA Guard per 20 on 1% agarose gel in 1XTBE buffer with ethidium bromide All reaction Volume. Staining and comparison to a known Standard. Although using a downstream primer reduces the number 60 The products from the amplification reactions are ana of cDNA Subfractions to three, it does not reduce the number lyzed by HPLC. HPLC ias carried out using automated of PCR reactions required to display most of the cDNA HPLC instrumentation (Rainin, Emeryville, Calif., or Species present in the pool. On the contrary, it decreases the Hewlett Packard, Palo Alto, Calif.). Unpurified DNA fin theoretical chance of identifying those cDNA species which gerprinting products which are denatured for 3 minutes at 95 are present. The best results are obtained using a combina 65 prior into injection into an HPLC are eluted with linear tion of nine different primers of the type DMO-VV, where V acetonitrile (ACN, J.T. Baker, N.J.) gradient of 1.8%/minute can be A.G.C but not T. With a T in the terminal 3' position, at a flow rate of 0.9 ml/minute. The start and end points are US 6,613,508 B1 67 68 adjusted according to the size of the amplified products. The 3. Capillary Electrophoresis (CE) temperature required for the Successful resolution of the Capillary electrophoresis (CE) in its various manifesta molecules generated during the DNA fingerprinting tech tions (free Solution, isotachophoresis, isoelectric focusing, nique is 50° C. The effluent from the HPLC is then directed polyacrylamide gel, micellar electrokinetic into a mass spectrometer (Hewlett Packard, Palo Alto, "chromatography”) is developing as a method for rapid high Calif.) for the detection of tags. resolution Separations of very Small Sample Volumes of Comparison of the chromatograms (mass spectrometry complex mixtures. In combination with the inherent Sensi based) indicates that bands at 220 bp and 468 bp are tivity and selectivity of MS, CE-MS is a potential powerful observed in the stimulated Jurkat cells and not observed in the unstimulated Jurkat cells. technique for bioanalysis. In the novel application disclosed C. Separation of Nucleic Acid Fragments herein, the interfacing of these two methods will lead to A Sample that requires analysis is often a mixture of many Superior DNA sequencing methods that eclipse the current components in a complex matrix. For Samples containing rate methods of Sequencing by Several orders of magnitude. unknown compounds, the components must be separated The correspondence between CE and electrospray ioniza from each other So that each individual component can be tion (ESI) flow rates and the fact that both are facilitated by identified by other analytical methods. The Separation prop 15 (and primarily used for) ionic species in Solution provide the erties of the components in a mixture are constant under basis for an extremely attractive combination. The combi constant conditions, and therefore once determined they can nation of both capillary Zone electrophoresis (CZE) and be used to identify and quantify each of the components. capillary isotachophoresis with quadrapole mass spectrom Such procedures are typical in chromatographic and elec eters based upon ESI have been described (Olivares et al., trophoretic analytical Separations. Anal. Chem. 59:1230, 1987; Smith et al., Anal. Chem. 1. High-Performance Liquid Chromatography (HPLC) 60:436, 1988; Loo et al., Anal. Chem. 179:404, 1989; High-Performance liquid chromatography (HPLC) is a Edmonds et al., J. Chroma. 474:21, 1989; Loo et al., J. chromatographic Separations technique to Separate com Microcolumn Sep. 1:223, 1989; Lee et al., J. Chromatog. pounds that are dissolved in solution. HPLC instruments 458:313, 1988; Smith et al., J. Chromatog. 480:211, 1989; consist of a reservoir of mobile phase, a pump, an injector, 25 Grese et al., J. Am. Chem. Soc. 111:2835, 1989). Small a separation column, and a detector. Compounds are Sepa peptides are easily amenable to CZE analysis with good rated by injecting an aliquot of the Sample mixture onto the (femtomole) sensitivity. column. The different components in the mixture pass The most powerful separation method for DNA fragments through the column at different rates due to differences in their partitioning behavior between the mobile liquid phase is polyacrylamide gel electrophoresis (PAGE), generally in and the Stationary phase. a slab gel format. However, the major limitation of the Recently, IP-RO-HPLC on non-porous PS/DVB particles current technology is the relatively long time required to with chemically bonded alkyl chains have been shown to be perform the gel electrophoresis of DNA fragments produced rapid alternatives to capillary electrophoresis in the analysis in the Sequencing reactions. An increase magnitude (10 of both Single and double-Strand nucleic acids providing fold) can be achieved with the use of capillary electrophore similair degrees of resolution (Huber et al, Anal. Biochem. 35 sis which utilize ultrathin gels. In free Solution to a first 212:351, 1993; Huber et al., 1993, Nuc. Acids Res. 21:1061; approximation all DNA migrate with the same mobility as Huber et al., Biotechniques 16:898, 1993). In contrast to the addition of a base results in the compensation of mass ion-excahnge chromoatrography, which does not always and charge. In polyacrylamide gels, DNA fragments Sieve retain double-strand DNA as a function of strand length and migrate as a function of length and this approach has (Since AT base pairs intereact with the positively charged 40 now been applied to CE. Remarkable plate number per Stationary phase, more strongly than GC base-pairs), IP-RP meter has now been achieved with cross-linked polyacryla HPLC enables a strictly size-dependent separation. mide (107 plates per meter, Cohen et al., Proc. Natl. Acad. A method has been developed using 100 mM triethylam Sci., USA 85:9660, 1988). Such CE columns as described monium acetate as ion-pairing reagent, phosphodiester oli can be employed for DNA sequencing. The method of CE is gonucleotides could be Successfully Separated on alkylated 45 in principle 25 times faster than slab gel electrophoresis in non-porous 2.3 um poly(styrene-divinylbenzene) particles a Standard Sequencer. For example, about 300 bases can be by means of high performance liquid chromatography read per hour. The Separation Speed is limited in Slab gel (Oefner et al., Anal. Biochem. 223:39, 1994). The technique electrophoresis by the magnitude of the electric field which described allowed the separation of PCR products differing can be applied to the gel without excessive heat production. only 4 to 8 base pairs in length within a Size range of 50 to 50 Therefore, the greater speed of CE is achieved through the 200 nucleotides. use of higher field strengths (300 V/cm in CE versus 10 2. Electrophoresis V/cm in Slab gel electrophoresis). The capillary format Electrophoresis is a separations technique that is based on reduces the amperage and thus power and the resultant heat the mobility of ions (or DNA as is the case described herein) generation. in an electric field. Negatively charged DNA charged 55 Smith and others (Smith et al., Nuc. Acids. Res. 18:4417, migrate towards a positive electrode and positively-charged 1990) have Suggested employing multiple capillaries in ions migrate toward a negative electrode. For Safety reasons parallel to increase throughput. Likewise, Mathies and one electrode is usually at ground and the other is biased Huang (Mathies and Huang, Nature 359:167, 1992) have positively or negatively. Charged species have different introduced capillary electrophoresis in which Separations are migration rates depending on their total charge, Size, and 60 performed on a parallel array of capillaries and demon shape, and can therefore be separated. An electrode appa Strated high through-put sequencing (Huang et al., Anal. ratus consists of a high-voltage power Supply, electrodes, Chem. 64.967, 1992, Huang et al., Anal. Chem. 64:2149, buffer, and a Support for the buffer Such as a polyacrylamide 1992). The major disadvantage of capillary electrophoresis gel, or a capillary tube. Open capillary tubes are used for is the limited amount of Sample that can be loaded onto the many types of Samples and the other gel Supports are usually 65 capillary. By concentrating a large amount of Sample at the used for biological Samples Such as protein mixtures or DNA beginning of the capillary, prior to Separation, loadability is fragments. increased, and detection levels can be lowered Several orders US 6,613,508 B1 69 70 of magnitude. The most popular method of preconcentration electrophoresis by miniaturizing the lane dimension to about in CE is Sample Stacking. Sample Stacking has recently been 100 micrometers. The electronics industry routinely uses reviewed (Chien and Burgi, Anal. Chem. 64:489A, 1992). microfabrication to make circuits with features of less than Sample Stacking depends of the matrix difference, (pH, ionic one micron in size. The current density of capillary arrays is strength) between the sample buffer and the capillary buffer, limited the outside diameter of the capillary tube. Micro So that the electric field acroSS the Sample Zone is more than fabrication of channels produces a higher density of arrayS. in the capillary region. In Sample Stacking, a large Volume of Microfabrication also permits physical assemblies not poS Sample in a low concentration buffer is introduced for sible with glass fibers and links the channels directly to other preconcentration at the head of the capillary column. The devices on a chip. Few devices have been constructed on capillary is filled with a buffer of the same composition, but microchips for Separation technologies. A gas chromato at higher concentration. When the Sample ions reach the graph (Terry et al., IEEE Trans. Electron Device, capillary buffer and the lower electric field, they stack into ED-26:1880, 1979) and a liquid chromatograph (Manz et al., a concentrated Zone. Sample Stacking has increased detect Sens. Actuators B1:249, 1990) have been fabricated on abilities 1-3 orders of magnitude. Silicon chips, but these devices have not been widely used. Another method of preconcentration is to apply isota 15 Several groups have reported Separating fluorescent dyes chophoresis (ITP) prior to the free Zone CE separation of and amino acids on microfabricated devices (Manz et al., J. analytes. ITP is an electrophoretic technique which allows Chromatography 593:253, 1992, Effenhauser et al., Anal. microliter Volumes of Sample to be loaded on to the capil Chem. 65:2637, 1993). Recently Woolley and Mathies lary. in contrast to the low nL injection Volumes typically (Woolley and Mathies, Proc. Natl. Acad. Sci. 91: 11348, asSociated with CE. The technique relies on inserting the 1994) have shown that photolithography and chemical etch Sample between two buffers (leading and trailing ing can be used to make large numbers of Separation electrolytes) of higher and lower mobility respectively, than channels on glass Substrates. The channels are filled with the analyte. The technique is inherently a concentration hydroxyethyl cellulose (HEC) separation matrices. It was technique, where the analytes concentrate into pure Zones shown that DNA restriction fragments could be separated in migrating with the same Speed. The technique is currently 25 as little as two minutes. less popular than the Stacking methods described above D. Cleavage of Tags because of the need for Several choices of leading and As described above, different linker designs will confer trailing electrolytes, and the ability to Separate only cationic cleavability (“lability”) under different specific physical or or anionic Species during a separation process. chemical conditions. Examples of conditions which Serve to The heart of the DNA sequencing process is the remark cleave various designs of linker include acid, base, ably Selective electrophoretic Separation of DNA or oligo oxidation, reduction, fluoride, thiol eXchange, photolysis, nucleotide fragments. It is remarkable because each frag and enzymatic conditions. ment is resolved and differs by only nucleotide. Separations Examples of cleavable linkers that Satisfy the general of up to 1000 fragments (1000 bp) have been obtained. A criteria for linkers listed above will be well known to those further advantage of Sequencing with cleavable tags is as 35 in the art and include those found in the catalog available follows. There is no requirement to use a slab gel format from Pierce (Rockford, Ill.). Examples include: when DNA fragments are separated by polyacrylamide gel ethylene glycobis(Succinimidylsuccinate) (EGS), an electrophoresis when cleavable tags are employed. Since amine reactive croSS-linking reagent which is cleavable numerous samples are combined (4 to 2000) there is no need by hydroxylamine (1 M at 37° C. for 3–6 hours); to run Samples in parallel as is the case with current 40 disuccinimidyl tartarate (DST) and sulfo-DST, which are dye-primer or dye-terminator methods (i.e., ABI373 amine reactive cross-linking reagents, cleavable by Sequencer). Since there is no reason to run parallel lanes, 0.015 M Sodium periodate; there is no reason to use a slab gel. Therefore, one can bis(2-(succinimidyloxycarbonyloxy)ethylsulfone employ a tube gel format for the electrophoretic Separation (BSOCOES) and sulfo-BSOCOES, which are amine method. Grossman (Grossman et al., Genet. Anal. Tech. 45 reactive cross-linking reagents, cleavable by base (pH Appl. 9:9, 1992) have shown that considerable advantage is 11.6); gained when a tube gel format is used in place of a slab gel 1,4-di-3'-(2-pyridyldithio(propionamido))butane format. This is due to the greater ability to dissipate Joule (DPDPB), a pyridyldithiol crosslinker which is cleav heat in a tube format compared to a slab gel which results in able by thiol eXchange or reduction; faster run times (by 50%), and much higher resolution of 50 high molecular weight DNA fragments (greater than 1000 N-4-(p-azidosalicylamido)-butyl-3'-(2-pyridydithio) nt). Long reads are critical in genomic sequencing. propionamide (APDP), a pyridyldithiol crosslinker Therefore, the use of cleavable tags in Sequencing has the which is cleavable by thiol eXchange or reduction; additional advantage of allowing the user to employ the bis-beta-4-(azidosalicylamido)ethyl-disulfide, a photo most efficient and sensitive DNA separation method which 55 reactive crosslinker which is cleavable by thiol also possesses the highest resolution. eXchange or reduction; 4. Microfabricated Devices N-Succinimidyl-(4-azidophenyl)-1,3'dithiopropionate Capillary electrophoresis (CE) is a powerful method for (SADP), a photoreactive crosslinker which is cleavable DNA sequencing, forensic analysis, PCR product analysis by thiol eXchange or reduction; and restriction fragment sizing. CE is far faster than tradi 60 Sulfo Succinimidyl-2-(7-azido-4-methylcoumarin-3- tional slab PAGE since with capillary gels a far higher acetamide)ethyl-1,3'-dithiopropionate: (SAED), a pho potential field can be applied. However, CE has the draw toreactive crosslinker which is cleavable by thiol back of allowing only one sample to be processed per gel. eXchange or reduction; The method combines the faster separations times of CE SulfoSuccinimidyl-2-(m-azido-o-nitrobenzamido)-ethyl with the ability to analyze multiple Samples in parallel. The 65 1,3'dithiopropionate (SAND), a photore active underlying concept behind the use of microfabricated crosslinker which is cleavable by thiol eXchange or devices is the ability to increase the information density in reduction. US 6,613,508 B1 71 72 Other examples of cleavable linkers and the cleavage commercial Sources of instruments for photochemical cleav conditions that can be used to release tags are as follows. A age are Aura Industries Inc. (Staten Island, N.Y.) and Agre silyl linking group can be cleaved by fluoride or under acidic netics (Wilmington, Mass.). Cleavage of the linkers results conditions. A 3-, 4-, 5-, or 6-Substituted-2-nitrobenzyloxy or in liberation of a primary amide on the tag. Examples of 2-, 3-, 5-, or 6-Substituted-4-nitrobenzyloxy linking group photocleavable linkers include nitrophenyl glycine esters, can be cleaved by a photon Source (photolysis). A 3-, 4-, 5-, eXO- and endo-2-benzonorborneyl chlorides and methane or 6-Substituted-2-alkoxyphenoxy or 2-, 3-, 5-, or Sulfonates, and 3-amino-3(2-nitrophenyl) propionic acid. 6-Substituted-4-alkoxyphenoxy linking group can be Examples of enzymatic cleavage include esterases which cleaved by Ce(NH) (NO) (oxidation). A NCO will cleave ester bonds, nucleaseS which will cleave phos (urethane) linker can be cleaved by hydroxide (base), acid, phodiester bonds, proteases which cleave peptide bonds, etc. or LiAlH4 (reduction). A 3-pentenyl, 2-butenyl, or 1-butenyl E. Detection of Tags linking group can be cleaved by O, O,O/IO, or KMnO, Detection methods typically rely on the absorption and (oxidation). A 2-3-, 4-, or 5-Substituted-furyloxy linking emission in Some type of Spectral field. When atoms or group can be cleaved by O, Br, MeOH, or acid. molecules absorb light, the incoming energy excites a quan Conditions for the cleavage of other labile linking groups 15 tized Structure to a higher energy level. The type of excita include: t-alkyloxy linking groups can be cleaved by acid; tion depends on the wavelength of the light. Electrons are methyl(dialkyl)methoxy or 4-substituted-2-alkyl-1,3- promoted to higher orbitals by ultraviolet or visible light, dioxlane-2-yl linking groups can be cleaved by HO"; molecular vibrations are excited by infrared light, and rota 2-silylethoxy linking groups can be cleaved by fluoride or tions are excited by microwaves. An absorption spectrum is acid; 2-(X)-ethoxy (where X=keto, ester amide, cyano, NO2, the absorption of light as a function of wavelength. The Sulfide, Sulfoxide, Sulfone) linking groups can be cleaved Spectrum of an atom or molecule depends on its energy level under alkaline conditions, 2-, 3-, 4-, 5-, or 6-Substituted Structure. Absorption spectra are useful for identification of benzyloxy linking groups can be cleaved by acid or under compounds. Specific absorption spectroscopic methods reductive conditions, 2-butenyloxy linking groups can be include atomic absorption spectroscopy (AA), infrared spec cleaved by (Ph-P)-RhCl(H), 3-, 4-, 5-, or 6-substituted-2- 25 troscopy (IR), and UV-Vis spectroscopy (uV-vis). bromophenoxy linking groups can be cleaved by Li, Mg, or Atoms or molecules that are excited to high energy levels BuLi; methylthiomethoxy linking groups can be cleaved by can decay to lower levels by emitting radiation. This light Hg"; 2-(X)-ethyloxy (where X=a halogen) linking groups emission is called fluorescence if the transition is between can be cleaved by Zn or Mg, 2-hydroxyethyloxy linking States of; the same Spin, and phosphorescence if the transi groups can be cleaved by oxidation (e.g., with Pb(OAc)). tion occurs between States of different Spin. The emission Preferred linkers are those that are cleaved by acid or intensity of an analyte is linearly proportional to concentra photolysis. Several of the acid-labile linkers that have been tion (at low concentrations), and is useful for quantifying the developed for Solid phase peptide Synthesis are useful for emitting Species. Specific emission spectroscopic methods linking tags to MOIs. Some of these linkers are described in include atomic emission spectroscopy (AES), atomic fluo a recent review by Lloyd-Williams et al. (Tetrahedron 35 rescence spectroscopy (AFS), molecular laser-induced fluo 49:11065-11133, 1993). One useful type of linker is based rescence (LIF), and X-ray fluorescence (XRF). upon p-alkoxybenzyl alcohols, of which two, When electromagnetic radiation passes through matter, 4-hydroxy methylphenoxyacetic acid and 4-(4- most of the radiation continues in its original direction but hydroxymethyl-3-methoxyphenoxy)butyric acid, are com a Small fraction is Scattered in other directions. Light that is mercially available from Advanced ChemTech (Louisville, 40 Scattered at the same wavelength as the incoming light is Ky.). Both linkers can be attached to a tag via an ester called Rayleigh Scattering. Light that is Scattered in trans linkage to the benzylalcohol, and to an amine-containing parent solids due to vibrations (phonons) is called Brillouin MOI via an amide linkage to the carboxylic acid. Tags linked Scattering. Brillouin Scattering is typically shifted by 0.1 to by these molecules are released from the MOI with varying 1 wave number from the incident light. Light that is Scat concentrations of trifluoroacetic acid. The cleavage of these 45 tered due to vibrations in molecules or optical phonons in linkers results in the liberation of a carboxylic acid on the opaque Solids is called Raman Scattering. Raman Scattered tag. Acid cleavage of tags attached through related linkers, light is shifted by as much as 4000 wavenumbers from the Such as 2,4-dimethoxy-4'-(carboxymethyloxy)- incident light. Specific Scattering Spectroscopic methods benzhydrylamine (available from Advanced ChemTech in include Raman spectroscopy. FMOC-protected form), results in liberation of a carboxylic 50 IR spectroScopy is the measurement of the wavelength amide on the released tag. and intensity of the absorption of mid-infrared light by a The photolabile linkers useful for this application have sample. Mid-infrared light (2.5-50 um, 4000-200 cm) is also been for the most part developed for Solid phase peptide energetic enough to excite molecular vibrations to higher synthesis (see Lloyd-Williams review). These linkers are energy levels. The wavelength of IR absorption bands are usually based on 2-nitrobe in Zyle Sters or 55 characteristic of Specific types of chemical bonds and IR 2-nitrobenzylamides. Two examples of photolabile linkers Spectroscopy is generally most useful for identification of that have recently been reported in the literature are 4-(4- organic and organometallic molecules. (1-Fmoc-amino)ethyl)-2-methoxy-5-nitrophenoxy)butanoic Near-infrared absorption spectroscopy (NIR) is the mea acid (Holmes and Jones, J. Org. Chem. 60:2318-2319, Surement of the wavelength and intensity of the absorption 1995) and 3-(Fmoc-amino)-3-(2-nitrophenyl)propionic acid 60 of near-infrared light by a Sample. Near-infrared light spans (Brown et al., Molecular Diversity 1:4–12, 1995). Both the 800 nm-2.5 um (12,500-4000 cm) range and is linkers can be attached via the carboxylic acid to an amine energetic enough to excite overtones and combinations of on the MOI. The attachment of the tag to the linker is made molecular vibrations to higher energy levels. NIR spectros by forming an amide between a carboxylic acid on the tag copy is typically used for quantitative measurement of and the amine on the linker. Cleavage of photolabile linkers 65 organic functional groups, especially O-H, N-H, and is usually performed with UV light of 350 nm wavelength at C=O. The components and design of NIR instrumentation intensities and times known to those in the art. Examples of are Similar to uV-Vis absorption spectrometers. The light US 6,613,508 B1 73 74 Source is usually a tungsten lamp and the detector is usually be produced in pulses and/or extracted in pulses. A pulsed a PbS solid-state detector. Sample holders can be glass or electric field accelerates all ions into a field-free drift region quartz and typical Solvents are CC1 and CS. The conve with a kinetic energy of qV, where q is the ion charge and nient instrumentation of NIR spectroScopy makes it Suitable V is the applied Voltage. Since the ion kinetic energy is 0.5 for on-line monitoring and process control. mV, lighter ions have a higher velocity than heavier ions Ultraviolet and Visible Absorption Spectroscopy (uV-vis) and reach the detector at the end of the drift region Sooner. Spectroscopy is the measurement of the wavelength and The output of an ion detector is displayed on an oscilloscope intensity of absorption of near-ultraviolet and visible light as a function of time to produce the mass spectrum. by a Sample. Absorption in the vacuum UV occurs at The ion formation process is the Starting point for mass 100–200 nm; (10-50,000 cm) quartz UV at 200–350 nm; Spectrometric analyses. Chemical ionization is a method that (50,000–28,570 cm) and visible at 350–800 nm; (28, employs a reagent ion to react with the analyte molecules 570-12,500 cm) and is described by the Beer-Lambert (tags) to form ions by either a proton or hydride transfer. The Bouguet law. UltraViolet and Visible light are energetic reagent ions are produced by introducing a large excess of enough to promote outer electrons to higher energy levels. methane (relative to the tag) into an electron impact (EI) ion UV-Vis spectroScopy can be usually applied to molecules 15 Source. Electron collisions produce CH and CH." which and inorganic ions or complexes in Solution. The uV-Vis further react with methane to form CHs and C.H.". spectra are limited by the broad features of the spectra. The Another method to ionize tags is by plasma and glow light Source is usually a hydrogen or deuterium lamp for uV discharge. Plasma is a hot, partially-ionized gas that effec measurements and a tungsten lamp for visible measure tively excites and ionizes atoms. A glow discharge is a ments. The wavelengths of these continuous light Sources low-pressure plasma maintained between two electrodes. are Selected with a wavelength separator Such as a prism or Electron impact ionization employs an electron beam, usu grating monochromator. Spectra are obtained by Scanning ally generated from a tungsten filament, to ionize gas-phase the wavelength separator and quantitative measurements can atoms or molecules. An electron from the beam knocks an be made from a spectrum or at a Single wavelength. electron off analyte atoms or molecules to create ions. Mass spectrometers use the difference in the mass-to 25 Electrospray ionization utilizes a very fine needle and a charge ratio (m/z) of ionized atoms or molecules to separate Series of SkimmerS. A Sample Solution is sprayed into the them from each other. Mass spectrometry is therefore useful Source chamber to form droplets. The droplets carry charge for quantitation of atoms or molecules and also for deter when the exit the capillary and as the Solvent vaporizes the mining chemical and Structural information about mol droplets disappear leaving highly charged analyte mol ecules. Molecules have distinctive fragmentation patterns ecules. ESI is particularly useful for large biological mol that provide Structural information to identify compounds. ecules that are difficult to vaporize or ionize. Fast-atom The general operations of a mass spectrometer are as fol bombardment (FAB) utilizes a high-energy beam of neutral lowS. Gas-phase ions are created, the ions are Separated in atoms, typically Xe or Ar, that Strikes a Solid Sample causing Space or time based on their mass-to-charge ratio, and the desorption and ionization. It is used for large biological quantity of ions of each mass-to-charge ratio is measured. 35 molecules that are difficult to get into the gas phase. FAB The ion Separation power of a mass spectrometer is causes little fragmentation and usually gives a large molecu described by the resolution, which is defined as R=m/delta lar ion peak, making it useful for molecular weight deter m, where m is the ion mass and delta m is the difference in mination. The atomic beam is produced by accelerating ions mass between two resolvable peaks in a mass spectrum. For from an ion Source though a charge-exchange cell. The ions example, a mass spectrometer with a resolution of 1000 can 40 pick up an electron in collisions with neutral atoms to form resolve an ion with a m/z of 100.0 from an ion with a m/z. a beam of high energy atoms. Laser ionization (LIMS) is a of 100.1. method in which a laser pulse ablates material from the In general, a mass spectrometer (MS) consists of an ion Surface of a Sample and creates a microplasma that ionizes Source, a mass-Selective analyzer, and an ion detector. The Some of the Sample constituents. Matrix-assisted laser des magnetic-sector, quadrupole, and time-of-flight designs also 45 orption ionization (MALDI) is a LIMS method of vaporiz require extraction and acceleration ion optics to transfer ions ing and ionizing large biological molecules Such as proteins from the Source region into the mass analyzer. The details of or DNA fragments. The biological molecules are dispersed Several mass analyzer designs (for magnetic-sector MS, in a Solid matrix Such as nicotinic acid. A UV laser pulse quadrupole MS or time-of-flight MS) are discussed below. ablates the matrix which carries. Some of the large molecules Single Focusing analyzers for magnetic-sector MS utilize a 50 into the gas phase in an ionized form So they can be extracted particle beam path of 180, 90, or 60 degrees. The various into a mass spectrometer. Plasma-desorption ionization (PD) forces influencing the particle Separate ions with different utilizes the decay of Cf which produces two fission mass-to-charge ratios: With double-focusing analyzers, an fragments that travel in opposite directions. One fragment electroStatic analyzer is added in this type of instrument to Strikes the Sample knocking out 1-10 analyte ions. The other Separate particles with difference in kinetic energies. 55 fragment Strikes a detector and triggers the Start of data A quadrupole mass filter for quadrupole MS consists of acquisition. This ionization method is especially useful for four metal rods arranged in parallel. The applied Voltages large biological molecules. Resonance ionization (RIMS) is affect the trajectory of ions traveling down the flight path a method in which one or more laser beams are tuned in centered between the four rods. For given DC and AC resonance to transitions of a gas-phase atom or molecule to Voltages, only ions of a certain mass-to-charge ratio pass 60 promote it in a Stepwise fashion above its ionization poten through the quadrupole filter and all other ions are thrown tial to create an ion. Secondary ionization (SIMS) utilizes an out of their original path. A mass spectrum is obtained by ion beam; such as He', 'O", or "Ar"; is focused onto the monitoring the ions passing through the quadrupole filter as Surface of a Sample and Sputters material into the gas phase. the Voltages on the rods are varied. Spark Source is a method which ionizes analytes in Solid A time-of-flight mass spectrometer uses the differences in 65 Samples by pulsing an electric current across two electrodes. transit time through a “drift region' to Separate ions of A tag may become charged prior to, during or after different masses. It operates in a pulsed mode So ions must cleavage from the molecule to which it is attached. Ioniza US 6,613,508 B1 75 76 tion methods based on ion “desorption', the direct formation have put forth hypotheses (Blades et al., Anal. Chem. or emission of ions from Solid or liquid Surfaces have 63:2109-14, 1991; Kebarle et al., Anal. Chem. 65:A972–86, allowed increasing application to nonvolatile and thermally 1993; Fenn, J. Am. Soc. Mass. Spectrom. 4:524–35, 1993). labile compounds. These methods eliminate the need for Regardless of the ultimate process of ion formation, ESI neutral molecule Volatilization prior to ionization and gen produces charged molecules from Solution under mild con erally minimize thermal degradation of the molecular spe ditions. cies. These methods include field desorption (Becky, Prin The ability to obtain useful mass spectral data on Small amounts of an organic molecule relies on the efficient ciples of Field Ionization and Field Desorption Mass production of ions. The efficiency of ionization for ESI is Spectrometry, Pergamon, Oxford, 1977), plasma desorption related to the extent of positive charge associated with the (Sundqvist and Macfarlane, Mass Spectrom. Rev. 4:421, molecule. Improving ionization experimentally has usually 1985), laser desorption (Karas and Hillenkamp, Anal. Chem. involved using acidic conditions. Another method to 60:2299, 1988; Karas et al., Angew. Chem. 101:805, 1989), improve ionization has been to use quaternary amines when fast particle bombardment (e.g., fast atom bombardment, possible (see Aebersold et al., Protein Science 1:494-503, FAB, and secondary ion mass spectrometry, SIMS, Barber et 1992; Smith et al., Anal. Chem. 60:436-41, 1988). al., Anal. Chem. 54:645A, 1982), and thermospray (TS) 15 Electrospray ionization is described in more detail as ionization (Vestal, Mass Spectrom. Rev. 2:447, 1983). The follows. Electrospray ion production requires two steps: mospray is broadly applied for the on-line combination with dispersal of highly charged droplets at near atmospheric liquid chromatography. The continuous flow FAB methods preSSure, followed by conditions to induce evaporation. A (Caprioli et al., Anal. Chem. 58:2949, 1986) have also Solution of analyte molecules is passed through a needle that shown significant potential. A more complete listing of is kept at high electric potential. At the end of the needle, the ionization/mass spectrometry combinations is ion-trap mass Solution disperses into a mist of Small highly charged Spectrometry, electrospray ionization mass Spectrometry, droplets containing the analyte molecules. The Small drop ion-spray maSS Spectrometry, liquid ionization mass lets evaporate quickly and by a process of field desorption or Spectrometry, atmospheric pressure ionization maSS residual evaporation, protonated protein molecules are Spectrometry, electron ionization mass spectrometry, meta 25 released into the gas phase. An electrospray is generally Stable atom bombardment ionization mass spectrometry, fast produced by application of a high electric field to a Small atom bombard ionization mass spectrometry, MALDI mass flow of liquid (generally 1-10 uL/min) from a capillary tube. Spectrometry, photo-ionization time-of-flight mass A potential difference of 3-6 kV is typically applied between the capillary and counter electrode located 0.2–2 cm away Spectrometry, laser droplet mass spectrometry, MALDI (where ions, charged clusters, and even charged droplets, TOF mass spectrometry, APCI maSS Spectrometry, nano depending on the extent of desolvation, may be Sampled by Spray mass spectrometry, nebulised Spray ionization mass the MS through a small orifice). The electric field results in Spectrometry, chemical ionization mass spectrometry, reso charge accumulation on the liquid Surface at the capillary nance ionization mass spectrometry, Secondary ionization terminus; thus the liquid flow rate, resistivity, and surface mass Spectrometry, thermospray maSS Spectrometry. tension are important factors in droplet production. The high The ionization methods amenable to nonvolatile biologi 35 electric field results in disruption of the liquid Surface and cal compounds have overlapping ranges of applicability. formation of highly charged liquid droplets. Positively or Ionization efficiencies are highly dependent on matrix com negatively charged droplets can be produced depending position and compound type. Currently available results upon the capillary bias. The negative ion mode requires the indicate that the upper molecular mass for TS is about 8000 presence of an electron Scavenger Such as oxygen to inhibit daltons (Jones and Krolik, Rapid Comm. Mass Spectrom. 40 electrical discharge. 1:67, 1987). Since TS is practiced mainly with quadrapole A wide range of liquids can be sprayed electrostatically mass Spectrometers, Sensitivity typically Suffers disporpor into a vacuum, or with the aid of a nebulizing agent. The use tionately at higher mass-to-charge ratios (m/z). Time-of of only electric fields for nebulization leads to Some prac flight (TOF) mass spectrometers are commercially available tical restrictions on the range of liquid conductivity and and possess the advantage that the m/z range is limited only 45 dielectric constant. Solution conductivity of less than 10 by detector efficiency. Recently, two additional ionization ohms is required at room temperature for a Stable electro methods have been introduced. These two methods are now Spray at useful liquid flow rates corresponding to an aqueous referred to as matrix-assisted laser desorption (MALDI, electrolyte solution of <10" M. In the mode found most Karas and Hillenkamp, Anal. Chem. 60:2299, 1988; Karas et useful for ESI-MS, an appropriate liquid flow rate results in al., Angew. Chem. 101:805, 1989) and electrospray ioniza 50 dispersion of the liquid as a fine mist. A short distance from tion (ESI). Both methodologies have very high ionization the capillary the droplet diameter is often quite uniform and efficiency (i.e., very high molecular ions produced/ on the order of 1 lim. Of particular importance is that the molecules consumed). Sensitivity, which defines the ulti total electrospray ion current increases only slightly for mate potential of the technique, is dependent on Sample-size, higher liquid flow rates. There is evidence that heating is quantity of ions, flow rate, detection efficiency and actual 55 useful for manipulating the electrospray. For example, Slight ionization efficiency. heating allows aqueous Solutions to be readily Electrospray-MS is based on an idea first proposed in the electrosprayed, presumably due to the decreased Viscosity 1960s (Dole et al., J. Chem. Phys. 49:2240, 1968). Electro and Surface tension. Both thermally-assisted and gas spray ionization (ESI) is one means to produce charged nebulization-assisted electrosprays allow higher liquid flow molecules for analysis by mass spectroScopy. Briefly, elec 60 rates to be used, but decrease the extent of droplet charging. trospray ionization produces highly charged droplets by The formation of molecular ions requires conditions effect nebulizing liquids in a strong electroStatic field. The highly ing evaporation of the initial droplet population. This can be charged droplets, generally formed in a dry bath gas at accomplished at higher pressures by a flow of dry gas at atmospheric pressure, shrink by evaporation of neutral Sol moderate temperatures (<60 C.), by heating during trans vent until the charge repulsion overcomes the cohesive 65 port through the interface, and (particularly in the case of ion forces, leading to a "Coulombic explosion'. The exact trapping methods) by energetic collisions at relatively low mechanism of ionization is controversial and Several groups preSSure. US 6,613,508 B1 77 78 Although the detailed processes underlying ESI remain The DNA fragments can be collected by several methods. uncertain, the very Small droplets produced by ESI appear to The first method is that of use of an electric field wherein allow almost any Species carrying a net charge in Solution to DNA fragments are collected onto or near an electrode. A be transferred to the gas phase after evaporation of residual second method is that wherein the DNA fragments are Solvent. Mass spectrometric detection then requires that ions collected by flowing a Stream of liquid past the bottom of a have a tractable m/z range (<4000 daltons for quadrupole gel. Aspects of both methods can be combined wherein DNA instruments) after desolvation, as well as to be produced and collected into a flowing Stream which can be later concen transmitted with sufficient efficiency. The wide range of trated by use of an electric field. The end result is that DNA fragments are removed from the milieu under which the solutes already found to be amenable to ESI-MS, and the Separation method was performed. That is, DNA fragments lack of Substantial dependence of ionization efficiency upon can be "dragged' from one Solution type to another by use molecular weight, Suggest a highly non-discriminating and of an electric field. broadly applicable ionization process. Once the DNA fragments are in the appropriate Solution The electrospray ion “Source” functions at near atmo (compatible with electrospray and mass spectrometry) the Spheric pressure. The electrospray “Source' is typically a tag can be cleaved from the DNA fragment. The DNA metal or glass capillary incorporating a method for electri 15 fragment (or remnants thereof) can then be separated from cally biasing the liquid Solution relative to a counter elec the tag by the application of an electric field (preferably, the trode. Solutions, typically water-methanol mixtures contain tag is of opposite charge of that of the DNA tag). The tag is ing the analyte and often other additives Such as acetic acid, then introduced into the electrospray device by the use of an flow to the capillary terminus. An ESI source has been electric field or a flowing liquid. described (Smith et al., Anal. Chem, 62:885, 1990) which Fluorescent tags can be identified and quantitated most can accommodate essentially any Solvent System. Typical directly by their absorption and fluorescence emission wave flow rates for ESI are 1-10 uL/min. The principal require lengths and intensities. ment of an ESI-MS interface is to sample and transportions While a conventional spectrofluorometer is extremely from the high pressure region into the MS as efficiently as flexible, providing continuous ranges of excitation and emis possible. 25 Sion wavelengths (lx, ls, 1s), more Specialized instruments The efficiency of ESI can be very high, providing the basis Such as flow cytometers and laser-Scanning microScopes for extremely Sensitive measurements, which is useful for require probes that are excitable at a single fixed wave the invention described herein. Current instrumental perfor length. In contemporary instruments, this is usually the mance can provide a totalion current at the detector of about 488-nm line of the argon laser. 2x10' A or about 107 counts/s for singly charged species. Fluorescence intensity per probe molecule is proportional On the basis of the instrumental performance, concentra to the product of e and QY. The range of these parameters tions of as low as 10' M or about 10 mol/s of a singly among fluorophores of current practical importance is charged species will give detectable ion current (about 10 approximately 10,000 to 100,000 cm'M' fore and 0.1 to counts/s) if the analyte is completely ionized. For example, 1.0 for QY. When absorption is driven toward Saturation by low attomole detection limits have been obtained for qua 35 high-intensity illumination, the irreversible destruction of ternary ammonium ions using an ESI interface with capil the excited fluorophore (photobleaching) becomes the factor lary Zone electrophoresis (Smith et al., Anal. Chem. limiting fluorescence detectability. The practical impact of 59:1230, 1988). For a compound of molecular weight of photobleaching depends on the fluorescent detection tech 1000, the average number of charges is 1, the approximate nique in question. number of charge States is 1, peak width (m/z) is 1 and the 40 It will be evident to one in the art that a device (an maximum intensity (ion/s) is 1x10'. interface) may be interposed between the separation and Remarkably little Sample is actually consumed in obtain detection Steps to permit the continuous operation of size ing an ESI mass spectrum (Smith et al., Anal. Chem. Separation and tag detection (in real time). This unites the 60:1948, 1988). Substantial gains might be also obtained by Separation methodology and instrumentation with the detec the use of array detectors with Sector instruments, allowing 45 tion methodology and instrumentation forming a single Simultaneous detection of portions of the Spectrum. Since device. For example, an interface is interposed between a currently only about 10 of all ions formed by ESI are Separation technique and detection by mass spectrometry or detected, attention to the factors limiting instrument perfor potentiostatic amperometry. mance may provide a basis for improved Sensitivity. It will The function of the interface is primarily the release of the be evident to those in the art that the present invention 50 (e.g., mass spectrometry) tag from analyte. There are several contemplates and accommodates for improvements in ion representative implementations of the interface. The design ization and detection methodologies. of the interface is dependent on the choice of cleavable An interface is preferably placed between the Separation linkers. In the case of light or photo-cleavable linkers, an instrumentation (e.g., gel) and the detector (e.g., mass energy or photon Source is required. In the case of an spectrometer). The interface preferably has the following 55 acid-labile linker, a base-labile linker, or a disulfide linker, properties: (1) the ability to collect the DNA fragments at reagent addition is required within the interface. In the case discreet time intervals, (2) concentrate the DNA fragments, of heat-labile linkers, an energy heat Source is required. (3) remove the DNA fragments from the electrophoresis Enzyme addition is required for an enzyme-Sensitive linker buffers and milieu, (4) cleave the tag from the DNA Such as a specific protease and a peptide linker, a nuclease fragment, (5) separate the tag from the DNA fragment, (6) 60 and a DNA or RNA linker, a glycosylase, HRP or phos dispose of the DNA fragment, (7) place the tag in a volatile phatase and a linker which is unstable after cleavage (e.g., Solution, (8) volatilize and ionize the tag, and (9) place or Similar to chemiluminescent Substrates). Other characteris transport the tag to an electrospray device that introduces the tics of the interface include minimal band broadening, tag into mass spectrometer. Separation of DNA from tags before injection into a mass The interface also has the capability of “collecting DNA 65 Spectrometer. Separation techniques include those based on fragments as they elute from the bottom of a gel. The gel electrophoretic methods and techniques, affinity techniques, may be composed of a slab gel, a tubular gel, a capillary, etc. Size retention (dialysis), filtration and the like. US 6,613,508 B1 79 80 It is also possible to concentrate the tags (or nucleic also may map to a dideoxy terminator type (ddTTP, ddCTP, acid-linker-tag construct), capture electrophoretically, and ddGTP, ddATP) by reference into which dideoxynucleotide then release into alternate reagent Stream which is compat reaction the tagged primer is placed. The Sequencing reac ible with the particular type of ionization method Selected. tion is then performed and the resulting fragments are The interface may also be capable of capturing the tags (or Sequentially Separated by Size in time. nucleic acid-linker-tag construct) on microbeads, shooting The tags are cleaved from the fragments in a temporal the bead(s) into chamber and then preforming laser frame and measured and recorded in a temporal frame. The desorption/vaporization. Also it is possible to extract in flow Sequence is constructed by comparing the tag map to the into alternate buffer (e.g., from capillary electrophoresis temporal frame. That is, all tag identities are recorded in buffer into hydrophobic buffer across a permeable time after the sizing Step and related become related to one membrane). It may also be desirable in Some uses to deliver another in a temporal frame. The sizing Step separates the tags into the mass spectrometer intermittently which would nucleic acid fragments by a one nucleotide increment and comprise a further function of the interface. Another func hence the related tag identities are Separated by a one tion of the interface is to deliver tags from multiple columns nucleotide increment. By foreknowledge of the dideoxy into a mass Spectrometer, with a rotating time slot for each 15 terminator or nucleotide map and Sample type, the Sequence column. Also, it is possible to deliver tags from a single is readily deduced in a linear fashion. column into multiple MS detectors, Separated by time, A genetic fingerprinting System of the present invention collect each Set of tags for a few milliseconds, and then consists of, in general, a Sample introduction device, a deliver to a mass spectrometer. device to Separate the tagged Samples of interest, a splitting The following is a list of representative vendors for device to deviate a variable amount of the Sample to a Separation and detection technologies which may be used in fraction collector, a device to cleave the tags from the the present invention. Hoefer Scientific Instruments (San Samples of interest, a device for detecting the tag, and a Francisco, Calif.) manufactures electrophoresis equipment Software program to analyze the data collected and display (Two StepTM, Poker FaceTM II) for sequencing applications. it in a differential display mode. It will be evident to one of Pharmacia Biotech (Piscataway, N.J.) manufactures electro 25 ordinary skill in the art when in possession of the present phoresis equipment for DNA separations and Sequencing disclosure that this general description may have many (PhastSystem for PCR-SSCP analysis, MacroPhor System variances for each of the components listed. AS best Seen in for DNA sequencing). Perkin Elmer/Applied Biosystems FIG. 15, an exemplary genetic fingerprinting System 10 of Division (ABI, Foster City, Calif.) manufactures semi the present invention consists of a Sample introduction automated sequencers based on fluorescent-dyes (ABI373 device 12, a separation device 14 that Separates the Samples and ABI377). Analytical Spectral Devices (Boulder, Colo.) by high-performance liquid chromatography (HPLC), a manufactures UV spectrometers. Hitachi Instruments Splitting device 13, a fraction collector 15, a photocleavage (Tokyo, Japan) manufactures Atomic Absorption spectrom device 16 to cleave the tags from the Samples of interest, a eters. Fluorescence Spectrometers, LC and GC Mass detection device 18 that detects the tags by means of an Spectrometers, NMR spectrometers, and UV-VIS Spectrom 35 electrochemical detector, and a data processing device 20 eters. PerSeptive Biosystems (Framingham, Mass.) pro with a data analysis Software program that analyzes the duces Mass Spectrometers (VoyagerTM Elite). Bruker Instru results from the detection device. Each component is dis ments Inc. (Manning Park, Mass.) manufactures FTIR cussed in more detail below. Spectrometers (Vector 22), FT-Raman Spectrometers, Time The Sample introduction device 12 automatically takes a of Flight Mass Spectrometers (Reflex IITM), Ion Trap Mass 40 measured aliquot 22 of the PCR products generated in the Spectrometer (Esquire"M) and a Maldi Mass Spectrometer. genetic fingerprinting procedure and delivers it through a Analytical Technology Inc. (ATI, Boston, Mass.) makes conventional tube 24 to the separation device 14 (generally Capillary Gel Electrophoresis units, UV detectors, and an HPLC). The sample introduction device 12 of the exem Diode Array Detectors. Teledyne Electronic Technologies plary embodiment consists of a temperature-controlled (Mountain View, Calif.) manufactures an Ion Trap Mass 45 autoSampler 26 that can accommodate micro-titer plates. Spectrometer (3DQ Discovery TM and the 3DQ ApogeeTM). The autoSampler 26 is temperature controlled to maintain Perkin Elmer/Applied Biosystems Division (Foster City, the integrity of the nucleic acid Samples generated and is Calif.) manufactures a Sciex Mass Spectrometer (triple able to inject 25 ul or less of sample. Manufacturers of this quadrupole LC/MS/MS, the API 100/300) which is compat type of Sample introduction device 12 are, for example, ible with electrospray. Hewlett-Packard (Santa Clara, Calif.) 50 Gilson (Middleton, Wis.). produces Mass Selective Detectors (HP 5972A), MALDI The Sample introduction device is operatively connected TOF Mass Spectrometers (HP G2025A), Diode Array in Series to the Separation device 14 by the conventional tube Detectors, CE units, HPLC units (HP1090) as well as UV 24. The PCR products in the measured aliquot 22 received Spectrometers. Finnigan Corporation (San Jose, Calif.) in the Separation device 14 are separated temporally by high manufactures mass spectrometers (magnetic Sector (MAT 95 55 performance liquid chromatography to provide Separated STM), quadrapole spectrometers (MAT 95 SQTM) and four DNA fragments. The high-performance liquid chromato other related mass spectrometers). Rainin (Emeryville, graph may have an isocratic, binary, or quaternary pump(s) Calif.) manufactures HPLC instruments. 27 and can be purchased from multiple manufacturers (e.g., The methods and compositions described herein permit Hewlett Packard (Palo Alto, Calif.) HP 1100 or 1090 series, the use of cleaved tags to Serve as maps to particular Sample 60 Beckman Instruments Inc. (800-742-2345), Bioanalytical type and nucleotide identity. At the beginning of each Systems, Inc. (800845-4246), ESA, Inc. (508) 250-700), Sequencing method, a particular (selected) primer is Perkin-Elmer Corp. (800-762-4000), Varian Instruments assigned a particular unique tag. The tags map to either a (800-926-3000), Waters Corp. (800-254-4752)). Sample type, a dideoxy terminator type (in the case of a The separation device 14 includes an analytical HPLC Sanger Sequencing reaction) or preferably both. Specifically, 65 column 28 Suitable for use to Separate the oligonucleotides. the tag maps to a primer type which in turn maps to a vector The column 28 is an analytical HPLC, for example, non type which in turn maps to a Sample identity. The tag may porous polystyrene divinylbenzene (2.2 um particle size) US 6,613,508 B1 81 82 Solid Support modified which can operate within a pH range A photocleaving unit is available from Supelco of 2 to 12, pressures of up to 3000 psi and a temperature (Bellefonte, Pa.). Photocleaving can be performed at mul range of 10 to 70° C. A temperature-control device (e.g. a tiple wavelengths with a mercury/Xenon arc lamp. The column oven) (not shown) may be used to control the wavelength accuracy is about 2 nm with a bandwidth of 10 temperature of the column. Such temperature-control 5 nm. The area irradiated is circular and typically of an area of devices are known in the art, and may be obtained from, for 10-100 square centimeters. In alternate embodiments, other cleaving devices, which cleave by acid, base, oxidation, example, Rainin Instruments (Subsidiary of Varian reduction, fluoride, thiol eXchange, photolysis, or enzymatic Instrument, Palo Alto, Calif.). A suitable column 28 is conditions, can be used to remove the tags from the Sepa available under the commercial name of DNASep(R) and is rated DNA fragments. available from Serasep (San Jose, Calif.). Other suitable After the cleaving device 16 cleaves the tags from the analytical HPLC columns are available from other manu Separated DNA fragments, the tags flow through a conven facturers (e.g., Hewlett Packard (Palo Alto, Calif.), Beckman tional tube 32 to the detection device 18 for detection of each Industries (Brea, Calif.), Waters Corp. (Milford, Mass.), and tag. Detection of the tags can be based upon the difference Supelco (Bellefonte, Pa.). in electrochemical potential between each of the tags used to The separation device 14 in the illustrated embodiment 15 label each kind of DNA generated in the PCR step. The incorporates the Sample splitter 13, and the Sample splitter is electrochemical detector 18 can operate on either coulom connected to the flowing Stream of the Sample. The Sample etric or amperometric principles. The preferred electrome Splitter 13 is adapted to divert in a conventional manner chanical detector 18 is the coulometric detector, which variable amounts of sample to the fraction collector 15 either consists of a flow-through or porous-carbon graphite for further analysis or storage. The fraction collector 15 must amperometric detector where the column eland passes be able to accommodate Small Volumes, have temperature through the electrode resulting in 100% detection efficiency. control to low temperatures, and have adjustable Sampling To fully detect each component, an array of 16 coulometric times. Manufacturers of in-line splitters include Upchurch detectors each held at a different potential (generally at 60 (Oak Harbor, Wash.). mV increments) is utilized. Examples of manufacturers of The fraction collector 15 is attached to the HPLC/LC 25 this type of detector are ESA (Bedford, Mass.) and Bioana device via a stream-splitter line 29. Fraction collectors 15 lytical Systems Inc. (800-845-4246). permit the collection of specific peaks, DNAS, RNAS, and In an alternate embodiment illustrated Schematically in nucleic acid fragments or molecules of interest into tubes, FIG. 16, the Sample introduction device 12, the Separating Wells of microtiter plates, or containers. Additionally, frac device 14, and the cleavage device 16 are Serially connected tion collectorS 15 can collect all or part of a set of nucleic as discussed above for maintaining the flow of Sample. The acid fragments separated by HPLC or LC. Manufacturers of cleavage device 16 is connected to a detection device 18 fraction collectors include Gilson (Middleton, Wis.), and which is a mass spectrometer 40 or the like that detects the Isco (Lincoln, Nebr). The use of a fraction collector 15 in tags based upon the difference in molecular weight between this technology provides considerable Substantial advan each of the tags used to label each kind of DNA generated tages over gel based Systems. For example, it is possible to 35 in the PCR step. The best detector based upon differences in directly clone nucleic acids fragments recovered by HPLC mass is the mass Spectrometer. For this use, the mass or LC methods. In addition, it is possible to amplify nucleic Spectrometer 40 will typically have an atmospheric preSSure acids fragments recovered by HPLC or LC methods by PCR. ionization (API) interface with either electrospray or chemi These two methods permit the rapid identification of nucleic cal ionization, a quadrupole mass analyzer, and a mass range acid fragments of interest on a Sequence level. Both methods 40 of at least 50 to 2600 m/z. Examples of manufacturers of a are tedious and ineffective when used in conjunction with suitable mass spectrometer are: Hewlett Packard (Palo Alto, gel-based Systems. Calif.) HP 1100 LC/MSD, Hitachi Instruments (San Jose, In the illustrated embodiment, the fraction collector 15 is Calif.), M-1200H LC/MS, Perkin Elmer Corporation, an individual component of the genetic fingerprinting Sys Applied Biosystems Division (Foster City, Calif.) API 100 tem 10. In an alternate embodiment (not shown), the fraction 45 LC/MS or API 300 LC/MS/MS, Finnigan Corporation (San collector 15 is incorporated in the Sample introduction Jose, Calif.) LCQ, MAT 95 S, Bruker Analytical Systems, device 12. Accordingly, the Stream-splitter line 32 directs the Inc. (Billerica, Mass.) APEX, BioAPEX, and ESQUIRE and diverted sample from the sample splitter 13 back to the Micromass (U.K.). Sample introduction device 12. The detection device 18 is electrically connected to a data A stream of the separated DNA fragments (e.g., Sequenc 50 processor and analyzer 20 that receives data from the ing reaction products) flows through a conventional tube 30 detection device. The data processor and analyzer 20 from the Separation device 14 downstream of the Sample includes a Software program that identifies the detected tag. splitter 13 to the cleavage device 16. Each of the DNA The data processor and analyzer 230 in alternate embodi fragments is labeled with a unique cleavable (e.g., ments is operatively connected to the injection device 12, the photocleavable) tag. The flowing stream of separated DNA 55 separation device 14, the fraction collector 15, and/or the fragments pass through or past the cleaving device 16, where cleaving device 16 to control the different components of the the tag is removed for detection (e.g., by mass spectrometry genetic fingerprinting System 10. or with an electrochemical detector). In the exemplary The Software package maps the electrochemical Signature embodiment, the cleaving device 16 is a photocleaving unit of a given tag to a Specific primer, and a retention time. Such that flowing Stream of Sample is exposed to Selected 60 Software generated nucleic acid profiles are then compared light energy and wave length. In one embodiment, the (length to length, fragment to fragment) and the results are Sample enters the photocleaving unit 16 and is exposed to reported to the user. The software will highlight both simi the Selected light Source for a Selected duration of time. In larities and differences in the nucleic acid fragment profiles. an alternate embodiment, the flowing Stream of Sample is The Software will also be able to direct the collection of carried adjacent to the light Source along a path that provides 65 Specific nucleic acid fragments by the fraction collector 15. a Sufficient exposure to the light energy to cleave the tags The Software package maps the m/z Signature of a given from the separated DNA fragments. tag to a specific primer, and a retention time. Software US 6,613,508 B1 83 84 generated nucleic acid profiles are then compared (length to control the temperature of the column. Such temperature length, fragment to fragment) and the results are reported to control devices are known in the art, and may be obtained the user. The Software will highlight both similarities and from, for example, Rainin Instruments (Subsidiary of Varian differences in the nucleic acid fragment profiles. The Soft Instrument, Palo Alto, Calif.). A suitable column 28 is ware will also be able to direct the collection of specific available under the commercial name of DNASep(R) and is nucleic acid fragments by the fraction collector. available from Serasep (San Jose, Calif.). Other suitable The system 18 in accordance with the present invention is analytical HPLC columns are available from other manu provided by operatively interconnecting the System's mul facturers (e.g., Hewlett Packard (Palo Alto, Calif.) tiple components. Accordingly, one or more System (Beckman Industries (Brea, Calif.), Waters Corp. (Milford, components, Such as the Sample introducing device 12 and Mass.), and Supelco (Bellefonte, Pa.). the detecting device 18 that, are in operation in a lab can be In the illustrated embodiment, the fraction collector 15 is combined with the System's other components, (e.g., the an individual component of the differential display System Separating device 14, cleaving device 16, and the data 10 that is coupled to the System's other components. In an processor and analyzer 20 in order to equip the lab with the alternate embodiment, the fraction collector 15 is incorpo DNA sequencing system 10 of the present invention. 15 rated in the Sample introduction device 12. Accordingly, the Another embodiment of the present invention provides a stream-splitter line 32 directs the diverted sample from the differential display System which consists of, in general, a Sample splitter 13 back to the Sample introduction device 12. Sample introduction device, a device to Separate the tagged The separation device 14 in the illustrated embodiment Samples of interest, a splitting device to deviate a variable incorporates the Sample Splitter 13 that is connected to the amount of the sample to a fraction collector, a unit to cleave flowing stream of the sample. The sample splitter 13 is the tags from the Samples of interest, a device for detecting adapted to divert in a conventional manner variable amounts the tag, and a Software program to analyze the data collected of sample to the fraction collector 15 either for further and display it in a differential display mode. It will be analysis or storage. The fraction collector 15 must be able to evident to one of ordinary skill in possession of the present accommodate Small Volumes, have temperature control to disclosure that the general description may have many 25 low temperatures, and have adjustable Sampling times. variances for each of the components listed. The differential Manufacturers of in-line splitters include Upchurch (Oak display System of an exemplary embodiments of the present Harbor, Wash.). invention consists of Similar components illustrated in FIG. A stream of the separated DNA fragments flow through a 15, including the Sample introduction device 12, the Sepa conventional tube 30 from the separation device 14 down ration device 14 for Separating the Samples by high stream of the sample splitter 113 to the cleavage device 16. performance liquid chromatography (HPLC), the splitting Each of the PCR products is labeled with a unique cleavable device 13, the fraction collector 15, the photocleavage (e.g., photocleavable) tag. The flowing Stream of Separated device 16 to cleave the tags from the Samples of interest, the DNA fragments pass through or past the cleaving device 16 detection device 18 for detection of the tags by where the tag is removed for detection by electrochemical electrochemistry, and the data processor and analyzer 20 35 detection. In the exemplary embodiment, the cleaving with a Software program. Each component is discussed in device 16 is a photocleaving unit Such that flowing Stream of more detail below. Sample is exposed to Selected light energy. In one In the differential display System, the Sample introduction embodiment, the Sample enters the photocleaving unit 16 device 12 automatically takes a measured aliquot 22 of the and is exposed to the Selected light Source for a Selected PCR product generated in the differential display procedure 40 duration of time. In an alternate embodiment, the flowing and delivers it through a conventional tube 24 to the Sepa Stream of Sample is carried in a Suitable tube portion or the ration device 14 (generally an HPLC). The sample intro like adjacent to the light Source along a path that provides a duction device 12 of the exemplary embodiment consists of Sufficient exposure to the light Source to cleave the tags from a temperature-controlled autoSampler 26 that can accommo the Separated DNA fragments. date micro-titer plates. The autoSampler 26 must be tem 45 A photocleaving unit is available from Supelco perature controlled to maintain the integrity of the nucleic (Bellefonte, Pa.). Photocleaving can be performed at mul acid Samples generated and be able to inject 25 ul or less of tiple wavelengths with a mercury/Xenon arc lamp. The Sample. Manufacturers of this type of product are wavelength accuracy is about 2 nm with a bandwidth of 10 represented, for example, by Gilson (Middleton, Wis.). nm. The area irradiated is circular and typically of an area of The Sample introduction device is operatively connected 50 10-100 square centimeters. In alternate embodiments, other in Series to the Separation device by the conventional tube cleaving devices, which cleave by acid, base, oxidation, 24. The PCR products in the measured aliquot 22 received reduction, fluoride, thil eXchange, photolysis, or enzymatic in the Separation device 14 are separated temporally by high conditions, can be used to remove the tags from the Sepa performance liquid chromatography to provide Separated rated DNA fragments. DNA fragments. The high-performance liquid chromato 55 After the cleaving device 16 cleaves the tags from the graph may have an isocratic, binary, or quaternary pump(s) Separated DNA fragments, the tags flow through a conven 27 and can be purchased from multiple manufacturers (e.g., tional tube 32 to the detection device 18 for detection of each Hewlett Packard (Palo Alto, Calif.) HP 1100 or 1090 series, tag. Detection of the tags is based upon the difference in Analytical Technology Inc. (Madison, Wis.), Perkin Elmer, electrochemical potential between each of the tags used to Waters, etc.). The separation device 14 includes an analyti 60 label each kind of DNA generated in the PCR step. The cal HPLC column 28 suitable for use to separate the oligo electrochemical detector 18 can operate on either coulom nucleotides. The column 28 is an analytical HPLC, for etric or amperometric principles. The preferred electrome example, non-porous polystyrene divinylbenzene (2.2 lim chanical detector 18 is the coulometric detector, which particle size) Solid Support modified which can operate consists of a flow-through or porous-carbon graphite within a pH range of 2 to 12, pressures of up to 3000 psi and 65 amperometric detector where the column eluent passes a temperature range of 10 to 70° C. A temperature-control through the electrode resulting in 100% detection efficiency. device (e.g., a column oven) (not shown) may be used to To fully detect each component, an array of 16 coulometric US 6,613,508 B1 85 86 detectors each held at a different potential (generally at 60 device for detecting the tag, and a Software program to mV increments) is utilized. The manufacturers of this type analyze the data collected. It will be evident to one of of detector include ESA (Bedford, Mass.) and Bioanalytical ordinary skill in the art when in possession of the present Systems Inc. (800-845-4246). disclosure that the general description may have many In an alternate embodiment of the differential display variances for each of the components listed. AS best Seen in system illustrated schematically in FIG. 16, the sample FIG. 17, (need Figure if) a preferred single-nucleotide exten introduction device 12, the Separating device 14, and the Sion assay, oligo-ligation assay or oligonucleotide-probe cleavage device 16 are Serially connected as discussed above based assay System 200 consists of a Sample introduction for maintaining the flow of Sample. The cleavage device 16 device 212, a separation device 214 that Separates the is connected to a detection device 18 that detects the tags Samples by high-performance liquid chromatography, a based upon the difference in molecular weight between each cleaving device 216 to cleave the tags from the Samples of of the tags used to label each kind of DNA generated in the interest, a detection device 218 of the tags by mass PCR step. The best detector based upon differences in mass Spectrometry, and a data processor and analyzer 220 which is the mass spectrometer 40. For this use, the mass Spec includes a Software program. Each component is discussed trometer will typically have an atmospheric pressure ion 15 in more detail below. ization (API) interface with either electrospray or chemical The sample introduction device 212 automatically takes a ionization, a quadrupole mass analyzer, and a mass range of measured aliquot 222 of the nucleic acid fragment generated at least 50 to 2600 m/z. Examples of manufacturers of a by a variety of methods (PCR, ligations, digestion, suitable mass spectrometer are: Hewlett Packard (Palo Alto, nucleases, etc.) and delivers it through a conventional tube Calif.) HP 1100 LC/MSD, Hitachi Instruments (San Jose, 224 to the separation device 214 (generally an HPLC). The Calif.), M-1200H LC/MS, JEOL, USA, Inc. (Peabody, Sample introduction device 212 of the exemplary embodi Mass.), Perkin Elmer Corporation, Applied Biosystems ment consists of a temperature-controlled autoSampler 226 Division (Foster City, Calif.) API 100 LC/MS or API 300 that can accommodate micro-titer plates. The autoSampler LC/MS/MS, Finnigan Corporation (San Jose, Calif.) LCQ, 226 must be temperature controlled to maintain the integrity MAT 95 S, MAT 95 SO, MAT 900 S, MAT 900 SO, and 25 of the nucleic acid Samples generated and be able to inject SSQ 7000, Bruker Analytical Systems, Inc. (Billerica, 25 ul or less of Sample. Manufacturers of this product are Mass.) APEX, BioAPEX, and ESQUIRE. represented, for example, by Gilson (Middleton Wis.). The detection device 18 is electrically connected to a data The Sample introduction device is operatively connected processor and analyzer 20 that receives data from the in Series to the Separation device by the conventional tube detection device. The data processor and analyzer 20 224. The nucleic acid products (which may be produced by includes a Software program that identifies the detected tag PCR, ligation reactions, digestion, nucleases, etc.) in the and its position in the DNA sequence. The data processor measured aliquot 222 receive in the Separation device 214 and analyzer 20 in alternate embodiments is operatively are separated temporally by high performance liquid chro connected to the injection device 12, the Separation device matography. The high-performance liquid chromatograph 14, the fraction collector 15, and/or the cleaving device 16 35 may have an isocratic, binary, or quaternary pump(s) 227 to control the different components of the differential display and can be purchased from multiple manufacturers (e.g., System. Hewlett Packard (Palo Alto, Calif.) HP 1100 or 1090 series, The Software package maps the Signature of a given tag to Beckman Instruments Inc. (800-742-2345), Bioanalytical a specific primer, and a retention time. Software generated Systems, Inc. (800-845-4246), ESA, Inc. (508) 250-700), nucleic acid profiles are then compared (length to length, 40 Perkin-Elmer Corp. (800-762-4000), Varian Instruments fragment to fragment) and the results are reported to the (800-926-3000), Waters Corp. (800-254-4752)). user. The software will highlight both similarities and dif The separations device 214 includes an analytical HPLC ferences in the nucleic acid fragment profiles. The Software column 228 Suitable for use to Separate the nucleic acid will also be able to direct the collection of specific nucleic fragments. The column 228 is an analytical HPLC, for acid fragments by the fraction collector 15. 45 example, non-porous polystyrene divinylbenzene (2.2 lim The Software package maps the m/z Signature of a given particle size) Solid Support which can operate within a pH tag to a specific primer, and a retention time. Software range of 2 to 12, preSSures of up to 3000 psi and a generated nucleic acid profiles are then compared (length to temperature range of 10 to 70° C. A temperature-control length, fragment to fragment) and the results are reported to device (e.g., a column oven) (not shown) may be used to the user. The Software highlights both similarities and dif 50 control the temperature of the column. Such temperature ferences in the nucleic acid fragment profiles. The Software control devices are known in the art, and may be obtained is also able to direct the collection of Specific nucleic acid from, for example, Rainin Instruments (Subsidiary of Varian fragments by the fraction collector. Instrument, Palo Alto, Calif.). A suitable column 228 is The differential display system is provided by operatively available under the commercial name of DNASep(R) and is interconnecting the System's multiple components. 55 available from Serasep (San Jose, Calif.). A wide variety of Accordingly, one or more System components, Such as the HPLC columns 228 can be used for this particular techno Sample introducing device 12 and the detecting device 18 logical unit Since Single-base pair resolution is not neces that are in operation in a lab can be combined with the sarily required. Other suitable analytical HPLC columns are System's other components, e.g., the Separating device 14, available from other manufacturers (e.g., Hewlett Packard cleaving device 16, and the data processor and analyzer 20, 60 (Palo Alto, Calif.), Beckman Instruments, Inc. (Brea, Calif.), in order to equip the lab with a System in accordance with and Waters Corp. (Milford, Mass.)). the present invention. A stream of the separated DNA fragments (e.g., Sequenc Single nucleotide extension assay, oligo-ligation assay or ing reaction product) flows through a conventional tube 230 oligonucleotide probe based assay Systems of the present from the separation device 214 to the cleavage device 216. invention consist of, in general, a Sample introduction 65 Each of the DNA fragments is labeled with a unique device, a device to Separate the tagged Samples of interest, cleavable (e.g., photocleavable) tag. The flowing Stream of a device to cleave the tags from the Samples of interest, a Separated DNA fragments pass through or past the cleaving US 6,613,508 B1 87 88 device 216 where the tag is removed for detection by mass forms to clean, no gel to pour), and the analysis time (greater Spectrometry or with a electrochemical detector. The pho than 4 hours for a large gel versus much shorter times (5 tocleaving unit is available from Supelco (Bellefonte, Pa.). minutes to an hour) for an HPLC analysis Additionally, there Photocleaving can be performed at multiple wavelengths is a Sample throughput advantage. By utilizing the tags with a mercury/Xenon arc lamp. The wavelength accuracy is described in this invention, many Samples can be analyzed about 2 nm with a bandwidth of 10 nm. The area irradiated in one batch (potentially 384 SampleS/lane) whereas the is circular and typically of an area of 10-100 Square centi gel-based analyses are limited to the 4 fluorophores avail meters. In alternate embodiments, other cleaving devices, able or one Sample/lane. The gels used are inherently which cleave by acid, base, oxidation, reduction, fluoride, delicate and can easily break or contain an air bubble or thiol eXchange, photolysis, or enzymatic conditions, can be other flaw rendering the whole gel or Several lanes useleSS. used to remove the tags from the Separated DNA fragments. HPLC columns are rugged and, when purchased pre-packed, After the cleaving device 216 cleaves the tags from the are free of packing defects creating a consistent, generally Separated DNA fragments, the tags flow through a conven uniform separation path. The HPLC systems also lend tional tube 232 to the detection device 218 for detection of towards better quality assurance in that internal Standards each tag. Detection of the tags can be based upon the 15 can be utilized due to the reproducibility of the HPLC difference in molecular weight between each of the tags used columns. Gel quality is inconsistent both between gels as to label each kind of DNA generated in the various assay well as within a gel making use of Standards nearly impos Steps. The best detector based upon differences in mass is the Sible. Finally, both the mass Spectrometry and electrochemi mass spectrometer. For this use, the mass spectrometer will cal detectors are more Sensitive than the detectors utilized in typically have an atmospheric pressure ionization (API) the gel based Systems allowing for lower limits of detection interface with either electrospray or chemical ionization, a and analysis of less sample which would be useful for quadrupole mass analyzer, and a mass range of at least 50 to non-PCR based analyses. 2600 m/z. Examples of manufacturers of a suitable mass Tagged Probes in Array-Based ASSayS spectrometer are: Hewlett Packard (Palo Alto, Calif.) HP Arrays with covalently attached oligonucleotides have 1100 LC/MSD, Hitachi Instruments (San Jose, Calif.) 25 been made used to perform DNA sequence analysis by M-1200H LC/MS, Perkin Elmer Corporation, Applied Bio hybridization (Southern et al., Genomics 13: 1008, 1992; systems Division (Foster City, Calif.) API 100 LC/MS or Drmanac et al., Science 260: 1649, 1993), determine expres API 300 LC/MS/MS, Finnigan Corporation (San Jose, Sion profiles, Screen for mutations and the like. In general, Calif.) LCQ, Bruker Analytical Systems, Inc. (Billerica, detection for these assays uses fluorescent or radioactive Mass.), ESQUIRE, and Micromers (U.K). labels. Fluorescent labels can be identified and quantitated In an alternate embodiment illustrated Schematically in most directly by their absorption and fluorescence emission FIG. 18, the sample introduction device 212, the separating wavelengths and intensity. A microScope/camera Setup using device 214, and the cleavage device 216 are serially con a fluorescent light Source is a convenient means for detecting nected as discussed above for maintaining the flow of fluorescent label. Radioactive labels may be visualized by Sample. The cleavage device 216 is connected to a detection 35 Standard autoradiography, phosphor image analysis or CCD device 218, which is an electrochemical detector 240 that detector. For Such labels the number of different reactions detects the tags based upon the difference in electrochemical that can be detected at a single time is limited. For example, potential between each of the tags used to label each kind of the use of four fluorescent molecules, Such as commonly DNA generated in the Sequencing reaction Step. The elec employed in DNA sequence analysis, limits anaylsis to four trochemical detector 240 of the exemplary embodiment can 40 Samples at a time. ESSentially, because of this limitation, operate on either coulometric or amperometric principles. each reaction must be individually assessed when using The preferred electrochemical detector 240 is the coulom these detector methods. etric detector, which consists of a flow-through or porous A more advantageous method of detection allows pooling carbon graphite amperometric detector where the column of the Sample reactions on at least one array and Simulta eluent passes through the electrode resulting in 100% detec 45 neous detection of the products. By using a tag, Such as the tion efficiency. To fully detect each component, an array of ones described herein, having a different molecular weight 16 coulometric detectors each held at a different potential or other physical attribute in each reaction, the entire Set of (generally at 60 mV increments) is utilized. Examples of reaction products can be harvested together and analyzed. manufacturers of this type of detector are ESA (Bedford, AS noted above, the methods described herein are appli Mass.) and Bioanalytical Systems Inc. (800-845-4246). 50 cable for a variety of purposes. For example, the arrays of Additional manufacturers of electrochemical detectors can oligonucleotides may be used to control for quality of be found in the list of other manufacturers found below. making arrays, for quantitation or qualitative analysis of The electrochemical detector 240 is electrically connected nucleic acid molecules, for detecting mutations, for deter to the data processor and analyzer 220 with the software mining expression profiles, for toxicology testing, and the package discussed above. The Software package maps the 55 like. detected property (e.g., the mass or electrochemical 1. Probe Quantitation or Typing Signature) of a given tag to a specific sample ID. The In this embodiment, oligonucleotides are immobilized per Software will be able to identify the nucleic acid fragment of element in an array where each oligonucleotide in the interest and load the ID information into respective data element is a different or related Sequence. Preferably, each bases. 60 element possesses a known or related Set of Sequences. The The DNA analysis systems described herein have numer hybridization of a labeled probe to Such an array permits the ous advantages over the traditional gel based Systems. One characterization of a probe and the identification and quan of the principal advantages is that these Systems may be fully tification of the Sequences contained in a probe population. automated. By utilizing an HPLC based Separation System, A generalized assay format that may be used in the samples can be automatically injected into the HPLC where 65 particular applications discussed below is a Sandwich assay as gel based Systems require manual loading. There is also format. In this format, a plurality of oligonucleotides of a significant time Savings found in the set-up time (no gel known Sequence are immobilized on a Solid Substrate. The US 6,613,508 B1 89 90 immobilized oligonucleotide is used to capture a nucleic described herein, (c) washing away non-hybridized material, acid (e.g., RNA, rRNA, a PCR product, fragmented DNA) and (d) detecting the tag by non-fluorescent spectrometry or and then a signal probe is hybridized to a different portion potentiometry, and therefrom determining the pattern of of the captured target nucleic acid. gene expression of the biological Sample. Tag-based differ Another generalized assay format is a Secondary detection 5 ential display, especially using cleavable mass spectometry System. In this format, the arrays are used to identify and tags, on Solid Substrates allows characterization of differen quantify labeled nucleic acids that have been used in a tially expressed genes. primary binding assay. For example, if an assay results in a 4. Single Nucleotide Extension Assay labeled nucleic acid, the identity of that nucleic acid can be The primer eXtension technique may be used for the determined by hybridization to an array. These assay formats detection of Single nucleotide changes in a nucleic acid are particularly useful when combined with cleavable mass template (Sokolov, Nucleic Acids Res., 18:3671, 1989). The Spectometry tags. technique is generally applicable to detection of any Single 2. Mutation Detection base mutation (Kuppuswamy et al., Proc. Natl, Acad. Sci. Mutations involving a Single nucleotide can be identified USA, 88:1143–1147, 1991). Briefly, this method first hybrid in a Sample by Scanning techniques, which are Suitable to 15 izes a primer to a Sequence adjacent to a known Single identify previously unknown mutations, or by techniques nucleotide polymorphism. The primed DNA is then Sub designed to detect, distinguish, or quantitate known jected to conditions in which a DNA polymerase adds a Sequence variants. Several Scanning techniques for mutation labeled dNTP, typically addNTP, if the next base in the detection have been developed based on the observation that template is complementary to the labeled nucleotide in the heteroduplexes of mismatched complementary DNA reaction mixture. In a modification, cDNA is first amplified Strands, derived from wild type and mutant Sequences, for a Sequence of interest containing a Single-base difference exhibit an abnormal migratory behavior. between two alleles. Each amplified product is then ana The methods described herein may be used for mutation lyzed for the presence, absence, or relative amounts of each Screening. One Strategy for detecting a mutation in a DNA allele by annealing a primer that is 1 base 5' to the poly Strand is by hybridization of the test Sequence to target 25 morphism and extending by one labeled base (generally a Sequences that are wild-type or mutant Sequences. A mis dideoxynucleotide). Only when the correct base is available matched Sequence has a destabilizing effect on the hybrid in the reaction will a base to incorporated at the 3'-end of the ization of Short oligonucleotide probes to a target Sequence primer. Extension products are then analyzed by hybridiza (see Wetmur, Crit. Rev. Biochem. Mol. Biol., 26:227, 1991). tion to an array of oligonucleotides Such that a non-extended The test nucleic acid source can be genomic DNA, RNA, product will not hybridize. cDNA, or amplification of any of these nucleic acids. Briefly, in the present invention, each dideoxynucleotide Preferably, amplification of test Sequences is first performed, is labeled with a unique tag. Of the four reaction mixtures, followed by hybridization with short oligonucleotide probes only one will add a dideoxy-terminator on to the primer immobilized on an array. An amplified product can be Sequence. If the mutation is present, it will be detected Scanned for many possible Sequence variants by determining 35 through the unique tag on the dideoxynucleotide after its hybridization pattern to an array of immobilized oligo hybridization to the array. Multiple mutations can be simul nucleotide probes. taneously determined by tagging the DNA primer with a A label, Such as described herein, is generally incorpo unique tag as well. Thus, the DNA fragments are reacted in rated into the final amplification product by using a labeled four Separate reactions each including a different tagged nucleotide or by using a labeled primer. The amplification 40 dideoxyterminator, wherein the tag is correlative with a product is denatured and hybridized to the array. Unbound particular dideoxynucleotide and detectable by non product is washed off and label bound to the array is detected fluorescent spectrometry, or potentiometry. The DNA frag by one of the methods herein. For example, when cleavable ments are hybridized to an array and non-hybridized mate mass spectrometry tags are used, multiple products can be rial is washed away. The tags are cleaved from the Simultaneously detected. 45 hybridized fragments and detected by the respective detec 3. Expression Profiles/differential Display tion technology (e.g., mass spectrometry, infrared Mammals, such as human beings, have about, 100,000 Spectrometry, potentiostatic amperometry or UV/visible different genes in their genome, of which only a Small spectrophotometry). The tags detected can be correlated to fraction, perhaps 15%, are expressed in any individual cell. the particular DNA fragment under investigation as well as Differential display techniques permit the identification of 50 the identity of the mutant nucleotide. genes Specific for individual cell types. Briefly, in differen 5. Oligonucleotide Ligation ASSay tial display, the 3' terminal portions of mRNAS are amplified The oligonucleotide ligation assay (OLA). (Landegen et and identified on the basis of size. Using a primer designed al., Science 241:487, 1988) is used for the identification of to bind to the 5' boundary of a poly(A) tail for reverse known Sequences in very large and complex genomes. The transcription, followed by amplification of the cDNA using 55 principle of OLA is based on the ability of ligase to upstream arbitrary Sequence primers, mRNA Sub covalently join two diagnostic oligonucleotides as they populations are obtained. hybridize adjacent to one another on a given DNA target. If AS disclosed herein, a high throughput method for mea the Sequences at the probe junctions are not perfectly Suring the expression of numerous genes (e.g., 1-2000) is based-paired, the probes will not be joined by the ligase. provided. Within one embodiment of the invention, methods 60 When tags are used, they are attached to the probe, which is are provided for analyzing the pattern of gene expression ligated to the amplified product. After completion of OLA, from a Selected biological Sample, comprising the Steps of fragments are hybridized to an array of complementary (a) amplifying cDNA from a biological Sample using one or Sequences, the tags cleaved and detected by mass spectrom more tagged primers, wherein the tag is correlative with a etry. particular nucleic acid probe and detectable by non 65 Within one embodiment of the invention methods are fluorescent spectrometry or potentiometry, (b) hybridizing provided for determining the identity of a nucleic acid amplified fragments to an array of oligonucleotides as molecule, or for detecting a Selecting nucleic acid molecule, US 6,613,508 B1 91 92 in, for example a biological Sample, utilizing the technique may be directly bonded to the Substrate using, for example, of oligonucleotide ligation assay. Briefly, Such methods silylated PEI. Alternatively, a reaction product of a bifunc generally comprise the Steps of performing amplification on tional coupling agent may be disposed between the Substrate the target DNA followed by hybridization with the 5' tagged Surface and the PEI coating, where the reaction product is reporter DNA probe and a 5" phosphorylated probe. The covalently bonded to both the surface and the PEI coating, sample is incubated with T4 DNA ligase. The DNA strands and secures the PEI coating to the surface. The bifunctional with ligated probes are captured on the array by hybridiza coupling agent contains a first and a Second reactive func tion to an array, wherein non-ligated products do not hybrid tional group, where the first reactive functional group is, for ize. The tags are cleaved from the Separated fragments, and example, a tri(O-C-C-alkyl)silane, and the Second reac then the tags are detected by the respective detection tech tive functional group is, for example, an epoxide, no logy (e.g., mass Spectro metry, infrared isocyanate, isothiocyanate and anhydride group. Preferred Spectrophotometry, potentiostatic amperometry or bifunctional coupling agents include 2-(3,4- UV/visible spectrophotometry. epoxy cyclohexyl)ethyltrime thoxysilane; 3,4- 6. Other Assays epoxybutyl trime thoxy Sila ne; The methods described herein may also be used to geno 15 3-isocyanatopropyltriethoxysilane, 3-(triethoxysilyl)-2- type or identification of Viruses or microbes. For example, methylpropylsuccinic anhydride and 3-(2,3-epoxypropoxy) F+ RNA coliphages may be useful candidates as indicators propyltrimethoxysilane. for enteric virus contamination. Genotyping by nucleic acid The array of the invention contains first, biomolecule amplification and hybridization methods are reliable, rapid, containing regions, where each region has an area within the Simple, and inexpensive alternatives to Serotyping (Kafatos range of about 1,000 square microns to about 100,000 et. al., Nucleic Acids Res. 7:1541, 1979). Amplification Square microns. In a preferred embodiment, the first regions techniques and nucleic aid hybridization techniques have have areas that range from about 5,000 Square microns to been Successfully used to classify a variety of microorgan about 25,000 square microns. isms including E. coli (Feng, Mol. Cell Probes 7:151, 1993), The first regions are preferably Substantially circular, rotavirus (Sethabutr et. al., J. Med Virol. 37:192, 1992), 25 where the circles have an average diameter of about 10 hepatitis C virus (Stuyver et. al., J. Gen Virol. 74:1093, microns to 200 microns. Whether circular or not, the bound 1993), and herpes simplex virus (Matsumoto et al., J. Virol. aries of the first regions are preferably Separated from one Methods 40:119, 1992). another (by the Second region) by an average distance of at Genetic alterations have been described in a variety of least about 25 microns, however by not more than about 1 experimental mammalian and human neoplasms and repre cm (and preferably by no more than about 1,000 microns). Sent the morphological basis for the Sequence of morpho In a preferred array, the boundaries of neighboring first logical alterations observed in carcinogenesis (Vogelstein et regions are separated by an average distance of about 25 al., NEJM 319:525, 1988). In recent years with the advent of microns to 100 microns, where that distance is preferably molecular biology techniques, allelic losses on certain chro constant throughout the array, and the first regions are mosomes or mutation of tumor Suppressor genes as well as 35 preferably positioned in a repeating geometric pattern as mutations in Several oncogenes (e.g., c-myc, c-jun, and the shown in the FIGures attached hereto. In a preferred repeat ras family) have been the most studied entities. Previous ing geometric pattern, all neighboring first regions are work (Finkelstein et al., Arch Surg. 128:526, 1993) has Separated by approximately the same distance (about 25 identified a correlation between Specific types of point microns to about 100 microns). mutations in the K-ras oncogene and the Stage at diagnosis 40 In preferred arrays, there are from 10 to 50 first regions on in colorectal carcinoma. The results Suggested that muta the Substrate. In another embodiment, there are 50 to 400 tional analysis could provide important information of tumor first regions on a Substrate. In yet another preferred aggressiveness, including the pattern and spread of metasta embodiment, there are 400 to 800 first regions on the sis. The prognostic value of TP53 and K-ras-2 mutational Substrate. analysis in Stage III carconoma of the colon has more 45 The biomolecule located in the first regions is preferably recently been demonstrated (Pricolo et al., Am. J. Surg. a nucleic acid polymer. A preferred nucleic acid polymer is 171:41, 1996). It is therefore apparent that genotyping of an oligonucleotide having from about 15 to about 50 nucle tumors and pre-cancerous cells, and Specific mutation detec otides. The biomolecule may be amplification reaction prod tion will become increasingly important in the treatment of ucts having from about 50 to about 1,000 nucleotides. cancers in humans. 50 In each first region, the biomolecule is preferably present Tagged Probes in Array-Based ASSayS at an average concentration ranging from 10 to 10” bio The tagged biomolecules as disclosed herein may be used molecules per 2,000 Square microns of a first region. More to interrogate (untagged) arrays of biomolecules. Preferred preferably, the average concentration of biomolecule ranges arrays of biomolcules contain a Solid Substrate comprising a from 107 to 10” biomolecules per 2,000 square microns. In Surface, where the Surface is at least partially covered with 55 the Second region, the biomolecule is preferably present at a layer of poly(ethylenimine) (PEI). The PEI layer com an average concentration of less than 10 biomolecules per prises a plurality of discrete first regions abutted and Sur 2,000 Square microns of Said Second region, and more rounded by a contiguous Second region. The first regions are preferably at an average concentration of less than 10 defined by the presence of a biomolecule and PEI, while the biomolecules per 2,000 square microns. Most preferably, the second region is defined by the presence of PEI and the 60 Second regions does not contain any biomolecule. substantial absence of the biomolecule. Preferably, the Sub The chemistry used to adhere the layer of PEI to the Strate is a glass plate or a Silicon wafer. However, the Substrate depends, in Substantial part, upon the chemical Substrate may be, for example, quartz, gold, nylon-6,6, identity of the substrate. The prior art provides numerous nylon or polystyrene, as well as composites thereof, as examples of suitable chemistries that may adhere PEI to a described above. 65 Solid Support. For example, when the Substrate is nylon-6,6, The PEI coating preferably contains PEI having a molecu the PEI coating may be applied by the methods disclosed in lar weight ranging from 100 to 100,000. The PEI coating Van Ness, J. et al. Nucleic Acids Res. 19:3345-3350, 1991 US 6,613,508 B1 93 94 and PCT International Publication WO 94/00600, both of The biomolecule in the array may be a nucleic acid which are incorporated herein by reference. When the solid polymer or analog thereof, Such as PNA, phosphorothioates Support is glass or Silicon, Suitable methods of applying a and methylphosphonates. Nucleic acid refers to both ribo layer of PEI are found in, e.g., Wasserman, B. P. Biotech nucleic acid and deoxyribonucleic acid. The biomolecule nology and Bioengineering XXII:271-287, 1980; and may comprise unnatural and/or Synthetic bases. The bio D'Souza, S. F. Biotechnology Letters 8:643–648, 1986. molecule may be Single or double Stranded nucleic acid Preferably, the PEI coating is covalently attached to the polymer. solid substrate. When the solid substrate is glass or silicon, the PEI coating may be covalently bound to the substrate A preferred biomolecule is an nucleic acid polymer, using Sillylating chemistry. For example, PEI having reactive which includes oligonucleotides (up to about 100 nucleotide Siloxy endgroups is commercially available from Gelest, 1O bases) and polynucleotides (over about 100 bases). A pre Inc. (Tullytown, Pa.). Such reactive PEI may be contacted ferred nucleic acid polymer is formed from 15 to 50 nucle with a glass Slide or Silicon wafer, and after gentle agitation, otide bases. Another preferred nucleic acid polymer has 50 the PEI will adhere to the Substrate. Alternatively, a bifunc to 1,000 nucleotide bases. The nucleic acid polymer may be tional Sillylating reagent may be employed. According to this a PCR product, PCR primer, or nucleic acid duplex, to list process, the glass or Silicon Substrate is treated with the 15 a few examples. However, essentially any nucleic acid type bifunctional Sillylating reagent to provide the Substrate with can be covalently attached to a PEI-coated Surface when the a reactive Surface. PEI is then contacted with the reactive nucleic acid contains a primary amine, as disclosed below. Surface, and covalently binds to the Surface through the The typical concentration of nucleic acid polymer in the bifunctional reagent. arraying solution is 0.001-10 ug/mL, preferably 0.01-1 The biomolecules being placed into the array format are lug/mL, and more preferably 0.05-0.5 lug/mL. originally present in a So-called "arraying Solution'. In order Preferred nucleic acid polymers are “amine-modified” in to place biomolecule in discrete regions on the PEI-coated that they have been modified to contain a primary amine at Substrate, the arraying Solution preferably contains a thick the 5'-end of the nucleic acid polymer, preferably with one ening agent at a concentration of about 35 vol% to about 80 or more methylene (-CH2-) groups disposed between the vol % based on the total volume of the composition, a 25 primary amine and the nucleic acid portion of the nucleic biomolecule which is preferably an oligonucleotide at a acid polymer. Six is a preferred number of methylene concentration ranging from 0.001 ug/mL to 10 ug/mL, and groups. Amine-modified nucleic acid polymers are preferred Water. because they can be covalently coupled to a Solid Support The concentration of the thickening agent is 35% V/V to through the 5'-amine group. PCR products can be arrayed 80% V/V for liquid thickening agents such as glycerol. The using 5'-hexylamine modified PCR primers. Nucleic acid preferred concentration of thickening agent in the compo duplexes can be arrayed after the introduction of amines by Sition depends, to Some extent, on the temperature at which nick translation using aminoallyl-dUTP (Sigma, St. Louis, the arraying is performed. The lower the arraying Mo.). Amines can be introduced into nucleic acids by temperature, the lower the concentration of thickening agent polymerases Such as terminal transferase with amino allyl that needs to be used. The combination of temperature and 35 dUTP or by ligation of Short amine-containing nucleic acid liquid thickening agent concentration control permits arrayS polymers onto nucleic acids by ligases. to be made on most types of Solid Supports (e.g., glass, Preferably, the nucleic acid polymer is activated prior to wafers, nylon 6/6, nylon membranes, etc.). be contacted with the PEI coating. This can be conveniently The presence of a thickening agent has the additional accomplished by combining amine-functionalized nucleic benefit of allowing the concurrent presence of low concen 40 acid polymer with a multi-functional amine-reactive chemi trations of various other materials to be present in combi cal Such as trichlorotriazine. When the nucleic acid polymer nation with the biomolecule. For example 0.001% V/V to contains a 5'-amine group, that 5'-amine can be reacted with 1% V/V of detergents may be present in the arraying trichlorotriazine, also known as cyanuric chloride (Van NeSS Solution. This is useful because PCR buffer contains a small et al., Nucleic Acids Res. 19(2):3345-3350, 1991) amount of Tween-20 or NP-40, and it is frequently desirable 45 Preferably, an excess of cyanuric chloride is added to the to array sample nucleic acids directly from a PCR vial nucleic acid polymer solution, where a 10- to 1000-fold without prior purification of the amplicons. The use of a molar excess of cyanuric chloride over the number of thickening agent permits the presence of Salts (for example amines in the nucleic acid polymer in the arraying Solution NaCl, KCI, or MgCl), buffers (for example Tris), and/or is preferred. In this way, the majority of amine-terminated chelating reagents (for example EDTA) to also be present in 50 nucleic acid polymers have reacted with one molecule of the arraying Solution. The use of a thickening agent also has trichlorotriazine, So that the nucleic acid polymer becomes the additional benefit of permitting the use of croSS-linking terminated with dichlorotriazine. reagents and/or organic Solvents to be present in the arraying Preferably, the arraying Solution is buffered using a com Solution. AS commercially obtained, croSS-linking reagents mon buffer Such as Sodium phosphate, Sodium borate, are commonly dissolved in organic solvent such as DMSO, 55 Sodium carbonate, or Tris HC1. A preferred pH range for the DMF, NMP, methanol, ethanol and the like. Commonly used arraying solution is 7 to 9, with a preferred buffer being organic Solvents can be used in arraying Solutions of the freshly prepared sodium borate at pH 8.3 to pH 8.5. To invention at levels of 0.05% to 20% (V/V) when thickening prepare a typical arraying Solution, hexylamine-modified agents are used. nucleic acid polymer is placed in 0.2 M Sodium borate, pH In general, the thickening agents impart increased ViscoS 60 8.3, at 0.1 ug/mL, to a total volume of 50 ul. Ten ul of a 15 ity to the arraying Solution. When a proper Viscosity is mg/mL Solution of cyanuric chloride is then added, and the achieved in the arraying Solution, the first drop is the reaction is allowed to proceed for 1 hour at 25 C. with Substantially the same size as, for example, the 100th drop constant agitation. Glycerol (Gibco BrlE), Grand Island, deposited. When an improper Viscosity is used in the array N.Y.) is added to a final concentration of 56%. ing Solution, the first drops deposited are significantly larger 65 The biomolecular arraying Solutions may be applied to the than latter drops which are deposited. The desired Viscosity PEI coating by any of the number of techniques currently is between those of pure water and pure glycerin. used in microfabrication. For example, the Solutions may be US 6,613,508 B1 95 96 placed into an inkjet print head, and ejected from Such a To detect the hybridization event (i.e., the presence of the head onto the coating. biotin), the Solid Support is incubated with Streptavidin/ A preferred approach to delivering biomolecular Solution horseradish peroxidase conjugate. Such enzyme conjugates onto the PEI coating employs a modified Spring probe. are commercially available from, for example, Vector Labo Spring probes are available from Several vendors including Everett Charles (Pomona, Calif.), Interconnect Devices Inc. ratories (Burlingham, Calif.). The streptavidin binds with (Kansas City, Kans.) and Test Connections Inc., (Upland, high affinity to the biotin molecule bringing the horseradish Calif.). In order for the commercially available spring peroxidase into proximity to the hybridized probe. Unbound probes as described above to Satisfactorily function as liquid Streptavidin/horseradish peroxidase conjugate is washed deposition devices according to the present invention, away in a simple washing Step. The presence of horseradish approximately /1000th to 5/1000th of an inch of metal material peroxidase enzyme is then detected using a precipitating must be removed from the tip of the probe. The process must Substrate in the presence of peroxide and the appropriate result in a flat Surface which is perpendicular to the longi buffers. tudinal axis of the Spring probe. The removal of approxi mately /1000th to 5/1000th of an inch of material from the A blue enzyme product deposited on a reflective Surface bottom of the tip is preferred and can be accomplished easily 15 Such as a wafer has a many-fold lower level of detection with a very fine grained Wet Stone. Specific Spring probes (LLD) compared to that expected for a calorimetric Sub which are commercially available and may be modified to strate. Furthermore, the LLD is vastly different for different provide a planar tip as described above include the XP54 colored enzyme products. For example, the LLD for probe manufactured by Ostby Barton (a division of Everett 4-methoxynapthol (which produces a precipitated blue Charles (Pomona, Calif.)); the SPA 25P probe manufactured product) per 50 uM diameter spot is approximately 1000 by Everett Charles (Pomona, Calif.) and 43-P fluted spring molecules, whereas a red precipitated Substrate gives an probe from Test Connections Inc., (Upland, Calif.). LLD about 1000-fold higher at 1,000,000 molecules per 50 The arraying Solutions as described above may be used tiM diameter spot. The LLD is determined by interrogating directly in an arraying proceSS. That is, the activated nucleic the Surface with a microscope (Such as the Axiotech micro acid polymers need not be purified away from unreacted 25 Scope commercially available from Zeiss) equipped with a cyanuric chloride prior to the printing Step. Typically the visible light source and a CCD camera (Princeton reaction which attaches the activated nucleic acid to the Instruments, Princeton, N.J.). An image of approximately solid support is allowed to proceed for 1 to 20 hours at 20 10,000 uMx10,000 uM can be scanned at one time. to 50 C. Preferably, the reaction time is 1 hour at 25 C. In order to use the blue colorimetric detection Scheme, the The arrays as described herein are particularly useful in Surface must be very clean after the enzymatic reaction and conducting hybridization assays, for example, using CMST the wafer or Slide must be Scanned in a dry State. In addition, labeled probes. However, in order to perform Such assays, the enzymatic reaction must be stopped prior to Saturation of the amines on the Solid Support must be capped prior to the reference spots. For horseradish peroxidase this is conducting the hybridization Step. This may be accom approximately 2–5 minutes. plished by reacting the solid support with 0.1-2.0 M suc 35 It is also possible to use chemiluminescent Substrates for cinic anhydride. The preferred reaction conditions are 1.0 M alkaline phosphatase or horesradish peroxidase (HRP), or succinic anhydride in 70% m-pyrol and 0.1 M sodium fluoroescence substrates for HRP or alkaline phosphatase. borate. The reaction typically is allowed to occur for 15 Examples include the dioxetane Substrates for alkaline phos minutes to 4 hours with a preferred reaction time of 30 phatase available from Perkin Elmer or Attophos HRP minutes at 25 C. Residual succinic anhydride is removed 40 with a 3x water wash. substrate from JBL Scientific (San Luis Obispo, Calif.). The solid support is then incubated with a solution The following examples are offered by way of illustration, containing 0.1-5 M glycine in 0.1-10.0 M Sodium borate at and not by way of limitation. pH 7-9. This step “caps” any dichloro-triazine which may Unless otherwise Stated, chemicals as used in the be covalently bound to the PEI surface by conversion into 45 examples may be obtained from Aldrich Chemical monochlorotriazine. The preferred conditions are 0.2 M Company, Milwaukee, Wis. The following abbreviations, glycine in 0.1 M sodium borate at pH 8.3. The solid support with the indicated meanings, are used herein: may then be washed with detergent-containing Solutions to ANP-=3-(Fmoc-amino)-3-(2-nitrophenyl)propionic acid remove unbound materials, for example, trace NMP. NBA=4-(Fmoc-aminomethyl)-3-nitrobenzoic acid Preferably, the solid support is heated to 95 C. in 0.01 M 50 HATU= O-7-azabenzo triazol-1-yl-N,N,N',N'- NaCl, 0.05 M EDTA and 01 M Tris pH 8.0 for 5 minutes. tetramethyluronium hexafluorophosphate This heating Step removes non-covalently attached nucleic DIEA=diisopropylethylamine acid polymers, Such as PCR products. In the case where MCT=monochlorotriazine double Strand nucleic acid are arrayed, this step also has the NMM=4-methylmorpholine effect of converting the double Strand to Single Strand form 55 NMP=N-methylpyrrolidone (denaturation). ACT357=ACT357 peptide synthesizer from Advanced The arrays are may be interrogated by probes (e.g., ChemTech, Inc., Louisville, Ky. oligonucleotides, nucleic acid fragments, PCR products, ACT=Advanced ChemTech, Inc., Louisville, Ky. etc.) which may be tagged with, for example CMST tags as NovaBiochem=CalBiochem-NovaBiochem International, described herein, radioisotopes, fluorophores or biotin. The 60 San Diego, Calif. methods for biotinylating nucleic acids are well known in TFA=Trifluoroacetic acid the art and are adequately described by Pierce (Avidin Tfa=Trifluoroacetyl Biotin Chemistry: A Handbook, Pierce Chemical Company, iNIP=N-Methylisonipecotic acid 1992. Rockford Ill.). Probes are generally used at 0.1 ng/ml, T?p=Tetrafluorophenyl to 10/ug/mL in Standard hybridization Solutions that include 65 DIAEA=2-(Diisopropylamino)ethylamine GuSCN, GuHCl, formamide, etc. (see Van Ness and Chen, MCT=monochlorotriazene Nucleic Acids Res., 19:5143–5151, 1991). 5'-AH-ODN=5'-aminohexyl-tailed oligodeoxynucleotide US 6,613,508 B1 97 98 EXAMPLES FIG. 2 shows the reaction Scheme. Step A. 4-(Hydroxymethyl)phenoxybutyric acid (compound Example 1 I; 1 eq) is combined with DIEA (2.1 eq) and allyl bromide (2.1 eq.) in CHCl and heated to reflux for 2 hr. Preparation of Avid Labile Linkers for Use in The mixture is diluted with EtOAc, washed with 1 NHCl Cleavable-MW-Identifier Sequencing (2x), pH 9.5 carbonate buffer (2x), and brine (1x), dried over NaSO, and evaporated in vacuo to give the allyl A. Synthesis of Pentafluorophenyl Esters of Chemically ester of compound I. Cleavable Mass Spectroscopy Tags, to Liberate Tags with Step B. The allyl ester of compound I from step A (1.75 eq.) Carboxyl Amide Termini is combined in CHCl with an FMOC-protected amino FIG. 1 shows the reaction Scheme. acid containing amine functionality in its Side chain Step A. TentaGel SAC resin (compound II; available from (compound II, e.g., alpha-N-FMOC-3-(3-pyridyl)- ACT; 1 eq) is suspended with DMF in the collection alanine, available from Synthetech, Albany, Oreg.; 1 eq.), vessel of the ACT357 peptide synthesizer (ACT). Com N-methylmorpholine (2.5 eq.), and HATU (1.1 eq.), and pound 1 (3 eq.), HATU (3 eq) and DIEA(7.5 eq.) in DMF 15 stirred at room temperature for 4 hr. The mixture is diluted are added and the collection vessel shaken for 1 hr. The with CHCl, washed with 1 M aq. citric acid (2x), water solvent is removed and the resin washed with NMP (2x), (1x), and 5% aq. NaHCO (2x), dried over NaSO, and MeOH (2x), and DMF (2x). The coupling of I to the resin evaporated in vacuo. Compound III is isolated by flash and the wash Steps are repeated, to give compound III. chromatography (CHCI->EtOAc). Step B. The resin (compound III) is mixed with 25% Step C. Compound III is dissolved in CHCl, Pd(PPh3) piperidine in DMF-and shaken for 5 min. The resin is (0.07 eq) and N-methylaniline (2 eq) are added, and the filtered, then mixed with 25% piperidine in DMF and mixture stirred at room temperature for 4 hr. The mixture shaken for 10 min. The Solvent is removed, the resin is diluted with CHCl, washed with 1 M aq. citric acid washed with NMP (2x), MeOH (2x), and DMF (2x), and (2x) and water (1x), dried over Na2SO, and evaporated used directly in Step C. 25 in vacuo. Compound IV is isolated by flash chromatog Step C. The deprotected resin from step B is suspended in raphy (CHC1->EtOAc+HOAc). DMF and to it is added an FMOC-protected amino acid, Step D. TentaGel S AC resin (compound V; 1 eq) is containing amine functionality in its Side chain suspended with DMF in the collection vessel of the (compound IV, e.g., alpha-N-FMOC-3-(3-pyridyl)- ACT357 peptide synthesizer (Advanced ChemTech Inc. alanine, available from Synthetech, Albany, Oreg., 3 eq.), (ACT), Louisville, Ky.). Compound IV (3 eq.), HATU (3 HATU (3 eq.), and DIEA (7.5 eq.) in DMF. The vessel is eq.) and DIEA (7.5 eq.) in DMF are added and the shaken for 1 hr. The Solvent is removed and the resin collection vessel shaken for 1 hr. The Solvent is removed washed with NMP (2x), MeOH (2x), and DMF (2x). The and the resin washed with NMP (2x), MeOH (2x), and coupling of IV to the resin and the wash Steps are DMF (2x). The coupling of IV to the resin and the wash repeated, to give compound V. 35 Steps are repeated, to give compound VI. Step D. The resin (compound V) is treated with piperidine Step E. The resin (compound VI) is mixed with 25% as described in step B to remove the FMOC group. The piperidine in DMF and shaken for 5 min. The resin is deprotected resin is then divided equally by the ACT357 filtered, then mixed with 25% piperidine in DMF and from the collection vessel into 16 reaction vessels. shaken for 10 min. The Solvent is removed and the resin Step E. The 16 aliquots of deprotected resin from step D are 40 washed with NMP (2x), MeOH (2x), and DMF (2x). The Suspended in DMF. To each reaction vessel is added the deprotected resin is then divided equally by the ACT357 appropriate carboxylic acid VI (R-COH; 3 eq.), from the collection vessel into 16 reaction vessels. HATU (3 eq.), and DIEA (7.5 eq.) in DMF. The vessels Step F. The 16 aliquots of deprotected resin from step E are are shaken for 1 hr. The Solvent is removed and the suspended in DMF. To each reaction vessel is added the aliquots of resin washed with NMP (2x), MeOH (2x), and 45 appropriate carboxylic acid VII (R-CO2H; 3 eq.), DMF (2x). The coupling of VI to the aliquots of resin HATU (3 eq.), and DIEA (7.5 eq.) in DMF. The vessels and the wash Steps are repeated, to give compounds are shaken for 1 hr. The Solvent is removed and the VII. aliquots of resin washed with NMP (2x), MeOH (2x), and Step F. The aliquots of resin (compounds VII) are DMF (2x). The coupling of VII to the aliquots of resin washed with CHCl (3x). To each of the reaction vessels 50 and the wash Steps are repeated, to give compounds is added 1% TFA in CHCl and the vessels shaken for 30 VII. min. The solvent is filtered from the reaction vessels into Step G. The aliquots of resin (compounds VIII) are individual tubes. The aliquots of resin are washed with washed with CHCl (3x). To each of the reaction vessels CHCl (2x) and MeOH (2x) and the filtrates combined is added 1% TFA in CHCl and the vessels shaken for 30 into the individual tubes. The individual tubes are evapo 55 min. The solvent is filtered from the reaction vessels into rated in vacuo, providing compounds VIII. individual tubes. The aliquots of resin are washed with Step G. Each of the free carboxylic acids VIII is dis CHCl (2x) and MeOH (2x) and the filtrates combined solved in DMF. To each solution is added pyridine (1.05 into the individual tubes. The individual tubes are evapo eq.), followed by pentafluorophenyl trifluoroacetate (1.1 rated in vacuo, providing compounds IX. eq.). The mixtures are stirred for 45 min. at room tem 60 Step H. Each of the free carboxylic acids IX is dissolved perature. The solutions are diluted with EtOAc, washed in DMF. To each solution is added pyridine (1.05 eq.), with 1 M aq. citric acid (3x) and 5% aq. NaHCOs (3x), followed by pentafluorophenyl trifluoroacetate (1.1 eq.). dried over Na2SO, filtered, and evaporated in vacuo, The mixtures are Stirred for 45 min. at room temperature. providing compounds IX-6. The Solutions are diluted with EtOAc, washed with 1 M A. Synthesis of Pentafluorophenyl Esters of Chemically 65 aq. citric acid (3x) and 5% aq. NaHCOs (3x), dried over Cleavable Mass Spectroscopy Tags, to Liberate Tars with Na2SO, filtered, and evaporated in vacuo, providing Carboxyl Acid Termini compounds X-6. US 6,613,508 B1 99 100 Example 2 5 mM EDTA). To 100 ul volumes of ODNs 25ul of 0.01 M dithiothreitol (DTT) is added. To an identical set of controls Demonstration of Photolytic Cleavage of T-L-X no DDT is added. The mixture is incubated for 15 minutes A T-L-X compound as prepared in Example 11 was at room temperature. Fluorescence is measured in a black irradiated with near-UV light for 7 min at room temperature. microtiter plate. The solution is removed from the incuba A Rayonett fluorescence UV lamp (Southern New England tion tubes (150 microliters) and placed in a black microtiter Ultraviolet Co., Middletown, Conn.) with an emission peak plate (Dynatek Laboratories, Chantilly, Va.). The plates are at 350 nm is used as a source of UV light. The lamp is placed then read directly using a Fluoroskan II fluorometer (Flow at a 15-cm distance from the Petri dishes with samples. SDS Laboratories, McLean, Va.) using an excitation wavelength gel electrophoresis shows that >85% of the conjugate is of 495 nm and monitoring emission at 520 nm for cleaved under these conditions. fluorescein, using an excitation wavelength of 591 nm and monitoring emission at 612 nm for Texas Red, and using an Example 3 excitation wavelength of 570 nm and monitoring emission at 590 nm for lissamine. Preparation of Fluorescent Labeled Primers and 15 Demonstration of Cleavage of Fluorophore Synthesis and Purification of Oligonucleotides The oligonucleotides (ODNs) are prepared on automated Moles of RFU RFU RFU DNA Synthesizers using the Standard phosphoramidite Fluorochrome non-cleaved cleaved free chemistry Supplied by the vendor, or the H-phosphonate 1.0 x 10 M 6.4 12OO 1345 3.3 x 10 M 2.4 451 456 chemistry (Glenn Research Sterling, Va.). Appropriately 1.1 x 10 M O.9 135 130 blocked dA, dG, dC, and T phosphoramidites are commer 3.7 x 107M O.3 44 48 cially available in these forms, and Synthetic nucleosides 1.2 x 107M O.12 15.3 16.0 may readily be converted to the appropriate form. The 4.1 x 107M O.14 4.9 5.1 oligonucleotides are prepared using the Standard phosphora 25 1.4 x 10 M O.13 2.5 2.8 midite Supplied by the Vendor, or the H-phosphonate chem 4.5 x 10 M O.12 O.8 O.9 istry. Oligonucleotides are purified by adaptations of Stan dard methods. Oligonucleotides with 5'-trityl groups are The data indicate that there is about a 200-fold increase in chromatographed on HPLC using a 12 micrometer, 300 if relative fluorescence when the fluorochrome is cleaved from Rainin (Emeryville, Calif.) Dynamax C-8 4.2x250 mm the ODN. reverse phase column using a gradient of 15% to 55% MeCN in 0.1 N EtNHOAc, pH 7.0, over 20 min. When Example 4 detritylation is performed, the oligonucleotides are further Preparation of Tagged M13 Sequence Primers and purified by gel eXclusion chromatography. Analytical checks Demonstration of Cleavage of Tags for the quality of the oligonucleotides are conducted with a 35 PRP-column (Alitech, Deerfield, Ill.) at alkaline pH and by Preparation of 2,4,6-trichlorotriazine derived oligonucle PAGE. otides: 1000 tug of 5'-terminal amine linked oligonucleotide Preparation of 2,4,6-trichlorotriazine derived oligonucle (5'-hexylamine-TGTAAAACGACGGCCAGT-3") (Seq. ID otides: 10 to 1000 tug of 5'-terminal amine linked oligo No. 1) are reacted with an excess recrystallized cyanuric nucleotide are reacted with an exceSS recrystallized cyanuric 40 chloride in 10% n-methyl-pyrrollidone alkaline (pH 8.3 to chloride in 10% n-methyl-pyrrolidone in alkaline (pH 8.3 to 8.5 preferably) buffer at 19 to 25 C. for 30 to 120 minutes. 8.5 preferably) buffer at 19° C. to 25° C. for 30 to 120 The final reaction conditions consist of 0.15 M Sodium minutes. The final reaction conditions consist of 0.15 M borate at pH 8.3, 2 mg/ml recrystallized cyanuric chloride Sodium borate at pH 8.3, 2 mg/ml recrystallized cyanuric and 500 ug/ml respective oligonucleotide. The unreacted chloride and 500 ug/ml respective oligonucleotide. The 45 cyanuric chloride is removed by Size exclusion chromatog unreacted cyanuric chloride is removed by Size exclusion raphy on a G-50 Sephadex column. chromatography on a G-50 Sephadex (Pharmacia, The activated purified oligonucleotide is then reacted with Piscataway, N.J.) column. a 100-fold molar excess of cystamine in 0.15 M Sodium The activated purified oligonucleotide is then reacted with borate at pH 8.3 for 1 hour at room temperature. The a 100-fold molar excess of cystamine in 0.15 M Sodium 50 unreacted cyStamine is removed by Size exclusion chroma borate at pH 8.3 for 1 hour at room temperature. The tography on a G-50 Sephadex column. The derived ODNs unreacted cystamine is removed by Size exclusion chroma are then reacted with a variety of amides. The derived ODN tography on a G-50 Sephadex column. The derived ODNs preparation is divided into 12 portions and each portion is are then reacted with amine-reactive fluorochromes. The reacted (25 molar excess) with the pentafluorophenyl-esters derived ODN preparation is divided into 3 portions and each 55 of either: (1) 4-methoxybenzoic acid, (2) 4-fluorobenzoic portion is reacted with either (a) 20-fold molar excess of acid, (3) toluic acid, (4) benzoic acid, (5) indole-3-acetic Texas Red Sulfonyl chloride (Molecular Probes, Eugene, acid, (6) 2,6-difluorobenzoic acid, (7) nicotinic acid Oreg.), with (b) 20-fold molar excess of Lissamine Sulfonyl N-oxide, (8) 2-nitrobenzoic acid, (9) 5-acetylsalicylic acid, chloride (Molecular Probes, Eugene, Oreg.), (c) 20-fold (10) 4-ethoxybenzoic acid, (11) cinnamic acid, (12) molar exceSS of fluorescein isothiocyanate. The final reac 60 3-aminonicotinic acid. The reaction is for 2 hours at 37 C. tion conditions consist of 0.15 MSodium borate at pH 8.3 for in 0.2 M NaBorate pH 8.3. The derived ODNs are purified 1 hour at room temperature. The unreacted fluorochromes by gel exclusion chromatography on G-50 SephadeX. are removed by Size exclusion chromatography on a G-50 To cleave the tag from the oligonucleotide, the ODNs are SephadeX column. adjusted to 1x10 molar and then dilutions are made (12, To cleave the fluorochrome from the oligonucleotide, the 65 3-fold dilutions) in TE (TE is 0.01 M Tris, pH 7.0, 5 mM ODNs are adjusted to 1x10 molar and then dilutions are EDTA) with 50% EtOH (V/V). To 100 ul volumes of ODNs made (12,3-fold dilutions) in TE (TE is 0.01 M Tris, pH 7.0, 25ul of 0.01 M dithiothreitol (DTT) is added. To an identical US 6,613,508 B1 101 102 set of controls no DDT is added. Incubation is for 30 minutes Step C. The deprotected resin from step B is suspended in at room temperature. NaCl is then added to 0.1 M and 2 DMF and to it is added an FMOC-protected amino acid, volumes of EtOH is added to precipitate the ODNs. The containing a protected amine functionality in its side ODNs are removed from solution by centrifugation at chain (Fmoc-Lysine(Aloc)-OH, available from PerSep 14,000xG at 4 C. for 15 minutes. The Supernatants are tive Biosystems; 3 eq.), HATU (3 eq.), and NMM (7.5 eq.) reserved, dried to completeneSS. The pellet is then dissolved in DMF. The vessel is shaken for 1 hr. The solvent is in 25 ul MeOH. The pellet is then tested by mass spectrom removed and the resin washed with NMP (2x), MeOH etry for the presence of tags. (2x), and DMF (2x). The coupling of Fmoc-Lys.(Aloc)- The mass spectrometer used in this work is an external ion OH to the resin and the wash Steps are repeated, to give source Fourier-transform mass spectrometer (FTMS). compound IV. Samples prepared for MALDI analysis are deposited on the Step D. The resin (compound IV) is washed with CH-Cl tip of a direct probe and inserted into the ion source. When (2x), and then suspended in a solution of (PPh3)4Pd(0) the Sample is irradiated with a laser pulse, ions are extracted (0.3 eq.) and PhSiH (10 eq.) in CHCl2. The mixture is from the Source and passed into a long quadrupole ion guide shaken for 1 hr. The Solvent is removed and the resin is that focuses and transports them to an FTMS analyzer cell 15 washed with CHCl (2x). The palladium step is repeated. located inside the bore of a Superconducting magnet The Solvent is removed and the resin is washed with The spectra yield the following information. Peaks vary CHCl (2x), N,N-diisopropylethylammonium dieth ing in intensity from 25 to 100 relative intensity units at the yldithiocarbamate in DMF (2x), DMF (2x) to give com following molecular weights: (1) 212.1 amu indicating pound V. 4-methoxybenzoic acid derivative, (2) 200.1 indicating Step E. The deprotected resin from step D is coupled with 4-fluorobenzoic acid derivative, (3) 196.1 amu indicating N-methylisonipecotic acid as described in Step C to give toluic acid derivative, (4) 182.1 amu indicating benzoic acid compound VI. derivative, (5) 235.2 amu indicating indole-3-acetic acid Step F. The Fmoc protected resin VI is divided equally by derivative, (6) 218.1 amu indicating 2,6-difluorobenzoic the ACT357 from the collection vessel into 36 reaction derivative, (7) 199.1 amu indicating nicotinic acid N-oxide 25 vessels to give compounds VI. derivative, (8) 227.1 amu indicating 2-nitrobenzamide, (9) Step G. The resin (compounds VI) is treated with pip 179.18 amu indicating 5-acetylsalicylic acid derivative, (10) eridine as described in step B to remove the FMOC group. 226.1 amu indicating 4-ethoxybenzoic acid derivative, (11) Step H. The 36 aliquots of deprotected resin from step G are suspended in DMF. To each reaction vessel is added the 209.1 amu indicating cinnamic acid derivative, (12) 198.1 appropriate carboxylic acid (RCOH; 3 eq.), HATU (3 amu indicating 3-aminonicotinic acid derivative. eq.), and NMM (7.5 eq.) in DMF. The vessels are shaken The results indicate that the MW-identifiers are cleaved for 1 hr. The solvent is removed and the aliquots of resin from the primers and are detectable by mass Spectrometry. washed with NMP (2x), MeOH (2x), and DMF (2x). The Example 5 coupling of RCOH to the aliquots of resin and the 35 wash steps are repeated, to give compounds VII. Preparation of a Set of Compounds of the Formula Step I. The aliquots of resin (compounds VIII) are Re-Lys(e-iNIP)-ANP-T?p washed with CHCl (3x). To each of the reaction vessels is added 90:5:5 TFA: H20:CHCl and the vessels shaken FIG. 3 illustrates the parallel synthesis of a set of 36 for 120 min. The solvent is filtered from the reaction T-L-X compounds (X=L), where L, is an activated ester 40 vessels into individual tubes. The aliquots of resin are (specifically, tetrafluorophenyl ester), Li is an ortho washed with CHCl (2x) and MeOH (2x) and the filtrates nitrobenzylamine group with Li being a methylene group combined into the individual tubes. The individual tubes that links Li, and Lif, Thas a modular structure wherein the are evaporated in vacuo, providing compounds IX. carboxylic acid group of lysine has been joined to the Step J. Each of the free carboxylic acids IX is dissolved nitrogen atom of the Libenzylamine group to form an amide 45 in DMF. To each solution is added pyridine (1.05 eq.), bond, and a variable weight component R., (where these followed by tetrafluorophenyl trifluoroacetate (1.1 eq.). R groups correspond to T as defined herein, and may be The mixtures are Stirred for 45 min. at room temperature. introduced via any of the Specific carboxylic acids listed The Solutions are diluted with EtOAc, washed with 5% herein) is bonded through the C-amino group of the lysine, aq. NaHCO(3x), dried over NaSO, filtered, and evapo while a mass spec Sensitivity enhancer group (introduced via 50 rated in Vacuo, providing compounds IX-6. N-methylisonipecotic acid) is bonded through the e-amino group of the lysine. Example 6 Referring to FIG. 3: Preparation of a Set of Compounds of the formula Step A. NovaSyn HMP Resin (available from NovaBio Re-Lys(e-iNIP)-NBA-T?p chem; 1 eq) is suspended with DMF in the collection 55 vessel of the ACT357. Compound I (ANP available from FIG. 4 illustrates the parallel synthesis of a set of 36 ACT; 3 eq.), HATU (3 eq) and NMM (7.5 eq.) in DMF T-L-X compounds (X=L), where L, is an activated ester are-added and the collection vessel shaken for 1 hr. The (specifically, tetrafluorophenyl ester), Li is an ortho solvent is removed and the resin washed with NMP (2x), nitrobenzylamine group with Libeing a direct bond between MeOH (2x), and DMF (2x). The coupling of I to the resin 60 Li, and Lif, where L is joined directly to the aromatic ring and the wash Steps are repeated, to give compound II. of the Li group, T has a modular structure wherein the Step B. The resin (compound II) is mixed with 25% piperi carboxylic acid group of lysine has been joined to the dine in DMF and shaken for 5 min. The resin is filtered, nitrogen atom of the Libenzylamine group to form an amide then mixed with 25% piperidine in DMF and shaken for bond, and a variable weight component Rae, (where these 10 min. The Solvent is removed, the resin washed with 65 R groups correspond to T as defined herein, and may be NMP (2x), MeOH (2x), and DMF (2x), and used directly introduced via any of the Specific carboxylic acids listed in Step C. herein) is bonded through the C-amino group of the lysine, US 6,613,508 B1 103 104 while a mass Spec enhancer group (introduced via Step D. The allyl ester on the resin (compound IV) is washed N-methylisonipecotic acid) is bonded through the e-amino with CHCl (2x) and mixed with a solution of (PPh), Pd group of the. lysine. (0) (0.3 eq.) and N-methylaniline (3 eq.) in CHCl2. The Referring to FIG. 4 mixture is shaken for 1 hr. The solvent is removed and the Step A. NovaSyn HMP Resin is coupled with compound I resin is washed with CHCl (2x). The palladium step is (NBA prepared according to the procedure of Brown et repeated. The Solvent is removed and the resin is washed al., Molecular Diversity, 1, 4 (1995)) according to the with CHCl (2x), N,N-diisopropylethylammonium procedure described in Step A of Example 5, to give diethyldithiocarbamate in DMF (2x), DMF (2x) to give compound II. compound V. Steps B-J. The resin (compound II) is treated as described 1O Step E. The deprotected resin from step D is suspended in in StepS B-J of Example 5 to give compounds X. DMF and activated by mixing HATU (3 eq.), and NMM (7.5 eq.). The vessels are shaken for 15 minutes. The Example 7 solvent is removed and the resin washed with NMP (IX). Preparation of a Set of Compounds of the Formula The resin is mixed with 2-(diisopropylamino)ethylamine iNIP-Lys(e-R)-ANP-T?p 15 (3 eq) and NMM (7.5 eq.). The vessels are shaken for 1 FIG. 5 illustrates the parallel synthesis of a set of 36 hour. The coupling of 2-(diisopropylamino)ethylamine to T-L-X compounds (X=L), where L, is an activated ester the resin and the wash Steps are repeated, to give com (specifically, tetrafluorophenyl ester), L' is an ortho pound VI. nitrobenzylamine group with Li being a methylene group Steps F. J. Same as in Example 5. that links Li, and Lif, Thas a modular structure wherein the Example 9 carboxylic acid group of lysine has been joined to the nitrogen atom of the Libenzylamine group to form an amide Preparation of a Set of Compounds of the Formula bond, and a variable weight component R., (where these Re-Lys(e-iNIP)-ANP-Lys(e-NH)-NH R groups correspond to T as defined herein, and may be introduced via any of the Specific carboxylic acids listed 25 FIG. 7 illustrates the parallel synthesis of a set of 36 herein) is bonded through the C-amino group of the lysine, T-L-X compounds (X=L), where L, is an amine while a mass spec Sensitivity enhancer group (introduced via (specifically, the e-amino group of a lysine-derived moiety), N-methylisonipecotic acid) is bonded through the C-amino L is an ortho-nitrobenzylamine group with Li being a group of the lysine. carboxamido-Substituted alkyleneaminoacylalkylene group Referring to FIG. 5: that links Li, and L., T has a modular structure wherein the Steps A-C. Same as in Example 5. carboxylic acid group of lysine has been joined to the Step D. The resin (compound IV) is treated with piperidine nitrogen atom of the Libenzylamine group to form an amide as described in step B of Example 5 to remove the FMOC bond, and a variable weight component R-se, (where these grOup. R groups correspond to T as defined herein, and may be Step E. The deprotected C-amine on the resin in Step D is 35 introduced via any of the Specific carboxylic acids listed coupled with N-methylisonipecotic acid as described in herein) is bonded through the C-amino group of the lysine, step C of Example 5 to give compound V. while a mass spec sensitivity enhancer group (introduced via Step F. Same as in Example 5. N-methylisonipecotic acid) is bonded through the e-amino Step G. The resin (compounds VI) are treated with group of the lysine. palladium as described in step D of Example 5 to remove 40 Referring to FIG. 7: the Aloc group. Step A. Fmoc-Lys(Boc)-SRAM Resin (available from ACT; Steps H-J. The compounds X are prepared in the same compound I) is mixed with 25% piperidine in DMF and manner as in Example 5. shaken for 5 min. The resin is filtered, then mixed with 25% piperidine in DMF and shaken for 10 min. The Example 8 45 solvent is removed, the resin washed with NMP (2x), Preparation of a Set of Compounds of the Formula MeOH (2x), and DMF (2x), and used directly in step B. R-Glucy-DIAEA)-ANP-T?p Step B. The resin (compound II), ANP (available from ACT; FIG. 6 illustrates the parallel synthesis of a set of 36 3 eq.), HATU (3 eq) and NMM (7.5 eq.) in DMF are added and the collection vessel shaken for 1 hr. The T-L-X compounds (X=L), where L, is an activated ester 50 (specifically, tetrafluorophenyl ester), L' is an ortho solvent is removed and the resin washed with NMP (2x), nitrobenzylamine group with Li being a methylene group MeOH (2x), and DMF (2x). The coupling of I to the resin that links Li, and Lif, Thas a modular structure wherein the and the wash Steps are repeated, to give compound III. C-carboxylic acid group of glutamatic acid has been joined Steps C.-J. The resin (compound III) is treated as in steps to the nitrogen atom of the Libenzylamine group to form an 55 B-I in Example 5 to give compounds X. amide bond, and a variable weight component R., (where these R groups correspond to T as defined herein, and may Example 10 be introduced via any of the Specific carboxylic acids listed Preparation of a Set of Compounds of the Formula herein) is bonded through the ao-amino group of the Re-Lys(e-Tfa)-Lys(e-ilNP)-ANP-T?p glutamic acid, while a mass Spec Sensitivity enhancer group 60 (introduced via 2-(diisopropylamino)ethylamine) is bonded FIG. 8 illustrates the parallel synthesis of a set of 36 through the Y-carboxylic acid of the glutamic acid. T-L-X -compounds (X=L), where L is an activated Referring to FIG. 6: ester (specifically, tetrafluorophenyl ester), L' is an ortho Steps A-B. Same as in Example 5. nitrobenzylamine group with Li being a methylene group Step C. The deprotected resin (compound III) is coupled to 65 that links Li, and L., T has a modular structure wherein the Fmoc-Glu-(OAl)-OH using the coupling method carboxylic acid group of a first lysine has been joined to the described in step C of Example 5 to give compound IV. nitrogen atom of the Libenzylamine group to form an amide US 6,613,508 B1 105 106 bond, a mass spec sensitivity enhancer group (introduced via primary amine selected from Re-Lys(e-iNIP)-ANP-Lys N-methylisonipecotic acid) is bonded through the e-amino (e-NH)-NH. (compounds X from Example 11). The group of the first lysine, a Second lysine molecle has been Solution is mixed overnight at ambient temperature. The joined to the first lysine through the C-amino group of the unreacted amine is removed by ultrafiltration through a first lysine, a molecular weight adjuster group (having a 3000 MW cutoff membrane (Amicon, Beverly, Mass.) trifluoroacetyl structure) is bonded through the e-amino using H2O as the wash Solution (3x). The compounds group of the Second lysine, and a variable weight component XIII are isolated by reduction of the volume to 100 mL. Rise, (where these R groups correspond to T as defined herein, and may be introduced via any of the Specific Example 13 carboxylic acids listed herein) is bonded through the Demonstration of the simultaneous Detection of C.-amino group of the second lysine. Referring to FIG. 8: Multiple Tags by Mass Spectrometry Steps A-E. These Steps are identical to Steps A-E in Example 5. This example provides a description of the ability to Step F. The resin (compound VI) is treated with piperidine Simultaneously detect multiple compounds (tags) by mass as described in step B in Example 5 to remove the FMOC 15 Spectrometry. In this particular example, 31 compounds are grOup. mixed with a matrix, deposited and dried on to a Solid Step G. The deprotected resin (compound VII) is coupled to Support and then desorbed with a laser. The resultantions are Fmoc-Lys(Tfa)-OH using the coupling method described then introduced in a mass spectrometer. in step C of Example 5 to give compound VIII. The following compounds (purchased from Aldrich, Steps H-K. The resin (compound VIII) is treated as in steps Milwaukee, Wis.) are mixed together on an equal molar F-J in Example 5 to give compounds X. basis to a final concentration of 0.002 M (on a per compound) basis: benzamide (121.14), nicotinamide Example 11 (122. 13), pyrazinamide (123.12), 3-amino-4- pyrazolecarboxylic acid (127.10), 2-thiophenecarboxamide Preparation of a Set of Compounds of the Formula 25 (127.17), 4-aminobenzamide (135.15), tolumide (135.1.7), Re-Lys(e-iNIP)-ANP-5'-AH-ODN 6-methylnicotinamide (136.15), 3-aminonicotinamide (137. 14), nicot in amide N-oxide (138.12), FIG. 9 illustrates the parallel synthesis of a set of 36 3-hydropicolinamide (138.13), 4-fluorobenzamide (139.13), T-L-X compounds (X=MOI, where MOI is a nucleic acid cinnamamide (147.18), 4-methoxybenzamide (151.17), 2,6- fragment, ODN) derived from the esters of Example 5 (the difluorbenzamide (157.12), 4-amino-5-imidazole same procedure could be used with other T-L-X com carboxyamide (162.58), 3,4-pyridine-dicarboxyamide pounds wherein X is an activated ester). The MOI is (165. 16), 4-ethoxyben Zamide (165. 19), 2,3- conjugated to T L through the 5' end of the MOI, via a pyrazinedicarboxamide (166.14), 2-nitrobenzamide phosphodiester-alkyleneamine group. (166.14), 3-fluoro-4-methoxybenzoic acid (170.4), indole Referring to FIG. 9: 35 3-acetamide (174.2), 5-acetylsalicylamide (179.18), 3.5- Step A. Compounds XII are prepared according to a dimethoxybenzamide (181.19), 1-naphthaleneacetamide modified biotinylation procedure in Van Ness et al., (185.23), 8-chloro-3,5-diamino-2-pyrazinecarboxyamide Nucleic Acids Res., 19,3345 (1991). To a solution of one (187.59), 4-trifluoromethyl-benzamide (189.00), 5-amino-5- of the 5'-aminohexyl oligonucleotides (compounds XI. phenyl-4-pyrazole-carboxamide (202.22), 1-methyl-2- 36, 1 mg) in 200 mM sodium borate (pH 8.3, 250 mL) is 40 benzyl-malonamate (207.33), 4-amino-2,3,5,6- added one of the Tetrafluorophenyl esters (compounds tetrafluorobenzamide (208.11), 2,3-napthlenedicarboxylic X from Example A, 100-fold molar excess in 250 mL acid (212.22). The compounds are placed in DMSO at the of NMP). The reaction is incubated overnight at ambient concentration described above. One ul of the material is then temperature. The unreacted and hydrolyzed tetrafluo mixed with alpha-cyano-4-hydroxy cinnamic acid matrix rophenyl esters are removed from the compounds XII 45 (after a 1:10,000 dilution) and deposited on to a solid by Sephadex G-50 chromatography. Stainless Steel Support. The material is then desorbed by a laser using the Protein Example 12 TOF Mass Spectrometer (Bruker, Manning Park, Mass.) and Preparation of a Set of Compounds of the Formula the resulting ions are measured in both the linear and R-Lys(e-iNIP)-ANP-Lys(e-(MCT-5'-AH-ODN))- 50 reflectron modes of operation. The following m/z values are NH observed (FIG. 11): 121.1->benzamide (121.14) FIG. 10 illustrates the parallel synthesis of a set of 36 122.1->nicotinamide (122.13) T-L-X compounds (X=MOI, where MOI is a nucleic acid 123.1->pyrazinamide (123.12) fragment, ODN) derived from the amines of Example 11 55 124.1 (the same procedure could be used with other T-L-X 125.2 compounds wherein X is an amine). The MOI is conjugated 127.3->3-amino-4-pyrazolecarboxylic acid (127.10) to T-L through the 5' end of the MOI, via a phosphodiester 127.2->2-thiophenecarboxamide (127.17) alkyleneamine group. 135.1->4-aminobenzamide (135.15) Referring to FIG. 10: 60 135.1->tolumide (135.17) Step A. The 5'-6-(4,6-dichloro-1,3,5-triazin-2-ylamino) 136.2->6-methylnicotinamide (136.15) hexyloligonucleotides XII are prepared as described 137.1->3-aminonicotinamide (137.14) in Van Ness et al., Nucleic Acids Res., 19, 3345 (1991). 138.2->nicotinamide N-oxide (138.12) Step B. To a solution of one of the 5'-6-(4,6-dichloro-1,3, 138.2->3-hydropicolinamide (138.13) 5-triazin-2-ylamino)hexyloligonucleotides (compounds 65 139.2->4-fluorobenzamide (139.13) XII) at a concentration of 1 mg/ml in 100 mM Sodium 1402 borate (pH 8.3) was added a 100-fold molar excess of a 147.3->cinnamamide (147.18)