Molecular Tools for Analysis

Deirdre O’ Meara

Doctoral Thesis

Department of Biotechnology Royal Institute of Technology

Stockholm 2001 © Deirdre O’ Meara

Department of Biotechnology Royal Institute of Technology, KTH SCFAB 106 91 Stockholm Sweden

Printed at Universitetsservice US AB Box 700 14 100 44 Stockholm, Sweden

ISBN 91-7283-161-8 Deirdre O’ Meara (2001): Molecular Tools for Nucleic Acid Analysis. Department of Biotechnology, Royal Institute of Technology, KTH, SCFAB, Stockholm, Sweden.

ISBN 91-7283-161-8

Abstract

Nucleic acid technology has assumed an essential role in various areas of in vitro diagnostics ranging from infectious disease diagnosis to human . An important requirement of such molecular methods is that they achieve high sensitivity and specificity with a fast turnaround time in a cost-effective manner. To this end, in this thesis we have focused on the development of sensitive nucleic acid strategies that facilitate automation and high-throughput analysis.

The success of nucleic acid diagnostics in the clinical setting depends heavily on the method used for purification of the nucleic acid target from biological samples. Here we have focused on developing strategies for hybridisation capture of such templates. Using biosensor technology we observed that the hybridisation efficiency could be improved using contiguous oligonucleotide probes which acted co-operatively. By immobilising one of the probes and annealing the second probe in solution, we achieved a marked increase in target capture due to a base stacking effect between nicked oligonucleotides and/or due to the opening up of secondary structure. Such co- operatively interacting modular probes were then combined with bio-magnetic bead technology to develop a capture system for the extraction of hepatitis C RNA from serum. Viral capture with such co-operatively interacting probes extracted 2-fold more target as capture with only a single probe achieving a similar sensitivity to the conventional extraction protocol. An analogous strategy was designed to enrich for sequencing products prior to removing sequencing reagents and template DNA which interfere with the separation and detection of sequencing ladders, especially in the case of capillary gel electrophoresis. This protocol facilitates high throughput clean-up of cycle sequencing reactions resulting in accurate sequence data at a low cost, which is a pre-requisite for large-scale genome sequencing products.

Currently, a large effort is directed towards differential sequencing to identify mutations or polymorphisms both in the clinical laboratory and in medical genetics. Inexpensive, high throughput methods are therefore required to rapidly screen a target nucleic acid for sequence based changes. In the clinical setting, sequence analysis of human immunodeficiency virus (HIV- 1) is used to determine the presence of drug resistance mutations. Here we describe a bioluminometric pyrosequencing approach to rapidly screen for the presence of drug resistance mutations in the protease gene of HIV-1. This sequencing strategy can analyse the protease gene of HIV-1 from eight patients in less than an hour and such non-gel based approaches should be useful in the future in a clinical setting for rapid, robust mutation detection.

Microarray technology facilitates large-scale mutation/polymorphism detection and here we developed a microarray based single nucleotide polymorphism (SNP) genotyping strategy based on apyrase mediated allele specific extension (AMASE). AMASE exploits the fact that mismatched primers exhibit slower reaction kinetics than perfectly matched primers by including a nucleotide degrading enzyme (apyrase) which results in degradation of the nucleotides before the mismatched primer can be extended. We have successfully typed 200 (14% were incorrect without apyrase) by AMASE which cluster into three distinct groups representing the three possible genotypes. In the future, AMASE on DNA microarrays should facilitate association studies where an accuracy >99% is required.

Keywords: nucleic acid capture, modular probes, biosensor, bio-magnetic separation, hepatitis C, sequencing, pyrosequencing, mutation detection, HIV-1, drug resistance, SNP, allele-specific extension, apyrase, genotyping.

© Deirdre O’ Meara, 2001 LIST OF PUBLICATIONS

This thesis is based on the following papers, which in the text will be referred to by their Roman numerals:

I. O’ Meara D., Nilsson P., Nygren P.Å., Uhlén M. and Lundeberg J. 1998. Capture of single-stranded DNA assisted by oligonucleotide modules. Analytical 255, 195-203.

II. Nilsson P., O’ Meara D., Edebratt F., Persson B., Uhlén M., Lundeberg, J. and Nygren P.Å. 1999. Quantitative investigation of the modular primer effect for DNA and peptide nucleic acid hexamers. Analytical Biochemistry 269, 155-161

III. O’ Meara D., Yun Z., Sönnerborg A. and Lundeberg J. 1998. Cooperative oligonucleotides mediating direct capture of hepatitis C virus RNA from serum. Journal of Clinical Microbiology 36, 2454-2459.

IV. Blomstergren A., O’ Meara D., Lukacs M., Uhlén M. and Lundeberg J. 2000. Cooperative oligonucleotides in purification of cycle sequencing projects. Biotechniques 29, 352-363.

V. O’ Meara D., Wilbe K, Letiner T, Hejdeman B., Albert J and Lundeberg J. 2001. Monitoring resistance to human immunodeficiency virus type 1 protease inhibitors by pyrosequencing. Journal of Clinical Microbiology 39, 464-473.

VI. O’ Meara D., Ahmadian A., Odeberg J and Lundeberg J. SNP typing by discriminatory allele extension on DNA microarrays. Submitted. Table of contents

Introduction

1 In vitro amplification of nucleic acids...... 1

1.1 Sample preparation...... 2 1.2 Amplification ...... 3 1.2.1 Target amplification...... 4 1.2.2 Probe amplification...... 5 1.2.3 Signal amplification...... 6 1.3 Analysis and detection...... 7 1.3.1 Heterogeneous formats ...... 8 1.3.2 Homogeneous formats ...... 12

2 Nucleic acid sequencing ...... 13

2.1 Chain terminating techniques ...... 14 2.1.1 Sanger sequencing ...... 14 2.2 Separation and detection...... 16 2.2.1 Automated slab-based sequencers...... 16 2.2.2 Capillary electrophoresis...... 16 2.2.3 Ultrathin gel electrophoresis ...... 16 2.2.4 Microfabricated DNA sequencers ...... 17 2.3 Alternative sequencing strategies...... 17 2.3.1 Sequencing by hybridisation ...... 17 2.3.2 Allele-specific extension...... 18 2.3.3 Mini-sequencing...... 19 2.3.4 ...... 21

3 Chip-based analysis...... 22

3.1 DNA Biosensors...... 23 3.1.1 Ligand immobilisation...... 23 3.1.2 Signal detection ...... 24 3.2 DNA microarrays ...... 25 3.2.1 Array platforms ...... 26 3.2.2 Hybridisation and detection ...... 29 Present Investigation

4Nucleic acid hybridisation using modular primers...... 31

4.1 Investigation of the modular primer effect using biosensor technology (I, II) ...... 32 4.2 Development of capture protocols using modular primers (III, IV) ...... 35

5 Analysis of mutations and polymorphisms ...... 41

5.1 Monitoring drug resistance mutations in HIV-1 by pyrosequencing (V)....42 5.2 SNP typing on DNA microarrays (VI)...... 45

6 Concluding remarks...... 50

Abbreviations...... 51

Acknowledgements ...... 52

References ...... 54

Original papers (I-VI) Molecular Tools for Nucleic Acid Analysis

Introduction

Historical perspective One of the most important advances in the field of biology in the twentieth century has been the understanding that DNA is the carrier of genetic information. However until the 1940’s, most biologists believed that were the chief carriers of hereditary material until the primary experiments by Avery and co-workers in 1944 showed that DNA was the “transforming factor” that conferred virulence upon nonvirulent pneumococci (Avery et al., 1944). The landmark paper in 1953 by Watson and Crick which correctly proposed the double-stranded structure of DNA lead to an understanding of gene function in molecular terms and marked the beginning of the molecular revolution (Watson and Crick, 1953). Since then important milestones in nucleic acid based technology such as nucleic acid hybridisation, DNA sequencing reactions, the polymerase chain reaction and automated fluorescent DNA sequencers have made it possible to streamline and automate most of the processes required for nucleic acid analysis. These technologies have led to the late twentieth/early twenty-first century being called the “Genomic Era” which has been characterised by developments occurring at a tremendous pace allowing whole genomes to be sequenced effectively (Lee and Lee, 2000). Indeed these technologies have led to the release of the human genome in February 2001, four years ahead of schedule. Advances in nucleic acid analysis together with publication of whole genome sequence data for a number of organisms have had widespread applications and in this thesis I will outline three nucleic acid/molecular tools (in vitro amplification, sequencing and DNA chip hybridisation) that have had a major impact in the field of diagnostics and genome sequencing (Figure 1).

1 In vitro amplification of nucleic acids

The ability to specifically amplify very small amounts of nucleic acid has made amplification techniques invaluable tools in areas such as diagnostics, genetic analysis, forensics and environmental testing. The amplification process itself constitutes only part of the complete analysis, as a sample preparation step must usually be performed prior to amplification, which is followed by a detection step.

1 D. O’ Meara

Nucleic acid

Amplification

Detection Chip-based analysis

(Qualitative, (Hybridisation kinetics, Quantitative) Sequencing Expression analysis, Large scale diagnostics) (Whole genomes, Expression profiling, Diagnostics)

Figure 1. Nucleic acid molecular tools and their applications.

1.1 Sample preparation The success of nucleic acid amplification depends greatly on the method used for purification of nucleic acids. The fundamental goals of sample preparation include, release of target, stabilisation from degradation, removal of amplification inhibitors and concentration and re-suspension of target in an amplification compatible buffer. The two main methods of sample preparation involve lysis or target capture or a combination thereof. Simple lysis methods typically use detergents such as sodium dodecyl sulphate or Triton X-100 and chaotropes such as guanidinium isothiocyanate and proteases (Greenfield and White, 1993). If the number of infectious organisms is low, it is frequently necessary to remove amplification inhibitors by additional phenol-chloroform extraction steps and to concentrate target by alcohol precipitation. Target capture involves the binding of nucleic acids in chaotropic solutions to magnetic particles (coated with oligonucleotide dT or a specific probe), silica or glass matrixes (Greenfield and White, 1993). Impurities and amplification inhibitors are removed by magnetic separation, centrifugation or filtration and the nucleic acids are eluted in an aqueous environment compatible with amplification. Target capture offers the advantage of automation, universality for all specimens, concentration of target and removal of inhibitors without the use of hazardous reagents. There are a number of commercially available sample-extraction kits, many of which now exist in

2 Molecular Tools for Nucleic Acid Analysis

an automated version capable of processing up to 96 extractions in a few hours from a variety of samples such as serum, plasma, blood, cells, sputum, tissue or cerebrospinal fluid (Heal, 2000). Whole organisms can also be captured by immunomagnetic separation (IMS) (Olsvik et al., 1994) and following this enrichment step, the cell is lysed and nucleic acid released. Sample preparation can also be integrated with amplification and detection on a microfluidic platform in the so-called “lab-on-a- chip” configuration (Waters et al., 1998; Anderson et al., 2000). Such miniaturisation of the analytical instrument will allow transport of the laboratory to the sample source, enabling point-of-care medical diagnostics and environmental testing.

1.2 Amplification Amplification at the molecular level can be used to increase the concentration of a specific target or probe or it can be used to increase the number of detection signals that will arise from each target (Table 1).

Table 1. Properties of various nucleic acid amplification technologies (modified from Schweitzer and Kingsmore, 2001).

Property PCR LCR NASBA bDNA Invader RCA

DNA target amplification √ xxxxLimited RNA target amplification √ x √ x x Limited Probe amplification x √ x √ x √ Signal amplification x x x √ √ x Multiplexing Little Little Little x x √ Isothermal x x √√√√ Amplification on microarrays xxxx√√ Sensitivity (copies) <10 100 100 500 600 1 Range (logs) 535347 Specificity (allele 50 5000 50 10 3000 1x105 discrimination factor) (PCR;polymerase chain reaction, LCR;ligase chain reaction, NASBA;nucleic acid sequence based amplification, RCA;rolling-circle amplification).

3 D. O’ Meara

1.2.1 Target amplification The most sensitive molecular techniques use an enzymatic step to amplify the target nucleic acid before analysis of the specific sequence. A variety of target amplification techniques have been developed but the polymerase chain reaction (PCR), developed by Kary Mullis in 1983, is the most widely used. The PCR is based on the ability of DNA polymerase to copy a strand of DNA by elongation of complementary strands initiated from a pair of oligonucleotide primers (Saiki et al., 1985; Mullis and Faloona, 1987). During PCR, the amplification goes through an exponential phase allowing target amplification of one billion-fold or more. The PCR has been systemically improved in many ways since its discovery and a number of these modifications are listed in Table 2.

Table 2. Modifications of PCR

Modification References

Hot-start (Chou et al., 1992) Nested-PCR (Haqqi et al., 1988) (Mullis and Faloona, 1987) Touchdown PCR (Don et al., 1991) RT-PCR (Erlich et al., 1991) Multiplex PCR (Chamberlain et al., 1988) Asymmetric PCR (Innis et al., 1988; Shyamala and Ames, 1989) Real-time PCR (Holland et al., 1991; Tyagi and Kramer, 1996; Nazarenko et al., 1997; Wittwer et al., 1997a; Whitcombe et al., 1999) Miniaturisation (Wittwer et al., 1997a; Kopp et al., 1998)

Two important determinants of PCR performance are primer design and annealing temperature and a number of computer applications are available to assist with primer construction. Guidelines for the development of new PCR assays and information on primer design software have recently been reviewed (Johnson, 2000). The extreme sensitivity of PCR also results in a potential risk for false-positive signals due to contamination. A number of suggestions exist to minimise this risk (Kwok and Higuchi, 1989; Fredricks and Relman, 1999) and inclusion of various controls will help identify any potential false-positive results. If PCR is being used for diagnostic purposes, then a number of process controls must be incorporated to ensure that negative results are indeed due to absence of detectable target. Commercial

4 Molecular Tools for Nucleic Acid Analysis

systems for PCR based detection of bacterial (e.g. Chylamidia trachomatis and Mycobacterium tuberculosis) and viral targets (e.g. hepatitis C virus [HCV] and human immunodeficiency virus [HIV-1]) are manufactured by Roche Molecular Systems. Another DNA target amplification assay that has been commercialised is the strand displacement assay (SDA) (Walker et al., 1992). A typical PCR involves 25-35 cycles usually taking two-three hours and is often the rate-limiting step in nucleic acid analysis. The advent of miniaturisation has facilitated improvements in PCR instrumentation allowing a reduction in sample volume, reduced reagent cost, shorter cycling times and higher-throughput, with automation offering the possibility of “hands-free” analysis. Reaction times have been reduced to 90 seconds by adapting PCR to run in a continuous flow system on a chip (Kopp et al., 1998), while real-time PCR has been performed in a rapid air thermal cycler in 10-20 minutes (Wittwer et al., 1997a). Affymetrix have also developed a miniature device that integrates sample extraction, nested PCR and mutation detection of HIV-1 by array hybridisation (Anderson et al., 2000). based amplification systems such as self sustained replication (3SR) (Guatelli et al., 1990) and nucleic acid sequence based analysis (NASBA) (Compton, 1991) are alternative target amplification strategies, which use RNA polymerase, reverse transcriptase (RT) and RNase H to amplify RNA. These are isothermal techniques that are technically less robust than PCR but are useful when selective amplification of RNA is desired. A kit for HIV-1 and cytomegalovirus (CMV) detection based on NASBA is commercially available (Organon Teknika).

1.2.2 Probe amplification

The ligase chain reaction (LCR) (Barany, 1991) which is based on the oligonucleotide-ligation assay (OLA) (Landegren et al., 1988) is a probe amplification technique that uses DNA ligase to join two probe ends that are juxtaposed on the target DNA. Repeating the process results in a exponential accumulation of ligation products, which can be detected using the functional groups attached to the probes. A modified technique called Gapped LCR, which uses a DNA polymerase and a ligase reduces the risk of false-positive results from blunt-end ligation (Ayyadevara et al., 2000). Since only the probe is amplified, LCR is limited to applications where identification of a signature nucleotide sequence in the sample is sufficient for target identification. However, the method is quite sensitive to

5 D. O’ Meara

mismatches and has been useful in detection of point mutations at the ligation site (Landegren et al., 1988). Padlock probes are a variant of the ligation principle, where if there is complete homology to the target, the outer ends of a linear probe are ligated to form a circularised probe intertwined with the target DNA (Nilsson et al., 1994). This technology is compatible with rolling-circle amplification (RCA) systems, allowing selective detection of the ligated circles (Baner et al., 1998) (Figure 2). This new amplification strategy allows multiplex amplification of circular probes and facilitates genotyping through allele-specific ligation of open circular probes (Lizardi et al., 1998; Faruqi et al., 2001). Immobilised linear allele-specific ligated probes can also be used to prime RCA allowing the allelic status of the target to be determined (Lizardi et al., 1998). A number of other probe amplification techniques also exist and include QB replicase (Kramer and Lizardi, 1989) and cycling probe technology (CPT) (Duck et al., 1990).

s/s DNA target s/s target s/s DNA target RCA Ligated primer padlock DNA Open circle probe Ligated padlock polymerase

s/s DNA generated by rolling circle

Figure 2. Probe circularisation by ligation and amplification by RCA. A padlock probe is hybridised to target DNA and if perfectly matched is ligated by DNA ligase. A primer complementary to the padlock probe then initiates replication of the probe via a rolling-circle mechanism using a strand displacing DNA polymerase. This generates a single-strand product containing many copies in tandem of the sequence complementary to the circular probe. To achieve further amplification, highly branched RCA (HRCA) can be carried out by adding a second primer which initiates copying of the RCA product.

1.2.3 Signal amplification

An alternative approach to target and probe amplification is to amplify the signal generated by detection of the target sequence. Branched DNA signal amplification (bDNA) was developed by Chiron Corporation and uses multiple DNA probes, first for capture and subsequently for increasing the number of potential binding sites from which the reporter signal is eventually generated (Urdea et al., 1991). The signal generated is in proportion to the amount of target and thus DNA quantification can be

6 Molecular Tools for Nucleic Acid Analysis determined from a calibration curve. This approach is less sensitive than target amplification but has several advantages; the risk of contamination is minimal and bDNA assays are capable of detecting a broader range of viral genotypes than PCR based assays, which have restricted specificity due to their limited number of target sequences. This is potentially a problem when detecting viral RNA genomes which can show high sequence heterogeneity (Hawkins et al., 1997). Another signal amplification technology is the Invader assay, which involves multiple cleavage cycles of an overlapping probe complex (Figure 3). This cleavage results in the release of a signal probe that is either detected directly (Lyamichev et al., 1999) or triggers cleavage of a labelled probe releasing 107 fluourescent events for each target molecule (Hall et al., 2000). The endonuclease that performs cleavage is sensitive to the presence of a mismatch at the overlapping site which is utilised for single nucleotide polymorphism (SNP) typing and mutation detection (Hessner et al., 2000).

5' Flap Cleavage site Invader probe Primary probe 5' 3' 3' 5' Target

Cleavage site Q F FRET probe Cleaved flap

Flap/FRET bridging probe

Figure 3. Schematic representation of the Invader assay. The upstream invasive probe and the downsteam primary probe hybridise to the target forming an overlapping hybridisation complex that is a substrate for the cleavage enzyme. The primary probe contains two regions; an analyte specific region and a non-complementary arm which is cleaved from the probe when the overlapping probe complex is formed. This cleaved flap is then used to drive a secondary invasive reaction resulting in the separation of the fluorescence resonance energy transfer (FRET) pair which gives rise to a fluourescent signal.

1.3 Analysis and detection Nucleic acid based analysis can be carried out in either a heterogeneous or homogenous manner. Heterogeneous assays require further manipulation once the amplification reaction is complete which may involve aliquoting a sample for a number of analytical procedures or adding some component of a signal detection system. Homogenous assays require no further manipulation once the reaction is set

7 D. O’ Meara

up as signal generation and detection are carried out in a single closed tube when the reaction is complete or continuously during the amplification (real-time detection). The quantity of samples and the number of analyses required from one sample usually dictate the choice between homogeneous and heterogeneous systems. When a large number of samples are to be analysed, heterogeneous chip-based assays become more attractive.

1.3.1 Heterogeneous formats PCR products may be analysed qualitatively or quantitatively depending on the objective of the experiment. Detection is usually linked to gel electrophoresis, hybridisation/capture to a solid support (microtitre plate, magnetic beads or DNA chip) or enzymatic extensions such as single base extension (SBE) or pyrosequencing.

(i) Qualitative analysis Qualitative analysis can be defined as specific product detection (e.g. detection of a pathogen) or determination of sequence variation (e.g. detection of a viral subtype). Many of the methods used to perform product detection and sequence verification overlap and are listed in Table 3.

Table 3. Methods for qualitative analysis of amplification products

Detection of Category Techniques specific sequence Basis for distinction product variation

Electrophoretic Agarose gel electrophoresis √ x/√ Size Sanger sequencing √ √ Nucleotide sequence SSCP* n/a† √ Mobility shift DGGE n/a √ Mobility shift HA n/a √ Mobility shift

Chromatographic DHPLC n/a √ Mobility shift

Hybridisation Southern blotting √ √ Probe & restriction sites Reverse dot blot √ x/√ Probe sites ASO √ √ Probe sites

Enzymatic RFLP n/a √ Restriction sites CFLP n/a √ Cleavage sites Pyrosequencing √ √ Nucleotide sequence Mini-sequencing √ √ Nucleotide sequence

Mass Spec MALDI-TOF √ √ Nucleotide sequence

*See page 51 for abbreviations. †n/a (not applicable)

8 Molecular Tools for Nucleic Acid Analysis

Specific product detection: Gel electrophoresis allows for convenient product analysis allowing a yes/no answer with an estimate of product quality, quantity and size. PCR product detection can also be performed via hybridisation assays which specifically detect the amplified product e.g. hybridisation in a reverse dot-blot format where a capture probe is immobilised followed by hybridisation of the denatured amplicon (Saiki et al., 1989). Detection is performed either directly by detection of labels incorporated in the amplicon e.g. fluorophores, or indirectly in a sandwich format through immunoenzymatic methods or streptavidin conjugated enzymes. This format has largely superseded the original dot-blot format where amplified products labelled with biotin/digoxigenin were immobilised via biotin/digoxigenin antibodies coupled to a solid support, which was followed by detection (directly or indirectly) with an oligonucleotide probe. Amplified products can also be immobilised by performing PCR with a PCR primer covalently coupled to a solid support (Oroskar et al., 1996). Determination of sequence variation: Identification of sequence variation is used in molecular typing of bacterial or viral nucleic acids and for epidemiological pathogenesis and monitoring disease progression. Here a brief outline is given of some of the methods used for the analysis of sequence variation (Table 3). DNA sequencing is recognised as the “golden standard” for establishing the identity of known and unknown sequence variants. However, alternate electrophoretic based mutation detection techniques such as single-strand conformation polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE) and heteroduplex analysis (HA) can be used to scan a sequence for alterations based on the fact that sequence differences cause mobility shifts. Denaturing high performance liquid chromatography (DHPLC) can be used instead of electrophoresis to determine mobility shifts, facilitating analysis in an automated rapid fashion. Restriction fragment length polymorphism (RFLP) and cleavage fragment polymorphism (CFLP) are based on variations in the size and number of fragments generated after digestion with restriction and cleavage (endonuclease or ribonuclease) enzymes respectively and can be used for comparative purposes in epidemiological and forensic studies and to characterise bacterial and viral subtypes. Enzymatic reactions such as OLA, pyrosequencing and SBE can also be carried out to screen for point mutations and SNPs, with the later now being performed in an array format (Pastinen et al., 1997; Hirschhorn et al., 2000). The recent publication by

9 D. O’ Meara

Kristensen et al. should be consulted for a comprehensive review of these techniques (Kristensen et al., 2001). Hybridisation with allele-specific oligonucleotides (ASO) is a widely applied method for detecting point mutations and SNPs and is based on determining differences in hybridisation between a perfect match and a mismatch. A line probe assay applying this technology for HCV typing (Stuyver et al., 1993; Le Pogam et al., 1998) and for detection of drug resistance mutations in HIV (Stuyver et al., 1997) is commercially available. With the advent of hybridisation based DNA chip technology, ASO hybridisation can be performed in a high-throughput manner and these oligonucleotide arrays have been used to screen for mutations in BRCA1 (Hacia et al., 1996), the cystic fibrosis gene (Cronin et al., 1996) and HIV-1 (Vahey et al., 1999) and for SNP identification and typing (Wang et al., 1998). An emerging new technology is the use of the mass spectrometer for the analysis of small DNA molecules. Its primary advantages are the speed and accuracy of sequence determination and the fact that no labels are needed for detection (Graber et al., 1999).

(ii) Quantitative analysis The polymerase chain reaction can also be used to estimate the quantity of a particular target DNA or RNA and is used to determine the load of an infecting pathogen, to monitor therapeutic response and for RNA expression analysis. However, since PCR is an exponential reaction, this means that small variations in amplification efficiency can yield large changes in the amount of PCR product generated. This characteristic together with the fact that later cycles of PCR exhibit a plateau effect makes it difficult to use PCR to obtain quantitative data. However there are several methods of quantitative PCR/RT-PCR that have been described in the literature and the strategy chosen often depends on whether an absolute or relative quantitation is required. The most widely used approach for determining absolute target amounts is quantitative competitive PCR (QC-PCR). In QC-PCR, standards are co-amplified with target nucleic acid to correct for tube-to-tube variations in amplification efficiency and to determine absolute amounts of initial target. These standards are usually external references that are amplified with the same primers as the native target to ensure that the reference and target sequences are amplified with equal efficiency. Generally, these standards are constructed by modifying a clone of the target sequence so that it

10 Molecular Tools for Nucleic Acid Analysis can be distinguished either by size difference or by insertion/removal of a restriction enzyme site. Quantitative competitive PCR relies on the co-amplification of a dilution series of these standards with a constant amount of native target, facilitating analysis in the exponential or plateau phase (Becker-Andre and Hahlbrock, 1989; Gilliland et al., 1990; Lundeberg et al., 1991) . Since the standard and target compete for primers and enzymes there will be equivalent amplification when both molecules are present in equimolar concentrations at the start of the reaction. By plotting the results (log standard signal/target signal versus log amount of standard added) the equivalence point can be calculated allowing determination of the initial amount of target. Limiting dilution experiments using nested PCR combined with Poisson statistics can also be used for an estimation of the initial number of target molecules present, without quantitation of the actual amount of PCR product formed (Brinchmann et al., 1991; Lundeberg et al., 1991; Sykes et al., 1992). This is based on the theory that one target molecule should give rise to a PCR product and to achieve this, multiple replicates of serial dilutions are assayed. This involves handling many PCR tubes and since nested PCR is used, this approach is susceptible to contamination resulting in false-positive end point results. However an advantage of this method is that target concentration can be roughly estimated without the need for standards or expensive equipment. Some of these drawbacks may be overcome in the future by performing the PCR in a nanolitre scale with real-time detection thereby avoiding nested PCR (Kalinina et al., 1997). This approach may also have an advantage when quantifying viral targets as small mismatches in the primer sequences are unlikely to affect quantitation to the same extent as mismatches in QC-PCR assays (Hawkins et al., 1997). Homogeneous quantitation methods based on amplification kinetics, which allow simultaneous amplification and quantitation in real-time have been developed and are discussed below and further in a recent review (Bustin, 2000). These strategies allow both relative and absolute quantitation, eliminating the tedious preparation of dilution series and have resulted in a robust, reproducible, quantitative PCR assay. In addition a number of commercial quantitation assays based on PCR, NASBA and bDNA are available for microbial and viral pathogens. These kits are widely used in clinical laboratories for HCV and HIV-1 quantitation and even though they use different

11 D. O’ Meara

extraction and amplification protocols, they show similar performance characteristics (Lin et al., 1998).

1.3.2 Homogeneous formats Homogenous assay formats simplify detection of amplification products and minimise problems of cross contamination and are useful for gene quantitation, RNA expression analysis and detection of small numbers of mutations in nucleic acid targets. These homogeneous assays can be characterised as non-specific or specific. Non-specific methods detect the presence or absence of amplicons but provide no information about the nature of the amplified product. Intercalating agents such as ethidium bromide, SBYR green or labelled hairpin primers (Sunrise primers) (Nazarenko et al., 1997) are examples of non-specific methods that are used to determine the amount of PCR product generated. Since these methods are prone to “false-positives”, in that undesirable product such as primer-dimer and non-specific products can increase fluorescence, it is desirable to use specific methods that probe the amplification product. Such systems are based on the phenomenon of FRET and have exploited various probe designs to allow the amplification to be monitored at the endpoint (post-amplification) or in real-time. Fluorescence resonance energy transfer requires a donor and an acceptor fluorophore and involves excitation of the donor, which results in energy transfer to the acceptor. In response, the acceptor can either emit light of a longer wavelength or dissipate the energy in the form of heat (quenching). Examples of such techniques are exonuclease probes (TaqMan) (Holland et al., 1991), hairpin probes (Molecular Beacons) (Tyagi and Kramer, 1996), hairpin primers (Scorpion primers)(Whitcombe et al., 1999) and binary hybridisation probes (Wittwer et al., 1997b). The hairpin primers and probes are more applicable to multiplexing than the TaqMan format as they can used a wider variety of FRET pairs, as the donor and quencher do not need to have overlapping spectra probably because they are brought into close proximity in the hairpin loop conformation (Tyagi et al., 1998). At least four new instruments combining a thermal cycler, laser and a detection/software system to automate detection and quantitation of nucleic acids are available commercially and perform one or all of these assays (ABI Prism 7700 [PE Biosystems], Light Cycler [Roche], SmartCycler [Cepheid] and I-Cycler [BioRad]). These real-time PCR systems facilitate quantitative analysis because the number of

12 Molecular Tools for Nucleic Acid Analysis

cycles needed to reach a threshold amount of PCR product is directly proportional to the target copy number (Heid et al., 1996). They have been used for RNA expression studies, gene copy number determination, detection of pathogens and for the analysis of polymorphisms/mutations (Bustin, 2000). Sequence variation among viral targets such as HCV and HIV can present problems for these probe based real-time detection methods (Takeuchi et al., 1999) and consequently non-specific detection formats may simplify the reaction if a conserved probe sequence is difficult to find. Indeed, DNA specific dyes have been used with these instruments (Light Cycler) for quantitation using re-annealing kinetics and melting curve analysis for discrimination of sequence variants (de Silva and Wittwer, 2000).

2 Nucleic acid sequencing

The past two decades have seen nucleic acid sequencing (DNA and RNA) evolve from complicated laboratory procedures performed by skilled researchers, to automated familiar techniques available to biologists and nonbiologists alike, with the Human Genome Project (HGP) having being the main driving force behind this technological advance. Among the most significant applications of this technology is the sequencing of whole genomes e.g. baker’s yeast (Saccharomyces cervisiae) (15 Mb) (Goffeau et al., 1996); the nematode worm (Caenorhabditis elegans) (97 Mb) (The C. elegans Sequencing Consortium, 1998); Drosophila melanogaster (120 Mb) (Adams et al., 2000); Arabidopsis thaliana (125 Mb) (The Arabidopsis Genome Initative, 2000); and the human genome (3000 Mb) (McPherson et al., 2001; Venter et al., 2001). These whole genome sequences facilitate investigation of gene expression patterns and help understand evolution and the causation of disease. In addition to the sequencing of whole genomes, DNA sequencing is also used for gene expression profiling which involves the sequencing of Expressed Sequencing Tags (EST) or short tags generated by Serial Analysis of Gene Expression (SAGE). DNA sequencing is also used for discovery and typing of mutations (Berg et al., 1995) and polymorphisms [e.g. SNP, insertions and deletions] within a genome (Mullikin et al., 2000). Indeed, The SNP Consortium (TSC) have to date (August 2001) mapped over one million SNPs by conventional DNA sequencing. This information has been

13 D. O’ Meara

complied in a SNP database facilitating the use of resequencing techniques such as mini-sequencing reactions etc. to rapidly screen for particular SNPs. Currently there is a huge focus on SNPs as it is believed that they will provide the genetic markers for the next era in human genetics due to their abundance throughout the genome and the relative ease of developing automated large-scale SNP typing assays. DNA sequencing can obviously also be used in many other areas such as sub-typing of bacteria (Spratt, 1999) and viruses (Arens, 2001) by sequencing variable regions of their genome e.g. sequencing of the 5′ non-translated region (NTR) PCR products from the Amplicor quantitation kit has been used for subtype classification of HCV (Germer et al., 1999). Sequencing can also facilitate identification of previously identified or unknown pathogens and consequently it can be used for molecular epidemiological studies and for investigations into pathogenesis, drug resistance mutations and disease progression.

2.1 Chain terminating techniques Despite the increased demand for high-throughput genome sequencing efforts, the basic technique for DNA sequencing developed more than twenty years ago is still in use today. This method (Sanger sequencing) developed by Sanger in 1977 performs DNA sequencing by polymerase mediated dideoxy chain termination (Sanger et al., 1977). A second sequencing strategy (Maxam-Gilbert sequencing) also described in 1977, uses chemical degradation but is in limited use due to the necessity of using toxic chemicals (Maxam and Gilbert, 1977). These methods are similar in that they generate a set of fragments with a common 5′ origin and a base specific 3′ terminus which are separated by electrophoresis allowing the sequence of the target to be determined.

2.1.1 Sanger sequencing Sanger sequencing for slab gel electrophoresis is performed in either a four lane-one dye format (Ansorge et al., 1986) or a one-lane four dye format (Smith et al., 1986). The former generates raw data that is more easily interpreted as a single dye is used and consequently the four lanes have equal mobility. Such formats are useful in diagnostics and forensics when relative peak heights need to be compared to interpret polymorphic nucleotide positions. However the throughput is slow and therefore the

14 Molecular Tools for Nucleic Acid Analysis

most common sequencing approach has been the four dye-one lane format. Commercial instruments are available in both formats; the automated laser fluorescence sequencer (ALF) from Amersham Pharmacia uses four lanes per sample, while the slab based sequencers from Applied Biosystems uses a one lane detection system involving base calling algorithms of the raw data which can slightly hamper the analysis of polymorphic positions. Fluorescent labels are incorporated into the sequencing fragments via the dideoxynucleotide (dye- sequencing) (Prober et al., 1987) or the sequencing primer (dye-primer sequencing) (Ansorge et al., 1986; Smith et al., 1986). Dye-terminator sequencing is more versatile as any unlabelled primer can be used for sequencing and since each of the four dideoxynucleotide triphosphates (ddNTPs) are labelled with a different dye, four extension reactions can be carried out in one tube reducing reagent cost and labour. Significant improvements have been made to the DNA polymerase used in the Sanger sequencing protocol. Initially, isothermal enzymes (T4 or T7 DNA polymerases) were used that showed even incorporation of all four terminators but were sensitive to temperature and degraded easily (Tabor and Richardson, 1987). With the discovery of a thermostable DNA polymerase, PCR style sequencing reactions (cycle sequencing) became possible and have grown in popularity due to a lower template requirement, effective denaturation of double-stranded DNA (avoiding time consuming single- strand preparation) and ease of automation (Murray, 1989). Genetic engineering of the DNA polymerase solved problems with discrimination of the enzyme for deoxynucleotide triphosphates (dNTPs) over ddNTPs (Tabor and Richardson, 1995). In general, the sequencing strategy selected to generate Sanger sequencing data depends on the template sequenced (e.g. genomic DNA, cDNA or viral template), the purpose of sequencing, the throughput required and the importance of sequence read- length and quality. The sequencing strategy employed is of particular importance when sequencing mutations/polymorphisms in cancer cells and viruses as differential incorporation of terminators, presence of prematurely terminated fragments, mobility shifts and poor signal-to-noise ratios in sequencing ladders can all complicate the interpretation of heterozygous nucleotide positions (Kronick, 1997).

15 D. O’ Meara

2.2 Separation and detection

2.2.1 Automated slab-based sequencers

The most commonly used format for DNA sequencing has been with automated fluorescence DNA sequencers which combine electrophoresis in a vertical slab gel with on-line detection of the fluorescently labelled fragments after excitation by a laser beam. Most commercial instruments (ABI prism and ALFexpress are the most widely used) have a capacity of 10-96 samples per run, with a run taking from 2 hours to 10 hours due to strict limitations in electric field strength.

2.2.2 Capillary electrophoresis These slab gels are gradually being replaced by capillary gel electrophoresis (CGE) which offers advantages such as automation, higher-throughput and reduced reagent and labour costs. The high surface to volume ratio of a capillary can effectively dissipate the heat produced during electrophoresis, allowing higher voltage and thereby facilitating faster sequencing runs. The small sample requirement of CGE, typically five nanolitres, provides an opportunity to reduce the amount of template DNA and reagents leading to a reduction in cost. Due to the smaller sample volumes, the development of dyes with stronger fluorescence signals was a significant advance to CGE (Ju et al., 1995). These energy transfer (ET) dyes contain a common donor dye paired with one of four acceptor dyes. A single excitation wavelength excites the common donor fluorophore, which then excites the acceptor fluorophore. Commercial capillary array instruments are now available and two of the most widely used are from Molecular Dynamics (MegaBACE 1000) and Perkin Elmer Biosystems (ABI 3700). Both instruments can automatically process up to 96 samples per run (~2-4 hours) and are capable of read-lengths of up to 800 bases.

2.2.3 Ultrathin gel electrophoresis Ultrathin sequencing gels can also withstand a higher level of voltage than traditional slab gels thereby allowing faster separation of the DNA fragments. Visible Genetics supply a disposable vertical gel cassette that allows rapid low-throughput dye-primer sequencing (400 bases in 30 minutes). It is used mostly for molecular diagnostic applications such as HIV, HCV and human papilloma virus (HPV) genotyping, p53 mutation detection and fragment analysis. A higher through-put system from MJ

16 Molecular Tools for Nucleic Acid Analysis

Research uses horizontal electrophoresis and can sequence 600 bases in ninety minutes in a 96 well format (Kostichka et al., 1992).

2.2.4 Microfabricated DNA sequencers A totally different platform to perform capillary electrophoresis is to use microchips or microfabricated devices. Recently four-colour separation of sequencing fragments in microfabricated capillary electrophoresis channels was performed with 600 bases sequenced in 20 minutes (Liu et al., 1999). In principle, these platforms offer higher- throughput and should be compatible with on-line systems that automate and integrate thermal cycling, purification, loading and capillary electrophoresis steps (Swerdlow et al., 1997; Tan and Yeung, 1998).

2.3 Alternative sequencing strategies In light of the near completion of the HGP, attention has shifted slightly from de novo sequencing towards resequencing. Resequencing can be defined as the sequencing of a region whose “normal” sequence is already known, to search for mutations and polymorphisms. Resequencing techniques have the potential to offer high-throughput analysis, facilitating SNP typing for use in association studies. This technology can also be applied in the field of diagnostics for mutation detection and viral genotyping. Most of the alternative sequencing strategies described here can be used both for de novo sequencing and resequencing with the exception of allele-specific extension and SBE.

2.3.1 Sequencing by hybridisation

Two alternate strategies using oligonucleotide arrays to perform sequencing by hybridisation (SBH) were described by a number of group in the late 1980’s (Bains and Smith, 1988; Drmanac et al., 1989; Khrapko et al., 1989; Southern et al., 1992). In format I, oligonucleotides are hybridised to immobilised targets and formation of strong duplexes between the perfectly matched probe and target DNA allows the target sequence to be reconstructed from the hybridisation pattern (Drmanac et al., 1989). This technology has been successfully applied to sequence 1.1 Kb of the p53 gene (Drmanac et al., 1998) but since thousands of separate hybridisation reactions were needed for this approach, its utility will depend on the level of automation

17 D. O’ Meara

achieved. An alternate format (format II) is based on the hybridisation of the target DNA to a large array of short overlapping probes of known sequence (Khrapko et al., 1989). This SBH approach (format II) offers the most potential and should be useful in the future for sequence verification and mutation analysis.

2.3.2 Allele-specific extension

Allele-specific extension is based on the technique of amplification refractory mutation system (ARMS), which uses allele-specific primers in the PCR to take advantage of the lowered intrinsic efficiency of DNA polymerases to extend mismatched primers compared to matched primers (Newton et al., 1989). In ARMS, two PCR reactions are performed with a different allele-specific primer, with extension and amplification only occurring if the primer is matched to the template DNA at its 3′ end. While ARMS must detect a specific sequence prior to amplification, allele-specific extension detects the variant after PCR. Allele-specific extension is performed on the PCR product using primers that differ at the 3′ end, defining the nucleotide variant of the polymorphic position. Labelled nucleotide(s) are incorporated into the extension product followed detection of the incorporated fluorophore after a brief washing step (Figure 4).

C G PCR product Denature PCR product and anneal allele-specific primers

Extension 1 Extension 2 T C G G

Extension with flourescent dNTPs

T C ** G G

Measure signal Measure signal

Genotype=ratio of signal 1 signal 2

Figure 4. Schematic representation of allele-specific extension. The single-strand template is split in two and annealed to allele-specific primers (i.e. 3′ end of the primers defines the variant nucleotide). These primers are enzymatically extended in the presence of labelled dNTPs and the signal for each extension reaction (after a washing step) determines which primer was extended, thereby allowing the nature of the nucleotide at the polymorphic site to be ascertained.

18 Molecular Tools for Nucleic Acid Analysis

These extension assays have recently been adapted to microarray formats, which offer the potential for high-throughput genotyping (Pastinen et al., 2000; Erdogan et al., 2001). However despite the potential of allele-specific extension, this technique is not widely used as there have been many reports on mismatches are not refractory to extension (Newton et al., 1989; Kwok et al., 1990; Fry et al., 1992; Huang et al., 1992; Ayyadevara et al., 2000).

2.3.3 Mini-sequencing

(i) Single base extension These mini-sequencing reactions were first described by Syvänen in 1990 and employ template-directed primer extension to screen for SNPs and point mutations (Syvänen et al., 1990). They are based on extension of a primer by DNA polymerase in the presence of a labelled dideoxynucleotide, which allows determination of the nature of the base immediately 3′ of the sequencing primer (Figure 5). To facilitate signal detection, a number of fluorophores are used (Head et al., 1997) or separate detection reactions of one label are performed (Syvänen, 1994).

C PCR product G Denature PCR product and anneal SBE primer

G

Extension 1 with Extension 2 with labelled ddCTP labelled ddTTP

T * C * G G

C * Measure signal Measure signal

Genotype=ratio of signal 1 signal 2

Figure 5. Schematic representation of SBE. Single-strand template is annealed to a SBE primer whose 3′ end is adjacent to the polymorphic site. The template and primer mix is then divided in two and the primers are enzmatically extended using a specified labelled ddNTP. The nature of the polymorphic site is determined from the signals generated from each extension mix.

19 D. O’ Meara

Extension is carried out in solution or on a solid-phase such as magnetic beads, manifold supports or on DNA microarrays (Syvänen, 1999). Extension reactions carried out on oligonucleotide microarrays have discriminatory power that is one order of magnitude higher than that of ASO hybridisation on DNA chips (Pastinen et al., 1997). Single base extension on microarrays is also easier to set up as only one mini-sequencing primer is needed per variable position as opposed to high-density arrays, where a series of overlapping probes for each position to be analysed is required. Modifications of this SBE method are TAG arrays where the array is independent of the specific markers typed (Hirschhorn et al., 2000) which has facilitated minisequencing on high-density oligonucleotide arrays (Fan et al., 2000).

(ii) Pyrosequencing Pyrosequencing, a sequencing by synthesis method was first described in 1993 and relies on the sequential addition and incorporation of nucleotides in a primer-directed polymerase extension (Nyren et al., 1993). During DNA synthesis, pyrophosphate (PPi) is released which is coupled with the enzymes sulfurylase and luciferase to generate detectable light. This light is proportional to the number of nucleotides incorporated allowing the sequence to be determined in real-time (Figure 6). A significant advance for this sequencing by synthesis approach was the introduction of apyrase, a nucleotide degrading enzyme which eliminated the need for washing steps after each nucleotide addition (Ronaghi et al., 1998). A recently described pyrosequencing approach performed on double-stranded DNA, which enzymatically removed amplification primers and nucleotides from the preceding PCR (which would otherwise interfere with the primer extension reaction), has potential for increasing the capacity of pyrosequencing (Nordström et al., 2000). An important factor in pyrosequencing, is the balance between the polymerase and apyrase as nucleotide degradation competes with nucleotide incorporation. Thus to obtain accurate pyrosequencing data, nucleotide degradation by apyrase has to be slower than nucleotide incorporation by the DNA polymerase. However, several factors such as insufficient exposure of polymerase to nucleotides, insufficient nucleotide degradation by apyrase and contamination by kinases, limit the read-length as discussed in a recently published review on pyrosequencing (Ronaghi, 2001). A fully automated instrument capable of sequencing 96 reactions simultaneously is now commercially available (Pyrosequencing AB) allowing the sequencing of short

20 Molecular Tools for Nucleic Acid Analysis stretches of DNA (20-30 bases) in less than one hour. This instrument has been used for a wide range of applications ranging from SNP typing (Ahmadian et al., 2000; Gustafsson et al., 2001) and mutation screening (Garcia et al., 2000) to bacterial (Monstein et al., 2001) and viral (Gharizadeh et al., 2001) typing. Pyrosequencing has also been used for de novo sequencing of cDNA tag libraries facilitating gene identification (Agaton et al., 2001; Nordström et al., 2001).

Polymerase 3 TAGAAGGACCGTTTAAGT 5' 5' ATCTT Primer dTTP PPi Sulfurylase Apyrase ATP dTMP + 2Pi Luciferase Light

Real-time monitoring

Nucleotides added:

DNA sequence: ATCTT

Figure 6. Schematic diagram of pyrosequencing. The reaction mixture consists of a single-strand DNA with a short annealed primer, DNA polymerase, adenosine triphosphate (ATP) sulfurylase, luciferase and apyrase. The four nucleotide bases are added to the mixture in a defined order e.g. A, C, G and T. If the added nucleotide forms a base pair (in this case, two “T”s base-pair to the template), the DNA polymerase incorporates the nucleotide and consequently PPi is released. The released pyrophosphate is converted to ATP by ATP sulfurylase which is used by luciferase to generate detectable light. This light is proportional to the number of nucleotides incorporated and is detected in “real-time”. The pyrosequencing raw data is displayed simultaneously and in this example the sequence generated reads “ATCTT”. If the nucleotide does not form a base pair with the DNA template, it is not incorporated by the polymerase but is degraded by apyrase.

2.3.4 Mass spectrometry Important advances in the ionisation of biopolymers in the gas phase during the last decade has allowed mass spectrometry (MS) to be considered a fast and accurate method for DNA sequence analysis. One mode of MS sequencing that seems to be the

21 D. O’ Meara

most promising is matrix-assisted laser desorption ionisation time of flight spectrometry (MALDI-TOF). In MALDI-TOF MS the nucleic acid is desorbed, ionised and subjected to an intense electric field which accelerates the fragments proportional to their mass-charge ratio with detection based on the time required for each particle to reach the detector. Although MS has been used for DNA sequencing (Fu et al., 1998), it offers most potential for typing single point mutations or SNPs. A number of MALDI-TOF MS genotyping approaches have been described and include the PROBE (Braun et al., 1997) and the PinPoint assay (Haff and Smirnov, 1997; Ross et al., 1998). These protocols subject the target to primer extension with a specific mix of dNTPs and ddNTPs or with a single ddNTP respectively. The DNA template and the extended primer are then separated and the primer product is subjected to MALDI-TOF MS, which allows characterisation of the base(s) downstream of the 3′ end of the primer. A drawback of MALDI-TOF MS is that it relies heavily on stringent purification. However, a recently described protocol called the GOOD assay has overcome some of these problems by chemically modifying the extension product thus eliminating the need for purification prior to MS detection (Sauer et al., 2000). An approach that offers potential for large-scale genotyping is to combine MS with DNA chip technology where the probe is released from the chip allowing interpretation of the extension product (Tang et al., 1999).

3 Chip-based analysis

Recently DNA biosensors and DNA chips have stimulated considerable interest due to their ability to obtain sequence-specific information in a fast, simple and inexpensive manner in comparison to traditional hybridisation/sequencing assays. The fundamental difference between DNA biosensors and DNA chips is that biosensors incorporate a biologically active layer at the surface of a recognition element which translates a biorecognition event (e.g. hybridisation) into a signal whereas the DNA chip surface must be scanned or imaged after hybridisation and washing to obtain the

complete hybridisation pattern (Figure 7).

22 Molecular Tools for Nucleic Acid Analysis

Biosensor Microarray

Target Target analyte Hybridise and wash Biorecognition layer Excite with lasers Transducer Probe/target duplex Microarray

Optical Signal Electrochemical Scan Pizoelectrical

Figure 7. Biosensors couple the biorecognition event and signal generation while with microarray technology the slide is scanned after hybridisation.

3.1 DNA Biosensors As described above, a biosensor is a device that combines the specificity of a biological sensing element with a transducer to produce a signal proportional to the target analyte concentration. Although biosensors have been most widely used to study -protein interactions this technology also offers the possibility to perform real-time binding measurements of DNA-protein interactions (Cheskis and Freedman, 1996) and to monitor DNA hybridisation kinetics and detect specific mutations (Nilsson et al., 1995; Persson et al., 1997). The design of a DNA biosensor generally involves modification of a sensor surface for attachment of DNA followed by immobilisation of the probe and detection of the target via hybridisation at the sensor surface.

3.1.1 Ligand immobilisation Since DNA biosensors are usually in the form of electrodes, crystals or chips, hybridisation is a solid-phase reaction, which is believed to proceed at one-tenth the rate of solution hybridisation (Bunemann, 1982). Therefore the immobilisation step should lead to a probe (ligand) with controlled orientation and packing density which minimises steric hindrance allowing the probe to be accessible for hybridisation,

which is of course applicable to all solid-phase hybridisation formats. Depending on the transducer, DNA can be immobilised by thiolylated or covalent linkage to gold transducers, via biotin-streptavidin chemistry, by covalent coupling to functional

23 D. O’ Meara

groups or by simple adsorption to carbon surfaces. As in solution-based hybridisation, ionic strength and temperature need to be optimised for each probe-target interaction.

3.1.2 Signal detection Transducers form the basis of a biosensor and serve to translate the biological event into a measurable analytical signal. Common transducing elements in DNA biosensors are optical, electrochemical or pizoelectrical devices which measure changes in light absorption, current or mass respectively. Here the focus is on optical DNA biosensors but for more information on electrochemical and pizoelectrical biosensors please see a recent review (Wang, 2000). Optical biosensors make up most of the DNA biosensor market with several types of transducers such as surface plasmon resonance (SPR), wave guiding and fluorescence forming the basis of these biosensors. These sensors measure properties such as the intensity, phase or angle of the reflected light which allows determination in real-time of the amount and rate of ligate binding. There are a number of commercially available optical biosensors but over 90% of the commercial biosensor publications cite the use of BIACORE instruments (Biacore AB). The BIACORE is a flow based system which uses the principle of SPR to detect changes at the sensor surface (Jönsson et al., 1991). Surface plasmon resonance is an optical phenomena in which light is focused through a prism on a thin gold film and due to energy transfer in the form of an evanescent wave from the light beam to the surface electrons on the metal film the intensity of the reflected light is decreased (Garland, 1996). The angle of the reflected light (resonance angle) is dependent on the refractive index near the surface of the thin gold film. A biomolecular interaction taking place on the sensor chip (a thin gold film coated with a dextran matrix) will lead to a change in the refractive index and cause an increase in the resonance angle. This change in resonance angle is proportional to the surface concentration of biomolecules (Stenberg et al., 1991) and results are presented as resonance units (RU) against time in a sensorgram thus providing a complete record of the process of association and dissociation (Figure 8). In a typical binding analysis, the ligand is immobilised on a chip which forms one wall of a flow cell while a sample is injected across the surface at a fixed flow rate. DNA hybridisation is followed in real-time without the need for labelled targets, making biomolecular interaction analysis (BIA) an ideal technique to determine binding specificity and interaction kinetics.

24 Molecular Tools for Nucleic Acid Analysis

Light Detection Reflected source unit Intensity

I Prism Angle II I II Resonance angle Resonance signal II Sensor chip

Time Flowcell I Sensorgram

Figure 8. Configuration of SPR detector, sensor chip and integrated fluidic cartridge in the BIACORE system. The resonance angle is sensitive to the mass concentration of molecules close to the sensor chip surface. As this concentration changes, the resonance angle shifts producing a response which is displayed in a sensorgram.

Biosensors such as the IAsys (Affinity Sensors) and ASI (Artificial Sensing Instruments) also use the resonance angle to probe surface changes but instead of SPR they utilise a wave guiding technique (Lowe et al., 1998; Wiki et al., 1998). An alternate biosensor based on optical fibre technology uses the evanescent wave to stimulate fluorescent labels in DNA attached at the end or along the length of the fibre. Most of these fibre optic biosensors use ethidium bromide but some have recently been described that use fluorescently labelled single-strand DNA to detect multiple DNA targets simultaneously (Ferguson et al., 2000).

3.2 DNA microarrays

DNA microarrays originate from the method of Southern blotting (Southern, 1975) and indeed its inventor, Edwin Southern, was the first to make oligonucleotide microarrays (Southern et al., 1992) for the purpose of high-throughput sequencing i.e. SBH. Although SBH was not sufficiently developed to have any impact on the sequencing of the human genome, it led to the creation of DNA microarrays, a powerful approach for the analysis of complex biological systems.

25 D. O’ Meara

These DNA chip arrays have a wide range of applications and have the advantage of speed of measurement and the ease with which the analysis lends itself to automation. The most powerful application of DNA chips is the comparison of hybridisation patterns to identify physiologically important variations e.g. expression profiling of normal and diseased states to highlight induced and suppressed pathways in the diseased tissue (DeRisi et al., 1996). Similar experiments in the pharmaceutical industry may help in the identification of drug candidates and any secondary changes suggestive of potential side-effects (Gerhold et al., 2001). DNA chips can also be used in the field of DNA diagnostics for genotyping of bacteria and viruses and for mutation/polymorphism screening i.e. resequencing. Affymetrixs have designed a HIV GeneChip which has been used to screen for drug resistance mutations in HIV-1 (Kozal et al., 1996) which shows similar capacity to identify mutations as other HIV- 1 genotyping assays currently available (Wilson et al., 2000). Tailor made oligonucleotide arrays have been used to test for mutations signifying susceptibility to disease in genes as diverse as the breast cancer gene BRCA1 (Hacia et al., 1996), the cystic fibrosis gene (Cronin et al., 1996) and the human mitochondrial genome (Chee et al., 1996). Such high-density oligonucleotide arrays have also been used for high- throughput discovery of SNPs as described by Wang et al. who examined 2.3 Mb of human genomic DNA identifying over 3000 candidate SNPs (Wang et al., 1998). However these approaches have not been so successful for large-scale SNP typing with one report assigning the correct genotype in only 57% of sites analysed (Cho et al., 1999). To enhance the discriminatory power of these chips, enzymatic extension has been performed on the oligonucleotide arrays as described by a number of groups (Pastinen et al., 1997; Fan et al., 2000; Hirschhorn et al., 2000; Pastinen et al., 2000).

3.2.1 Array platforms

Among the many different technical solutions available, methods for making an array generally fall into one of two categories; in situ (on-chip) synthesis of oligonucleotides or peptide nucleic acid (PNA) or arraying/spotting of prefabricated oligonucleotides, PNA or DNA fragments (Figure 9). Design of an array is largely dependant on the research question that has to be answered with smaller arrays with well defined target sequences (e.g. HIV chip) being commonly used for diagnostic purposes, while elucidation of metabolic pathways or identifying novel genes necessitates the use of large arrays often with undefined sequences.

26 Molecular Tools for Nucleic Acid Analysis

Spotted DNA microarray High-density oligonucleotide array Public database Glass wafer

DNA

PCR amplification Sequence selection Purification Oligomer synthesis Robotic printing

PM MM

Gene X Gene X

Figure 9. Array fabrication and gene representation for spotted DNA arrays and high-density oligonucleotide arrays. Spotted DNA arrays are made by printing amplified DNA onto glass slides. Each spot on the slide corresponds to a gene fragment of several hundred bases. Pre-synthesised oligonucleotides can also be spotted (not shown). The high-density arrays are manufactured using light directed chemical synthesis to produce thousands of different oligonucleotide probes in a highly ordered fashion on the glass chip. Each gene is represented by 15-20 different oligomer pairs (pm; perfect match, mm; mismatch) on the array.

(i) In situ synthesis. Three approaches have been used to direct oligonucleotide synthesis to defined areas of a support for in situ fabrication: synthesis via combinatorial chemistry (Southern et al., 1994); ink-jet delivery of nucleotide precursors to the surface (Blanchard, 1998); and photolithography (Pease et al., 1994). Photolithography is pioneered by Affymetrix to generate high-density oligonucleotide arrays. This technique involves exposing light onto the surface of the chip through a photolithographic mask, which selectively removes photo-labile deprotecting groups from the growing oligonucleotide chain in a stepwise fashion to create oligonucleotides. Currently over 400,000 different features (an area containing millions of identical probes) are packed into a region of about 1 cm2. However since the step-wise synthesis yield is only 95%, oligonucleotides no longer than 25 bases can be synthesised which can dramatically reduce specificity and sensitivity. In gene expression analysis, specificity is addressed by comparing perfectly matched sequences with single base mismatched sequences

27 D. O’ Meara

and by synthesising many oligonucleotides for each gene to be analysed (typically twenty pairs of oligonucleotides are arrayed to represent each gene) (Figure 9) (Wodicka et al., 1997). When these high-density chips are used for polymorphism analysis, every nucleotide position that has to be interrogated has a set of four oligonucleotides designed that differ only in the central nucleotide. The relative intensities of hybridisation to each series of probes at a particular location identifies the nucleotide. However, due to the poor step-wise yield, it is estimated that less than 36% of all probes in a feature are likely to be error free (Graves, 1999). Nevertheless, this error prone synthesis may have beneficial side effects as it has been shown that a probe to which a second mismatch has been added is more sensitive in detecting mutations than one containing only the mismatched base (Guo et al., 1997). A further disadvantage is that these high-density arrays are very expensive and the complexity of photolithography means that the chip manufacture must be carried out “in house”. However, in situ synthesis has a number of advantages over deposition of pre- synthesised oligonucleotides in that yields are higher and consistent over the surface from one cell of the array to another and chips can be manufactured directly from sequence databases removing the uncertainty and burden of sample handling and tracking.

(ii) Spotted arrays The technology for printing prefabricated oligonucleotides and cDNAs is more accessible and less expensive than that for in situ fabrication and simply involves the robotic spotting of oligonucleotides or cDNA onto glass slides. This technology was developed by researchers at Stanford University (Schena et al., 1995; Shalon et al., 1996) but today there are numerous commercially available robotic spotters that use direct touch or fine micropipetting (Gerhold et al., 1999). Generally PCR amplified inserts from cDNA clones are spotted onto glass slides (overlaid with a positively charged coating such as amino silane or polylysine) (Figure 9) but DNA (PCR products or oligonucleotides) can also be covalently coupled to slides through various coupling chemistries (Lindroos et al., 2001). An advantage of cDNA arrays is that they can be produced from both known and unknown cDNA while oligonucleotide arrays require gene sequence information for synthesis of the oligomers.

28 Molecular Tools for Nucleic Acid Analysis

3.2.2 Hybridisation and detection Hybridisation conditions depend greatly on the purpose of the experiment. Expression analysis requires overnight hybridisation with low stringency (high salt and low temperature) which enhances annealing of low copy number sequences while identification of mismatches requires greater stringency over a shorter time period (hours). Hybridisation of the labelled target ideally should be linear (i.e. proportional to the amount of probe), sensitive so that low abundance genes are detected and specific so that probes hybridise only to the desired gene in the complex target mixture. Since the hybridisation efficiency is susceptible to sequence specific features such as melting temperature and secondary or tertiary structure effects, the use of PNA in place of DNA oligonucleotides may overcome some of these problems as PNA can form hybrids in low salt conditions where secondary and tertiary structures are prevented from forming. The hybridisation signal can also be improved by insertion of spacers in short oligonucleotide sequences reducing steric hindrance and improving probe accessibility (Guo et al., 1994; Southern et al., 1999). Electric fields can be also used to enhance the hybridisation rate and precisely regulate the stringency of hybridisation (Edman et al., 1997; Cheng et al., 1998). The most common method of signal detection is laser-induced fluorescence which is detected using confocal optical scanners. The method of labelling depends on the template that is used, on whether mutation detection or gene expression analysis will be performed and whether high-density or spotted arrays are used. Various means have been proposed to introduce the fluorescent dyes into the target cDNA or RNA (Wodicka et al., 1997; de Saizieu et al., 1998; Eisen and Brown, 1999) with a single dye required for Affymetrix high-density arrays and two dyes for differential expression analysis on cDNA microarrays. While spotted cDNA arrays intrinsically normalise for noise and background in a pair-wise comparison they are at a disadvantage when different samples in an experimental set are to be compared. With cDNA arrays each test sample must first be compared to the same reference sample and then the relative ratios compared after a normalisation procedure. The high-density oligonucleotide arrays however allow flexibility in sample comparisons and provides an estimate of the levels of gene transcripts in individual samples. Both these microarray formats have a similar detection limit of approximately 3 copies per cell (Lockhart et al., 1996; Schena et al.,

29 D. O’ Meara

1996) which is lower than that of classical northern blots and thus misses rare transcripts but still provides sensitive and accurate information regarding the majority of differentially expressed genes (Taniguchi et al., 2001). DNA microarrrays produce huge quantities of data which pose a major bottleneck especially in gene expression analysis. This problem is now being tackled and software tools such as the cluster system described by Eisen to identify co-regulated genes in expression analysis (Eisen et al., 1998) are being developed to analyse, interpret, visualise and manage this huge volume of data. Finally, the future of microarray technology probably lies in its transfer from the research laboratory into industrial, clinical and other routine applications. This will require new technological developments such as label free detection and flow-through systems which is further discussed in a recent review (Blohm and Guiseppi-Elie, 2001).

30 Molecular Tools for Nucleic Acid Analysis

Present Investigation

This thesis describes the development of nucleic acid methods for sample preparation, diagnostics and sequence analysis. Initially biosensor technology was used to optimise conditions for nucleic acid hybridisation (papers I and II). These results demonstrated that modular probes significantly improved target capture and a protocol based on such co-operative oligonucleotides for the capture of HCV RNA from serum and for the clean-up of sequencing products was developed (papers III and IV). Papers V and VII are concerned with techniques to screen for mutations and polymorphisms. Here we investigated the feasibility of using pyrosequencing to monitor the development of drug resistance mutations in HIV-1 (paper V) as well as describing the development of an apyrase mediated allele-specific extension technique (AMASE) for SNP typing on DNA microarrays (paper VI).

4 Nucleic acid hybridisation using modular primers

The properties of nucleic acid complementarity are utilised in many standard techniques including sample preparation, PCR, DNA sequencing, genotyping, nucleic acid detection and analysis using DNA microarrays. Important factors in such applications are the stability of the duplex formed and the presence of secondary or tertiary structure in the target or probe. In this thesis the use of a modular probe to enhance the capture of single-strand DNA, viral targets and sequencing products was investigated. Modular probes are short oligonucleotide modules that hybridise contiguously on a template resulting in their mutual stabilisation (Kotler et al., 1993). This stabilisation effect has been attributed to base stacking interactions between adjacent bases of the oligonucleotides as nicked DNA complexes are more difficult to dissociate than oligonucleotide arrangements with gaps. Indeed, structural conformation analysis has revealed large similarities between nicked and continuous double-stranded DNA, while a gapped double-stranded DNA was kinked and more flexible (Roll et al., 1998). In addition, the melting temperature of the duplex is increased several degrees in the presence of an adjacent oligonucleotide (Lin et al., 1989; Kandimalla et al., 1995) while thermodynamic analysis shows that a nick is energetically favoured over a gap by at least 1.4 kcal/mol (Lane et al., 1997). Various groups have also shown

31 D. O’ Meara that insertion of ≥ 1 nucleotide gaps results in less stable duplexes or can even prevent duplex formation (Kharpko et al., 1991; Parinov et al., 1996). Although purine-purine interactions have the highest energy of interaction (Norberg and Nilsson, 1995), pyrimidine-pyrimidine and purine-pyrimidine interactions have also been show to participate in base stacking with even one of the weakest combinations, T:A being sufficient for stabilisation of hexamers stacked with immobilised 8 mer probes (Parinov et al., 1996). Such modular probes have been found to prime polymerase directed DNA sequencing reactions uniquely unlike either of the modules alone, facilitating the manufacture of longer more specific primers from libraries of shorter less specific primers (Kieleczawa et al., 1992; Kotler et al., 1994). These modules have also been used in SBH strategies where a set of short oligonucleotide probes with known sequence is used to identify the complementary sequence in an unknown DNA target (Kharpko et al., 1991; Parinov et al., 1996). Although these sequencing strategies have largely been replaced by shotgun sequencing, the principle of stacking hybridisation can be utilised in many other areas such as array technology, nucleic acid capture and antisense applications to stabilise DNA-probe duplexes and to open up secondary structure. Indeed stacked oligonucleotides have been investigated as potential antisense oligonucleotides and have been shown to bind with more sequence specificity than longer oligonucleotides (Kandimalla et al., 1995). Recently it has been shown that the rate of target capture by immobilised hairpin probes with dangling ends (i.e. facilitates base stacking) is twice that of immobilised linear probes forming a thermodynamically more stable duplex, which may offer advantages in solid-phase hybridisation systems (Riccelli et al., 2001). A modification of this approach is to insert a linker between two oligonucleotides which also serves to increases their binding affinity and thermal stability (Taylor et al., 2001). Here this co-operative effect between contiguous oligonucleotides was exploited to enhance the capture of various nucleic acid targets ranging from viral RNA templates to cycle sequencing products.

4.1 Investigation of the modular primer effect using biosensor technology (I, II)

In paper I, various modular primer combinations were investigated using SPR-based biosensor technology to enhance the capture of single-strand PCR products. Initially,

32 Molecular Tools for Nucleic Acid Analysis

immobilised probes of 9-36 nucleotides in length were used for capture but surprisingly they resulted in very low hybridisation signals. These poor hybridisation responses are probably due to restricted reaction kinetics at the semi-solid-phase surface and to secondary structure in the target sequence that prevents efficient hybridisation. Indeed the target DNA used in this study is derived from the NTR of HCV, which is predicted to have a high degree of secondary structure (Brown et al., 1992). To overcome such limitations, the use of a modular probe that anneals adjacent to the immobilised probe was investigated. This pre-hybridisation step takes place in solution taking advantage of the more efficient hybridisation kinetics in solution (Figure 10).

Figure 10. Schematic illustration of the opening up of secondary structure using a modular probe.

Using this strategy, a significant improvement in single-stranded capture was achieved when a single pre-hybridising module of 11 nucleotides or longer was employed. Interestingly, no dramatic difference was observed when the capture probe was reduced in length from 18 nucleotides to 9 nucleotides. Actually the nonamer capture probe together with two adjacently positioned nonamers (pre-hybridised) could capture single-stranded DNA more efficiently than an 18 mer capture probe combined with a nonamer modular that spans the same twenty-seven nucleotides in the target sequence. This suggests that the length of the capturing probe is not the most important parameter but rather it depends on the number and length of the modular probes. However, even-though the nonamer (with two modular probes) could capture DNA as effectively as the 18 mer probe (with a modular ≥11 nucleotides), it displayed apparent faster dissociation kinetics. Insertion of a gap between the capture and modular probes reduced the capture efficiency by 30-90%, with the introduction of a gap being more detrimental in situations where multiple modules were employed.

33 D. O’ Meara

This suggests that the base stacking interactions have been disrupted but that the modules can still partly suppress secondary structure. To further explore the mechanism behind the modular primer effect, biosensor technology was used to quantitatively investigate this stabilisation phenomenon using combinations of shorter hexamer modules (paper II). Two different target oligonucleotides were immobilised on separate flow cells generating nicked, gapped and overlapping modular hybridisation complexes. In the third flow cell, a mixture of two targets was immobilised each separately comprising of either of the two annealing sites. Initially the affinities of each of the individual hexamers was determined by plotting the equilibrium response as a function of the amount of analyte injected. The hexamers showed different affinities for the two targets despite the fact that each contained the identical hexamer annealing site, which shows that the surrounding sequence also influences the hybridisation. To investigate the modular primer effect, pair-wise combinations of four DNA hexamers, at a fixed total concentration but at varying concentration ratios were hybridised to the template oligonucleotides. To observe co-operative effects, the arithmetic sum of the individual responses of the corresponding hexamers was then subtracted from the responses obtained from co-injections. A stabilisation effect was observed for the three adjacent hybridisation formats while two out of three one-base gap formats and a one-base overlap resulted in a small modular stabilisation effect. The apparent combined affinities for DNA-DNA hexamers were also determined by varying the total hexamer concentration. Individual affinities for two hexamers were compared to the apparent affinities when increasing amounts of a stabilising hexamer were added. Up to an 80-fold increase in apparent combined affinity was observed when a low affinity hexamer was supplemented with a high-affinity modules while up to a 20-fold increase was seen for the opposite combination. When the modules were hybridised on separate templates there was no increase in apparent affinity showing that the modules were unable to stabilise each other. Taken together, these results show that stabilisation is dependant on the positioning and inherent affinities of the adjacently annealing hexamer modules. To investigate if the modular primer effect could also be observed for PNA, which has a peptide rather than a sugar backbone, PNA/DNA and PNA/PNA combinations were analysed. The results show that PNA hexamers stabilise each other and that PNA-DNA hexamers can also interact co-operatively. However, in one example there

34 Molecular Tools for Nucleic Acid Analysis

was no apparent co-operative effect between a PNA and a DNA hexamer, which was probably due to the high inherent affinity displayed by this PNA hexamer. In conclusion, this work shows that flexible systems for capture can be designed from libraries of short oligonucleotides that co-operatively interact stabilising duplex formation. Such adjacently hybridising probes could also be used to resolve secondary structure, a common obstacle encountered in many different molecular biology techniques (van Doorn et al., 1994; Brooks et al., 1995). This work also illustrates that the biosensor is a useful analytical tool to optimise different hybridisation formats prior to the development of hybridisation protocols for preparative purposes as described in the following papers (papers III and IV).

4.2 Development of capture protocols using modular primers (III, IV) Based on this data it was investigated whether a sensitive extraction protocol for HCV RNA could be developed using such stabilising probes and bio-magnetic bead technology. Hepatitis C is a positive-sense RNA virus with a genome of approximately 9.5 Kb. Previously routine diagnosis of HCV infection was based on detection of antibodies against viral antigens (Kuo et al., 1989) but this has largely been replaced by amplification of HCV RNA by RT-PCR. Amplification of HCV RNA also serves to monitor viremia levels in patients receiving antiviral therapy (Lee et al., 1998) and to generate template for genotyping/subtyping (Zein 2000). An important first step in HCV RT-PCR is the extraction of RNA from serum, which concentrates the target and removes PCR inhibitors present in the sample. However, standard protocols generally involve phenol chloroform extraction and precipitation steps or silica absorption on spin columns, which are impractical in the development of automated systems for the processing of large sample numbers in a clinical setting. In this protocol magnetic beads were used for target capture, which allows for a rapid change of reaction buffers and reagents simply by applying a magnetic field, thereby circumventing centrifugation or precipitation steps. Such magnetic bead based protocols have been described previously but have displayed less sensitivity than conventional protocols (Heermann et al., 1994; van Doorn et al., 1994; Regan and Margolin, 1997; Miyachi et al., 2000). In one of these reports, a rapid hybridisation capture assay for HCV based on magnetic beads was described which showed a 10- fold lower sensitivity than the conventional extraction procedure with the capture

35 D. O’ Meara

efficiency varying considerably depending on the particular probe used (van Doorn et al., 1994). Similar results which have been attributed to the loss of template DNA during capture have been observed for other viral targets such as hepatitis B virus (HBV) (Heermann et al., 1994) and poliovirus (Regan and Margolin, 1997). Indeed a loss of 60% of the HCV RNA template after capture with magnetic beads has previously been reported (Hsuih et al., 1996). This loss in sensitivity applies not only to magnetic bead assays but commercial silica extraction methods have also been shown to be less sensitive than conventional extraction protocols (Fanson et al., 2000). This loss in sensitivity could be a critical factor when nucleic acid technology (NAT) is used by manufacturers of blood products and blood banks to screen plasma pools (for economical and technical reasons) instead of testing individual donations. Here it was investigated whether nucleic acid capture strategies could be improved by using the co-operatively interacting modular probes described in paper I which hybridise to the 5′ NTR, a conserved region in HCV. In this protocol, a modular probe was prehybridised to the viral RNA (following viral lysis) and the duplex was then captured by a second probe that was covalently coupled to magnetic beads (Figure 11). The captured RNA was then subjected to an in-house RT-PCR amplification. The capture probe used here was covalently coupled to the magnetic beads as non-specific binding of template to streptavidin beads was observed when a biotinylated probe was used (data not shown).

Serum + modular probe

Magnetic bead and capture probe

RT-PCR bDNA assay

Figure 11. Biomagnetic capture of HCV RNA from serum samples. Following capture the RNA is analysed by RT-PCR or quantified by bDNA.

36 Molecular Tools for Nucleic Acid Analysis

Initially the protocol was applied to HCV recombinant DNA and recombinant RNA. The sensitivity of the protocol was investigated by performing capture on a dilution series of recombinant template followed by semi-nested PCR. This allows comparison at the PCR plateau level where all dilutions have reached saturation irrespective of the initial copy number. The results showed a 5-fold improvement in overall detection sensitivity when a modular probe was employed in comparison to extraction with only the capture probe. When the approach was evaluated on HCV positive serum samples, capturing the entire viral target of 9.6 Kb, an increase in assay sensitivity of 5 to 25-fold was observed when a modular probe was employed. However, this end point analysis which facilitated comparison of capture with and without a modular probe allowed only for the determination of the sensitivity of the capture procedure combined with an amplification step. To address the sensitivity of the capture protocol in comparison to a conventional extraction method [guanidinum thiocyanate extraction, phenol chloroform extraction and ethanol precipitation (Chomczynski and Sacchi, 1987)], the amount of viral target extracted by both methods was quantified by the bDNA assay (Table 4). Since the bDNA assay undergoes a linear amplification, the capture efficiency with and without the modular probe can also be quantified. These results show that capture with the modular probe extracts on average twice as much RNA as capture without the modular probe (Table 4) and that it captured RNA of different genotypes with comparable sensitivity to the conventional extraction procedure.

Table 4. Comparison of the conventional extraction method with the oligonucleotide assisted capture assay.

Quantity of HCV RNA extracted (MEq/ml)

Conventional Capture with pre- Capture without pre- extraction hybridising probe hybridising probe

Genotype 1a 1.115 0.934 0.365 1b 5.287 5.984 2.410 2b 5.735 4.147 1.929 3a 0.944 0.913 0.852

37 D. O’ Meara

The capture protocol followed by RT-PCR was also successfully applied to 21 clinical samples (previously quantified by the Amplicor HCV Monitor test) with a sample that yielded a false-negative result in the Amplicor test successfully detected. These results show that the protocol captured all the different HCV subtypes and genotypes tested with viral capture failing in one of seven samples captured without a modular probe. In this study, probes complementary to the NTR of HCV, a conserved region that is predicted to have a high degree of secondary structure (Kato et al., 1990; Choo et al., 1991; Brown et al., 1992) were used. Suppression of such secondary structure by the modular probes might be an explanation for the marked improvement in capture observed. Indeed it has been previously reported that the method selected for RNA extraction can have a profound effect on the sensitivity of the HCV assay (Nolte et al., 1994). Unlike silica based isolation methods, non-specific RNA and DNA is removed by this selective extraction method which should improve the sensitivity of detection especially in samples with low copy number. This magnetic bead capture of whole genomes should also minimise shearing of the RNA template, a critical factor in the success of long range RT-PCR (Thiel et al., 1997). Therefore it can be concluded that this capture protocol is a sensitive extraction method for HCV which is amenable to automation using robotic workstations.

A similar strategy using magnetic beads and stacking hybridisation was then used for the purification of cycle sequencing products prior to DNA sequencing (paper IV). This clean-up step is critical to obtain good quality sequencing data as excess unincorporated dyes (dye-terminators or dye-primers) and other contaminants such as salt, proteins and template DNA can interfere with the separation and detection of sequencing products, especially in capillary gel electrophoresis (CGE) (Tong and Smith, 1993; Salas-Solano et al., 1998). Traditional purification methods use ethanol precipitation or spin columns but such protocols that include extraction, centrifugation or vacuum drying steps are cumbersome and thus difficult to automate. To circumvent such procedures, a magnetic bead based method for the purification of cycle sequencing products suitable for large-scale genome sequencing projects was developed. A similar capture protocol which uses streptavidin magnetic beads to purify biotin labelled sequencing products has also been described (Fangan et al.,

38 Molecular Tools for Nucleic Acid Analysis

1999). While this approach is generic, it requires biotin labelling of the sequencing primer and is compatible only with dye-terminator chemistry. In the approach described here, capture probes complementary to the vector sequence were coupled to magnetic beads and were used together with an adjacently annealing modular probe. These probes were complementary to vector sequences and therefore represent a universal system to capture sequencing products generated from all inserts (Figure 12).

Sequencing products

capture probe and bead * * modular probe Addition of probes sequencing and magnetic beads * fragments to sequencing * products. * * sequencing * primer template DNA

Re-use * dNTPs/ddNTPs beads /salt polymerase Magnetic separation of beads and sequencing fragments

* * * *

* *

Elute and load on gel Discard

Figure 12. Biomagnetic purification of dye-primer cycle sequencing products (also applicable to dye- terminator sequencing chemistry). Magnetic beads facilitate facile separation of unincorporated labelled sequencing primer, nucleotides, salts, proteins and template DNA from the Sanger fragments, all of which can interfere with injection, separation and detection of sequencing products.

In these experiments, an in-house bacterial genome project using pUC18 and pBluescript SKII shotgun libraries, with insert sizes in the range of 800-2000 base pair (bp) was used as a model system. Sanger fragments were generated using either forward or reverse sequencing primers and for each vector and sequencing primer combination, a specific capture and modular probe was used. These sequencing fragments were then captured on the beads in a one-step hybridisation reaction.

39 D. O’ Meara

Initially we investigated whether a stacking module resulted in improved capture of cycle sequencing products. Comparison of capture with and without the modular probe using the random pUC clones shows a higher signal intensity for capture with the modular probe. Generally capture with the modular probe resulted in a higher accuracy especially in the 400-500 bp region (Table 5) with more even peak heights compared to capture without the modular probe.

Table 5. Comparison of the sequencing results of 15 different inserts in terms of accuracy over the first 500 bases.

Purification method Accuracy

Capture without modular 97.7% Ethanol precipitation 98.3% Capture with modular 99.6%

Optimal conditions for capture were determined to be incubation at 54°C for at least 15 minutes with ≥150 µg beads and 30 pmoles of modular probe. The temperature appears to be critical with a sharp decrease in capture efficiency at temperatures above and below the melting temperature of the capture probe. The quality of sequencing data (after capture with the modular probe) also compared favourably to traditional ethanol precipitation with a higher accuracy observed for the capture protocol (Table 5). This is probably due to mis-annealed and extended sequencing primers which co-precipitate with the sequencing product during ethanol precipitation. An advantage of the capture protocol is that only the correctly extended sequencing products will be purified, reducing background and improving accuracy. In addition, double-stranded templates are removed which can otherwise interfere with the separation efficiency and the electrokinetic injection step of CGE (Salas- Solano et al., 1998). To test the compatibility of this capture protocol with CGE, 15 random pBluescript clones were subjected to cycle sequencing, capture and elution and were then directly electroinjected. The capture protocol achieved a higher accuracy over the first 100 bases in comparison to ethanol precipitation while the remaining 500 bases had comparable accuracy. The purification method used prior to CGE is of critical importance, as residual salt and ddNTPs decrease the amount of DNA injected into the capillary column while injected DNA template decreases read- length (Salas-Solano et al., 1998). This protocol also allows relatively facile

40 Molecular Tools for Nucleic Acid Analysis

separation of sequencing reaction products unlike previously described purification methods for CGE which have involved gel filtration steps (Ruiz-Martinez et al., 1998). A crucial factor in adapting this capture protocol to large-scale sample purification is the cost and throughput of the system. To make this procedure more cost-efficient it was investigated whether the magnetic beads could be re-used. Following elution of the captured sequencing product, the beads were briefly washed and re-used up to 12 times. There was no apparent loss in signal intensity and chromatograms showed low background, high accuracy and a high signal-to-noise ratio. We have also shown that this capture protocol is highly specific, as only the complementary sequencing fragments are extracted from a mix of sequencing products. This specificity has facilitated multiplex sequencing reactions (bi-directional sequencing of one or more vectors) where the individual products are iteratively captured by the different probe/bead sets reducing reagent costs considerably (Blomstergren, 2001). This protocol has now been adapted to an automated robotic workstation which allows a throughput of 384 reactions (obtained from 96 quatroplex or 192 duplex reactions) in two hours (Holmberg, 2001). In conclusion we have described a method which allows a high-throughput clean-up of cycle sequencing reactions resulting in highly accurate sequence data at a low cost that is suitable for capillary sequencers.

5 Analysis of mutations and polymorphisms

Conventional Sanger sequencing has been the workhorse of DNA sequencing laboratories for the past twenty years but faces limitations such as cost and throughput in light of the increasing reliance on genotypic data in the diagnostic and medical fields. Consequently numerous groups have focused on the development of alternate sequencing technologies such as SBH, pyrosequencing and enzymatic reactions on DNA microarrays. Here we have investigated the feasibility of using pyrosequencing to monitor the development of drug resistance mutations in HIV-1 (paper V) as well as developing an apyrase mediated allele-specific extension technique for SNP typing on DNA microarrays (paper VI).

41 D. O’ Meara

5.1 Monitoring drug resistance mutations in HIV-1 by pyrosequencing (V) Human immunodeficiency virus has caused approximately 20 million deaths world wide and today an estimated 36 million people are infected with HIV-1 (CDC, 2001). Human immunodeficiency virus infection is characterised by high rates of viral replication and destruction of CD4+ T-lymphocytes which ultimately leads to immune deficiency, acquired immunodeficiency syndrome (AIDS)-defining illnesses and death. Consequently, inhibition of viral replication has assumed primary importance in the control of AIDS. However, resistance to the currently available antiretroviral therapy directed against the reverse transcriptase (RT) and protease gene is a widespread problem affecting 30-50% of individuals receiving highly active antiretroviral therapy (HAART) (Wit et al., 1999). Due to the high replication rate of HIV-1 and the low fidelity of RT, various mutations are introduced into the HIV-1 gene pool leading to the evolution of genetically distinct viral variants termed quasispecies. If antiviral therapy is sub-optimal, variants with reduced sensitivity to the antiviral agents will be selected for (Condra, 1998). The presence of these drug resistance mutations can be tested phenotypically by measuring the degree of sensitivity to a drug in a (Hertogs et al., 1998). However, these tests are costly and time consuming and hence are not suitable for use in the routine management of HIV-1 patients. Due to advances in nucleic acid diagnostics and an increased understanding of the effects of resistance mutations on viral drug susceptibility, genotypic drug resistance is now more frequently used in clinical laboratories. Currently the three most widely used genotypic strategies to identity the mutant virus population are (i) by conventional Sanger sequencing which provides complete sequence information of a defined region, (ii) by hybridisation to specific probes that interrogate certain codons e.g. line probe assay (LiPA) and (iii) by hybridisation analysis to DNA chips. Sanger sequencing is seen as the gold standard for drug resistance testing but is labour-intensive and expensive for use in a routine setting with large inter-laboratory differences in the quality of the results (Schuurman et al., 1999). However, the introduction of dedicated kits (Visible Genetics and ABI) may improve overall sequencing results and increase throughput. A line probe assay for RT has been found to be accurate and reliable and more sensitive than sequencing for detection of mixed viruses (Stuyver et al., 1997). However it interrogates only nine codons and suffers from a high rate of hybridisation failures (Koch et al., 1999;

42 Molecular Tools for Nucleic Acid Analysis

Servais et al., 2001). A prototype assay for protease (interrogates eight codons) has found to be similar in performance to the RT line probe assay but without the high hybridisation failure rate (Servais et al., 2001). The GeneChip from Affymetrix consists of over 16,000 oligonucleotide probes complementary to protease and part of the RT region (Vahey et al., 1999). On the basis of the hybridisation pattern, the HIV- 1 genomic sequence and mutations are simultaneously identified. However the GeneChip is expensive and currently is not designed to identify insertions and deletions implicated in resistance to specific drugs. In paper V, the feasibility of using pyrosequencing as a genotyping tool to determine the presence of drug resistance mutations in the protease gene of HIV-1 was investigated. Twelve pyrosequencing primers were designed to sequence the 33 codons implicated in the 52 drug resistance mutations in the 297 bp of the protease gene (Kuiken et al., 2000). While the main focus was on the seven primary mutations (at codons 30, 46, 48, 50, 82, 84 and 90-represented by the black bars in Figure 13) as well as the eleven secondary mutations (represented by shaded bars in Figure 13), a further fifteen codons implicated in drug resistance were also sequenced.

10 20 468254 71 Indinavir

Ritonavir 48 90 Saquinavir 30 Nelfinavir 50 84 Amprenavir

Figure 13. The resistance patterns of HIV-1 protease inhibitors (PIs). The black bars represent primary mutations while the shaded bars represent secondar y mutations that are common to many PIs. An additional fifteen secondary mutations specific to individual PIs are not represented here.

Initially these twelve primers were evaluated on the HIV-1 Mn strain and on eight proviral DNA samples (previously sequenced by Sanger sequencing) with all codons implicated in drug resistance successfully sequenced and with an average read-length of 26 bases obtained for each primer. Some ambiguities were observed in the pyrograms due to positive or negative frameshifts. These frameshifts generally

43 D. O’ Meara appeared late in the extension reaction probably due to inefficient nucleotide degradation or enzyme contamination (Ronaghi, 2001). Nevertheless, these ambiguities did not affect the interpretation of nucleotide sequence as they appeared consistently which allowed for pattern discrimination between wild type and mutant sequences. Some discordances were observed between the pyrosequencing data and the Sanger sequencing data which can be explained by low viral DNA copy numbers where the input viral DNA for the individual PCRs may not be representative of the entire population. This pyrosequencing approach was then evaluated on clinical samples where four patients were monitored retrospectively for the development of drug resistance mutations over a two and a half-year period. Resistance to indinavir (IDV) was observed in patient 2 after two months of therapy and concurred with the appearance of primary and secondary drug resistance mutations to IDV at codon 46 and 10 respectively. Therapy was then switched to two other PIs but viral load remained high and several additional PI resistance mutations developed including the primary drug resistant mutations to ritonavir (RTV) and saquinavir (SQV). In patient 3, a secondary mutation was present as a minor variant in the PI-naïve sample but during suboptimal PI treatment it had a selective advantage over the other variants and thus became the dominant population. Resistance to IDV, RTV and SQV was observed in patient 4 but only one primary drug resistance mutation (to SQV) was observed. This could indicate problems with treatment adherence or drug absorption. A limitation of almost all resistance testing assays is their relative insensitivity to detect minority variants in the virus population. Here the limit of detection of pyrosequencing was determined to be 25%, which is comparable to conventional sequencing strategies (Leitner et al., 1993; Schuurman et al., 1999). Indeed a quasispecies present at 25% of the total population that was missed by conventional Sanger sequencing was detected by pyrosequencing and confirmed by clonal analysis. In conclusion, pyrosequencing is a rapid robust technique for monitoring the development of drug resistance mutations in the protease gene allowing the sequencing of eight patients in one hour. However, for pyrosequencing to be useful in drug resistance testing in a clinical setting the assay would need to be extended to include the sequencing of the drug resistance mutations in RT. This would require the design of at least another 30 sequencing primers, which is not a trivial task given the polymorphic nature of HIV-1. However, primer mismatches were generally well

44 Molecular Tools for Nucleic Acid Analysis

tolerated in this study, as there were only three cases where mismatches lead to poor sequencing results with these 12 pimers. While base calling and pattern recognition was performed manually here, software to perform automatic base calling has now been developed but interpretation of mixed variants must still be performed manually and would not be practical in a routine clinical laboratory. However, the appearance of an unexpected pattern could indicate that a polymorphism exists at a particular codon and pyrosequencing could then be used for clonal analysis of this sample to determine whether mixed variants are present in the population.

5.2 SNP typing on DNA microarrays (VI) As well as monitoring the development of mutations in the HIV-1 viral nucleic acid we have also typed (by allele-specific extension) a polymorphism in the chemokine receptor CCR5, which is believed to confer resistance to HIV-1 infection or at least delay the development of AIDS (Dean et al., 1996). This 32 bp deletion lies on the short arm of chromosome 3 (3p21) with individuals homozygous for the deletion being largely resistant to HIV-1 infection despite multiple sexual contacts with HIV-1 infected individuals, while heterozygotes are partially protected against disease progression. The deletion results in a severely truncated CCR5 protein, which is not presented at the cell surface and therefore cannot act as a co-receptor for HIV-1 (Liu et al., 1996). It is believed that a number of other SNPs could also confer resistance (Paxton and Kang, 1998) but association studies need to be carried out before they can be associated with delayed disease progression or increased resistance to HIV-1. To facilitate such large-scale association studies, high-throughput, accurate and robust techniques are required to screen candidate SNPs. Here we have developed a microarray based allele-specific extension method for the accurate typing of SNPs which has the potential to analyse up to 12,500 SNPs per microarray. Allele-specific extension methodology was first described by Newton in 1989 who used allele-specific primers (with their 3′ terminus annealing at the variant position) for amplification (Newton et al., 1989). The specificity of this ARMS technique depends on the ability of polymerases to bind and extend perfectly matched primers in comparison to a primer that is mismatched at the 3′ terminus (Newton et al., 1989). However a problem which has greatly hampered the development of these allele based discrimination methods is that some mismatches are not refractory to extension

45 D. O’ Meara

(Newton et al., 1989; Wu et al., 1989; Kwok et al., 1990; Ayyadevara et al., 2000), which can result in incorrect genotyping and almost certainly reduces the discriminatory power of the method. Mismatches such as GT or CA are clearly not refractory to extension (Newton et al., 1989) and in two independent studies it was observed that generally all T mismatches were well extended in addition to CA mismatches (Kwok et al., 1990; Huang et al., 1992). The addition of apyrase, a nucleotide degrading enzyme, to allele-specific extension reactions to minimise the extension of such mismatches has previously been described (Ahmadian et al., 2001). This AMASE assay exploits the fact that DNA polymerases exhibit slower reaction kinetics when extending a mismatched primer compared to a perfectly matched primer allowing incorporation of nucleotides when the reaction kinetics are fast but degrading the nucleotides before extension when the reaction kinetics are slow. The principle has previously been illustrated in a real-time bioluminometric detection format where inclusion of apyrase resulted in the correct typing of eight variants which were incorrectly or ambiguously typed when apyrase was omitted (Ahmadian et al., 2001). Here using microarray technology, the genotyping of twelve polymorphisms in seventeen individuals based on this principle is described. These twelve polymorphisms consist of ten single nucleotide substitutions in genes that are implicated as risk factors for myocardial infarction (Odeberg et al., 2001) and in the p53 tumour suppresser gene (Ahmadian et al., 2000). An insertion/deletion in a candidate gene for coronary heart disease (Odeberg et al., 2001) and in CCR5 were also typed (Dean et al., 1996). A pair of allele-specific extension primers (with differing 3′ termini) were used for each SNP which generated six A(primer):C(template)/G:T mismatches, two C:A/T:G, two C:T/A:G and one of each of the mismatches G:A/T:C and G:G/C:C as well as generating the perfectly matched primer:template duplexes. These extension primers were covalently coupled to the slide surface thereby facilitating hybridisation and in situ extension of target DNA. Initially the conditions for hybridisation of target DNA to the microarray were optimised and factors such as a spacer sequence and stabilising probes were investigated. Synthesis of the extension primers with a 15 T spacer at the 5′ end improved the hybridisation signal presumably because of the reduced steric hindrance between target and probe as suggested by previous studies (Southern et al., 1999). It

46 Molecular Tools for Nucleic Acid Analysis was also investigated whether the modular probe effect could be extended to microarrays to improve the amount of target hybridised to the printed oligonucleotides. A 1.5 to a 5-fold improvement in the absolute amount of methylenetetrahydrofolate reductase (MTHFR) and p53 DNA captured was observed when a modular probe was prehybridised to the target DNA (Figure 14). Although the role of base stacking is unclear here due to the presence of the 15 T spacer, opening up of secondary structure is probably involved. Modular probes were thus employed for all 12 polymorphisms typed and it is envisioned that they will be particularly beneficial when AMASE is scaled up to type thousands of SNPs using multiplex PCR products. In previous allele-specific extension assays performed on microarrays, a reduction in absolute signal was observed when multiplex PCR products were used (Pastinen et al., 2000). Thus the modular probes may help achieve higher overall signals if employed in such scaled up assay formats.

s/s p53 s/s p53

+ - Modular probe

16000

12000

8000

4000

0 + - Modular probe

Figure 14. Improvement of hybridisation efficiency using modular primer on oligonucleotide microarrays. Fluorescently labelled single-strand p53 DNA (with and without a modular primer) was hybridised to the immobilised capture probe. The slides were scanned and the raw data analysed, which shows a 5-fold improvement in the amount of DNA captured when a modular probe was included.

The presence/absence of apyrase and its effect on genotyping using oligonucleotide microarrays was then investigated. When allele-specific extension was performed without apyrase, five of the thirty-six (14%) different variants typed were incorrectly scored. These were all homozygous variants that were scored as heterozygous when

47 D. O’ Meara apyrase was omitted, due to equal extension of the matched and mismatched primer. These five variants comprised of three nucleotide substitutions, a 4G/5G insertion and the 32 bp deletion. Not surprisingly, four of the five mismatches that were extended in the absence of apyrase were mismatches that are difficult to discriminate i.e. G:T, T:G and A:C. AMASE was then applied to the typing of the ten SNPs and two insertion/deletion polymorphisms in seventeen individuals generating approximately 200 genotypes. Cluster analysis was performed on this data by plotting allelic fraction (estimates the relative amount of the major allele in the sample) versus the total fluorescence which showed a clear clustering of the allelic fractions for all twelve polymorphisms corresponding to the three possible genotypes (Figure 15).

6

5

4

3

2

Log total fluorescence Log total 1

0 0 0.2 0.4 0.6 0.8 1 Allelic fraction

Figure 15. Cluster diagram showing the genotype assignment for the 17 individuals at each of the twelve polymorphic positions. The relative allelic fraction versus the log of the total fluorescence signal is plotted for each polymorphism. The dashed lines represents the boundaries for scoring a sample homozygous (≤ 0.2 and ≥ 0.8) or heterozygous (0.29-0.71).

Boundaries were then calculated allowing a deviation of 3-fold the standard deviation from the expected allelic fractions of 0.5 or 0.001/0.999 for heterozygous and homozygous samples respectively. These boundaries determined the range of allelic frequencies a sample had to fall between to be scored as heterozygous or homozygous (for the major or minor allele). This means that for a sample to be called heterozygous it must have an allele fraction between 0.29-0.71. The allelic fraction for the 58 heterozygous samples scored here ranged from 0.35 to 0.65 which corresponds to

48 Molecular Tools for Nucleic Acid Analysis

extension ratios <2 (i.e. the extension signal of one primer is no more than 2-fold greater than the extension signal of the other primer). If a sample is to be scored as homozygous for the minor or wild type allele it must have an allele fraction ≤0.2 or ≥0.8 respectively. In this study, allele fractions ranging from 0.01-0.18 for the minor allele and from 0.86-0.99 for the major allele were observed. These boundaries mean that for a sample to be called homozygous, one primer most be extended at least 4- fold more than the other primer. The ratios obtained here after extension with apyrase ranged from 4.6 to 314 with an average ratio of 23. This microarray based allele-specific extension assay is a rapid, robust and accurate method for large-scale genotyping. The assay can be completed in less than an hour (excluding printing of the array, sample preparation and data analysis) with current technology having the capacity to interpret up to 12,500 SNPs. The key parameters for consideration for large-scale genotyping were recently outlined by Lai and included the cost (<$0.2 per genotype), no-call rate (should be <10%), error rate (should be <1%), throughput (≥ 50,000 genotypes per day) and potential for multiplexing (≥ 10 x) (Lai, 2001). Here we have addressed at least two of these parameters i.e. the no call rate and error rate. Other microarray based allele specific extension formats have a high error/no call rate with samples scored as homozygous even though extension ratios of only 1.4 were obtained (Erdogan et al., 2001). Despite the setting of boundaries for SNP scoring, there were no samples here that failed to be typed when apyrase was included. Indeed all 198 genotypes (which included at least one sample of each variant) were scored correctly. However the degree of multiplexing still needs to be addressed which is a problem for most of the available SNP technologies with techniques such as rolling-circle perhaps offering a solution to this problem (Lizardi et al., 1998). Apyrase could also be included in SBE minisequencing formats where the accuracy of typing depends on how reliably the DNA polymerase discriminates between incorporation of a matched or mismatched dideoxynucleotide (Syvänen et al., 1993). Like allele-specific extension, these SBE assays are also prone to polymerase errors due to mispriming of the minisequencing extension primer (Fan et al., 2000; Prince et al., 2001). However allele-specific extension methods requiring only a single detection reaction of one fluorophore are more compatible with large-scale genotyping than SBE formats where a number of separate extension or detection

49 D. O’ Meara reactions have to be carried out (Fan et al., 2000). Another advantage of this AMASE method is that extension is performed in situ on the microarray rather than in solution as described for many of the SBE techniques (Fan et al., 2000; Hirschhorn et al., 2000).

6 Concluding remarks

In conclusion, a variety of molecular tools such as biosensor technology, magnetic beads, pyrosequencing, apyrase mediated allele-specific extension and oligonucleotide microarrays have been used for nucleic acid analysis. Biosensor technology was used to investigate the kinetic properties of hybridisation, which led to the development of stabilising modular probes to improve the capture of nucleic acids. Such stabilising probes were used to extract viral RNA and to clean-up sequencing products resulting in sensitive extraction protocols that are amenable to automation. A pilot study of pyrosequencing illustrated that it could be employed to rapidly screen HIV-1 patients for drug resistance mutations in the protease gene. Finally, DNA microarray technology was used in the development of a high- throughput genotyping method based on AMASE which has the potential to type up to 12,500 SNPs in one hour.

50 Molecular Tools for Nucleic Acid Analysis

Abbreviations

AMASE apyrase mediated allele-specific extension ARMS amplification refractory mutation system ASO allele-specific oligonucleotide ATP adenosine triphosphate bDNA branched DNA bp basepairs CCR5 cysteine-cystein chemokine recepetor cDNA complementary DNA CFLP cleavage fragment length polymorphism CGE capillary gel electrophoresis DGGE denaturing gradient gel electrophoresis DHPLC denaturing high performance liquid chromatography DNA deoxyribonucleic acid dNTP deoxynucleotide triphosphate ddNTP dideoxynucleotide triphosphate HA hetroduplex analysis HAART highly active antiretroviral therapy HBV hepatitis B virus HCV hepatitis C virus HGP Human Genome Project HIV human immunodeficiency virus HPV human papilloma virus FRET fluorescence resonance energy transfer IDV indinavir Kb kilo base LCR ligase chain reaction MALDI matrix-assisted laser desorption ionisation Mb mega bases mRNA messenger RNA MS mass spectroscopy NASBA nucleic acid based sequence amplification NAT nucleic acid technology NTR non-translated region OLA oligonucleotide ligation assay PCR polymerase chain reaction QC-PCR quantitative competitive PCR PI protease inhibitor PNA peptide nucleic acid PPi pyrophosphate RCA rolling-circle amplification RFLP restriction fragment length polymorphism RNA ribonucleic acid RT reverse transcriptase RTV ritonavir RU resonance units SBE single base extension SBH sequencing by hybridisation SNP single nucleotide polymorphism SPR surface plasmon resonance SSCP single-strand conformation polymorphism SQV saquinavir TOF time of flight TSC The SNP Consortium

51 D. O’ Meara

Acknowledgements

I am grateful to the entire staff at the Department of Biotechnology for creating an excellent working environment and for your friendship and help over the past few years. I am especially grateful to;

My supervisor, Prof. Joakim Lundeberg for all your help when I first arrived in Stockholm, for your support and encouragement, for giving me so many opportunities, for helping me reach this point and for your unending enthusiasm.

Prof. Mathias Uhlén, for accepting me as a PhD. student and for the creative atmosphere you bring to the DNA corner.

Shea Fanning of Cork Institute of Technology, for introducing me to the world of molecular biology and for giving me the inspiration and encouragement to come here to do a PhD.

Per-Åke for your thoughtful reflections on modular probes and hybridisation and Stefan for helping things go smoothly at the beginning.

Jacob Odeberg for “showing me the ropes” in the lab and for good collaborations. Peter Nilsson for answering all my trivial questions about biacore and microarrays and for the work with the Mut S project that never came to be. Afshin Ahmadian for all your help in the last few months and for interesting discussions on AMASE and Anna Blomstergren for successfully using the modular probes with sequencing products.

My collaborators, Jan Albert, Thomas Leitner and Karin Wilbe for teaching me some of your knowledge about HIV, Zhibing Yun and Anders Sönnerberg for good work on the HCV project and to Dynal AS, Oslo for fruitful projects and for supplying me with beads.

Påls group, Baback and Carlos for friendship and help with pyrosequencing and Tommy and Jonas for the use of your enzymes.

Everybody in the department, but especially Eric and Anders for keeping the computers and network running smoothly, Bahram and Co. for all the sequencing runs, Kicki for keeping everything in order in relation to PCR, all in ex “Blå-kulla” (and Anna G, Åsa, Cissi and Malin N) for help with various things, Monica and Pia for taking care of administrative matters, Afshin, Torbjörn and Christin for answering my various questions about printing this thesis, everybody in the ex “storra labbet” for pleasant company and Tanya, Joke, Bertha, Naowarat and Mostafa for friendship and help.

52 Molecular Tools for Nucleic Acid Analysis

My Irish friends here, with a special thanks to Mary and Barbara for all your help and companionship, for the shared experiences of being a PhD. student in Stockholm and not least for the laughs we had. To Susan and Hans (and Ellen) for friendship, dinners and memorable parties. To Alison for her friendship and John Kennedy for introducing me to a large circle of friends in my first few months here.

Anders, Tuan and Christina, Vinay and Clara, Helena and Caj, Chris and Anna for the good times we have had in Stockholm.

My friends in Ireland; Sheila and James, Karen, Caroline and Jim, Dorothy, Louise and Gaye.

A heartfelt thanks to my family, Carmel and Laura, Catriona, Joseph, Paul and my parents for all your support during the years.

To Allen, for your love, support and understanding. I would never have made it without you.

53 D. O’ Meara

References

Adams, M. D., Celniker, S. E., Holt, R. A., Evans, C. A., Gocayne, J. D., Amanatides, P. G., Scherer, S. E., Li, P. W., Hoskins, R. A., and Galle, R. F. 2000. The genome sequence of Drosophila melanogaster. Science 287:2185-95.

Agaton, C., Sievertzon, M., Unneberg, P., Holmberg, A., Larsson, M., Odeberg, J., Uhlén, M., and J, L. 2001. Gene Expression Analysis by Signature Pyrosequencing. Submitted.

Ahmadian, A., Gharizadeh, B., Gustafsson, A. C., Sterky, F., Nyrén, P., Uhlén, M., and Lundeberg, J. 2000. Single-nucleotide polymorphism analysis by pyrosequencing. Anal Biochem 280:103-10.

Ahmadian, A., Gharizadeh, B., O' Meara, D., Odeberg, J., and Lundeberg, J. 2001. Genotyping by apyrase mediated allele-specific extension. Submitted.

Anderson, R. C., Su, X., Bogdan, G. J., and Fenton, J. 2000. A miniature integrated device for automated multistep genetic assays. Nucleic Acids Res 28:E60.

Ansorge, W., Sproat, B. S., Stegemann, J., and Schwager, C. 1986. A non-radioactive automated method for DNA sequence determination. J Biochem Biophys Methods 13:315-23.

Arens, M. 2001. Clinically relevant sequence-based genotyping of HBV, HCV, CMV, and HIV. J Clin Virol 22:11-29.

Avery, O. T., MacLeod, C. M., and MacCarthy, M. 1944. Studies on the chemical nature of the substance inducing transformationof pneumoccal types. J. Exp. Med. 79:137-158.

Ayyadevara, S., Thaden, J. J., and Shmookler Reis, R. J. 2000. Discrimination of primer 3'-nucleotide mismatch by taq DNA polymerase during polymerase chain reaction. Anal Biochem 284:11-8.

Bains, W., and Smith, G. C. 1988. A novel method for nucleic acid sequence determination. J Theor Biol 135:303-7.

Baner, J., Nilsson, M., Mendel-Hartvig, M., and Landegren, U. 1998. Signal amplification of padlock probes by rolling circle replication. Nucleic Acids Res 26:5073-8.

Barany, F. 1991. Genetic disease detection and DNA amplification using cloned thermostable ligase. Proc Natl Acad Sci U S A 88:189-93.

Becker-Andre, M., and Hahlbrock, K. 1989. Absolute mRNA quantification using the polymerase chain reaction (PCR). A novel approach by a PCR aided transcript titration assay (PATTY). Nucleic Acids Res 17:9437-46.

54 Molecular Tools for Nucleic Acid Analysis

Berg, C., Hedrum, A., Holmberg, A., Pontén, F., Uhlén, M., and Lundeberg, J. 1995. Direct solid-phase sequence analysis of the human p53 gene by use if multiplex polymerase chain reaction and α-thiotriphosphate nucleotides. Clin. Chem. 41:1461- 1466.

Blanchard, A. 1998. Synthetic DNA arrays. Genet Eng 20:111-23.

Blohm, D. H., and Guiseppi-Elie, A. 2001. New developments in microarray technology. Curr Opin Biotechnol 12:41-7.

Blomstergren, A. 2001. Automated solid phase capture of multiplex cycle sequencing products. Unpublished.

Braun, A., Little, D. P., and Koster, H. 1997. Detecting CFTR gene mutations by using primer oligo base extension and mass spectrometry. Clin Chem 43:1151-8.

Brinchmann, J. E., Albert, J., and Vartdal, F. 1991. Few infected CD4+ T cells but a high proportion of replication- competent provirus copies in asymptomatic human immunodeficiency virus type 1 infection. J Virol 65:2019-23.

Brooks, E. M., Sheflin, L. G., and Spaulding, S. W. 1995. Secondary structure in the 3' UTR of EGF and the choice of reverse transcriptases affect the detection of message diversity by RT-PCR. Biotechniques 19:806-815.

Brown, E. A., Zhang, H., Ping, L. H., and Lemon, S. M. 1992. Secondary structure of the 5´ nontranslated regions of hepatitis C virus and pestivirus genomic . Nucleic Acids Res. 20:5041-5045.

Bunemann, H. 1982. Immobilization of denatured DNA to macroporous supports: II. Steric and kinetic parameters of heterogeneous hybridization reactions. Nucleic Acids Res 10:7181-96.

Bustin, S. A. 2000. Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. J Mol Endocrinol 25:169-93.

CDC. The Center for Disease Control and Prevention:The global HIV and AIDS epidemic, 2001. Jama 285:3081-3.

Chamberlain, J., Gibbs, R., Ranier, J., Nguyen, P., and Caskey, C. 1988. Deletion screening of the Duchenne muscular dystrophy locus via multiplex DNA amplification. Nucleic Acid Res 16:11141-11156.

Chee, M., Yang, R., Hubbell, E., Berno, A., Huang, X. C., Stern, D., Winkler, J., Lockhart, D. J., Morris, S. F., and Fodor, S. P. 1996. Accessing genetic information with high-density DNA arrays. Science 274:610-614.

Cheng, J., Sheldon, E. L., Wu, L., Uribe, A., Gerrue, L. O., Carrino, J., Heller, M. J., and O'Connell, J. P. 1998. Preparation and hybridization analysis of DNA/RNA from E. coli on microfabricated bioelectronic chips. Nat Biotechnol 16:541-6.

55 D. O’ Meara

Cheskis, B., and Freedman, L. P. 1996. Modulation of nuclear receptor interactions by ligands: kinetic analysis using surface plasmon resonance. Biochemistry 35:3309-18.

Cho, R. J., Mindrinos, M., Richards, D. R., Sapolsky, R. J., Anderson, M., Drenkard, E., Dewdney, J., Reuber, T. L., Stammers, M., and Federspiel, N. 1999. Genome-wide mapping with biallelic markers in Arabidopsis thaliana. Nat Genet 23:203-7.

Chomczynski, P., and Sacchi, N. 1987. Single-step method of RNA isolation by acid guanidinium thiocyante-phenol-chloroform extraction. Anal. Biochem. 162:156-159.

Choo, Q. L., Richman, K. H., Han, J. H., Berger, K., Lee, C., Dong, C., Gallegos, C., Coit, D., Medina-Selby, A., Barr, P. J., Weiner, A. J., Bradley, D. W., Kuo, G., and Houghton, M. 1991. Genetic organization and diversity of the hepatitis C virus. Proc. Natl. Acad. Sci. USA 88:2451-2455.

Chou, Q., Russell, M., Birch, D. E., Raymond, J., and Bloch, W. 1992. Prevention of pre-PCR mis-priming and primer dimerization improves low-copy-number amplifications. Nucleic Acids Res 20:1717-23.

Compton, J. 1991. Nucleic acid sequence-based amplification. Nature 350:91-2.

Condra, J. H. 1998. Resistance to HIV protease inhibitors. Haemophilia 4:610-5.

Cronin, M. T., Fucini, R. V., Kim, S. M., Masino, R. S., Wespi, R. M., and Miyada, C. G. 1996. Cystic fibrosis mutation detection by hybridization to light-generated DNA probe arrays. Hum Mutat 7:244-55. de Saizieu, A., Certa, U., Warrington, J., Gray, C., Keck, W., and Mous, J. 1998. Bacterial transcript imaging by hybridization of total RNA to oligonucleotide arrays. Nat Biotechnol 16:45-8. de Silva, D., and Wittwer, C. T. 2000. Monitoring hybridization during polymerase chain reaction. J Chromatogr B Biomed Sci Appl 741:3-13.

Dean, M., Carrington, M., Winkler, C., Huttley, G. A., Smith, M. W., Allikmets, R., Goedert, J. J., Buchbinder, S. P., Vittinghoff, E., and Gomperts, E. 1996. Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study. Science 273:1856-62.

DeRisi, J., Penland, L., Brown, P. O., Bittner, M. L., Meltzer, P. S., Ray, M., Chen, Y., Su, Y. A., and Trent, J. M. 1996. Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat Genet 14:457-60.

Don, R. H., Cox, P. T., Wainwright, B. J., Baker, K., and Mattick, J. S. 1991. 'Touchdown' PCR to circumvent spurious priming during gene amplification. Nucleic Acids Res 19:4008.

56 Molecular Tools for Nucleic Acid Analysis

Drmanac, R., Labat, I., Brukner, I., and Crkvenjakov, R. 1989. Sequencing of megabase plus DNA by hybridization: theory of the method. Genomics 4:114-28.

Drmanac, S., Kita, D., Labat, I., Hauser, B., Schmidt, C., Burczak, J. D., and Drmanac, R. 1998. Accurate sequencing by hybridization for DNA diagnostics and individual genomics. Nat Biotechnol 16:54-8.

Duck, P., Alvarado-Urbina, G., Burdick, B., and Collier, B. 1990. Probe amplifier system based on chimeric cycling oligonucleotides. Biotechniques 9:142-8.

Edman, C. F., Raymond, D. E., Wu, D. J., Tu, E., Sosnowski, R. G., Butler, W. F., Nerenberg, M., and Heller, M. J. 1997. Electric field directed nucleic acid hybridization on microchips. Nucleic Acids Res 25:4907-14.

Eisen, M., and Brown, P. 1999. DNA arrays for analysis of gene expression. In: Methods in Enzymology:cDNA preparation and characterisation. (S. Weissman, ed.), Academic Press, pp. 179-205.

Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95:14863- 8.

Erdogan, F., Kirchner, R., Mann, W., Ropers, H. H., and Nuber, U. A. 2001. Detection of mitochondrial single nucleotide polymorphisms using a primer elongation reaction on oligonucleotide microarrays. Nucleic Acids Res 29:E36.

Erlich, H. A., Gelfand, D., and Sninsky, J. J. 1991. Recent advances in the polymerase chain reaction. Science 252:1643-51.

Fan, J. B., Chen, X., Halushka, M. K., Berno, A., Huang, X., Ryder, T., Lipshutz, R. J., Lockhart, D. J., and Chakravarti, A. 2000. Parallel genotyping of human SNPs using generic high-density oligonucleotide tag arrays. Genome Res 10:853-60.

Fangan, B. M., Dahlberg, O. J., Deggerdal, A. H., Bosnes, M., and Larsen, F. 1999. Automated system for purification of dye-terminator sequencing products eliminates up-stream purification of templates. Biotechniques 26:980-3.

Fanson, B. G., Osmack, P., and Di Bisceglie, A. M. 2000. A comparison between the phenol-chloroform method of RNA extraction and the QIAamp viral RNA kit in the extraction of hepatitis C and GB virus- C/hepatitis G viral RNA from serum. J Virol Methods 89:23-7.

Faruqi, A. F., Hosono, S., Driscoll, M. D., Dean, F. B., Alsmadi, O., Bandaru, R., Kumar, G., Grimwade, B., Zong, Q., Sun, Z., Du, Y., Kingsmore, S., Knott, T., and Lasken, R. S. 2001. High-throughput genotyping of single nucleotide polymorphisms with rolling circle amplification. BMC Genomics 2:4.

Ferguson, J. A., Steemers, F. J., and Walt, D. R. 2000. High-density fiber-optic DNA random microsphere array. Anal Chem 72:5618-24.

57 D. O’ Meara

Fredricks, D. N., and Relman, D. A. 1999. Application of polymerase chain reaction to the diagnosis of infectious diseases. Clin Infect Dis 29:475-86; quiz 487-8.

Fry, G., Lachenmeier, E., Mayrand, E., Giusti, B., Fisher, J., Johnston-Dow, L., Cathcart, R., Finne, E., and Kilaas, L. 1992. A new approach to template purification for sequencing applications using paramagnetic particles. Biotechniques 13:124-131.

Fu, D. J., Tang, K., Braun, A., Reuter, D., Darnhofer-Demar, B., Little, D. P., O'Donnell, M. J., Cantor, C. R., and Koster, H. 1998. Sequencing 5 to 8 of the p53 gene by MALDI-TOF mass spectrometry. Nat Biotechnol 16:381-4.

Garcia, C. A., Ahmadian, A., Gharizadeh, B., Lundeberg, J., Ronaghi, M., and Nyrén, P. 2000. Mutation detection by pyrosequencing: sequencing of exons 5-8 of the p53 tumor suppressor gene. Gene 253:249-57.

Garland, P. B. 1996. Optical evanescent wave methods for the study of biomolecular interactions. Q Rev Biophys 29:91-117.

Gerhold, D., Lu, M., Xu, J., Austin, C., Caskey, C. T., and Rushmore, T. 2001. Monitoring expression of genes involved in drug metabolism and toxicology using DNA microarrays. Physiol Genomics 5:161-70.

Gerhold, D., Rushmore, T., and Caskey, C. T. 1999. DNA chips: promising toys have become powerful tools. Trends Biochem Sci 24:168-73.

Germer, J. J., Rys, P. N., Thorvilson, J. N., and Persing, D. H. 1999. Determination of hepatitis C virus genotype by direct sequence analysis of products generated with the Amplicor HCV test. J Clin Microbiol 37:2625-30.

Gharizadeh, B., Kalantari, M., Garcia, C. A., Johansson, B., and Nyrén, P. 2001. Typing of human papillomavirus by pyrosequencing. Lab Invest 81:673-9.

Gilliland, G., Perrin, S., Blanchard, K., and Bunn, H. F. 1990. Analysis of cytokine mRNA and DNA: detection and quantitation by competitive polymerase chain reaction. Proc Natl Acad Sci U S A 87:2725-9.

Goffeau, A., Barrell, B. G., Bussey, H., Davis, R. W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J. D., Jacq, C., and Johnston, M. 1996. with 6000 genes. Science 274:546, 563-7.

Graber, J. H., Smith, C. L., and Cantor, C. R. 1999. Differential sequencing with mass spectrometry. Genet Anal 14:215-9.

Graves, D. J. 1999. Powerful tools for genetic analysis come of age. Trends Biotechnol 17:127-34.

Greenfield, L., and White, T. J. 1993. Sample preparation methods. In: Diagnostic Molecular Microbiology:Principles and Applications (D. H. Persing, T. F. Smith, F. C. Tenover, and T. J. White, eds.), ASM Press, Washington, D.C., pp. p122-137.

58 Molecular Tools for Nucleic Acid Analysis

Guatelli, J. C., Whitfield, K. M., Kwoh, D. Y., Barringer, K. J., Richman, D. D., and Gingeras, T. R. 1990. Isothermal, in vitro amplification of nucleic acids by a multienzyme reaction modeled after retroviral replication. Proc Natl Acad Sci U S A 87:7797.

Guo, Z., Guilfoyle, R. A., Thiel, A. J., Wang, R., and Smith, L. M. 1994. Direct fluorescence analysis of genetic polymorphisms by hybridization with oligonucleotide arrays on glass supports. Nucleic Acids Res 22:5456-65.

Guo, Z., Liu, Q., and Smith, L. M. 1997. Enhanced discrimination of single nucleotide polymorphisms by artificial mismatch hybridization. Nat Biotechnol 15:331-5.

Gustafsson, A. C., Guo, Z., Hu, X., Ahmadian, A., Brodin, B., Nilsson, A., Ponten, J., Ponten, F., and Lundeberg, J. 2001. HPV-related cancer susceptibility and p53 codon 72 polymorphism. Acta Derm Venereol 81:125-9.

Hacia, J. G., Brody, L. C., Chee, M. S., Fodor, S. P., and Collins, F. S. 1996. Detection of heterozygous mutations in BRCA1 using high density oligonucleotide arrays and two-colour fluorescence analysis. Nat. Genet. 14:441-447.

Haff, L. A., and Smirnov, I. P. 1997. Single-nucleotide polymorphism identification assays using a thermostable DNA polymerase and delayed extraction MALDI-TOF mass spectrometry. Genome Res 7:378-88.

Hall, J. G., Eis, P. S., Law, S. M., Reynaldo, L. P., Prudent, J. R., Marshall, D. J., Allawi, H. T., Mast, A. L., Dahlberg, J. E., Kwiatkowski, R. W., de Arruda, M., Neri, B. P., and Lyamichev, V. I. 2000. Sensitive detection of DNA polymorphisms by the serial invasive signal amplification reaction. Proc Natl Acad Sci U S A 97:8272-7.

Haqqi, T. M., Sarkar, G., David, C. S., and Sommer, S. S. 1988. Specific amplification with PCR of a refractory segment of genomic DNA. Nucleic Acids Res 16:11844.

Hawkins, A., Davidson, F., and Simmonds, P. 1997. Comparison of plasma virus loads among individuals infected with hepatitis C virus (HCV) genotypes 1, 2, and 3 by quantiplex HCV RNA assay versions 1 and 2, Roche Monitor assay, and an in- house limiting dilution method. J Clin Microbiol 35:187-92.

Head, S. R., Rogers, Y. H., Parikh, K., Lan, G., Anderson, S., Goelet, P., and Boyce- Jacino, M. T. 1997. Nested genetic bit analysis (N-GBA) for mutation detection in the p53 tumor suppressor gene. Nucleic Acids Res 25:5065-71.

Heal, A. 2000. Molecular biology automation in the clinical laboratory. J Clinical Ligand Assay 23:57-65.

Heermann, K. H., Hagos, Y., and Thomssen, R. 1994. Liquid-phase hybridization and capture of hepatitis B virus DNA with magnetic beads and fluorescence detection of PCR product. J Virol Methods 50:43-57.

59 D. O’ Meara

Heid, C. A., Stevens, J., Livak, K. J., and Williams, P. M. 1996. Real time quantitative PCR. Genome Res 6:986-94.

Hertogs, K., de Bethune, M. P., Miller, V., Ivens, T., Schel, P., Van Cauwenberge, A., Van Den Eynde, C., Van Gerwen, V., Azijn, H., and Van Houtte, M. 1998. A rapid method for simultaneous detection of phenotypic resistance to inhibitors of protease and reverse transcriptase in recombinant human immunodeficiency virus type 1 isolates from patients treated with antiretroviral drugs. Antimicrob Agents Chemother 42:269-76.

Hessner, M. J., Budish, M. A., and Friedman, K. D. 2000. Genotyping of factor V G1691A (Leiden) without the use of PCR by invasive cleavage of oligonucleotide probes. Clin Chem 46:1051-6.

Hirschhorn, J. N., Sklar, P., Lindblad-Toh, K., Lim, Y. M., Ruiz-Gutierrez, M., Bolk, S., Langhorst, B., Schaffner, S., Winchester, E., and Lander, E. S. 2000. SBE-TAGS: an array-based method for efficient single-nucleotide polymorphism genotyping. Proc Natl Acad Sci U S A 97:12164-9.

Holland, P. M., Abramson, R. D., Watson, R., and Gelfand, D. H. 1991. Detection of specific polymerase chain reaction product by utilizing the 5'--3' exonuclease activity of Thermus aquaticus DNA polymerase. Proc Natl Acad Sci U S A 88:7276-80.

Holmberg, A. 2001. Novel automated workstations for large-scale sequence reaction cleanup. Unpublished.

Hsuih, T. C. H., Park, Y. N., Zaretsky, C., Wu, F., Tyagi, S., Kramer, F. R., Sperling, R., and Zhang, D. Y. 1996. Novel, ligation-dependent PCR assay for detection of hepatitis C virus in serum. J. Clin. Microbiol. 34:501-507.

Huang, M. M., Arnheim, N., and Goodman, M. F. 1992. Extension of base mispairs by Taq DNA polymerase: implications for single nucleotide discrimination in PCR. Nucleic Acids Res 20:4567-73.

Innis, M. A., Myambo, K. B., Gelfand, D. H., and Brow, M. A. 1988. DNA sequencing with Thermus aquaticus DNA polymerase and direct sequencing of polymerase chain reaction-amplified DNA. Proc Natl Acad Sci U S A 85:9436-40.

Johnson, J. R. 2000. Development of polymerase chain reaction-based assays for bacterial gene detection. J Microbiol Methods 41:201-9.

Jönsson, U., Fägerstam, L., Ivarsson, B., Johnsson, B., Karlsson, R., Lundh, K., Löfås, S., Persson, B., Roos, H., and Rönnberg, I. 1991. Real-time biospecific interaction analysis using surface plasmon resonance and sensor chip technology. Biotechniques 11:620-627.

Ju, J., Ruan, C., Fuller, C. W., Glazer, A. N., and Mathies, R. A. 1995. Fluorescence energy transfer dye-labeled primers for DNA sequencing and analysis. Proc Natl Acad Sci U S A 92:4347-51.

60 Molecular Tools for Nucleic Acid Analysis

Kalinina, O., Lebedeva, I., Brown, J., and Silver, J. 1997. Nanoliter scale PCR with TaqMan detection. Nucleic Acids Res 25:1999-2004.

Kandimalla, E. R., Manning, A., Lathan, C., Byrn, R. A., and Agrawal, S. 1995. Design, biochemical, biophysical and biological properties of cooperative antisense oligonucleotides. Nucleic Acids Research 23:3578-3584.

Kato, N., Hijikata, M., Ootsuyama, Y., Nakagawa, M., Ohkoshi, S., Sugiymura, S., and Shimotohno, K. 1990. Molecular cloning of the human hepatitis C virus genome from Japanese patients with non-A, non-B hepatitis. Proc. Natl. Acad. Sci. USA 87:9524-9528.

Kharpko, K. R., Lysov, A. A., Ivanov, I. B., Yershov, G. M., S.K., V., V.L., F., and Mirzabekov, A. D. 1991. A method for DNA sequencing by hybridization with oligonucelotide matrix. J. DNA Seq. Mapp. 1:375-388.

Khrapko, K. R., Lysov , Y. P., Khorlyn , A. A., Shick, V. V., Florentiev, V. L., and Mirzabekov, A. D. 1989. An oligonucleotide hybridization approach to DNA sequencing. FEBS Lett. 256:118-122.

Kieleczawa, J., Dunn, J. J., and Studier, F. W. 1992. DNA sequencing by primer walking with strings of contiguous hexamers. Science 258:1787-1791.

Koch, N., Yahi, N., Colson, P., Fantini, J., and Tamalet, C. 1999. Genetic polymorphism near HIV-1 reverse transcriptase resistance- associated codons is a major obstacle for the line probe assay as an alternative method to sequence analysis. J Virol Methods 80:25-31.

Kopp, M. U., Mello, A. J., and Manz, A. 1998. Chemical amplification: continuous- flow PCR on a chip. Science 280:1046-8.

Kostichka, A. J., Marchbanks, M. L., Brumley, R. L., Jr., Drossman, H., and Smith, L. M. 1992. High speed automated DNA sequencing in ultrathin slab gels. Biotechnology 10:78-81.

Kotler, L., Sobolev, I., and Ulanovsky, L. 1994. DNA sequencing: modular primers for automated walking. Biotechniques 17:554-558.

Kotler, L. E., Zevin-Sonkin, D., Sobolev, I. A., and Beskin, A. D. 1993. DNA sequencing: modular primers assembled from a library of hexamers or pentamers. Proc. Natl. Acad. Sci. USA 90:4241-4245.

Kozal, M. J., Shah, N., Shen, N., Yang, R., Fucini, R., Merigan, T. C., Richman, D. D., Morris, D., Hubbell, E., Chee, M., and Gingeras, T. R. 1996. Extensive polymorphisms observed in HIV-1 clade B protease gene using high-density oligonucleotide arrays. Nat. Med. 2:753-9.

Kramer, F. R., and Lizardi, P. M. 1989. Replicatable RNA reporters. Nature 339:401- 2.

61 D. O’ Meara

Kristensen, V. N., Kelefiotis, D., Kristensen, T., and Borresen-Dale, A. L. 2001. High-throughput methods for detection of genetic variation. Biotechniques 30:318-22, 324, 326 passim.

Kronick, M. 1997. Heterozygote determination using automated DNA sequencing technology. In: Laboratory methods for the detection of mutations and polymorphisms in DNA (G. R. Taylor, ed.), CRC Press, pp. 175-189.

Kuiken, C., Foley, B., Hahn, B., Marx, P., Mccutchan, F., Mellors, J., Mullins, J., Wolinsky, S., and B., K. 2000. HIV Sequence Compendium 2000, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory.

Kuo, G., Choo, Q. L., Alter, H. J., Gitnick, G. L., Redeker, A. G., Purcell, R. H., Miyamura, T., Dienstag, J. L., Alter, M. J., Stevens, C. E., and et al. 1989. An assay for circulating antibodies to a major etiologic virus of human non-A, non-B hepatitis. Science 244:362-4.

Kwok, S., and Higuchi, R. 1989. Avoiding false positives with PCR. Nature 339:237- 8.

Kwok, S., Kellogg, D. E., McKinney, N., Spasic, D., Goda, L., Levenson, C., and Sninsky, J. J. 1990. Effects of primer-template mismatches on the polymerase chain reaction: human immunodeficiency virus type 1 model studies. Nucleic Acids Res 18:999-1005.

Lai, E. 2001. Application of SNP technologies in : lessons learned and future challenges. Genome Res 11:927-9.

Landegren, U., Kaiser, R., Sanders, J., and Hood, L. 1988. A ligase-mediated gene detection technique. Science 241:1077-80.

Lane, M. J., Paner, T., Kashin, I., Faldasz, B. D., Li, B., Gallo, F. J., and Benight, A. S. 1997. The thermodynamic advantage of DNA oligonucleotide ‘stacking hybridization’ reactions: energetics of a DNA nick. Nucleic Acids Res. 25:611-616.

Le Pogam, S., Dubois, F., Christen, R., Raby, C., Cavicchini, A., and Goudeau, A. 1998. Comparison of DNA enzyme immunoassay and line probe assays (Inno-LiPA HCV I and II) for hepatitis C virus genotyping. J Clin Microbiol 36:1461-3.

Lee, P. S., and Lee, K. H. 2000. Genomic analysis. Curr Opin Biotechnol 11:171-5.

Lee, W. M., Reddy, K. R., Tong, M. J., Black, M., van Leeuwen, D. J., Hollinger, F. B., Mullen, K. D., Pimstone, N., Albert, D., and Gardner, S. 1998. Early hepatitis C virus-RNA responses predict interferon treatment outcomes in chronic hepatitis C. The Consensus Interferon Study Group. Hepatology 28:1411-5.

Leitner, T., Halapi, E., Scarlatti, G., Rossi, P., Albert, J., Fenyo, E. M., and Uhlén, M. 1993. Analysis of heterogeneous viral populations by direct DNA sequencing. Biotechniques 15:120-7.

62 Molecular Tools for Nucleic Acid Analysis

Lin, H. J., Pedneault, L., and Hollinger, F. B. 1998. Intra-assay performance characteristics of five assays for quantification of human immunodeficiency virus type 1 RNA in plasma. J Clin Microbiol 36:835-9.

Lin, S. B., Blake, K. R., Miller, P. S., and Ts'o, P. O. P. 1989. Use of EDTA to characterise interactions between oligodeoxyribonucleoside methylphosphonates and nucleic acids. Biochemistry 26:1054-1061.

Lindroos, K., Liljedahl, U., Raitio, M., and Syvänen, A. C. 2001. Minisequencing on oligonucleotide microarrays: comparison of immobilisation chemistries. Nucleic Acids Res 29:E69-9.

Liu, R., Paxton, W. A., Choe, S., Ceradini, D., Martin, S. R., Horuk, R., MacDonald, M. E., Stuhlmann, H., Koup, R. A., and Landau, N. R. 1996. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86:367-77.

Liu, S., Shi, Y., Ja, W. W., and Mathies, R. A. 1999. Optimization of high-speed DNA sequencing on microfabricated capillary electrophoresis channels. Anal Chem 71:566-73.

Lizardi, P. M., Huang, X., Zhu, Z., Bray-Ward, P., Thomas, D. C., and Ward, D. C. 1998. Mutation detection and single-molecule counting using isothermal rolling-circle amplification. Nat Genet 19:225-32.

Lockhart, D. J., Dong, H., Byrne, M. C., Follettie, M. T., Gallo, M. V., Chee, M. S., Mittmann, M., Wang, C., Kobayashi, M., Horton, H., and Brown, E. L. 1996. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat. Biotech. 14:1675-1680.

Lowe, P. A., Clark, T. J., Davies, R. J., Edwards, P. R., Kinning, T., and Yeung, D. 1998. New approaches for the analysis of molecular recognition using the IAsys evanescent wave biosensor. J Mol Recognit 11:194-9.

Lundeberg, J., Wahlberg, J., and Uhlén, M. 1991. Rapid colorimetric quantification of PCR amlified DNA. Biotechniques 10:68-75.

Lyamichev, V., Mast, A. L., Hall, J. G., Prudent, J. R., Kaiser, M. W., Takova, T., Kwiatkowski, R. W., Sander, T. J., de Arruda, M., Arco, D. A., Neri, B. P., and Brow, M. A. 1999. Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol 17:292-6.

Maxam, A. M., and Gilbert, W. 1977. A new method for sequencing DNA. Proc Natl Acad Sci U S A 74:560-4.

McPherson, J. D., Marra, M., Hillier, L., Waterston, R. H., Chinwalla, A., Wallis, J., Sekhon, M., Wylie, K., Mardis, E. R., and Wilson, R. K. 2001. A physical map of the human genome. Nature 409:934-41.

63 D. O’ Meara

Miyachi, H., Masukawa, A., Ohshima, T., Hirose, T., Impraim, C., and Ando, Y. 2000. Automated specific capture of hepatitis C virus RNA with probes and paramagnetic particle separation. J Clin Microbiol 38:18-21.

Monstein, H., Nikpour-Badr, S., and Jonasson, J. 2001. Rapid molecular identification and subtyping of Helicobacter pylori by pyrosequencing of the 16S rDNA variable V1 and V3 regions. FEMS Microbiol Lett 199:103-7.

Mullikin, J. C., Hunt, S. E., Cole, C. G., Mortimore, B. J., Rice, C. M., Burton, J., Matthews, L. H., Pavitt, R., Plumb, R. W., and Sims, S. K. 2000. An SNP map of human chromosome 22. Nature 407:516-20.

Mullis, K. B., and Faloona, F. A. 1987. Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. Methods Enzymol 155:335-50.

Murray, V. 1989. Improved double-stranded DNA sequencing using the linear polymerase chain reaction. Nucleic Acids Res 17:8889.

Nazarenko, I. A., Bhatnagar, S. K., and Hohman, R. J. 1997. A closed tube format for amplification and detection of DNA based on energy transfer. Nucleic Acids Res 25:2516-21.

Newton, C. R., Graham, A., Heptinstall, L. E., Powell, S. J., Summers, C., Kalsheker, N., Smith, J. C., and Markham, A. F. 1989. Analysis of any point mutation in DNA. The amplification refractory mutation system (ARMS). Nucleic Acids Res 17:2503- 16.

Nilsson, M., Malmgren, H., Samiotaki, M., Kwiatkowski, M., Chowdhary, B. P., and Landegren, U. 1994. Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 265:2085-8.

Nilsson, P., Persson, B., Uhlén, M., and Nygren, P.-Å. 1995. Real-time monitoring of DNA manipulations using biosensor technology. Anal. Biochem. 224:400-408.

Nolte, F. S., Thurmond, C., and Mitchell, P. S. 1994. Isolation of hepatitis C virus RNA from serum for reverse transcription-PCR. J. Clin. Microbiol. 32:519-520.

Norberg, J., and Nilsson, L. 1995. Potential of mean force calculations of the stacking-unstacking process in single-stranded deoxyribodinucleoside monophosphates. Biophys J 69:2277-85.

Nordström, T., Gharizadeh, B., Pourmand, N., Nyrén, P., and Ronaghi, M. 2001. Method enabling fast partial sequencing of cDNA clones. Anal Biochem 292:266-71.

Nordström, T., Ronaghi, M., Forsberg, L., de Faire, U., Morgenstern, R., and Nyrén, P. 2000. Direct analysis of single-nucleotide polymorphism on double-stranded DNA by pyrosequencing. Biotechnol Appl Biochem 31:107-12.

64 Molecular Tools for Nucleic Acid Analysis

Nyren, P., Pettersson, B., and Uhlen, M. 1993. Solid phase DNA minisequencing by an enzymatic luminometric inorganic pyrophosphate detection assay. Anal Biochem 208:171-5.

Odeberg, J., Holmberg, K., Persson, M. L., and Uhlén, M. 2001. Pyrosequencing of genetic risk factors for thrombosis. In preparation.

Olsvik, O., Popovic, T., Skjerve, E., Cudjoe, K. S., Hornes, E., Ugelstad, J., and Uhlén, M. 1994. Magnetic separation techniques in diagnostic microbiology. Clin Microbiol Rev 7:43-54.

Oroskar, A. A., Rasmussen, S. E., Rasmussen, H. N., Rasmussen, S. R., Sullivan, B. M., and Johansson, A. 1996. Detection of immobilized amplicons by ELISA-like techniques. Clin Chem 42:1547-55.

Parinov, S., Barsky, V., Yershov, G., Kirillov, E., Timofeev, E., Belgovskiy, A., and Mirzabekov, A. 1996. DNA sequencing by hybridization to microchip octa-and decanucleotides extended by stacked pentanucleotides. Nucleic Acids Res 24:2998- 3004.

Pastinen, T., Kurg, A., Metspalu, A., Peltonen, L., and Syvänen, A. C. 1997. Minisequencing: a specific tool for DNA analysis and diagnostics on oligonucleotide arrays. Genome Res 7:606-14.

Pastinen, T., Raitio, M., Lindroos, K., Tainola, P., Peltonen, L., and Syvänen, A. C. 2000. A system for specific, high-throughput genotyping by allele-specific primer extension on microarrays. Genome Res 10:1031-42.

Paxton, W. A., and Kang, S. 1998. Chemokine receptor allelic polymorphisms: relationships to HIV resistance and disease progression. Semin Immunol 10:187-94.

Pease, A. C., Solas, D., Sullivan, E. J., Cronin, M. T., Holmes, C. P., and Fodor, S. P. 1994. Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci U S A 91:5022-6.

Persson, B., Stenhag, K., Nilsson, P., Larsson, A., Uhlén, M., and Nygren, P. Å. 1997. Analysis of oligonucleotide probe affinities using surface plasmon resonance: a means for mutational scanning. Anal. Biochem. 246:33-44.

Prince, J. A., Feuk, L., Howell, W. M., Jobs, M., Emahazion, T., Blennow, K., and Brookes, A. J. 2001. Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): design criteria and assay validation. Genome Res 11:152-62.

Prober, J. M., Trainor, G. L., Dam, R. J., Hobbs, F. W., Robertson, C. W., Zagursky, R. J., Cocuzza, A. J., Jensen, M. A., and Baumeister, K. 1987. A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides. Science 238:336-41.

65 D. O’ Meara

Regan, P. M., and Margolin, A. B. 1997. Development of a nucleic acid capture probe with reverse transcriptase-polymerase chain reaction to detect poliovirus in groundwater. J. Virol. Methods 64:65-72.

Riccelli, P. V., Merante, F., Leung, K. T., Bortolin, S., Zastawny, R. L., Janeczko, R., and Benight, A. S. 2001. Hybridization of single-stranded DNA targets to immobilized complementary DNA probes: comparison of hairpin versus linear capture probes. Nucleic Acids Res 29:996-1004.

Roll, C., Ketterle, C., Faibis, V., Fazakerley, G. V., and Boulard, Y. 1998. Conformations of nicked and gapped DNA structures by NMR and molecular dynamic simulations in water. Biochemistry 37:4059-70.

Ronaghi, M. 2001. Pyrosequencing sheds light on DNA sequencing. Genome Res 11:3-11.

Ronaghi, M., Uhlén, M., and Nyrén, P. 1998. A sequencing method based on real- time pyrophosphate. Science 281:363, 365.

Ross, P., Hall, L., Smirnov, I., and Haff, L. 1998. High level multiplex genotyping by MALDI-TOF mass spectrometry. Nat Biotechnol 16:1347-51.

Ruiz-Martinez, M. C., Salas-Solano, O., Carrilho, E., Kotler, L., and Karger, B. L. 1998. A sample purification method for rugged and high-performance DNA sequencing by capillary electrophoresis using replaceable polymer solutions. A. Development of the cleanup protocol. Anal Chem 70:1516-27.

Saiki, R. K., Scharf, S., Faloona, F., Mullis, K. B., Horn, G. T., Erlich, H. A., and Arnheim, N. 1985. Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230:1350-4.

Saiki, R. K., Walsh, P. S., Levenson, C. H., and Erlich, H. A. 1989. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes. Proc Natl Acad Sci U S A 86:6230-4.

Salas-Solano, O., Ruiz-Martinez, M. C., Carrilho, E., Kotler, L., and Karger, B. L. 1998. A sample purification method for rugged and high-performance DNA sequencing by capillary electrophoresis using replaceable polymer solutions. B. Quantitative determination of the role of sample matrix components on sequencing analysis. Anal Chem 70:1528-35.

Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain- terminating inhibitors. Proc Natl Acad Sci U S A 74:5463-7.

Sauer, S., Lechner, D., Berlin, K., Lehrach, H., Escary, J. L., Fox, N., and Gut, I. G. 2000. A novel procedure for efficient genotyping of single nucleotide polymorphisms. Nucleic Acids Res 28:E13.

66 Molecular Tools for Nucleic Acid Analysis

Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. 1995. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467-70.

Schena, M., Shalon, D., Heller, R., Chai, A., Brown, P. O., and Davis, R. W. 1996. Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc Natl Acad Sci U S A 93:10614-9.

Schuurman, R., Demeter, L., Reichelderfer, P., Tijnagel, J., de Groot, T., and Boucher, C. 1999. Worldwide evaluation of DNA sequencing approaches for identification of drug resistance mutations in the human immunodeficiency virus type 1 reverse transcriptase. J. Clin. Microbiol. 37:2291-6.

Schweitzer, B., and Kingsmore, S. 2001. Combining nucleic acid amplification and detection. Curr Opin Biotechnol 12:21-7.

Servais, J., Lambert, C., Fontaine, E., Plesseria, J. M., Robert, I., Arendt, V., Staub, T., Schneider, F., Hemmer, R., Burtonboy, G., and Schmit, J. C. 2001. Comparison of DNA sequencing and a line probe assay for detection of human immunodeficiency virus type 1 drug resistance mutations in patients failing highly active antiretroviral therapy. J Clin Microbiol 39:454-9.

Shalon, D., Smith, S. J., and Brown, P. O. 1996. A DNA microarray system for analyzing complex DNA samples using two- color fluorescent probe hybridization. Genome Res 6:639-45.

Shyamala, V., and Ames, G. F. 1989. Amplification of bacterial genomic DNA by the polymerase chain reaction and direct sequencing after asymmetric amplification: application to the study of periplasmic permeases. J Bacteriol 171:1602-8.

Smith, L. M., Sanders, J. Z., Kaiser, R. J., Hughes, P., Dodd, C., Connell, C. R., Heiner, C., Kent, S. B., and Hood, L. E. 1986. Fluorescence detection in automated DNA sequence analysis. Nature 321:674-9.

Southern, E., Mir, K., and Shchepinov, M. 1999. Molecular interactions on microarrays. Nat Genet 21:5-9.

Southern, E. M. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 98:503-17.

Southern, E. M., Case-Green, S. C., Elder, J. K., Johnson, M., Mir, K. U., Wang, L., and Williams, J. C. 1994. Arrays of complementary oligonucleotides for analysing the hybridisation behaviour of nucleic acids. Nucleic Acids Res 22:1368-73.

Southern, E. M., Maskos, U., and Elder, J. K. 1992. Analyzing and comparing nucleic acid sequences by hybridization to arrays of oligonucleotides: evaluation using experimental models. Genomics 13:1008-17.

67 D. O’ Meara

Spratt, B. G. 1999. Multilocus sequence typing: molecular typing of bacterial pathogens in an era of rapid DNA sequencing and the internet. Curr Opin Microbiol 2:312-6.

Stenberg, E., Persson, B., Roos, H., and Urbaniczky, C. 1991. Quantitative determination of surface concentration of protein with surface plasmon resonance using radiolabeled proteins. J. Colloid Interface Sci. 143:513-526.

Stuyver, L., Rossau, R., Wyseur, A., Duhamel, M., Vanderborght, B., Van Heuverswyn, H., and Maertens, G. 1993. Typing of hepatitis C virus isolates and characterization of new subtypes using a line probe assay. J Gen Virol 74:1093-102.

Stuyver, L., Wyseur, A., Rombout, A., Louwagie, J., Scarcez, T., Verhofstede, C., Rimland, D., Schinazi, R. F., and Rossau, R. 1997. Line probe assay for rapid detection of drug-selected mutations in the human immunodeficiency virus type 1 reverse transcriptase gene. Antimicrob Agents Chemother 41:284-91.

Swerdlow, H., Jones, B. J., and Wittwer, C. T. 1997. Fully automated DNA reaction and analysis in a fluidic capillary instrument. Anal Chem 69:848-55.

Sykes, P. J., Neoh, S. H., Brisco, M. J., Hughes, E., Condon, J., and Morley, A. A. 1992. Quantitation of targets for PCR by use of limiting dilution. Biotechniques 13:444-9.

Syvänen, A. C. 1994. Detection of point mutations in human genes by the solid-phase minisequencing method. Clin Chim Acta 226:225-36.

Syvänen, A. C. 1999. From gels to chips: "minisequencing" primer extension for analysis of point mutations and single nucleotide polymorphisms. Hum Mutat 13:1- 10.

Syvänen, A. C., Aalto-Setala, K., Harju, L., Kontula, K., and Soderlund, H. 1990. A primer-guided nucleotide incorporation assay in the genotyping of apolipoprotein E. Genomics 8:684-92.

Syvänen, A. C., Sajantila, A., and Lukka, M. 1993. Identification of individuals by analysis of biallelic DNA markers, using PCR and solid-phase minisequencing. Am J Hum Genet 52:46-59.

Tabor, S., and Richardson, C. C. 1987. DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proc Natl Acad Sci U S A 84:4767-71.

Tabor, S., and Richardson, C. C. 1995. A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. Proc Natl Acad Sci U S A 92:6339-43.

Takeuchi, T., Katsume, A., Tanaka, T., Abe, A., Inoue, K., Tsukiyama-Kohara, K., Kawaguchi, R., Tanaka, S., and Kohara, M. 1999. Real-time detection system for quantification of hepatitis C virus genome. Gastroenterology 116:636-42.

68 Molecular Tools for Nucleic Acid Analysis

Tan, H., and Yeung, E. S. 1998. Automation and integration of multiplexed on-line sample preparation with capillary electrophoresis for high-throughput DNA sequencing. Anal Chem 70:4044-53.

Tang, K., Fu, D. J., Julien, D., Braun, A., Cantor, C. R., and Koster, H. 1999. Chip- based genotyping by mass spectrometry. Proc Natl Acad Sci U S A 96:10016-20.

Taniguchi, M., Miura, K., Iwao, H., and Yamanaka, S. 2001. Quantitative assessment of DNA microarrays-comparison with analyses. Genomics 71:34-9.

Taylor, R. W., Wardell, T. M., Connolly, B. A., Turnbull, D. M., and Lightowlers, R. N. 2001. Linked oligodeoxynucleotides show binding cooperativity and can selectively impair replication of deleted mitochondrial DNA templates. Nucleic Acids Res 29:3404-12.

The Arabidopsis Genome Initative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796-815.

The C. elegans Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: a platform for investigating biology. The C. elegans Sequencing Consortium. Science 282:2012-8.

Thiel, V., Rashtchian, A., Herold, J., Schuster, D. M., Guan, N., and Siddell, S. G. 1997. Effective amplification of 20-kb DNA by reverse transcription PCR. Anal. Biochem. 252:62-70.

Tong, X., and Smith, L. M. 1993. Solid phase purification in automated DNA sequencing. DNA Seq 4:151-62.

Tyagi, S., Bratu, D. P., and Kramer, F. R. 1998. Multicolor molecular beacons for allele discrimination. Nat Biotechnol 16:49-53.

Tyagi, S., and Kramer, F. R. 1996. Molecular beacons: probes that fluoresce upon hybridization. Nat Biotechnol 14:303-8.

Urdea, M. S., Horn, T., Fultz, T. J., Anderson, M., Running, J. A., Hamren, S., Ahle, D., and Chang, C. A. 1991. Branched DNA amplification multimers for the sensitive, direct detection of human hepatitis viruses. Nucleic Acids Symp Ser 24:197-200.

Vahey, M., Nau, M. E., Barrick, S., Cooley, J. D., Sawyer, R., Sleeker, A. A., Vickerman, P., Bloor, S., Larder, B., Michael, N. L., and Wegner, S. A. 1999. Performance of the Affymetrix GeneChip HIV PRT 440 platform for antiretroviral drug resistance genotyping of human immunodeficiency virus type 1 clades and viral isolates with length polymorphisms. J Clin Microbiol 37:2533-7. van Doorn, L. J., Kleter, B., Voermans, J., Maertens, G., Brouwer, H., Heijtink, R., and Quint, W. 1994. Rapid detection of hepatitis C virus RNA by direct capture from blood. J. Med. Virol. 42:22-28.

69 D. O’ Meara

Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., and Holt, R. A. 2001. The sequence of the human genome. Science 291:1304-51.

Walker, G. T., Fraiser, M. S., Schram, J. L., Little, M. C., Nadeau, J. G., and Malinowski, D. P. 1992. Strand displacement amplification-an isothermal, in vitro DNA amplification technique. Nucleic Acids Res 20:1691-6.

Wang, D. G., Fan, J. B., Siao, C. J., Berno, A., Young, P., Sapolsky, R., Ghandour, G., Perkins, N., Winchester, E., and Spencer, J. 1998. Large-scale identification, mapping, and genotyping of single- nucleotide polymorphisms in the human genome. Science 280:1077-82.

Wang, J. 2000. From DNA biosensors to gene chips. Nucleic Acids Res 28:3011-6.

Waters, L. C., Jacobson, S. C., Kroutchinina, N., Khandurina, J., Foote, R. S., and Ramsey, J. M. 1998. Microchip device for cell lysis, multiplex PCR amplification, and electrophoretic sizing. Anal Chem 70:158-62.

Watson, J. D., and Crick, F. H. 1953. Molecular structure of nucleic acids. A structure for deoxyribose nucleic acid. Nature 171:737-738.

Whitcombe, D., Theaker, J., Guy, S. P., Brown, T., and Little, S. 1999. Detection of PCR products using self-probing amplicons and fluorescence. Nat Biotechnol 17:804- 7.

Wiki, M., Kunz, R. E., Voirin, G., Tiefenthaler, K., and Bernard, A. 1998. Novel integrated optical sensor based on a grating coupler triplet. Biosens Bioelectron 13:1181-5.

Wilson, J. W., Bean, P., Robins, T., Graziano, F., and Persing, D. H. 2000. Comparative evaluation of three human immunodeficiency virus genotyping systems: the HIV-GenotypR method, the HIV PRT GeneChip assay, and the HIV-1 RT line probe assay. J Clin Microbiol 38:3022-8.

Wit, F. W., van Leeuwen, R., Weverling, G. J., Jurriaans, S., Nauta, K., Steingrover, R., Schuijtemaker, J., Eyssen, X., Fortuin, D., and Weeda, M. 1999. Outcome and predictors of failure of highly active antiretroviral therapy: one-year follow-up of a cohort of human immunodeficiency virus type 1-infected persons. J Infect Dis 179:790-8.

Wittwer, C. T., Herrmann, M. G., Moss, A. A., and Rasmussen, R. P. 1997a. Continuous fluorescence monitoring of rapid cycle DNA amplification. Biotechniques 22:130-1, 134-8.

Wittwer, C. T., Ririe, K. M., Andrew, R. V., David, D. A., Gundry, R. A., and Balis, U. J. 1997b. The LightCycler: a microvolume multisample fluorimeter with rapid temperature control. Biotechniques 22:176-81.

70 Molecular Tools for Nucleic Acid Analysis

Wodicka, L., Dong, H., Mittmann, M., Ho, M. H., and Lockhart, D. J. 1997. Genome- wide expression monitoring in Saccharomyces cerevisiae. Nat Biotechnol 15:1359- 67.

Wu, D. Y., Ugozzoli, L., Pal, B. K., and Wallace, R. B. 1989. Allele-specific enzymatic amplification of ß-globin genomic DNA for diagnosis of sickle cell anemia. Proc Natl Acad Sci U S A 86:2757-60.

71