Padlock and Proximity Probes for in Situ and Array-Based Analyses: Tools for the Post-Genomic Era
Total Page:16
File Type:pdf, Size:1020Kb
Comparative and Functional Genomics Comp Funct Genom 2003; 4: 525–530. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.326 Conference Review Padlock and proximity probes for in situ and array-based analyses: tools for the post-genomic era Ulf Landegren*, Fredrik Dahl, Mats Nilsson, Simon Fredriksson, Johan Baner,´ Mats Gullberg, Jonas Jarvius, Sigrun Gustafsdottir, Ola Soderberg,¨ Olle Ericsson, Johan Stenberg and Edith Schallmeiner Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Sweden *Correspondence to: Abstract Ulf Landegren, Department of Genetics and Pathology, Rudbeck Highly specific high-throughput assays will be required to take full advantage of Laboratory, Uppsala University, the accumulating information about the macromolecular composition of cells and Sweden. tissues, in order to characterize biological systems in health and disease. We discuss E-mail: the general problem of detection specificity and present the approach our group [email protected] has taken, involving the reformatting of analogue biological information to digital reporter segments of genetic information via a series of DNA ligation assays. The assays enable extensive, coordinated analyses of the numbers and locations of genes, Received: 6 June 2003 transcripts and protein. Copyright 2003 John Wiley & Sons, Ltd. Revised: 5 August 2003 Accepted: 5 August 2003 A background to molecular analyses core, the problem of analysing macromolecules in biological samples is one of specificity of detection, Complete genome sequences are becoming avail- and the requirements are extreme: the two double- able for an increasing number of organisms, allow- stranded haploid genomes in any human interphase ing research on the corresponding species to tran- cell together comprise approximately 13 billion sit to a post-genomic phase. Studies can now be nucleotides that must be searched to find a par- founded on extensive parts lists, comprising all ticular single-nucleotide variant. An unexpectedly protein-coding genes as well as regulatory and large proportion of the genome, maybe half, can be structural genetic elements, common genetic vari- transcribed to RNA (Scherer et al., 2003), but here ants and predicted repertoires of proteins of these the detection problem is compounded by the need organisms. It remains contentious just how sense to distinguish related sequences that may be present can best be made of variable patterns of gene at vastly different copy numbers. This same prob- expression, of skewed distribution of common lem is still more pressing at the level of protein and, genetic variants among healthy and affected indi- in addition, subtle modifications of the proteins viduals, or of the representation of protein sets in must be distinguished as these can signal entirely different samples. Nonetheless, the opportunity to different activity states. The number of function- specifically demonstrate the presence, concentra- ally distinct protein species that need be resolved tion, distribution and relative location of all these in an organism remains largely unknown. Finally, molecular components clearly will provide a basis in order to understand cellular functions it is also for entirely new insights into normal biological pro- important to obtain information about the organiza- cesses and disease mechanisms. tion of all these molecules on length scales between Despite impressive progress in development of tens of nanometers to micrometers and above. It is tools for macromolecular analysis, such techniques necessary to clarify the architecture of interacting remain crucial limiting factors for capitalizing on molecules and to chart sub-ecologies within cells, genomic information in biological studies. At its as well as relationships between cells in tissues. Copyright 2003 John Wiley & Sons, Ltd. 526 U. Landegren et al. Highly precise detection mechanisms are thus The possibility of analysing nucleic acid mole- required to convert the analogue information cules according to their potential to serve as targets embodied in the states of cells to digital informa- for DNA ligation reactions has a history that stems tion about the concentration and distribution of cel- back to the work by Gobind Khorana’s group lular components. While methods like PCR provide in the early 1970s, in which a DNA sequence the specificity required to search highly complex encoding a tRNA molecule was shown to template genomes, only a limited number of target sequences probe ligation reactions (Besmer et al., 1972). In can be investigated in individual reactions, and spa- the late 1980s we and others established analytic tial information is typically lost. Conversely, DNA DNA ligation in the oligonucleotide ligation assay hybridization reactions provide localized informa- (OLA), one of a small number of mechanisms still tion, and large numbers of targets can be investi- in use today for analysing specific target nucleic gated in parallel in array-based analyses, but due to acid sequences (Landegren et al., 1988; Alves and inherent limitations in the specificity of hybridiza- Carr, 1988). tion it is generally not possible to investigate com- In OLA, pairs of probes are joined, forming plex samples such as the total human genome, and new DNA strands, if and only if the probe pairs detection of rare RNA sequences is difficult. The are juxtaposed by hybridizing next to each other analogous problems exist for analyses of proteins: on the same target sequence, forming a substrate absolute specificity of protein binding is probably for DNA ligation. The requirement for coordi- not attainable and assay performance therefore cru- nated hybridization by pairs of probes renders cially depends on test architectures, most of which assays sufficiently specific to detect and distin- were established several decades ago. Just as pairs guish single-copy gene sequences directly in total of primers are required to elicit PCR amplification, genomic DNA, and the enzyme’s substrate require- assays dependent on pairwise antibody binding pro- ments easily distinguish any DNA sequence dif- vide increased specificity, but problems of cross- ferences leading to mismatched ligation substrates. reactivity mount when large numbers of molecules However, just as with PCR, and with sandwich are investigated in parallel and the mechanism cur- immunoassays for protein detection, the paired- rently is not applicable for localized detection. probe design can give rise to cross-reactions, as In this review, we will discuss reaction mecha- more targets are investigated in the same reac- nisms developed by our group for advanced macro- tion. We therefore developed so-called padlock molecular analyses. The three main interrelated probes — oligonucleotide probes designed so that technologies to be described are padlock probes target-complementary sequences at both ends can and proximity ligation probes for nucleic acid and hybridize in juxtaposition on target strands and be protein analyses, respectively, and rolling-circle joined by ligation in a target-dependent manner amplification as a general means of sensitive local- (Nilsson et al., 1994; Figure 1). The two target- ized and solution-phase detection. complementary segments of the probes are sepa- rated by sequences that can be used for amplifica- tion or identification via DNA tag sequences. Ligation-based techniques for nucleic The padlock probe design offers a number of acid analyses advantages, compared both to standard hybridiza- tion probes and to PCR and OLA. (a) The liga- An ideal set of molecular tools for macromolecu- tion reaction converts linear oligonucleotide probes lar analysis would allow simple, automated design to circular DNA strands, wound around tar- and synthesis of sets of reagents that can then get molecules. This means that probes specifi- be combined in large numbers, and in a seam- cally bound to long target molecules can with- less process result in detection signals that are stand washes under conditions where no base- easily recorded over background. The general strat- pairing persists, i.e. at denaturing pH or tem- egy we have adopted, to be described in greater perature, without detaching. (b) It is also possi- detail below, achieves these ends by reformatting ble to apply exonuclease enzymes that degrade biological information as sets of signature DNA nucleic acids from the 5 and/or 3 ends in order sequences, created in target-dependent probe liga- to remove any unreacted probes, while preserving tion reactions. reacted — endless — probes. (c) Finally, as will Copyright 2003 John Wiley & Sons, Ltd. Comp Funct Genom 2003; 4: 525–530. Padlock and proximity probes for in situ andarray-basedanalyses 527 5′ 3′ 5′ 3′ A C T Matched target T Mismatched target Ligation Ligation 5′ 3′ C Figure 1. Padlock probe ligation for nucleic acid detection. Oligonucleotide probes with end sequences correctly hybridized in juxtaposition on a target strand can be ligated to form circular DNA strands, wound around the target sequence. By contrast, probes mismatched at the 3 position are poor substrates for ligation, allowing sequence variants to be distinguished be discussed below, circularized probe molecules 2000, 2001). As an alternative, we have also used may be copied in a rolling-circle replication reac- parallel padlock analyses to measure the relative tion, reeling off a long, single-stranded molecule representation of allelic transcripts at the level of that represents catenated repeats