Perspectives
Total Page:16
File Type:pdf, Size:1020Kb
PERSPECTIVES ness of solutions is easy to check, and ‘NP- TIMELINE complete’ problems are considered the hardest of the class because they require exponentially increasing amounts of time to solve.) The past, present and future of The problem asks whether, given a set of n cities (‘vertices’) with m paths (‘edges’) con- molecular computing necting them, a ‘hamiltonian’ path exists that starts at a given vertex vin, passes through each vertex exactly once, and ends at vertex vout. Adam J. Ruben and Laura F. Landweber For fairly small values of n, today’s computers easily solve this problem. However, when n becomes very large, the amount of time Ever since scientists discovered that 2 required to generate and check every possible conventional silicon-based computers have solution increases exponentially, making very an upper limit in terms of speed, they have 1 3 large calculations infeasible. been searching for alternative media with The version Adleman solved contained v which to solve computational problems. in only seven vertices (FIG. 1), which is no remark- That search has led them, among other 0 able feat in the world of computer science. He places, to DNA. used a simple, brute-force algorithm: generate vout random paths through the graph, discard any When most people think of a ‘DNA comput- 6 that do not begin at v and end at v , discard 4 in out er’,the first image that springs to mind is a any that do not enter exactly n vertices, and personal computer-like interface with micro- 5 discard any that do not pass through each ver- centrifuge tubes lined up inside the central Figure 1 | An example of a seven-vertex, 13- tex at least once. But the use of DNA to solve processor and a keyboard that plugs directly edge graph. The ‘salesman’ starts his journey at this small computational problem marked the ′ into the molecule’s 5 end. Perhaps someday vertex 0 (vin) and ends at vertex 6 (vout). genesis of a new scientific field. such a project will become reality. At the Adleman began by synthesizing a random moment, however,‘DNA computing’ is the have always hoped would be innately linked. 20-base-pair DNA oligonucleotide (20-mer) slightly misleading title applied to experi- Much of the human body is, in theory, a flood to represent each vertex, followed by another ments in which DNA molecules have compu- of binary operations. We could be staring into series of 20-mers to represent edges. The edge tational roles. Often sequences of about 8–20 the abyss of a science as doomed as phrenolo- DNA had a certain built-in feature: in each base pairs are used to represent bits, and gy or mesmerism. But we may be at the fore- 20-mer, the first ten nucleotides comple- numerous methods have been devised to front of a new and creative technology whose mented the last ten of one vertex, and the last manipulate and evaluate them. implications have not even been fully ten complemented the first ten of another DNA is a convenient choice, as it is both mapped out, let alone realized. Fifty years vertex (FIG. 2). For example, a 20-base-pair self-complementary, allowing single-strand- from now, molecular computing may con- edge connecting vertex one to vertex two ed DNA to select its own Watson–Crick ceivably be important in our lives — who, would consist of the complements to vertex complement, and can easily be copied. Also, fifty years ago, could have predicted the per- one’s last ten base pairs and vertex two’s first molecular biologists have already built a tool- sonal computer? — and it all began with an ten. That way, when the mixture of DNA is box for manipulating DNA, including experiment published by a computer scientist restriction enzyme digestion, ligation, in Science in 1994. Path 1–2 sequencing, amplification and fluorescent labelling, giving DNA a head start over alter- The Travelling Salesman Problem native computational media. In a seminal paper, Leonard Adleman1 solved 12 This unique combination of computer sci- a simplified instance of a famous NP-com- Figure 2 | Edge DNA forming a splint bridging ence and molecular biology has fascinated the plete computer problem called the Travelling two vertices. The last five nucleotides of vertex world for nearly six years, perhaps because it Salesman Problem. (‘NP’ is the name given to 1, and the first five of vertex 2, complement the finally links two popular disciplines that we the class of search problems for which correct- nucleotides on vertex 1–2. NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 1 | OCTOBER 2000 | 69 PERSPECTIVES denatured and then allowed to anneal and Six years later, DNA computing is still in ligate, the edge will form a splint to ‘connect’ “Here’s nature’s toolbox … its infancy. Researchers have developed sever- both vertices, and T4 DNA ligase can join al algorithms to solve classic problems, and a together all possible such combinations. It is a bunch of little tools that handful have been tested and work. Despite the bulk-annealing step that allows this mole- are dirt cheap; you can buy current progress, we are still a long way away cular approach to test a massive number of from solving complex problems. possibilities in parallel. a DNA strand for 100 This initial denaturing/hybridization/liga- femtocents.” Computing on surfaces tion step generated over 1013 strands of DNA. There is an interesting and potentially useful Among these, it was probable that at least one exists and ‘readout’ of the correct solution approach now available to the field of DNA encoded the hamiltonian path. Next, all required graduated PCR on the final product. computing: computing on surfaces2. This sequences that began at vin and ended at vout This involved a series of six different PCR involves affixing the solution DNA strands, were selectively amplified by the polymerase reactions, each using the vin forward primer correct and incorrect, to a solid medium, and chain reaction (PCR) using primers recogniz- and a primer complementary to each of the — in a subtractive algorithm, for example — ing those sequences. Any path that did not other 20-mer vertices, which were then selectively destroying the incorrect ones. pass through exactly seven points was then analysed in separate lanes on a gel. (So, for Liu et al.2 used this approach to solve a eliminated by gel-purifying only the 140- example, a readout showing bands of 40, x, small instance of an NP-complete satisfiability base-pair product (necessarily containing 60, 80, 100 and 120 base pairs, where x repre- (SAT) problem involving Boolean logic. They seven 20-mers). Finally, to eliminate solutions sents the absence of a band in the second lane, began with a logical statement in four vari- that did not pass through each vertex exactly would represent the path 0→1, 1→3, 3→4, ables divided into four connected sections, or once, the product from the previous step was 4→5, 5→6. This is not a hamiltonian path, as ‘clauses’.As each variable could be either 0 or affinity purified by denaturing the double- it misses the second vertex, and so would have 1, there existed 24 = 16 possible solutions to stranded paths and removing the biotinylated been eliminated in the size-purification step.) the problem, four of which were correct. complementary strand on magnetic beads. Adleman’s path, however, did indeed pass A DNA strand corresponding to each pos- The first affinity target was a single-stranded through each vertex exactly once. Although sibility was synthesized. Each strand had the 20-mer complementary to the sequence of several of these methods have been modified format 5′-FFFFvvvvvvvvFFFF-3′, where the the second vertex, so only paths that con- and improved, graduated PCR remains a sim- eight bases labelled ‘F’ are fixed (for use in the tained the second ‘city’ were retained. This ple and rapid method for readout that PCR amplification step) and ‘vvvvvvvv’ is a process was repeated for every vertex except bypasses DNA sequencing. unique octanucleotide corresponding to a the first and the last (as all paths were already As this was the first experimental demon- different predetermined solution. The strands bounded by vin and vout PCR primers). stration of DNA computing, Adleman reflect- were bound to solid media (a maleimide- The presence of a DNA band at the end ed on some possible implications. This com- functionalized gold surface), and a fixed 21- indicated that, for the given graph, a hamil- putation required about seven days of mer spacer sequence was attached to the 5′ tonian path exists. Confirmation that a path laboratory work, so for large n such a process end in order to separate the solution could prove unwieldy, unless some sort of sequence from the support. All DNA was automation or alternative algorithm could be single stranded. aabc devised. However, a typical desktop computer The algorithm used at this point involved def can execute about 106 operations per second, a cycle of mark–destroy–unmark operations. gh i and the fastest supercomputer can execute First the sequences encoding correct solu- about 1012 — but Adleman’s molecular com- tions were marked (by hybridization to their 01 01 putation, if each ligation counts as an opera- complements), then all unmarked (single- 14 i tion, did over 10 . Scaling up the ligation step stranded) sequences were destroyed using 200 20 h could push the number over 10 operations Escherichia coli exonuclease I, and finally the g per second. Furthermore, it used an extremely remaining solutions were unmarked by small quantity of energy — 2 × 1019 opera- removing their complements.