DNA Computers
Total Page:16
File Type:pdf, Size:1020Kb
Eindhoven Honours Class Foundations of Informatics Algorithmic Adventures DNA computing Hendrik Jan Hoogeboom Computer Science Leiden 13 december 2010 natural computation •genetic algoritms •neural networks DNA computing bio-informatics Len Adleman Molecular Computation of Solutions to Combinatorial Problem, Science, 266: 1021-1024, (Nov. 11) 1994. ―the mad scientist at work‖ http://www.usc.edu/dept/molecular-science/fm-adleman.htm Scientific American ―In other words, one could program a Turing machine to produce Watson-Crick complementary strings, factor numbers, play chess and so on. This realization caused me to sit up in bed and remark to my wife, Lori, ‗Jeez, these things could compute.‘ I did not sleep the rest of the night, trying to figure out a way to get DNA to solve problems.‖ Leonard M. Adleman - Computing with DNA Scientific American August 1998 Physicists plunder life's tool chest If we look inside the cell, we see extraordinary machines that we couldn't make ourselves, says Len Adleman. “It's a great tool chest - and we want to see what can we build with it.” Adleman created the first computer to use DNA to solve a problem. He was struck by the parallels between DNA, with its long ribbon of information, and the theoretical computer known as the Turing Machine. Nature News Service april 2003 Physicists plunder life's tool chest Adleman tackled the famous „travelling salesman‟ problem - finding the shortest route between cities. Such problems rapidly become mind- boggling. The only way is to examine every possible option. With many cities, this number is astronomical. DNA excels at getting an astronomical amount of data into a tiny space. “One gram of DNA can store as much information as a trillion compact discs,” says Adleman. Myriad DNA molecules can examine every possible route at once, rather than one at a time, as in a conventional computer. contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future DNA H H H O H N H H O C N N H O N N O H C O H H G 2 H N N N C H2 O H O H H P 3‘ DNA 5‘ S A T S P P Base pairs S G C S Watson & Crick [ & Rosalind Franklin ] P P A=T S C G S adenine - thymine CG P P guanine - cytosine S A T S P 3‘ 5‘ single – double strand 5‘ 3‘ double strand G T G G A T C C low temp C A C C T A G G denaturing annealing G T G G A T C C high temp C A C C T A G G single strands complementarity G T G G A T C C G G A T C C A C G T G G A T C C C A C C T A G G restriction enzymes G G A T C C C C T A G G BamHI G G A T C C C C T A G G sticky ends subsequence selection G T G G A T C C C C T A magnetic beads separation on length DNA gel electrophoresis multiplication / amplification C G primer C polymerase C G G A T C A C C T A G G C A A G PCR — polymerase chain reaction contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future complexity second n=10 30 50 60 minute day n 10-5s 310-5s 510-5s 610-5s year n2 10-4s 910-4s 210-3s 410-3s century n5 10-1s 24 s 1.7 m 13 m 2n 10-3s 18 m 13 d 366 c 3n 610-2s 6.5 y 3855 c 1013 c polynomial vs. exponential now 100x 1000x n N 100N 1000N n2 N 10N 32N n5 N 2.5N 4N 2n N N+6.6 N+10 3n N N+4.2 N+6.3 The general idea custom made is there a single strands of DNA double strand (many copies) with my desired properties? properties: • length, • subsequence. if we can do this, then we can solve certain problems (efficiently)! HPP: Hamilton Path Problem 4 3 1 ‗travelling salesman‘ vin vout 0 6 2 5 given: directed graph (points & connections) question: is there a path that visits each point exactly once ? HPP: Hamilton Path Problem solution 4 3 1 ‗travelling salesman‘ vin vout 0 6 2 5 given: directed graph (points & connections) question: is there a path that visits each point exactly once ? HPP: Hamilton Path Problem 4 3 1 no solution? exponential time: 0 6 try all possibilities representative class 2 5 ‗NP complete‘ 2 heuristics 1 1 3 2 3 0 1 6 4 4 complexity (theory) - P vs. NP P polynomial algorithm to find a solution NP polynomial algorithm to verify a solution NP-complete ? millenium prize problem P=NP www.claymath.org/Millennium_Prize_Problems/ contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future Adleman‘s algorithm 4 3 1 1. generate ‗all‘ paths 0 6 keep only paths 2 5 2. … from vin to vout 3. … that enter n vertices 4. … that enter all vertices 5. if any path remains OK ‗massive parallellism‘ building blocks A C T C G T G G A T C C G G A C C A C C T A G G A C T C G T G G A T C C G G A C C A C C T A G G Adleman‘s algorithm 4 3 1 0. coding the graph 1. generate ‗all‘ paths 0 6 keep only paths 2 5 2. … from vin to vout 3. … that enter n vertices 4. … that enter all vertices 0 1 ACGG GTGG ATCC5. TAGT if any path remains OK |||| |||| CACC TAGG 0 1 Adleman‘s algorithm 4 3 1 0. coding the graph 1. generate ‗all‘ paths 0 6 keep only paths 2 5 2. … from vin to vout 3. … that enter n vertices 0 4.1 … that enter2 all vertices1 ACGG GTGG ATCC TAGT TTGC AACT ATCC TAGT |||| |||| 5.|||| if ||||any path|||| remains|||| OK CACC TAGG ATCA AACG TTGA TAGG 0 1 1 2 2 1 Adleman‘s algorithm 4 3 1 0. coding the graph 1. generate ‗all‘ paths 0 6 keep only paths 2 5 2. … from vin to vout 3. … that enter n vertices 4. … that enter all vertices 5. if any path remains OK •PCR with vin and vout primers •gel: separate on length, amplify & purify •magnetic beads: select strands •PCR amplification & gel contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future comments … . ―clear that the methods could be scaled up to … larger graphs‖ + bath tub of DNA ? + suitable algorithms . approximately 7 days of lab work + automation + alternative molecular algorithms . possibility of errors + pseudopaths: accidental ligation + PCR, separation procedures + hairpin loops + stability when scaled comments … . ―power of this method of computation‖ - 1014 operations 1020 plausable - exceed supercomputers by thousandfold :) . ―not clear whether … used to solve real computational problems‖ . multiplying 100 digit numbers . potential: massively parallel searches contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future Turing machine # a a a a a b b b b b c c c c c # tape c/c,L b/b,L b/b,L a/a,R a/a,L 1. mark a 6 2. move to b‘s mark b a/a,R b/b,R 3. move to c‘s b/b,R c/c,R mark c 4. if another c 1 2 3 4 5. then back to a‘s a/a,R b/b,R c/c,R goto 1. #/#,L else back to a‘s 6. check marks c/c,L stop ok 6 b/b,L a/a,L ‗universal‘ Turing machine GGATGnnnnnnnnn Rothemund CCTACnnnnnnnnnnnnn FokI circular DNA symbol spacing state = position of cut • cut states with restriction enzyme • mix ‗instructions‘ with ‗tape‘ • ‗activate‘ instructions (cut protected end) • ligate to form circles • cut old symbol • recircularize contents DNA … the tool chest problem complexity … P & NP Hamilton Path Problem Adleman‘s algorithm comments theory … Turing machine recent work + future tic-tac-toe logic gates fluorescence Stojanovic & Stefanovic, Deoxyribozyme- Based Molecular Automaton. Nature Biotechn. 2003. Deoxyribozyme-Based Logic Gates J. Am. Chem. Soc. 2002. Medium Scale Integration of Molecular Logic Gates in an Automaton Nano Letters 2006. digamma.cs.unm.edu/wiki/bin/view/McogPublicWeb/MolecularAutomataMAYAII MAYA-II first “medium-scale integrated molecular circuit”, integrating 128 deoxyribozyme-based logic gates, 32 input DNA molecules, and 8 two- channel fluorescent outputs across 8 wells tic-tac-toe o2 = (i6i7i2) (i7i9i1)(i8i9i1) i1 TCT GCG TCT ATA AAT i2 ATC GTA TGT TGT TCA i3 GTA TAG TCT GTT TGT i4 G TAA GTG CTC AAA TGT C i5 G TCT AAT TCT CAC GGT C tic-tac-toe o2 = (i6i7i2) (i7i9i1)(i8i9i1) i1 TCT GCG TCT ATA AAT i2 ATC GTA TGT TGT TCA i3 GTA TAG TCT GTT TGT i4 G TAA GTG CTC AAA TGT C i5 G TCT AAT TCT CAC GGT C DNA computing after 10 years “There are many practical hurdles.