Exhaustive Search This Lecture Restriction Enzymes and the Partial

Exhaustive Search This Lecture Restriction Enzymes and the Partial

This lecture ¾Restriction enzymes and the partial digest problem Exhaustive search ¾Finding regulatory motifs in DNA Sequences Torgeir R. Hvidsten ¾Exhaustive search methods T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 1 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 2 Restriction enzymes ¾ HindII - first restriction enzyme – Restriction enzymes and the was discovered accidentally in 1970 while studying how the ppgpartial digest problem bacterium Haemophilus influenzae takes up DNA from the virus ¾ Restriction enzymes are used as a defense mechanism by bacteria to break down the DNA of attacking viruses T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 3 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 4 Uses of restriction enzymes Restriction maps ¾Recombinant DNA technology (i.e. combining DNA ¾ A map showing the positions of restriction sites in a DNA sequences that would not normally occur together) sequence ¾Cloning ¾ If the DNA sequence is known then constructing a ¾cDNA/genomic library construction restriiiction map is tr iilivial ¾DNA mapping ¾ In early days of molecular biology, DNA sequences were often unknown ¾ Biologists had to solve the problem of constructing restriction maps without knowing DNA sequences T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 5 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 6 1 Measuring length of restriction fragments Partial restriction digest ¾ The sample of DNA is exposed to the restriction enzyme for only a limited amount of time to prevent it from being cut at all restriction sites ¾ Restriction enzymes break DNA into restriction ¾ This experiment generates the set of all possible restriction fragments fragments between every two (not necessarily consecutive) cuts ¾ Gel electrophoresis is a process for separating DNA by size and measuring sizes of restriction fragments ¾ This set of fragment sizes is used to ¾ Can separate DNA fragments that differ in length in determine the only 1 nucleotide for fragments up to 500 nucleotides positions of the long restriction sites in the DNA sequence T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 7 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 8 Multiset of restriction fragments Partial Digest Problem (PDP) ¾ We assume that multiplicity of a fragment can be detected, i.e., X the set of n integers representing the location of all the number of restriction fragments of the same length can be cuts in the restriction map, including the start and determined end i.e. X = {x1 = 0, x2, …, xn}. ¾ Restriction sites: X = {0, 5, 14, 19, 22} n the total number of cuts ¾ Multiset: ΔX = {3, 5, 5, 8, 9, 14, 14, 17, 19, 22} ΔX thlifihe multiset of integers represent ilhfing lengths of each of the n(n-1)/2 fragments produced from a partial digest i.e. ΔX = {xj –xi | 1 ≤ i < j ≤ n} Problem: Given the multiset L, find a set X such that ΔX = L T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 9 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 10 Partial Digest: Multiple Solutions (I) Partial Digest: Multiple Solutions (II) It is not always possible to uniquely reconstruct a set X based only on ΔX. 01257912 015781012 For example, the set 0 1 2 5 7 912 0 1 5 7 81012 X = {0, 2, 5} 1 1 4 6811 1 4 6 7 911 and 2 3 5 7 10 5 2 3 5 7 (X + 10) = {10, 12, 15} both produce ΔX={2, 3, 5} as their partial digest set (they are 5 2 4 7 7 1 3 5 homometric). 7 2 5 8 2 4 The sets {0,1,2,5,7,9,12} and {0,1,5,7,8,10,12} present a less trivial 9 3 10 2 example of non-uniqueness. They both digest into: {1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12}. 12 12 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 11 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 12 2 BruteForcePDP Efficiency of BruteForcePDP BruteForcePDP(L, n) ¾The running time of BruteForcePDP is O(|L|n-2) 1 M ← Maximum element in L or, since |L| = n(n-1)/2, O(n2n-4) 2 for every set of n – 2 integers 0 < x2 < … xn-1 < M ¾ Note that without restricting the elements of X to n-2 such that xi ε L for 1 < i < n elements in L, the running time would be O(M ) 3 X ← {0,x2,…,xn-1,M } ¾If L = {2, 998, 1000} (n = 3, M = 1000), the 4Form ΔX from X running time would be very different 5 if ΔX = L ¾Nonetheless, both algorithms have a exponential 6 return X running time 7 output “No solution” T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 13 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 14 Branch and Bound Algorithm for PDP Definitions 1. Initiation: Add the start (0) and end (width) point of the Before describing PartialDigest, first define sequence to X (and remove the end point from L) Δ(y, X) 2. Find the largest element y in L as the multiset of all distances between point y and all 3. See if y fits o n t he rig ht o r le ft s ide o f t he rest rict io n map other points in the set X by checking whether the other lengths (fragments) it Δ(y, X) = {|y – x1|, |y – x2|, …, |y – xn|} creates are in L 4. If it fits, remove these lengths from L and add y (or width for X = {x1, x2, …, xn} –y) to X (if not, backtrack) 5. Go back to step 2 until L is empty T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 15 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 16 PartialDigest(L) 1 width ← Maximum element in L 2 Remove width from L PartialDigest An Example 3 X ← {0, width} 4 Place(L, X) L = {2, 2, 3, 3, 4, 5, 6, 7, 8, 10} Place(L,X) 1 if L is empty Inititation: 2 output X width = 10, so delete 10 from L and add 0 and 10 to X 3 return 4 y ← MiMaximum e lement in L 5 if Δ(y, X ) ⊆ L L = {2, 2, 3, 3, 4, 5, 6, 7, 8} 6 Remove lengths Δ(y, X) from L and add y to X X = {0, 10} 7 Place(L, X ) 8 Remove y from X and add lengths Δ(y, X) to L 9 if Δ(width – y, X ) ⊆ L 10 Remove lengths Δ(width – y, X) from L and add width – y to X and 11 Place(L, X ) 12 Remove width – y from X and add lengths Δ(width – y, X ) to L 13 return T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 17 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 18 3 An Example An Example L = {2, 2, 3, 3, 4, 5, 6, 7, 8} L = {2, 3, 3, 4, 5, 6, 7} X = {0, 10} X = {0, 8, 10} y = 8 y = 7 Δ(y, X) = { 8, 2} , which is a subset of L Δ(y, X) = { 7, 1, 3} , which is not a subset of L Δ(width – y, X) = {3, 5, 7}, which is a subset of L L = {2, 3, 3, 4, 5, 6, 7} X = {0, 8, 10} L = {2, 3, 4, 6} X = {0, 3, 8, 10} T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 19 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 20 An Example An Example L = {2, 3, 4, 6} L = {} X = {0, 3, 8, 10} X = {0, 3, 6, 8, 10} y = 6 L is empty! Δ(y, X) = { 6, 3, 2, 4} , which is a subset of L Output X L = {} And continue searching for more solutions … X = {0, 3, 6, 8, 10} T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 21 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 22 An Example An Example L = {2, 3, 4, 6} L = {2, 3, 3, 4, 5, 6, 7} X = {0, 3, 8, 10} X = {0, 8, 10} y = 6 y = 7 Δ(y, X) = { 6, 3, 2, 4} , which is a subset of L Δ(y, X) = { 7, 1, 3} , which is not a subset of L Δ(width – y, X) = {4, 1, 4, 6}, which is not a subset of L Δ(width – y, X) = {3, 5, 7}, which is a subset of L Backtrack! Backtrack! T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 23 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 24 4 An Example An Example L = {2, 2, 3, 3, 4, 5, 6, 7, 8} L = {2, 3, 3, 4, 5, 6, 7} X = {0, 10} X = {0, 2, 10} y = 8 y = 7 Δ(y, X) = { 8, 2} , which is a subset of L Δ(y, X) = { 7, 5, 3} , which is a subset of L Δ(width – y, X) = {2, 8}, which is a subset of L L = {2, 3, 4, 6} L = {2, 3, 3, 4, 5, 6, 7} X = {0, 2, 7, 10} X = {0, 2, 10} T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 25 T.R. Hvidsten: 1MB304: Discrete structures for bioinformatics II 26 An Example An Example L = {2, 3, 4, 6} L = {} X = {0, 2, 7, 10} X = {0, 2, 4, 7, 10} y = 6 L is empty! Δ(y, X) = { 6, 4, 1, 4} , which is not a subset of L Output X Δ(width – y, X) = {4, 2, 3, 6}, which is a subset of L And continue searching for more solutions … L = {} X = {0, 2, 4, 7, 10} T.R.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    13 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us