Computational Molecular Biology

Unique Probe Mapping met behulp van PQ-trees

Leiden, 6 april 2006

Hendrik Jan Hoogeboom Algorithms / Fundamentele Informatica

www.liacs.nl/home/hoogeboo/ feb’01 - human genome genetic / physical map physical mapping

108 bp C: Full DNA Physical mapping Cut C and clone into overlapping Physical 106 bp YAC clones. mapping Cut the DNA in each YAC clone and clone into overlapping cosmid clones. 104 bp Select a subset of cosmid clones of minimum total length that covers the YAC DNA. Fragment assembling Duplicate the cosmid and then cut the copies randomly. Select and sequence short fragments and then reassemble them into a deduced cosmid string. 102 bp using a map of the genome

m u j a e i x f z d w

m u z d hybridization mapping

Clones

y x z w

Probes unique probe mapping e b o r 3 clone p 4 clones 1,2,…,6 2 6 probes A,B,…,G 1 5

E B F C A G D

A B C D E F G matrix representation 1 1 1 2 1 1 3 1 1 1 1 4 1 1 5 1 1 1 6 1 1 reordering of probes

ordering e b o D G C A F B E r 3 clone p 1 1 1 4 2 2 1 1 6 3 1 1 1 1 1 5 4 1 1 5 1 1 1 6 1 1 E B F C A G D

A B C D E F G clones contain 1 1 1 2 1 1 consecutive probes 3 1 1 1 1 4 1 1 5 1 1 1 6 1 1 interval graphs e b o r 3 clone p 4 5 2 6 1 2 3 6 1 5 4

E B F C A G D

A B C D E F G matrix representation 1 1 1 2 1 1 3 1 1 1 1 4 1 1 5 1 1 1 6 1 1 thur 16 ^ 0 1  ¬ 0 1 mon tues 1 A 7 12 v 0 1 D fri sat wed 5 6 10 2 2 3 B C sun expressie code binaire zoek 4 2

sat tues c s expr expr term e * fri mon wed a a term term* fact sun thur t e fact r t a fact b a a l k syntax 2,3 boom

BOMEN PQ-trees representation for permutations

P Q

1 2 3 1 2 3

{ 123, 132, 213, 231, 312, 321 } { 123, 321 } PQ-trees datastructure to represent all possibilities

P Q (k≥2) (k≥3)

T1 Tk T1 Tk P permutation Q linear order

PQ trees represent possible reorderings (permutations of probes) example

2 A B C D E 1 1 1 1 1 D A C B E 2 1 1 1 1 clones { A, C, D } { A, B, C, E }

1 2 D AC BE

D AC BE EB CA D D CA BE EB AC D D AC EB BE CA D D A C B E D CA EB BE AC D PQ-trees equivalent representations

P Q

1 2 3 1 2 3 reorder mirror

P Q

3 1 2 3 2 1

{ 123, 132, 213, 231, 312, 321 } { 123, 321 } PQ- algorithm reduce(T,S) T PQ tree ~ of permutations S new clone ~ set of (consecutive) probes add requirement S to tree T ‘keep S together’

• colour leaves in S • apply transformations to get consecutive leaves • apply replacement rules to add new restriction to tree

P all leaves in S

Q segment in S example

Ω Ω

A B C D E F B A C E D F B A C E D F

S = {A,C,E} ‘root’ no coloured nodes outside this subtree

Ω Ω

B A C E D F B A C E D F B A C E D F replacement rules replacement rules (2,3) root root Ω (2) … … … …

non-root Ω (3)

… … replacement rules (4,5)

root Ω … … (4) …

… … … …

non-root Ω (5) … …

… … references K.S. Booth and G.S. Leuker. Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. JCSS 13:335-379, 1976. also 7th STOC, 1975. challenges

• find the right model (simplification)

° noise, errors, too much / not enough data ° heuristics & AI approach ° is it data mining?

• interdisciplinary