Final review Depth-first search vs. Breadth-first search

Depth-first search. Put unvisited vertices on a stack.

Breadth-first search. Put unvisited vertices on a queue. source set of marked vertices s Connected component: maximal set of connected vertices. DFS marks all vertices connected to s in time v proportional to the sum of their degrees. no such edge set of can exist unmarked vertices x

w

BFS finds path from s to t that uses fewest number of edges.

Breadth-frst Intuition.maze exploration BFS examines vertices in increasing distance from s.

2 ‣ directed graphs

3 Depth-first search in digraphs

Same method as for undirected graphs. • Every undirected graph is a digraph (with edges in both directions). • DFS is a digraph .

DFS (to visit a vertex v)

Mark v as visited. Recursively visit all unmarked vertices w adjacent from v.

4 Breadth-first search in digraphs

Same method as for undirected graphs. • Every undirected graph is a digraph (with edges in both directions). • BFS is a digraph algorithm.

s

BFS (from source vertex s) w

Put s onto a FIFO queue, and mark s as visited. Repeat until the queue is empty: - remove the least recently added vertex v - for each unmarked vertex adjacent from v: add to queue and mark as visited..

v

Is w reachable from v in this digraph?

Proposition. BFS computes shortest paths (fewest number of edges).

5 Topological sort

DAG. Directed acyclic graph.

Topological sort. Redraw DAG so all edges point up.

0→5 0→2 0 0→1 3→6 3→5 3→4 2 5 1 5→4 6→4 6→0 3→2 1→4 3 4

6

directed edges DAG

Solution. Reverse DFS Postorder! topological order

6 Reverse DFS postorder in a DAG

marked[] reversePost

0 dfs(0) 1 0 0 0 0 0 0 - dfs(1) 1 1 0 0 0 0 0 - dfs(4) 1 1 0 0 1 0 0 - 2 5 1 4 done 1 1 0 0 1 0 0 4 1 done 1 1 0 0 1 0 0 4 1 3 4 dfs(2) 1 1 1 0 1 0 0 4 1 2 done 1 1 1 0 1 0 0 4 1 2 6 dfs(5) 1 1 1 0 1 1 0 4 1 2 check 2 1 1 1 0 1 1 0 4 1 2 5 done 1 1 1 0 1 1 0 4 1 2 5 0→5 0 done 1 1 1 0 1 1 0 4 1 2 5 0 0→2 check 1 1 1 1 0 1 1 0 4 1 2 5 0 check 2 1 1 1 0 1 1 0 4 1 2 5 0 0→1 dfs(3) 1 1 1 1 1 1 0 4 1 2 5 0 3→6 check 2 1 1 1 1 1 1 0 4 1 2 5 0 3→5 check 4 1 1 1 1 1 1 0 4 1 2 5 0 3→4 check 5 1 1 1 1 1 1 0 4 1 2 5 0 dfs(6) 1 1 1 1 1 1 1 4 1 2 5 0 5→4 6 done 1 1 1 1 1 1 1 4 1 2 5 0 6 6→4 3 done 1 1 1 1 1 1 1 4 1 2 5 0 6 3 6→0 check 4 1 1 1 1 1 1 0 4 1 2 5 0 6 3 3→2 check 5 1 1 1 1 1 1 0 4 1 2 5 0 6 3 check 6 1 1 1 1 1 1 0 4 1 2 5 0 6 3 reverse DFS 1→4 done 1 1 1 1 1 1 1 4 1 2 5 0 6 3 postorder is a topological order!

7 Strongly-connected components

Def. Vertices v and w are strongly connected if there is a directed path from v to w and a directed path from w to v.

Key property. Strong connectivity is an equivalence relation: • v is strongly connected to v. • If v is strongly connected to w, then w is strongly connected to v. • If v is strongly connected to w and w to x, then v is strongly connected to x.

Def. A strong component is a maximal subset of strongly-connected vertices.

A graph and its connected components A digraph and its strong components 8 Kosaraju's algorithm

Simple (but mysterious) algorithm for computing strong components. • Run DFS on GR to compute reverse postorder. • Run DFS on G, considering vertices in order given by first DFS.

DFS in reverse digraph (ReversePost) DFS in original digraph

G

dfs(1) dfs(0) dfs(11) dfs(6) dfs(7) 1 done dfs(5) check 4 check 9 check 6 dfs(12) check 4 dfs(8) check unmarked vertices in the order check unmarked vertices in the order dfs(4) dfs(3) dfs(9) check 0 check 7 0 1 2 3 4 5 6 7 8 9 10 11 12 1 0 2 4 5 3 11 9 12 10 6 7 8 check 5 check 11 6 done check 9 dfs(2) dfs(10) 8 done dfs(0) dfs(1) check 0 check 12 7 done check 8 dfs(6) 1 done check 3 10 done 9 done dfs(7) dfs(0) 2 done 3 done 12 done dfs(8) dfs(5) check 2 11 done check 7 dfs(4) 4 done check 9 8 done dfs(3) 5 done check 12 7 done check 5 check 1 check 10 6 done dfs(2) 0 done check 2 dfs(2) check 0 check 4 dfs(4) check 3 check 5 dfs(11) 2 done check 3 dfs(9) 3 done dfs(12) Second check 2 DFS gives strong components. (!!) check 11 4 done dfs(10) 5 done strong check 9 check 1 components 10 done 0 done 9 12 done check 2 check 8 check 4 check 6 check 5 9 done check 3 11 done dfs(11) check 6 check 4 dfs(5) dfs(12) dfs(3) dfs(9) check 4 check 11 check 2 dfs(10) 3 done check 12 check 0 10 done 5 done 9 done 4 done 12 done check 3 11 done 2 done check 9 0 done check 12 dfs(1) check 10 check 0 dfs(6) 1 done check 9 check 2 check 4 check 3 check 0 check 4 reverse 6 done check 5 postorder dfs(7) check 6 for use check 6 check 7 in second dfs(8) check 8 dfs() check 7 check 9 (read up) check 9 check 10 8 done check 11 7 done check 12 check 8

Kosaraju’s algorithm for fnding strong components in digraphs ‣ minimum spanning trees

10 Minimum spanning

Given. Undirected graph G with positive edge weights (connected). Def. A spanning tree of G is a subgraph T that is connected and acyclic. Goal. Find a min weight spanning tree.

24 4

23 6 9 18

5 11 16 8 7 10 14

21

spanning tree T: cost = 50 = 4 + 6 + 8 + 5 + 11 + 9 + 7

Brute force. Try all spanning trees?

11 Kruskal's algorithm

Kruskal's algorithm. [Kruskal 1956] Consider edges in ascending order of weight. Add the next edge to the tree T unless doing so would create a cycle.

next MST edge is red

graph edges sorted by weight MST edge (black)

0-7 0.16 2-3 0.17 1-7 0.19 0-2 0.26 5-7 0.28 1-3 0.29 1-5 0.32 2-7 0.34 4-5 0.35 1-2 0.36 4-7 0.37 0-4 0.38 6-2 0.40 3-6 0.52 6-0 0.58 6-4 0.93 obsolete edge (gray) grey vertices are a cut defined by the vertices connected to one of the red edge’s vertices

12 Trace of Kruskal’s algorithm Kruskal's algorithm: implementation challenge

Challenge. Would adding edge v–w to tree T create a cycle? If not, add it.

Efficient solution. Use the union-find . • Maintain a set for each connected component in T. • If v and w are in same set, then adding v–w would create a cycle. • To add v–w to T, merge sets containing v and w.

w

v w v

Case 1: adding v–w creates a cycle Case 2: add v–w to T and merge sets containing v and w

13 Prim's algorithm

Prim's algorithm. [Jarník 1930, Dijkstra 1957, Prim 1959] Start with vertex 0 and greedily grow tree T. At each step, add to T the min weight edge with exactly one endpoint in T.

edges with exactly one endpoint in T (sorted by weight)

0-7 0.16 1-7 0.19 0-2 0.26 0-2 0.26 0-2 0.26 5-7 0.28 0-4 0.38 5-7 0.28 1-3 0.29 6-0 0.58 2-7 0.34 1-5 0.32 4-7 0.37 2-7 0.34 0-4 0.38 1-2 0.36 6-0 0.58 4-7 0.37 0-4 0.38 0-6 0.58

2-3 0.17 5-7 0.28 4-5 0.35 5-7 0.28 1-5 0.32 4-7 0.37 1-3 0.29 4-7 0.37 0-4 0.38 1-5 0.32 0-4 0.38 6-2 0.40 4-7 0.37 6-2 0.40 3-6 0.52 0-4 0.38 3-6 0.52 6-0 0.58 6-2 0.40 6-0 0.58 6-0 0.58

6-2 0.40 3-6 0.52 6-0 0.58 6-4 0.93

Trace of Prim’s algorithm 14 Prim's algorithm: lazy implementation

Challenge. Find the min weight edge with exactly one endpoint in T.

Lazy solution. Maintain a PQ of edges with (at least) one endpoint in T. • Delete min to determine next edge e = v–w to add to T. • Disregard if both endpoints v and w are in T. • Otherwise, let v be vertex not in T : - add to PQ any edge incident to v (assuming other endpoint not in T) - add v to T

1-7 is min weight edge with exactly one endpoint in T priority queue of crossing edges

1-7 0.19 0-2 0.26 5-7 0.28 2-7 0.34 4-7 0.37 0-4 0.38 6-0 0.58

15 Prim's algorithm: running time

Proposition. Lazy Prim's algorithm computes the MST in time proportional to E log E in the worst case.

Pf. operation frequency binary

delete min E log E

insert E log E

16 ‣ shortest path

17 EdgeEdge relaxation relaxation

Relax edge e = v→w. • distTo[v] is length of shortest known path from s to v. • distTo[w] is length of shortest known path from s to w. • edgeTo[w] is last edge on shortest known path from s to w. • If e = v→w gives shorter path to w through v, update distTo[w] and edgeTo[w]. v->w is ineligible v->w successfully relaxes distTo[v] distTo[v] v 3.1 v 3.1 weight of v->w is 1.3 weight of v->w is 1.3 s s private void relax(DirectedEdge e) w 3.3 w 7.2 black edges { are in edgeTo[] distTo[w] int v = e.from(), w = e.to(); black edges distTo[w] are in edgeTo[] if (distTo[w] > distTo[v] + e.weight()) { v v 3.1 distTo[w] = distTo[v] + e.weight(); no changes edgeTo[w] edgeTo[w] = e; s s } w w 4.4 }

19 18 Edge relaxation (two cases)Thursday, March 29, 12 Dijkstra's algorithm Dijkstra’sDijkstra's Algorithm algorithm • Consider vertices in increasing order of distance from s • Consider(non-tree vertices vertex in increasing with the lowest order distTo[] of distancevalue). from s • Add vertex to tree and relax all edges incident from that vertex. (non-treeShortest path vertex trees with the lowest distTo[] value). • Add vertex to tree and relax all edges incident from that vertex. Consider vertices in increasing order of distance from s • edge-weighted digraph v distTo[v] edgeTo[v] 4->5 0.35 0 0.00 - (non-tree5->4 0.35 vertex with the1 lowest distTo[] value). 4->7 0.37 3 1 5->7 0.28 5 Add7->5 vertex 0.28 to tree and relax all edges incident from2 that vertex. • 5->1 0.32 7 3 edge-weighted digraph0->4 0.38 2 v distTo[v] edgeTo[v] 0->2 0.26 4 4->5 0.35 7->3 0.39 shortest path from 0 to 60 1->3 0.29 0 0.00 - 5->4 0.35 0->2 0.26 5 2->7 0.34 1 4->7 0.37 2->7 0.34 6->2 0.40 3 6 1 4 7->3 0.39 6 5->7 0.28 3->6 0.52 5 3->6 0.52 7 7->5 0.28 6->0 0.58 2 5->1 0.32 6->4 0.93 7 0->4 0.3825% An edge-weighted digraph and a shortest path 50%2 3100% 0->2 0.26 4 7->3 0.39 shortest path from 0 to 60 1->3 0.29 0->2 0.26 5 2->7 0.34 2->7 0.34 6->2 0.40 6 4 7->3 0.39 6 3->6 0.52 3->6 0.52 6->0 0.58 7 27

6->4 0.93 Thursday, March 29, 12 An edge-weighted digraph and a shortest path

19

27 30 Thursday, March 29, 12 Thursday, March 29, 12 ‣ string sorts

20 Key-indexed counting

Goal. Sort an array a[] of N integers between 0 and R - 1. • Count frequencies of each letter using key as index. • Compute frequency cumulates which specify destinations. • Access cumulates using key as index to move records. • Copy back into original array. i a[i] i aux[i]

0 a 0 a int N = a.length; 1 a int[] count = new int[R+1]; 1 a 2 b r count[r] 2 b for (int i = 0; i < N; i++) 3 b a 2 3 b count[a[i]+1]++; 4 b b 5 4 b 5 for (int r = 0; r < R; r++) 5 c c 6 c 6 6 d count[r+1] += count[r]; d d 8 7 d 7 d 9 for (int i = 0; i < N; i++) e 8 e 8 e aux[count[a[i]]++] = a[i]; f 12 9 f 9 f - 12 for (int i = 0; i < N; i++) 10 f 10 f a[i] = aux[i]; 11 f 11 f copy back

21 Least-significant-digit-first string sort

LSD string sort. • Consider characters from right to left. • Stably sort using dth character as the key (using key-indexed counting).

sort key sort key sort key

0 d a b 0 d a b 0 d a b 0 a c e 1 a d d 1 c a b 1 c a b 1 a d d 2 c a b 2 e b b 2 f a d 2 b a d 3 f a d 3 a d d 3 b a d 3 b e d 4 f e e 4 f a d 4 d a d 4 b e e 5 b a d 5 b a d 5 e b b 5 c a b 6 d a d 6 d a d 6 a c e 6 d a b 7 b e e 7 f e d 7 a d d 7 d a d

8 f e d 8 b e d 8 f e d 8 e b b 9 b e d 9 f e e 9 b e d 9 f a d 10 e b b 10 b e e 10 f e e 10 f e d 11 a c e 11 a c e 11 b e e 11 f e e

sort must be stable (arrows do not cross) 22 Most-significant-digit-first string sort

MSD string sort. • Partition file into R pieces according to first character (use key-indexed counting). • Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).

0 a d d

0 d a b 0 a d d 1 a c e count[] 1 a d d 1 a c e 2 b a d 2 c a b 2 b a d a 0 3 b e e 3 f a d 3 b e e b 2 4 b e d 4 f e e 4 b e d c 5 sort these d 6 independently 5 b a d 5 c a b 5 c a b (recursive) e 8 6 d a d 6 d a b f 9 7 b e e 7 d a d 6 d a b - 12 7 d a d 8 f e d 8 e b b

9 b e d 9 f a d 8 e b b 10 e b b 10 f e e 11 a c e 11 f e d 9 f a d

10 f e e

sort key 11 f e d 23 3-way string quicksort (Bentley and Sedgewick, 1997)

Overview. Do 3-way partitioning on the dth character. • Cheaper than R-way partitioning of MSD string sort. • Need not examine again characters equal to the partitioning char.

partitioning element she by sells are use first character value seashells seashells to partition into "less", "equal", by she and "greater" subarrays the seashells sea sea recursively sort subarrays, shore shore excluding first character the surely for "equal" subarray shells shells she she sells sells are sells surely the seashells the

24 ‣

25 R-Way Tries

• Store characters and values in nodes (not keys). • Each node has R children, one for each possible character. • For now, we do not draw null links.

root Ex. she sells sea shells by the link to for all keys that start with s link to trie for all keys b s t that start with she value for she in node y 4 e h h corresponding to last key character

a 2 l e 0 e 5

l l key value by 4 s 1 l sea 2 sells 1 label each node with she 0 character associated s 3 with incoming link shells 3 the 5 Anatomy of a trie

26 Ternary search tries

TST. [Bentley-Sedgewick, 1997] • Store characters and values in nodes (not keys). • Each node has three children: smaller (left), equal (middle), larger (right).

link to TST for all keys link to TST for all keys that start with that start with s a letter before s s b t a b s t a

r y 4 e h u h r y 4 h h e u

e 12 a 14 l e 10 o r 0 e 8 e 12 14 l e 10 r 0 e 8 a o

l l r e each node has l l r e three links s 11 l e 7 l s 11 l e 7 l

s 15 y 13 s 15 y 13

TST representation of a trie

27 ‣ compression

28 Variable-length codes

Need prefix-free code to prevent ambiguities: Ensure that no codeword is a prefix of another.

Codeword table Trie representation key value Ex 1. Fixed-length code. ! 101 0 1 A 0 A Ex 2. Append special stop char to each codeword.B 1111 0 1 C 110 Ex 3. General prefix-free code. D 100 0 1 0 1 R 1110 D ! C 0 1 R B Compressed bitstring 011111110011001000111111100101 30 bits A B RA CA DA B RA !

Codeword table Trie representation Codeword table Trie representation key value key value 0 1 ! 101 ! 101 0 1 A 0 A A 11 0 1 B 1111 B 00 0 1 0 1 C 110 C 010 0 1 0 1 B A D 100 D 100 0 1 0 1 R 1110 D ! C 0 1 R 011 C R D ! R B Compressed bitstring Compressed bitstring 011111110011001000111111100101 30 bits 11000111101011100110001111101 29 bits A B RA CA DA B RA ! A B R A C A D A B R A ! Two prefx-free codes Codeword table Trie representation 29 key value ! 101 0 1 A 11 B 00 0 1 0 1 C 010 B A D 100 0 1 0 1 R 011 C R D !

Compressed bitstring 11000111101011100110001111101 29 bits A B R A C A D A B R A !

Two prefx-free codes Huffman codes

Goal: Find best prefix-free code

Huffman algorithm: • Count frequency freq[i] for each char i in input. David Huffman • Start with one node corresponding to each char i (with weight freq[i]). • Repeat until single trie formed: - select two tries with min weight freq[i] and freq[j] - merge into single trie with weight freq[i] + freq[j] • Implementation: Use minPQ on trie weights

Applications. JPEG, MP3, MPEG, PKZIP, GZIP, PDF, …

30 Codeword table Trie representation key value ! 101 0 1 A 0 A B 1111 0 1 Prefix-free codes: trie representationC 110 D 100 0 1 0 1 R 1110 D ! C Q. How to represent the prefix-free code? 0 1 A. A binary trie! R B • Chars in leaves.Compressed bitstring 011111110011001000111111100101 30 bits Codeword is path from root to leaf. • A B RA CA DA B RA !

Codeword table Trie representation key value ! 101 0 1 A 11 B 00 0 1 0 1 C 010 B A D 100 0 1 0 1 R 011 C R D !

Compressed bitstring 11000111101011100110001111101 29 bits A B R A C A D A B R A !

Two prefx-free codes 31 Lempel-Ziv-Welch compression

LZW compression. • Create ST associating W-bit codewords with string keys. • Initialize ST with codewords for single-char keys. • Find longest string s in ST that is a prefix of unscanned part of input. • Write the W-bit codeword associated with s. • Add s + c to ST, where c is next char in the input.

input A B R A C A D A B R A B R A B R A EOF matches A B R A C A D A B R A B R A B R A output 41 42 52 41 43 41 44 81 83 82 88 41 80 codeword table key value A B 81 AB AB AB AB AB AB AB AB AB AB AB 81 B R 82 BR BR BR BR BR BR BR BR BR BR 82 RA RA 83 input R A 83 RA RA RA RA RA RA RA substring A C 84 AC AC AC AC AC AC AC AC 84 LZW C A 85 CA CA CA CA CA CA CA 85 codeword A D 86 AD AD AD AD AD AD 86 lookahead D A 87 DA DA DA DA DA 87 character ABR 88 ABR ABR ABR ABR 88 RAB 89 RAC RAB RAB 89 BRA 8A BRA BRA 8A ABRA 8B ABRA 8B LZW compression for ABRACADABRABRABRA

32 ‣ Geometric search

33 1d range search: BST implementation

Range search. Find all keys between k1 and k2. • Recursively find all keys in left subtree (if any could fall in range). • Check key in current node. • Recursively find all keys in right subtree (if any could fall in range).

searching in the range [F..T]

red keys are used in compares but are not in the range S E X A R C H M L P black keys are in the range

Range search in a BST

Proposition. Running time is proportional to R + log N (assuming BST is balanced).

34 Quadtree

Idea. Recursively divide space into 4 quadrants. Implementation. 4-way tree (actually a trie).

NW NE SW SE (0..., 1...) b a h c a d d e f g

e b c

f g h public class QuadTree { (01.., 00..) private Quad quad; private Value val; private QuadTree NW, NE, SW, SE; }

Benefit. Good performance in the presence of clustering. Drawback. Arbitrary depth!

35 Quadtree: 2d orthogonal range search

Range search. Find all keys in a given 2d range. • Recursively find all keys in NE quadrant (if any could fall in range). • Recursively find all keys in NW quadrant (if any could fall in range). • Recursively find all keys in SE quadrant (if any could fall in range). • Recursively find all keys in SW quadrant (if any could fall in range).

b

c NW NE SW SE a d a h

e d e f g

f g h b c

Typical running time. R + log N.

36 2d tree

Recursively partition plane into two halfplanes.

8 1

6 3 2 9

3

4 6 7 8

1 5 5 10 9 2

7

10

4

37 2d tree implementation

Data structure. BST, but alternate using x- and y-coordinates as key. • Search gives rectangle containing point. • Insert further subdivides the plane.

q p

q p points points points points left of p right of p below q above q even levels odd levels

8

6 1 9 3 3 2

1 5 2

4 6 7 8 7 5 10 9 10

4

38 2d tree: 2d orthogonal range search

Range search. Find all points in a query axis-aligned rectangle. • Check if point in node lies in given rectangle. • Recursively search left/top subdivision (if any could fall in rectangle). • Recursively search right/bottom subdivision (if any could fall in rectangle).

Typical case. R + log N. Worst case (assuming tree is balanced). R + √N.

6 1

3 3 2

5 1

4 6 7 8

5 10 9

4

39 2d tree: nearest neighbor search

Nearest neighbor search. Given a query point, find the closest point. • Check distance from point in node to query point. • Recursively search left/top subdivision (if it could contain a closer point). • Recursively search right/bottom subdivision (if it could contain a closer point). • Organize recursive method so that it begins by searching for query point.

Typical case. log N. Worst case (even if tree is balanced). N.

query point closest point = 135

8

6 1 9 3 3 2

1 5 2

4 6 7 8 7 5 10 9 10

4

40 Search for intersections

Problem. Find all intersecting pairs among N geometric objects. Applications. CAD, games, movies, virtual reality, ....

Simple version. 2d, all objects are horizontal or vertical line segments.

Brute force. Test all Θ(N 2) pairs of line segments for intersection.

41 Orthogonal line segment intersection search: sweep-line algorithm

Sweep vertical line from left to right. • x-coordinates define events. • h-segment (left endpoint): insert y-coordinate into ST.

3 3

4

2 2

1 1

0 0

y-coordinates 42 Orthogonal line segment intersection search: sweep-line algorithm

Sweep vertical line from left to right. • x-coordinates define events. • h-segment (left endpoint): insert y-coordinate into ST. • h-segment (right endpoint): remove y-coordinate from ST.

3 3

4

2

1 1

0 0

y-coordinates 43 Orthogonal line segment intersection search: sweep-line algorithm

Sweep vertical line from left to right. • x-coordinates define events. • h-segment (left endpoint): insert y-coordinate into ST. • h-segment (right endpoint): remove y-coordinate from ST. • v-segment: range search for interval of y-endpoints.

3 3

4 1d range search 2

1 1

0 0

y-coordinates 44 General line segment intersection search

Extend sweep-line algorithm. • Maintain segments that intersect sweep line ordered by y-coordinate. • Intersections can only occur between adjacent segments. • Add line segment ⇒ one new pair of adjacent segments. • Delete line segment ⇒ two segments become adjacent • Intersection ⇒ swap adjacent segments.

A

B

C insert segment D delete segment

A AB ABC ACB ACBD ACD CAD CA A intersection

order of segments that intersect sweep line 45 Line segment intersection: implementation

Efficient implementation of sweep line algorithm. • Maintain PQ of important x-coordinates: endpoints and intersections. • Maintain set of segments intersecting sweep line, sorted by x. • Time proportional to R log N + N log N. to support "next largest" and "next smallest" queries

Implementation issues. • Degeneracy. • Floating-point precision. • Must use PQ, not presort (intersection events are unknown ahead of time).

46 Orthogonal rectangle intersection search

Goal. Find all intersections among a set of N orthogonal rectangles. Non-degeneracy assumption. All x- and y-coordinates are distinct.

Application. Design-rule checking in VLSI circuits.

47 Orthogonal rectangle intersection search

Move a vertical "sweep line" from left to right. • Sweep line: sort rectangles by x-coordinates and process in this order, stopping on left and right endpoints. • Maintain set of y-intervals intersecting sweep line. • Left endpoint: search set for intersecting y-intervals; insert y-interval. • Right endpoint: delete y-interval.

3

2 3 2

0 0

1 1

y-coordinates 48 Interval search trees

Create BST, where each node stores an interval (lo, hi).

• Use left endpoint as BST key. (17, 19) 22 • Store max endpoint in subtree rooted at node.

Suffices to implement all ops efficiently! (5, 11) 18 (20, 22) 22

(4, 8) 8 (15, 18) 18

(7, 10) 10

(7, 10) (20, 22)

(5, 11) (17, 19)

(4, 8) (15, 18)

49 Finding an intersecting interval

To search for any interval that intersects query interval (lo, hi) :

Node x = root; while (x != null) { if (x.interval.intersects(lo, hi)) return x.interval; else if (x.left == null) x = x.right; else if (x.left.max < lo) x = x.right; else x = x.left; (17, 19) 22 } return null;

(5, 8) 18 (20, 22) 22 Ex. Search for (9, 10).

(4, 8) 8 (15, 18) 18

(7, 10) 10

50 Interval search tree: analysis

Implementation. Use a red-black BST to guarantee performance.

can maintain auxiliary information using log N extra work per op

interval best operation brute search tree in theory

insert interval 1 log N log N

find interval N log N log N

delete interval N log N log N

find any interval that N log N log N intersects (lo, hi) find all intervals that N R log N R + log N intersects (lo, hi)

order of growth of running time for N intervals

51 Rectangle intersection sweep-line algorithm: review

Move a vertical "sweep line" from left to right. • Sweep line: sort rectangles by x-coordinates and process in this order, stopping on left and right endpoints. • Maintain set of rectangles that intersect the sweep line in an interval search tree (using y-intervals of rectangle). • Left endpoint: interval search for y-interval of rectangle; insert y-interval. • Right endpoint: delete y-interval.

52 Geometric search summary: of the day

problem example solution

1d range search BST

kd orthogonal kd tree range search

1d interval search interval search tree

2d orthogonal sweep line reduces to line segment intersection 1D range search

2d orthogonal sweep line reduces to rectangle intersection 1D interval search

53 ‣

54 General dynamic programming technique

Applies to a problem that at first seems to require a lot of time (possibly exponential), provided we have:

• Simple subproblems: the subproblems can be defined in terms of a few variables, such as j, k, l, m, and so on.

• Subproblem optimality: the global optimum value can be defined in terms of optimal subproblems

• Subproblem overlap: the subproblems are not independent, but instead they overlap (hence, should be constructed bottom-up).

55 Analysis of the LCS algorithm

Define L[i,j] to be the length of the longest common subsequence of two strings X[0..i] and Y[0..j].

Then we can define L[i,j] in the general case as follows: • If xi=yj, then L[i,j] = L[i-1,j-1] + 1 (we can add this match) • If xi≠yj, then L[i,j] = max{L[i-1,j], L[i,j-1]} (we have no match here)

Answer is contained in L[n,m] (and the subsequence can be recovered from the L table).

56