Announcements (4/30/08)

• Midterm on Friday CSE 326: Data Structures • Special office hour: 4:30-5:30 Thursday in Jaech B-Trees and B+ Trees Gallery (6th floor of CSE building) – This is instead of my usual 11am office hour. Brian Curless • Reading for this lecture: Weiss Sec. 4.7 Spring 2008

2

Traversing very large datasets Memory considerations Suppose we had very many pieces of data (as in a What is in a node? In an object? database), e.g., n = 230 ≈ 109. Node: How many (worst case) hops through the tree to find a Object obj; node? Node left; Node right; •BST Node parent;

Object: Key key; • AVL …data…

Suppose the data is 1KB. How much space does the tree take? •Splay How much of the data can live in 1GB of RAM? 3 4 Cycles to access: CPU Minimizing random disk access Registers 1 In our example, almost all of our data structure is on L1 Cache 2 disk.

L2 Cache 30 Thus, hopping through a tree amounts to random accesses to disk. Ouch!

Main memory 250 How can we address this problem?

Disk Random: 30,000,000 Streamed: 5000

5 6

M-ary Search Tree B-Trees

Suppose, somehow, we devised a search tree with How do we make an M-ary search tree work? maximum branching factor M: • Each node has (up to) M-1 keys. •Order property: – subtree between two keys x and y 3 7 12 21 contain leaves with values v such that x < v < y Complete tree has height:

# hops for find: x<3 3

Root (special case) B-Tree with M = 4 – has between 2 and M children (or root could be a leaf) Internal nodes 10 42 – store up to M-1 keys – have between ⎡M/2⎤ and M children 6 20 27 34 50 Leaf nodes – store between ⎡(M-1)/2⎤ and M-1 sorted keys 1 2 8 9 12 14 16 22 24 28 32 35 38 39 44 47 52 60 – all at the same depth

9 10

B+ Trees B+ Tree Structure Properties In a B+ tree, the internal nodes have no data – only the Root (special case) leaves do! – has between 2 and M children (or root could be a leaf)

• Each internal node still has (up to) M-1 keys: Internal nodes •Order property: – store up to M-1 keys – subtree between two keys x and y 3 7 12 21 – have between ⎡M/2⎤ and M children contain leaves with values v Leaf nodes such that x ≤ v < y – where data is stored – Note the “≤” – all at the same depth • Leaf nodes have up to L – contain between ⎡L/2⎤ and L data items sorted keys. x<3 3≤x<7 7≤x<12 12≤x<21 21≤x

11 12 B+ Tree: Example Disk Friendliness B+ Tree with M = 4 (# pointers in internal node) and L = 5 (# data items in leaf) What makes B+ trees disk-friendly?

Data objects… 12 44 1. Many keys stored in a node which I’ll ignore • All brought to memory/cache in one disk access. in slides 6 20 27 34 50 2. Internal nodes contain only keys; 1, AB.. 6 12 20 27 34 44 50 Only leaf nodes contain keys and actual data 2, GH.. 8 14 22 28 38 47 60 All leaves 4, XY.. 9 16 24 32 39 49 70 at the same • Much of tree structure can be loaded into memory 10 17 41 depth irrespective of data object size 19 • Data actually resides in disk

13 14 Definition for later: “neighbor” is the next sibling to the left or right.

B+ trees vs. AVL trees Building a B+ Tree with Insertions

Suppose again we have n = 230 ≈ 109 items:

• Depth of AVL Tree Insert(3)Insert(18) Insert(14)

• Depth of B+ Tree with M = 256, L = 256

The empty B-Tree Great, but how to we actually make a B+ tree and keep it M = 3 L = 3 balanced…?

15 16 18 18 18

3 18 3 18 3 18 Insert(32) Insert(36) 14 30 14 30 14 30 3 3 Insert(30) 14 14 18 18 18 Insert(15)

3 18 14 30

M = 3 L = 3 17 M = 3 L = 3 18

18 32 18 32

3 18 32 3 18 32 Insert(12,40,45,38) Insert(16) 18 14 30 36 14 30 36 18 15 15 15 32 15 32 40

3 15 18 32 3 15 18 32 40 18 32 14 16 30 36 12 16 30 36 45 14 38 18 32 30 36

19 20 M = 3 L = 3 M = 3 L = 3 Insertion Algorithm And Now for Deletion…

1. Insert the key in its leaf in 3. If an internal node ends up sorted order with M+1 children, overflow! 18 Delete(32) 18 2. If the leaf ends up with L+1 – Split the node into two nodes: items, overflow! • original with ⎡(M+1)/2⎤ children – Split the leaf into two nodes: • new one with ⎣(M+1)/2⎦ 15 32 40 15 40 • original with ⎡(L+1)/2⎤ items children • new one with ⎣(L+1)/2⎦ items – Add the new child to the parent 3 15 18 32 40 3 15 18 40 – Add the new child to the parent – If the parent ends up with M+1 – If the parent ends up with M+1 items, overflow! 12 16 30 36 45 12 16 30 45 children, overflow! 4. Split an overflowed root in two and hang the new nodes 14 38 14 This makes the tree deeper! under a new root 5. Propagate keys up tree. 21 M = 3 L = 3 22

18 Delete(15) 18 18 Delete(16) 18

15 36 40 36 40 14 36 40 36 40

3 15 18 36 40 3 18 36 40 3 14 18 36 40 18 36 40 12 16 30 38 45 12 16 30 38 45 12 16 30 38 45 30 38 45 14

M = 3 L = 3 23 M = 3 L = 3 24 Delete(14) 18 Delete(16) 36 36

36 40 18 40 18 40

3 18 36 40 3 3 18 36 40 3 18 36 40 12 30 38 45 12 12 30 38 45 12 30 38 45 14 14 14

M = 3 L = 3 25 M = 3 L = 3 26

Delete(18) 36 36 36

18 40 40 40

3 18 36 40 3 36 40 3 36 40 12 30 38 45 12 38 45 12 38 45 30

M = 3 L = 3 27 M = 3 L = 3 28 Deletion Algorithm Deletion Slide Two 3. If an internal node ends up with fewer than ⎡M/2⎤ children, underflow! 1. Remove the key from its leaf – Adopt from a neighbor; update the parent 2. If the leaf ends up with fewer – If adoption won’t work, merge with neighbor than ⎡L/2⎤ items, underflow! – If the parent ends up with fewer than ⎡ ⎤ – Adopt data from a neighbor; M/2 children, underflow! update the parent – If adopting won’t work, delete This reduces the node and merge with neighbor 4. If the root ends up with only one child, height of the tree! – If the parent ends up with make the child the new root of the tree fewer than ⎡M/2⎤ children, underflow! 29 5. Propagate keys up through tree. 30

Thinking about B+ Trees Tree Names You Might Encounter

• B+ Tree insertion can cause (expensive) splitting FYI: and propagation – B-Trees with M = 3, L = x are called 2-3 trees • Nodes can have 2 or 3 keys • B+ Tree deletion can cause (cheap) adoption or – B-Trees with M = 4, L = x are called 2-3-4 trees (expensive) deletion, merging and propagation • Nodes can have 2, 3, or 4 keys • Propagation is rare if M and L are large (Why?) • Pick branching factor M and data items/leaf L such that each node takes one full page/block of memory/disk.

31 32