Trees • Binary Trees • Newer Types of Tree Structures • M-Ary Trees

Trees • Binary Trees • Newer Types of Tree Structures • M-Ary Trees

<p>Data Organization and <br>Processing </p><p>Hierarchical Indexing </p><p><strong>(NDBI007) </strong></p><p>David Hoksza <a href="/goto?url=http://siret.ms.mff.cuni.cz/hoksza" target="_blank">http://siret.ms.mff.cuni.cz/hoksza </a></p><p>1</p><p>Outline </p><p>• prefix tree </p><p>• … </p><p>• Background </p><p>• graphs </p><p>• <strong>search trees </strong></p><p>• binary trees • m-ary trees </p><p>• <strong>Newer types </strong>of tree structures </p><p>• Typical tree type structures </p><p>• <strong>B-tree </strong>• <strong>B+-tree </strong></p><p>• B*-tree </p><p>2</p><p>Motivation </p><p>• Similarly as in hashing, the <strong>motivation </strong>is <strong>to find record(s) </strong>given a query </p><p>key using only <strong>a few operations </strong></p><p>• unlike hashing, trees allow to retrieve set of records with keys from given <strong>range </strong></p><p>• Tree structures use “<strong>clustering</strong>” to efficiently <strong>filter out non relevant records </strong>from the data set </p><p>• Tree-based indexes practically implement the index file data organization </p><p>• for each <strong>search key, </strong>an <strong>indexing structure </strong>can be maintained </p><p>3</p><p>Drawbacks of Index(-Sequential) Organization </p><p>• <strong>Static </strong>nature </p><p>• when <strong>inserting </strong>a record at the beginning of the file, the whole index needs to be </p><p><strong>rebuilt </strong></p><p>• an overflow handling policy can be applied → the performance degrades as the file grows </p><p>• <strong>reorganization </strong>can take <strong>a lot of&nbsp;time</strong>, especially for large tables </p><p>• <strong>not possible </strong>for online transactional processing (<strong>OLTP</strong>) scenarios (as opposed </p><p>to OLAP – online analytical processing) where many insertions/deletions occur </p><p>4</p><p>Tree Indexes </p><p>• <strong>Most common </strong>dynamic indexing structure for external memory </p><p>• When <strong>inserting/deleting into/from the primary file</strong>, the <strong>indexing </strong></p><p><strong>structure(s) </strong>residing in the secondary file is <strong>modified </strong>to accommodate the </p><p>new key </p><p>• The <strong>modification </strong>of a tree is implemented by <strong>splitting/merging </strong>nodes </p><p>• Used not only in databases </p><p>• NTFS directory structure is built over B+ tree </p><p>5</p><p>Trees </p><p>• <strong>root </strong></p><p>• <strong>Undirected graph without cycles </strong></p><p>• is the only node without parent – root of the hierarchy </p><p>• we will be interested in rooted trees <br>→ one node designated as the root → orientation </p><p>• <strong>leaves </strong></p><p>• nodes without children </p><p>• <strong>inner nodes </strong></p><p>• nodes with children </p><p>• Nodes are in <strong>parent-child relation </strong></p><p>• <strong>branch </strong></p><p>• every <strong>parent </strong>has a <strong>finite set of </strong></p><p><strong>descendants (children nodes) </strong></p><p>• from a parent an edge points to child <br>• path from a leaf to the root </p><p>• every <strong>child </strong>has <strong>exactly one parent </strong></p><p>6</p><p></p><ul style="display: flex;"><li style="flex:1">Tree height = 3 </li><li style="flex:1">Root </li></ul><p></p><p>Tree arity = 3 </p><p>Tree Characteristics (1) </p><p>D:0-H:3 </p><p><strong>width = 1 width = 2 </strong></p><p>level 0 level 1 </p><p>• <strong>Tree arity/degree </strong></p><p>• maximum number of children of any node </p><p>D:1-H:1 <br>D:1-H:2 </p><p>• <strong>Node depth </strong></p><p>• the length of the simplest path (number of edges) from the root node </p><p>D:2-H:1 </p><p>• depth(root) = 0 </p><p><strong>width = 5 </strong></p><p>• <strong>Tree depth </strong></p><p>level 2 level 3 </p><p>• maximum of node depths </p><p>D:2-H:0 </p><p>• <strong>Tree level </strong></p><p><strong>width = 2 </strong></p><p>• set of nodes with the same depth (distance from root) </p><p>• <strong>Level width </strong></p><p>D:3-H:0 </p><p>• number of nodes at given level </p><p>• <strong>Node height </strong></p><p>Tree depth = 3 </p><p>• number of edges on the <strong>longest </strong>downward simple path from Y to a leaf </p><p>• <strong>Tree height </strong></p><p>• height of the root node </p><p>7</p><p>Tree Characteristics (2) </p><p>• <strong>Balanced </strong><em>(vyv ážený ) </em><strong>tree </strong></p><p>• a tree whose subtrees differ in height by no more than one and the subtrees are height-balanced as well </p><p>• <strong>Unsorted tree </strong></p><p>• general tree – the descendants of a node are not sorted at all </p><p>• <strong>Sorted tree </strong></p><p>• unlike general tree the order of children of each node matters </p><p>• children of a node are sorted based on a given key </p><p>8</p><p>Binary Tree </p><p>• <strong>Each inner node </strong>contains at </p><p>most <strong>two child nodes </strong>(tree arity </p><p>= 2) </p><p>• Characteristics </p><p>• <strong>number of nodes </strong>of a <strong>perfect </strong></p><p>binary tree of height ℎ </p><p>풉</p><p>#풏풐풅풆풔<sub style="top: 0.41em;">풑풆풓풇&nbsp;</sub>=&nbsp;&nbsp;&nbsp;ퟐ<sup style="top: -0.73em;">풊&nbsp;</sup>=&nbsp;ퟐ<sup style="top: -0.73em;">풉+ퟏ&nbsp;</sup>−&nbsp;ퟏ </p><p>• Types </p><p>• <strong>perfect (perfektní) binary tree </strong>– </p><p>every non-leaf node has two child </p><p>nodes </p><p>• <strong>complete (komletní) binary tree </strong></p><p>– full tree except for the first nonleaf level where nodes are as far left </p><p>as possible </p><p>풊=ퟎ </p><p>• <strong>number of leaf nodes </strong>of a </p><p><strong>perfect </strong>binary tree of height ℎ </p><p>#풍풆풂풇_풏풐풅풆풔<sub style="top: 0.34em;">풑풆풓풇&nbsp;</sub>=&nbsp;ퟐ<sup style="top: -0.61em;">풉 </sup></p><p>9</p><p>Binary Search Tree (BST) </p><p></p><ul style="display: flex;"><li style="flex:1">• Binary sorted tree where </li><li style="flex:1">• Searching a record with a key 푘 </li></ul><p></p><p>• each node is assigned a value <br>1. Enter&nbsp;the tree at the root level </p><p>corresponding to one of the keys <br>2. Compare&nbsp;푘&nbsp;with the value 푣&nbsp;of the </p><p>• all nodes in the <strong>left subtree&nbsp;</strong>of a node </p><p>푋&nbsp;correspond to keys with <strong>lower </strong>values than the value of 푋 </p><p>• all nodes in the <strong>right subtree&nbsp;</strong>of a </p><p>node 푋&nbsp;correspond to keys with <strong>higher </strong>values than the value of 푋 current node </p><p>3. If&nbsp;푘&nbsp;=&nbsp;푣&nbsp;<strong>return the </strong>record </p><p>corresponding to the current node key <br>4. If&nbsp;푘&nbsp;&lt;&nbsp;푣&nbsp;descend to the left subtree 5. If&nbsp;푘&nbsp;&gt;&nbsp;푣&nbsp;descend to the right subtree </p><p>6. Repeat&nbsp;steps 2-5 until a leaf is </p><p>reached. If the condition in step 3 is not met at any level, no record with the key 푘&nbsp;is present <br>• when using BST for storing records, the records can be stored directly in the nodes or addressed from them </p><p>10 </p><p>Redundant vs. Non-Redundant BST </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>Non-Redundant BST </strong></li><li style="flex:1"><strong>Redundant BST </strong></li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">• Records stored in (addressed </li><li style="flex:1">• Records stored in (addressed </li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">from) both inner and leaf nodes </li><li style="flex:1">from) the leaves </li></ul><p></p><p>• Structure of inner and leaf nodes differ </p><p><strong>8</strong><br><strong>8</strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>4</strong></li><li style="flex:1"><strong>9</strong></li></ul><p></p><p><strong>12 </strong><br><strong>4</strong></p><p><strong>12 </strong><br><strong>9</strong><br><strong>2</strong><br><strong>7</strong></p><ul style="display: flex;"><li style="flex:1"><strong>2</strong></li><li style="flex:1"><strong>7</strong></li><li style="flex:1"><strong>9</strong></li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>7</strong></li><li style="flex:1"><strong>8</strong></li></ul><p></p><p>11 </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>2</strong></li><li style="flex:1"><strong>4</strong></li></ul><p></p><p>Applications of Binary Trees (not related to </p><p>storing data records) </p><p>• <strong>Expression trees </strong></p><p>• leaves = variables • inner nodes = operands </p><p>• <strong>Huffman coding </strong></p><p>• leaves = data </p><p>• coding along a branch leading to give leaf = leaf’s binary representation </p><p>• <strong>Query optimizers in DBMS </strong></p><p>• query can be represented by an algebraic expression which can be in turn </p><p>represented by a binary tree </p><p>12 </p><p>M-ary Trees (1) </p><p>• Motivation </p><p>• binary trees are not suitable for magnetic disks because of their height – indexing </p><p>16 items would need an index of height 4 → up to 4 disk accesses (potentially in </p><p>different locations) to retrieve one of the 16 records </p><p>• log<sub style="top: 0.41em;">2&nbsp;</sub>1000&nbsp;=&nbsp;10&nbsp;,&nbsp;log<sub style="top: 0.41em;">2&nbsp;</sub>1,000,000&nbsp;=&nbsp;20 </p><p>• <strong>But increasing arity leads to decreasing the tree height </strong></p><p>13 </p><p>M-ary Trees (2) </p><p>• <strong>Generalization of&nbsp;binary search trees </strong></p><p>• A binary tree is a special case of an m-ary tree for 푚&nbsp;=&nbsp;2 </p><p>• only m-ary trees with 푚&nbsp;&gt;&nbsp;2 are usually considered as true m−ary trees </p><p>• unlike binary tree, <strong>m-ary tree </strong>contains 풎&nbsp;−&nbsp;ퟏ&nbsp;<strong>discriminators (keys) </strong>in every </p><p>node <br>• every node 푁&nbsp;points to up to 풎&nbsp;<strong>child nodes </strong>(subtrees) </p><p>• every <strong>subtree </strong>of 푁&nbsp;contains records with <strong>keys restricted </strong>by the <strong>pair of </strong></p><p><strong>discriminators </strong>of 푁&nbsp;between which the subtree is rooted </p><p>• the <strong>leftmost subtree </strong>contains only records with keys having <strong>lower values </strong>than <strong>all the discriminators </strong>in 푁 </p><p>• the <strong>rightmost subtree </strong>contains only records with <strong>keys having higher values </strong></p><p>than <strong>all the discriminators </strong>in 푁 </p><p>14 </p><p>M-ary Trees (3) </p><p>풎&nbsp;=&nbsp;ퟒ </p><p>Data could be stored directly in the node as well. But it is not usual in real-world database environments. </p><p><strong>disk block </strong></p><p>푘<sub style="top: 0.31em;">1 </sub></p><p></p><ul style="display: flex;"><li style="flex:1">푘<sub style="top: 0.31em;">2 </sub></li><li style="flex:1">푘<sub style="top: 0.31em;">3 </sub></li></ul><p></p><p>data1 data3 data2 </p><p>subtree with all keys </p><p>푘<sub style="top: 0.31em;">1 </sub>&lt;&nbsp;푘<sub style="top: 0.31em;">푖&nbsp;</sub>&lt; <br>푘<sub style="top: 0.31em;">2 </sub></p><p>subtree with all keys </p><p>푘<sub style="top: 0.31em;">2&nbsp;</sub>&lt;&nbsp;푘<sub style="top: 0.31em;">푖&nbsp;</sub>&lt; </p><p>푘<sub style="top: 0.31em;">3 </sub></p><p>subtree with all keys subtree with all keys </p><p>푘<sub style="top: 0.312em;">3&nbsp;</sub>&lt;&nbsp;푘<sub style="top: 0.312em;">푖 </sub></p><p>푘<sub style="top: 0.31em;">푖&nbsp;</sub>&lt;&nbsp;푘<sub style="top: 0.31em;">1 </sub></p><p>15 </p><p>Characteristics of M-Ary Trees (1) </p><p>• <strong>Maximum number of&nbsp;nodes </strong>• <strong>Maximum number of </strong></p><p><strong>records/keys </strong></p><p>• tree height ℎ </p><p>• 푙푒푣푒푙<sub style="top: 0.41em;">0 </sub>=&nbsp;푚<sup style="top: -0.73em;">0 </sup>=&nbsp;1,&nbsp;푙푒푣푒푙<sub style="top: 0.41em;">1 </sub>= </p><p>• every node contains up to 푚&nbsp;−&nbsp;1 </p><p>푚<sup style="top: -0.73em;">1</sup>,&nbsp;푙푒푣푒푙<sub style="top: 0.41em;">2&nbsp;</sub>=&nbsp;푚<sup style="top: -0.73em;">2</sup>,&nbsp;…&nbsp;푙푒푣푒푙<sub style="top: 0.41em;">ℎ </sub>=&nbsp;푚<sup style="top: -0.73em;">ℎ </sup></p><p>keys/records </p><p>푚<sup style="top: -0.73em;">ℎ+1 </sup>−&nbsp;1 </p><p>ℎ</p><p>풎<sup style="top: -0.73em;">풉+ퟏ&nbsp;</sup>−&nbsp;ퟏ <br>×&nbsp;푚&nbsp;−&nbsp;1&nbsp;=&nbsp;풎<sup style="top: -0.73em;">풉+ퟏ&nbsp;</sup>−&nbsp;ퟏ </p><p>푚&nbsp;−&nbsp;1 <br>&nbsp;&nbsp;푚<sup style="top: -0.73em;">푖&nbsp;</sup>= <br>풎&nbsp;−&nbsp;ퟏ </p><p>푖=0 </p><p>• 푚&nbsp;=&nbsp;3,&nbsp;ℎ&nbsp;=&nbsp;3 →&nbsp;#푟푒푐표푟푑푠&nbsp;≤ <br>3<sup style="top: -0.73em;">4&nbsp;</sup>−&nbsp;1&nbsp;=&nbsp;80 <br>• 푚&nbsp;=&nbsp;100,&nbsp;ℎ&nbsp;=&nbsp;3 →&nbsp;#푟푒푐표푟푑푠&nbsp;≤ <br>100<sup style="top: -0.73em;">4&nbsp;</sup>−&nbsp;1&nbsp;=&nbsp;100,000,000 </p><p>16 </p><p>Characteristics of M-Ary Trees (2) </p><p>• <strong>Minimum height </strong></p><p>풉&nbsp;=&nbsp;퐥퐨퐠<sub style="top: 0.44em;">풎&nbsp;</sub>풏 </p><p>• <strong>Maximum height </strong></p><p>풏</p><p>풉&nbsp;≐ </p><p>풎</p><p>• maximum height of the tree </p><p>corresponds to the <strong>maximum number of disk operations </strong>needed </p><p>to <strong>fetch </strong>a record → search complexity </p><p>푶(풏) </p><p>• minimum height of the tree </p><p>corresponds to the <strong>minimum number of disk operations </strong>needed </p><p>to <strong>fetch </strong>a record → search complexity </p><p>푶(퐥퐨퐠<sub style="top: 0.3725em;">퐦&nbsp;</sub>풏) </p><p>• The challenge is to <strong>keep the complexity logarithmic</strong>, that is to </p><p>keep the tree more or less balanced </p><p>17 </p><p>B-tree </p><p>• Bayer &amp; McCreight, 1972 </p><p>• B-tree is a sorted <strong>balanced m-ary </strong>(not binary) <strong>tree </strong>with additional </p><p><strong>constraints restricting the branching </strong>in each node thus causing the </p><p>tree to be reasonably “wide” </p><p>• Inserting or deleting a record in B-tree causes only local changes and not rebuilding of the whole index </p><p>18 </p><p>B-tree definition (1) </p><p>• B-trees <strong>are balanced m-ary trees </strong>fulfilling the following conditions: </p><p>1. The&nbsp;<strong>root </strong>has <strong>at least two children </strong>unless it is a leaf. </p><p>풎</p><p>2. Every&nbsp;<strong>inner node </strong>except of the root <strong>has at least&nbsp;</strong>and <strong>at most </strong>풎 </p><p>ퟐ</p><p><strong>children</strong>. </p><p>풎</p><p>3. Every&nbsp;<strong>node </strong>contains <strong>at least&nbsp;</strong>−&nbsp;ퟏ&nbsp;and <strong>at most </strong>풎&nbsp;−&nbsp;ퟏ&nbsp;<strong>(pointers </strong></p><p>ퟐ</p><p><strong>to) data records</strong>. </p><p>4. Each&nbsp;<strong>branch </strong>has the <strong>same length</strong>. </p><p>19 </p><p>B-tree definition (2) </p><p>5. The&nbsp;<strong>nodes </strong>have following <strong>organization</strong>: </p><p>풑<sub style="top: 0.37em;">ퟎ</sub>,&nbsp;풌<sub style="top: 0.37em;">ퟏ</sub>,&nbsp;풑<sub style="top: 0.37em;">ퟏ</sub>,&nbsp;풅<sub style="top: 0.37em;">ퟏ&nbsp;</sub>,&nbsp;풌<sub style="top: 0.37em;">ퟐ</sub>,&nbsp;풑<sub style="top: 0.37em;">ퟐ</sub>,&nbsp;풅<sub style="top: 0.37em;">ퟐ&nbsp;</sub>,&nbsp;…&nbsp;,&nbsp;,&nbsp;풌<sub style="top: 0.37em;">풏</sub>,&nbsp;풑<sub style="top: 0.37em;">풏</sub>,&nbsp;풅<sub style="top: 0.37em;">풏&nbsp;</sub>,&nbsp;풖 </p><p>• where 풑<sub style="top: 0.37em;">ퟎ</sub>,&nbsp;풑<sub style="top: 0.37em;">ퟏ</sub>,&nbsp;…&nbsp;,&nbsp;풑<sub style="top: 0.37em;">풏&nbsp;</sub>are <strong>pointers to the child nodes</strong>, 풌<sub style="top: 0.37em;">ퟏ</sub>,&nbsp;풌<sub style="top: 0.37em;">ퟐ</sub>,&nbsp;…&nbsp;,&nbsp;풌<sub style="top: 0.37em;">풏&nbsp;</sub>are <strong>keys</strong>, </p><p>풅<sub style="top: 0.37em;">ퟏ</sub>,&nbsp;풅<sub style="top: 0.37em;">ퟐ</sub>,&nbsp;…&nbsp;,&nbsp;풅<sub style="top: 0.37em;">풏&nbsp;</sub>are associated <strong>data, </strong>풖&nbsp;is <strong>unused space </strong>and the <strong>records </strong>풌<sub style="top: 0.37em;">풊</sub>,&nbsp;풑<sub style="top: 0.37em;">풊</sub>,&nbsp;풅<sub style="top: 0.37em;">풊&nbsp;</sub><strong>are </strong></p><p><strong>sorted </strong>in increasing order with respect to the keys and </p><p>풎</p><p>−&nbsp;ퟏ&nbsp;≤&nbsp;풏&nbsp;≤&nbsp;풎&nbsp;−&nbsp;ퟏ </p><p>ퟐ</p><p>6. If&nbsp;a <strong>subtree </strong>푼&nbsp;풑<sub style="top: 0.44em;">풊&nbsp;</sub>corresponds to the pointer 푝<sub style="top: 0.44em;">푖&nbsp;</sub>then: </p><p><strong>i. </strong>∀풌&nbsp;∈&nbsp;푼&nbsp;풑<sub style="top: 0.37em;">풊−ퟏ&nbsp;</sub>:&nbsp;풌&nbsp;&lt;&nbsp;풌<sub style="top: 0.37em;">풊 </sub><strong>ii. </strong>∀풌&nbsp;∈&nbsp;푼&nbsp;풑<sub style="top: 0.37em;">풊&nbsp;</sub>:&nbsp;풌&nbsp;&gt;&nbsp;풌<sub style="top: 0.37em;">풊 </sub></p><p>20 </p><p>Example of a B-Tree </p><p>• Keys: 25, 48, 27, 91, 35, 78, 12, 56, 38, 19 </p><p><strong>48 </strong><br><strong>19 </strong></p><p><strong>25 </strong></p><ul style="display: flex;"><li style="flex:1"><strong>27 </strong></li><li style="flex:1"><strong>78 </strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>91 </strong></li><li style="flex:1"><strong>12 </strong></li><li style="flex:1"><strong>35 </strong></li><li style="flex:1"><strong>38 </strong></li><li style="flex:1"><strong>56 </strong></li></ul><p></p><p>21 </p><p>Redundant B-Tree </p><p>• <strong>Redundant </strong>vs. <strong>non-redundan</strong>t B-trees </p><p>• the presented definition introduced the <strong>non-redundan</strong>t B-tree where <strong>each key </strong></p><p><strong>value </strong>occurred <strong>just once </strong>in the whole tree <br>• <strong>redundant B-trees </strong>store <strong>the data values in the leaves </strong>and thus have to allow <strong>repeating of&nbsp;keys in the inner nodes </strong></p><p>• restriction 6(i) is modified so that </p><p>• ∀풌&nbsp;∈&nbsp;푼&nbsp;풑<sub style="top: 0.31em;">풊−ퟏ&nbsp;</sub>:&nbsp;풌&nbsp;≤&nbsp;풌<sub style="top: 0.31em;">풊 </sub></p><p>• moreover, the <strong>inner nodes do not contain pointers </strong>to the data records </p><p>22 </p><p>B-Tree implementations </p><p>• Nodes vs. pages/blocks </p><p>• usually <strong>one page/block contains one node </strong></p><p>• <strong>Existing DBMS implementations </strong></p><p>• one <strong>page </strong>usually takes <strong>8KB </strong></p><p>• <strong>redundant B-trees </strong></p><p>• inner nodes contain only pairs&nbsp;푘<sub style="top: 0.34em;">1</sub>,&nbsp;푝<sub style="top: 0.34em;">1 </sub>→&nbsp;inner nodes and leaf nodes differ in structure and </p><p>size </p><p>• <strong>data </strong>are not stored in the indexing structure itself but <strong>addressed from the leaf&nbsp;nodes </strong></p><p>• if the key is of type 64-bit integer (8B) and the DMBS runs on a 64-bit OS (8B pointers) </p><p>with 8KB blocks →&nbsp;(푘<sub style="top: 0.34em;">1</sub>,&nbsp;푝<sub style="top: 0.34em;">1</sub>)&nbsp;=&nbsp;16퐵&nbsp;→ one page can accommodate 8192/16&nbsp;= </p><p>512 pairs → arity of the B-tree is then about 500 </p><p>23 </p><p>B-Tree - Search </p><p>• Searching a (non-redundant) tree 푇&nbsp;for a record with key 푘 </p><p>• <strong>Enter </strong>the tree in the <strong>root node</strong>. </p><p>• If the node <strong>contains a key </strong>풌<sub style="top: 0.41em;">풊&nbsp;</sub>such that 풌<sub style="top: 0.41em;">풊&nbsp;</sub>=&nbsp;풌&nbsp;return the data associated with </p><p>풅<sub style="top: 0.41em;">풊</sub>. </p><p>• Else if the node is <strong>leaf, return </strong>NULL. </p><p>• Else find <strong>lowest </strong>풊&nbsp;such that 풌&nbsp;&lt;&nbsp;풌<sub style="top: 0.41em;">풊&nbsp;</sub>and set 풋&nbsp;=&nbsp;풊&nbsp;−&nbsp;ퟏ. If there is no such ꢀ&nbsp;set 풋 </p><p><strong>as the rightmost </strong>index with existing key. </p><p>• Fetch the <strong>node </strong>pointed to by 풑<sub style="top: 0.41em;">풋</sub>. • <strong>Repeat </strong>the process from step 2. </p><p>24 </p><p>Updating a B-Tree </p><p>• The logarithmic complexity is ensured by the condition that <strong>every node has </strong></p><p><strong>to be at least half&nbsp;full </strong><br>• <strong>Inserting </strong></p><p>• finding a leaf where the new record should be inserted </p><p>• when inserting <strong>into a not yet full node no splitting occurs </strong></p><p>• when inserting <strong>into a full node</strong>, the <strong>node is split </strong>in such a way that <strong>the two resulting nodes are at least half&nbsp;full </strong></p><p>• splitting a node increments the number of subtrees of the parent node and thus possibly causes </p><p>splitting of the parent node → <strong>split cascade </strong></p><p>• <strong>Deleting </strong></p><p>• when deleting a record from a <strong>node more than half full</strong>, <strong>no reorganization </strong>happens • deleting <strong>in a half&nbsp;full node </strong>induces <strong>merging of the neighboring nodes </strong></p><p>• merging decreases the number of child nodes in the parent node thus possibly causing merging of </p><p>the parent node→ <strong>merge cascade </strong></p><p>25 </p><p>Inserting into a B-tree (1) </p><p>• <strong>Insert </strong>into a (non-redundant) tree 푇&nbsp;a record 푟&nbsp;with key 푘 </p><p>• If the tree is <strong>empty</strong>, allocate a <strong>new node</strong>, insert the key 푘&nbsp;and (pointer to record) 푟 </p><p>and <strong>return</strong>. <br>• Else find the <strong>leaf node </strong>푳&nbsp;where the key 풌&nbsp;<strong>belongs</strong>. </p><p>• If 푳&nbsp;<strong>is not full insert </strong>풓&nbsp;and 푘&nbsp;into 퐿&nbsp;in such a position that the keys are sorted and </p><p><strong>return</strong>. <br>• Else create a <strong>new node </strong>푳′. • Leave lower <strong>half records&nbsp;</strong>(all the items from 퐿&nbsp;plus 푟) in 푳&nbsp;and the higher <strong>half </strong></p><p><strong>records </strong>into 푳′&nbsp;<strong>except </strong>of the item with the middle key 푘′. </p><p><strong>a) If&nbsp;</strong>퐋&nbsp;<strong>is the root</strong>, create <strong>a new root node</strong>, move the record with key 푘′&nbsp;to the new root and <strong>point it to </strong>푳&nbsp;<strong>and </strong>푳′&nbsp;and return. </p><p>b) Else&nbsp;move the record with key 푘′&nbsp;to the <strong>parent node </strong>푷&nbsp;into appropriate position based on </p><p>the value 푘′&nbsp;<strong>and point the “left” pointer to </strong>퐋&nbsp;<strong>and the “right” pointer to </strong>푳′. </p><p>• <strong>If </strong>푷&nbsp;<strong>overflows</strong>, repeat step 5 for 푃&nbsp;else return. </p><p>26 </p><p>Inserting into a B-tree (2) </p><p><strong>27 </strong></p><p><strong>27 </strong></p><p><strong>Insert 91 </strong><br><strong>48 </strong><br><strong>25 </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>48 </strong></li><li style="flex:1"><strong>91 </strong></li><li style="flex:1"><strong>25 </strong></li></ul><p></p><p>35 falls into the node (48,91) </p><p>→ (35, 48, 91) → middle key </p><p>(48) moves to the parent node </p><p><strong>27 35 </strong><br><strong>48 </strong><br><strong>Insert 35 </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>25 </strong></li><li style="flex:1"><strong>91 </strong></li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">퐿</li><li style="flex:1">퐿′ </li></ul><p></p><p>27 </p><p>Inserting into a B-tree (3) </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>27 </strong></li><li style="flex:1"><strong>48 </strong></li></ul><p><strong>Insert 78,12 </strong></p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    73 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us