<<

Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

Programming Hacks Tutorials search...

Tetszik 19 6

HOME ABOUT EVAN TUTORIALS LOG IN

Home Programming Nested Sets

Written by Evan Petersen on Thursday, 12 April 2012. Posted in Programming, Random

JUST ANOTHER HIERARCHICAL MODEL

Learn how to create a hierarchical structure optimized for fetching all child nodes in a highly efficient manner. In this tutorial, we’ll look into the pros and contras of using the Nested model to store our hierarchical data structure.

Storing Hierarchical Structures with the Nested Set Model

Creating a model that can store hierarchical data is not a simple task. What do I mean by hierarchical data? Imagine that you are trying to organize products into meaningful categories. For instance, you would have a few main categories: food, appliances, and electronics. You then want to sub-divide those categories into more specific groups and then divide those categories into more specific groups. This is fairly simple to map out on paper, but not so easy to place into a data table. Such is the case when attempting to represent hierarchical data, like products and categories, in a database. One such method for modeling this type of data is called Nested Sets. This tutorial will cover the concepts behind the nested set model and when it should be implemented. I'll begin by explaining the far simpler parent child model and progressing into the nested set model.

Parent-Child Model

Traditionally, hierarchical structures are modeled with a strict parent to child relationship. In the case of a , each node will have a unique ID and a parent ID. ID numbers have no meaningful relationship to the data that is contained by the node[1] other than serving as a unique identifier. The root[2] node will have a special value as its parent ID so as to signify that you have reached the root. Consider the following table and diagram:

ID parentID Value

0-1Root

10 A

20 B

31 C

41 D

53 E

62 F

1 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

Who am I?

My name is Evan Petersen and I work as a Programmer Each arrow represents a pointer to the parent node as defined by the ID, parentID, Value table. This parent-child at Dotcomjungle in Southern Oregon. Click here to learn relationship model is one of the most widely used and understood models for representing hierarchical more about me and this structures. Each node indicates what its ID is and who it is a child of (I.e. who its parent is). website

The Problem: Clinical Study Data Entry On-demand access to live Unfortunately, this model becomes cumbersome when querying for ALL children of a given node. Let’s say, for reports. Medrio saves you time and money. instance, that we want to know the ID of every child under node with ID 1. By looking at our diagram, we can www.Medrio.com quickly report that nodes 3, 4, and 5 are all children of node 1. Using our table, however, the operation isn’t Számlázzon Ingyen Nálunk Gyorsan, Könnyen, Telepítés nearly as easy. Nélkül. Több mint 3700 Cég Már Használja Szamlazz.hu/online-szamlazas

Querying for Children

We begin by querying for all nodes whose Recent Comments parent ID is 1. In this case, we find that nodes

3 and 4 list node 1 as their parent (the value in magnificent submit, very the red arrows signifies the query number that informative. I'm wondering why the opposite experts of this we are currently performing). We then must sector don't notice this. You query for all nodes that list 3 or 4 as their should continue your... parent. As you can see, this would get out of hand quickly on larger datasets, resulting in

hundreds if not thousands of database calls so web site promotion 09. June, 2012 | # that we can retrieve all children of a single node. Note how the number of queries This comment would be to say thanks, i dont comment necessary directly corresponds to the depth typically, but when i do it really (height) of the tree. The number of queries will is usually for some thing extremely great. always be 1+ the height of the tree that you are querying.

lavyneill 27. May, 2012 | #

Finally! This is just what I was looknig for. Nested Sets: our knight in shining armor!

This is where the concept of nested sets comes in. Bear with me as it’s not as simple as the parent child relationship we discussed earlier.

Rather than have a parent ID, we will be implementing our table with left and right values. Take a look at the table Pattie and the resulting data structure: 05. May, 2012 | #

You can use sleep command to pause a bash scirpt.i=1while ID Left Right Value [ "$i" != "10" ];do i=`expr $i + 1` echo number $i sleep... 0 0 13 Root

118 A

2912B Auth 04. May, 2012 | #

2 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

on August 7, 2006of course, it 325 C isn't *that* weird that PDO is swleor. PDO is, as far as I can 467 D see it, more of a complete database... 534 E

61011F Ceyda 04. May, 2012 | #

hey, i have a test on circuit ansyials on thrusday and im having difficulties in complex Whoa, what just happened? situations, for instints. can i select a node as...

Alright, let’s take a step back and look at where the next set of numbers came from.

The left and right values in a node represent a set of child nodes. In turn, each child node has a left and right ismail value that represent another set of children, hence the term Nested Sets. For instance, if I wanted to retrieve all 04. May, 2012 | #

children of node A (ID 1), I would run a query similar to this: dear dr. yaz z. lii am about to start teaching ntwroek analysis to ug students of electrical and electronics engineering. thanks 1 SELECT * FROM `nodes` WHERE Left > '1' AND Right<'8'; ? for...

This would return C, D and E, in a single query. Notice how no recursive step was applied. In a single query, we were able to retrieve all children of A. Narasimhamurthy 04. May, 2012 | #

on August 7, 2006of course, it isn't *that* weird that PDO is Well that’s great, but how do we figure out the left and right values for an slewor. PDO is, as far as I can see it, more of a complete existing tree? database...

Surprisingly enough, this isn’t too tricky to accomplish. Here’s a diagram showing how I computed the left and right values of each node: Jhon 04. May, 2012 | #

The parent-child data structure Determining Left and Right works wonderfully for small sets. When you start getting values into high traffic websites with deep nesting...

Begin at the top left arrow, starting your count from 0. When you pass a node,

increment by one and list that number Evan Petersen 18. April, 2012 | # next to the node. If you are to the left of the node, the number represents the Why not use recursive CTEs against the parent-child left value. If you’re on the right, it model? represents the right value. Keep the simple (and normalized) parent-child data structure and get the...

Brandon K 18. April, 2012 | #

3 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

Notice how any given leaf node’s left and right values only differ by 1 while nodes with children differ by at least 3 and then in increments of two. This can tell us a lot about a given node without performing additional queries.

Alternative Representation

Let’s look at this data structure in a way that’s a little more meaningful with respect to left and right values:

With the nodes represented in this fashion, we can very easily see how the sets are nested and the depth of a given node (distance from the root). The number line on the bottom can be thought of as the one dimensional space in which all nodes are placed. Each node consumes a distance of 1 and has a buffer of at least one slot before another node is encountered. If we were to query again for all children of node A, we can think of the query operating like so:

1 SELECT * FROM `nodes` WHERE Left > '1' AND Right<'8'; ?

Inserting Nodes

Let’s try inserting a new node with value G just beneath the Root node. Take a look at the resulting table and diagram below:

ID Left Right Value

0 0 15 Root

118 A

2912B

325 C

467 D

534 E

4 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

61011F

71314G

To insert the node as a direct child of another node, you need to allocate some space in the tree. To do so, retrieve the right value of the node you wish to insert under. In this case, the right value of the root was 13 (see previous diagram). To allocate the space, add 2 to all left and right values that are greater than or equal to 13.

1 UPDATE `nodes` SET Left = Left + '2' WHERE Left >= '13' ? 2 UPDATE `nodes` SET Right = Right + '2' WHERE Right >= '13'

Because we are inserting a new child directly below the root, our update of left values won’t affect any nodes. (Side note, we are inserting G as the rightmost child of the Root so as to minimize the number of updates the database must perform. Try inserting as the leftmost node instead and look at how many nodes must have their Left and Right values incremented). After allocating the necessary space, insert your new node, setting the left value to the root’s previous right value, and your new node’s right value to 1 + the root’s previous right value.

What about deleting nodes?

Deleting nodes is an incredibly simple operation; just beware of exactly what is happening. Let’s take a look at the different cases for a delete:

Deleting a leaf You will remove the leaf node and the rest of the tree will remain untouched

1. Decrement all left values greater than the node to delete’s left value by 2 2. Decrement all right values greater than the node to delete’s right value by 2 3. Remove node

Deleting a node with children You will remove the node and promote all immediate children to be direct descendants of the parent node of the node you are removing

1. Decrement all left and right values by 1 if left value is greater than node to delete’s left value and right value is less than node to delete’s right 2. Decrement all left values by 2 if left value is greater than node to delete’s right value. 3. Decrement all right values by 2 if right value is greater than node to delete’s right value. 4. Remove node

Deleting the Root Node Same as deleting a Parent Node EXCEPT for the fact that it will now be possible to have multiple trees without a single root node. If the former Root node contained more than one child, multiple nodes will become root nodes. This is typically a bad event; however, it may be desired.

Sample Delete

Let’s look at what deleting a node with children (Node A index 1) looks like in our model:

ID Left Right Value

5 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

0013Root

2710B

314 C

456 D

523 E

689 F

71112G

Notice how C and D became children of the Root node while E remained a child of C. If we were to continue on and delete the Root node, we would end up with 4 root nodes and 2 children (typically considered a mistake when dealing with trees, but again, it depends on what you are attempting to accomplish.)

Efficiency:

Nested sets are incredibly efficient for retrieving all children of a given node. They are not efficient when deleting or inserting due to the sheer number of updates you must perform to your dataset. While it may only be 3 SQL commands to insert, the database is updating many, many records resulting in database thrashing.

When used in conjunction with a parent ID or a level number, nested sets can remain efficient for retrieving all immediate children of a given node; however, in true nested set form, multiple queries are required for retrieving only immediate children of a given node. Additionally, deletes of nodes with children requires an additional SQL query to update the parent pointers if implemented alongside the traditional parent child relationship.

Take away

Think of Nested Sets as another model for storing hierarchical data structures. It is highly efficient for retrieving all children of a given node and is simple to insert and remove nodes (although somewhat resource intensive). If you happen to come across a situation where nested sets’ benefits shine, try implementing and see what kind of performance gains you achieve over the standard parent/child relationship.

[1] Node: An that contains information pertaining to its location and contents. E.g. parent value, identification number and human readable information.

[2] The top most node of a tree structure under which child nodes are placed.

Tags: Hierarchical Model, MYSQL, PHP

Tweet 3 Like 5

About the Author Evan Petersen

My name is Evan Petersen, and I work as a Programmer in Southern Oregon. You can visit the home page of my blog at: www.EvanPetersen.com.

I enjoy reserarching new methods to solve age old problems and later sharing my findings with the community at large. Hopefully you'll find something of use!

Follow me on G+

6 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

http://www.evanpetersen.com »

Comments (11)

Ryan 12 April 2012 at 21:34 | #

Hey, thanks for taking the time to write this out. I've been trying to wrap my head around nested sets for some time but I never fully grasped it. Cheers!

reply

Evan Petersen 17 April 2012 at 19:56 | #

My pleasure! Glad it was useful.

reply

Josh 12 April 2012 at 22:05 | #

Great article. The only thing is that I walked away from it still wondering why would I want Nested Sets? I understand that they are superior to Adjacency Lists but IMHO Nested Sets are good for querying subtrees and not much else. Querying direct children, deleting nodes, inserting nodes, and moving subtrees are all relatively difficult with Nested Sets. Not to mention that Nested Sets do not maintain referential integrity. Even Path Enumeration is easier despite having some of the same problems.

All that said, I love your writing style and approach - I just wish this article was about Closure Tables instead of Nested Sets.

reply

Evan Petersen 17 April 2012 at 19:56 | #

Hey, thanks for the comment! Nested sets are really good at retrieving all children of a given node. With extra information, like depth, you can make nested sets almost as efficient as parent/child relationships for retrieving information, just not inserting.

Parent/Child relationships break down when you have to grab all children of a root node.

Glad you enjoyed the article! I'll have to take a closer look at closure tables and possibly post about them!

reply

Piegus 14 April 2012 at 03:55 | #

Hey. What about updating sets. or moving it from one hierachy to another?

reply

Evan Petersen 17 April 2012 at 19:53 | #

I like to think of these operations as moving items on a number line. To move one to another location, you first need to annex some space by increasing the left and right values of every node that currently occupies that space. You are really just shifting everything over, moving the set, and then shifting it all back. I'll have to post an addendum to this article that better covers the moving of sets.

In the mean time, does that analogy make any sense or am I talking like a crazy person?

reply

Connie C. Khan 15 April 2012 at 14:49 | #

7 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

Interesting content on the other hand I would like to explain to you that I think there is trouble with your RSS feeds when they appear to not be working for me. Might be just me but I was thinking I would suggest it.

reply

Brandon K 18 April 2012 at 11:05 | #

Why not use recursive CTEs against the parent-child model?

Keep the simple (and normalized) parent-child data structure and get the answers you need from recursive CTEs.

reply

Evan Petersen 18 April 2012 at 16:34 | #

The parent-child data structure works wonderfully for small sets. When you start getting into high traffic websites with deep nesting (think on the order of 1000s of levels) you will run into serious performance issues especially when querying for all children of a given node.

For most cases, a parent child relationship will be fine but will not scale. Recursive solutions will always be slower than their iterative counterpart.

reply

Narasimhamurthy 04 May 2012 at 09:23 | #

dear dr. yaz z. lii am about to start teaching ntwroek analysis to ug students of electrical and electronics engineering. thanks for explanation and your illustrations are too good that i want to follow. kindly let me know which software is used to draw the ntwroeks and highlighting relevant portions of the ntwroek. thanks and regards.

reply

ismail 04 May 2012 at 11:23 | #

hey, i have a test on circuit ansyials on thrusday and im having difficulties in complex situations, for instints. can i select a node as a reference node if it is connected to a voltage source? If possible does that mean the nodal voltage of the node of the other side of the voltage source will be that of the voltage source.thanks brendon

reply

Leave a comment You are commenting as guest. Optional login below.

Name *

E-mail *

Website

Submit comment

8 / 9 2012.06.22. 15:32 Nested Sets Hierarchical Model | Evan Petersen http://www.evanpetersen.com/item/nested-sets.html

9 / 9 2012.06.22. 15:32