Math 342 Class Wiki Fall 2006 2 Authors
This document was produced collaboratively by: • Keegan Asper • Justin Barcroft • Shawn Bashline • Paul Bernhardt • Kayla Blyman • Amanda Bonanni • Carolyn Bosserman • Jason Brubaker • Jenna Cramer • Daniel Edwards • Kristen Erbelding • Nathaniel Fickett • Rachel French • Brett Hunter • Kevin LaFlamme • Rebeca Maynard • Katherine Patton • Chandler Sheaffer • Kay See Tan • Brittany Williams
3 4 Introduction
This is a hard-copy version of a collaborative document produced during the Fall semester 2006 by students in my combinatorics course. The original hypertext version is currently being served at:
http://pc-cstaecker-2.messiah.edu/~cstaecker/classwiki
Since the original document was produced collaboratively by the students with minimal contributions by the professor, some errors may exist in the text. This hard-copy was translated into LATEX markup language by a computer program, which may have introduced a few further cosmetic errors. Thanks to all the students for their hard work and a great semester. Dr. P. Christopher Staecker Messiah College, 2006
5 6 Contents
Arrangements with repetition 11
Arrangements with restricted positions 12
Base case 14
Big O notation 14
Binary search tree 14
Binomial theorem 15
Bipartite graph 17
Circle-chord technique 17
Cliques 18
Color critical 32
Combinatorics 33
Comparing binary search trees 33
Complement 40
Complete graph 41
Computing coefficients 42
Computing coefficients/Examples 44
Coq 45
Counting with Venn diagrams 46
Dijkstra 49
7 8
Directed graph 49
Distributions 50
Enumeration 52
Euler 55
Euler’s Formula for Spheres, Toruses and Other Complex Solids 56
Euler cycle 62
Exponential generating function 64
Fibonacci Sequence and Pascal’s Triangle Relationship 66
Fibonacci sequence 67
Forest 68
Four color theorem 69
Four color theorem/Example 2: Chromatic Number of Graph 71
Gamma function 72
Generating function 79
Graph coloring 84
Graph theory 85
Hamilton 87
Hamilton circuit 88
Heawood Conjecture 90
Inhomogeneous recurrence relation 100
Integer partitions 101
Isomorphic algorithms 102
Isomorphism 107
Kuratowski 111
Linear recurrence relation 113 9
Minimal Recursion Circuits 116
Minimal cost 122
More Coloring Fun 124
Multigraph 124
Network 125
Network Flow 130
Pascal’s Tetrahedron 137
Path 144
Permutations and combinations 146
Permutations and combinations/“SYSTEMS” example 149
Planarity 150
Planarity algorithm 151
Ramanujan 156
Recursion 157
Rook Polynomials 160
Solving recurrence relations with generating functions 161
Spanning tree 164
Sperner’s lemma 166
Spherical and Toroidal Graphs 172
Stirling’s Formula 178
Subdivision 181
The Birthday Paradox 181
The Inclusion/Exclusion Principle 183
Towers of Hanoi 185
Traveling salesman problem 186 10
Tree 188
Using combinations in statistics 192
Varadarajan example 193
Vertex 194
Where chess meets mathematics 195 11
Arrangements with repetition
Counting Arrangements
A common enumeration problem one may have to solve deals with how many arrangements can be made from a collection of repeated objects of different types.
Our theorem states, if we have n objects with r1 of type 1, r2 of type 2, ..., rm of type m, where r1 + r2 + ... + rm = n, then the number of arrangements that can be made is:
n n − r1 n − r1 − r2 n − r1 − r2 − ... − rm−1 P (n; r1, r2, ..., rm) = ∗ ∗ ∗...∗ r1 r2 r3 rm An example of when this theorem would be useful is when counting the number of rearrangements in our previous example, SYSTEMS. Although our theorem might be slightly more tedious, we will reach the same answer. Let us think of rearranging the word systems as counting the arrangements with 3 of letter S, 1 of letter Y , 1 of letter T , 1 of letter E, and 1 of letter M. Therefore, our problem would look like: 7 4 3 2 1 P (7; 3, 1, 1, 1, 1) = ∗ ∗ ∗ ∗ = 840 3 1 1 1 1 As you can see, we have arrived at the same conclusion and answer as we did in the “SYSTEMS” example. However, this method proves more useful for problems which contain more than one quantity of various types. For another example, also see the Varadarajan example. See also: Distributions
An Arrangements with Repition Example: Burgers
How many ways are there to choose 10 burgers of 4 different types?
• You can start off by thinking of it as choosing a certain number of each type, such as 4 of the first type, 3 of the second, 1 of the third, and 2 of the fourth, with the total adding up to 10. • Then, in order to display this, you can write it as xxxx|xxx|x|xx where the x’s are the burgers and the pipes (|’s) separate the types. • Now you can treat this problem like an arrangement, and the new question is how many arrangements can we make with the x’s and pipes? There are 10 characters, 10 x’s and 3 pipes. 12
• Now we can represent this problem with P (13; 10, 3).
This example shows us a new theorem:
• The number of ways to choose r things of ntypes is:
C(r + n − 1, r) r is the number of x’s n − 1 is the number of pipes (|) For the example above: r = 10 burgers; n = 4 types. So the theorem results in:
C(r + n − 1, r) = C(10 + 4 − 1, 10) = C(13, 10), which yields the same answer as P (13; 10, 3).
Arrangements with restricted positions
Board where ’bad’ positions are shaded
If you one is counting arrangements with too many restrictions, it may be nec- essary to use a grid diagram in which ’bad’ positions are shaded. In this way, one can count the number of ways to arrange mutually non-capturing rooks within the grid. This might be used to solve problems such as: How many arrangements can be made with the letters a b c d e with: a not in position 1 or 2 b not in position 1 d not in position 3 or 4 e not in position 3, 4, or 5 Let U = all possible arrangements of the 5 letters 13
A1 = all arrangements with a ’bad’ letter in position 1
A2 = all arrangements with a ’bad’ letter in position 2 ... Ai = all arrangements with a ’bad’ letter in position i
So, we want N(A¯1...A¯5) We know from The Inclusion/Exclusion Principle that we should find N(U) − S1 + S2 − S3 + S4 − S5
S1, the sum of the first-fold arrangements in this case signifies all the arrange- ments ’bad’ in 1 position.
Similarly, Sk will be all arrangements bad in k positions or the number of ways to pick k bad spots on the board (with no 2 in the same row or column). Given some board, the number of placements of k mutually non-capturing rooks th is as the function rB(B), or the k rook number of B.
Example 1
Find all rk’s for B1 :
r0(B1) = 1 There is always one way to place 0 things
r1(B1) = 4 This is the number of spots on the board
r2(B1) = 3 This can be counted
r3(B1) = 0 There is no way to place 3 rooks in B1 without them attacking eachother all other, higher rk are zero.
Rook Polynomials
The rook polynomial r(x, B) of a board, B is the Generating function for rk.
2 i r(x, B) = r0(B) + r1(B)x + r2(B)x + ... + ri(B)x
2 So, from the previous example, r(x, B1) = 1 + 4x + 3x . 14
Example 2
Find all rk’s for B2
r0(B2) = 1
r1(B2) = 20
r2(B2) = 10 ∗ 9
r3(B2) = 0 All the others higher than r3(B2) will all be zero.
Base case
A base case is the simplest case of a recurrence relation that can be solved without further recursion. For example, r(1) = 1 is a properly-stated base case. A recurrence relation can have more than one base case if the recurrence relies upon more than one previous term.
Big O notation
Big O notation is a term used in both mathematics and computer science. In computer science it categorizes the complexity of an algorithm. It is often seen as an indicator of the efficiency of an algorithm, as it can roughly predict the time required to complete n iterations or recursions of an algorithm. For example, a given algorithm loops through n rows and adds each one to a sum. The Big O notation for this algorithm would be notated O(n). We would say that the algorithm has “order of n” complexity.
Common “Orders”
• O(log n) (logarithmic) • O(n) (linear) • O(n2) (quadratic) • O(cn) (exponential) • O(n!) (factorial) 15
Binary search tree
A binary search tree is a 2-ary tree the nodes of which have the following prop- erties:
1. The node has a value.
2. The sub-tree with the node’s left child as its root contains nodes with values less than the node.
3. The sub-tree with the node’s right child as its root contains nodes with values greater than the node.
Binomial theorem
n Pn n k n n n 2 The Binomial Theorem states: (1+x) = k=0 k x = 0 + 1 x+ 2 x + n n ... + n x (This is also known as Observation 0) From the Binomial Theorem, we can also find an exponential generating func- tion: (1 + x)n is also an exponential function for P (n, r)
• ∞ X n (1 + x)n = xr r r=0 • ∞ X n! = xr (n − r)!r! r=0 • ∞ X n! xr = (n − r)! r! r=0 • ∞ X xr = P (n, r) r! r=0 proof:
First we will look at a specific case, when n = 3 16
(a+x)3 = (a+x)(a+x)(a+x) = (a+x)(aa+xa+ax+xx) = aaa+aax+axa+axx+xaa+xax+xxa+xxx This is all words of length 3 on x and a. Here, the term xi corresponds to words with i x’s. Thus, the coefficient on xi is the number of words of length n on x and a, using i x’s. By the arrangements with repetition theorem, the coefficient on xi is:
n n − i n P (n; i, n − i) = ∗ = i n − i i Thus,
3 3 3 3 3 X 3 (1 + x)3 = + x + x2 + x3 = xk 0 1 2 3 k k=0 by observation, the same thing can be extended to (1 + x)n.
• Example problem using the Binomial Theorem.
More binomial identities
1. n n n n + + + ... + = 2n 0 1 2 n 2. n (n + 1) (n + 2) (n + r) (n + r + 1) + + + ... + = 0 1 2 r r
3. r (r + 1) (r + 2) n (n + 1) + + + ... + = r r r r (r + 1) 4. n2 n2 n2 n2 2n + + + ... = 0 1 2 n n 5. r X m n (m + n) = k (r − k) r k=0 6. m X m n (m + n) = k (r + k) (m + r) k=0 17
7. m X (m − k)(n + k) (m + n + 1) = r s (r + s + 1) k=0
Bipartite graph
an example of a bipartite graph from class
This is a property in graph theory that depends on paths or circuits of even length. However, this property of length is defined differently in the world of combinatorics by meaning the number of edges in the path or circuit.
Bipartite Defined:
A graph is bipartite if the graph’s vertices can be partitioned into two sets (we’ll call them V1 and V2) such that all edges go from vertices of V1 to vertices of V2 (and vice versa). If all of the vertices on one side are connected to all of the vertices on the other side and vice versa by edges, it is a complete bipartite graph.
Theorem Involving Bipartite Graphs:
A graph G is bipartite iff every circuit in G has even length.
Circle-chord technique
The circle-chord technique is a technique that checks a graph for planarity. To use this method, first find a circuit in the graph that visits every vertex once. Draw this circuit in a circle, and then insert the rest of the edges represented on the graph, connecting each vertex to the other around the circle. Below you will see the class example of this technique. 18
Circle-Chord Technique Example
After creating the circle-chord representation of the graph, you can determine whether the graph is planar. If the chords are able to be drawn inside the circle and/or outside of the circle without crossing each other, the graph is planar. However, if the chords do cross each other, you must try other methods before determining whether or not the graph is nonplanar.
Cliques
Everyone encounters cliques on a regular basis because they are the fabric of society. Social cliques, which are smaller groups of people who all know one another, exist within the larger group of society. Social cliques are simply one of the many applications of cliques within graphs. Cliques within graphs can easily be understood through the analogy of social cliques. The graph is society’s equivalent in this analogy. Within the graph, vertices represent the people and the edges represent relationships between the pairs of people (vertices) which they connect. So if Bill (Vertex B) knows Sue (Vertex S) there will be an edge connecting Vertex B to Vertex S, representing their relationship. 19
200
Cliques
Technically, a clique is a complete subgraph within a larger graph. More specif- ically, a clique is a subset of the set of vertices creating the graph. Within the subset of vertices, every vertex is adjacent to every other vertex in the subset, and is not contained in any larger subset meeting this same requirement. The resulting subset and the edges that connect the vertices in the subset to one another is the clique. To show that a graph contains a clique of a given size we use a clique number. The clique number is the number of vertices contained in the largest clique in the graph. The clique number is written ω(G) for the graph, G.
Example 1
A graph can have just one clique as shown in the graph, G, below.
ω(G) = 4 The clique is highlighted in red. 20
Example 2
A graph can also have many cliques of different sizes as shown in the graph, H, below.
ω(H) = 5 The largest clique, made from five vertices, is highlighted in red. There is another clique made of four vertices highlighted in blue. There is also a clique made of three vertices highlighted in green.
Perfect Graphs
Clique numbers are used to determine whether a graph is perfect. To understand what a perfect graph is, an understanding of induced subgraphs is necessary. An induced subgraph is a subgraph created by selecting a collection of vertices from the original graph, and creating a graph from them and every edge present in the original graph that connects one vertex in that set to another vertex in the set. A perfect graph was defined in two ways by Berge. According to Berge a graph, G, could be γ-perfect or α-perfect. A γ-perfect graph, G, has ω(H) = χ(H) for all H which are induced subgraphs of G.
Example of γ-Perfect γ-perfect graphs can be complicated, but also very simple. Here is a simple example of a perfect graph shown in a bipartite graph. 21
Notice that for any set of vertices that you may hap- pen to choose to select, you will get a chromatic number and a clique number of either 2 or 1; they will always be the same.
α-Perfect Graphs
To understand an α-perfect graph, some new terms need to be explained. The first is the clique number’s complement, the independence number, written α(G). The independence number is the largest subset of vertices in a graph, G, which do not have edges connecting them. Since there were edges connecting all of the vertices in the clique in the original graph, G, there will be no edges connecting the vertices that have become the independent set of the original graph’s complement, G¯.
Independence Number Example Here the clique number in G and the independence number in G¯ are shown to be equal. 22
The graph on the left is the original graph with ω(G) = 4. The edges of the clique are highlighted red, while the vertices in the clique are highlighted green. The graph on the right is the complement of G with α(G¯) = 4. The vertices involved in the independent set are highlighted green. Notice that these vertices are the same vertices that were involved in the clique in the original graph.
Clique Covering Number Another important concept to grasp is the clique covering number of a graph. This is the smallest number of cliques needed for every vertex of the graph to be included in at least one of the selected cliques. The clique covering number is notated θ(G), and it is χ(G)’s complement. So, for any graph, G, χ(G) = θ(G¯).
Clique Covering Number Example Here the chromatic num- ber in G and the clique covering number in G¯ are shown to be equal. 23
The graph on the left is the original graph with χ(G) = 3. The vertices are colored with three different colors, red, blue, and green, showing that the clique covering number and chromatic number really are com- plements. A chromatic number of 2 would not work because the graph has a clique number of 3 and clique numbers are one of the lower bounds of the pos- sible chromatic numbers. In class, this was expressed as, “since the graph has a K3, it must have a minimum chromatic number of 3.” The graph on the right is the complement of the original graph, G, with θ(G¯) = 3. The edges of the three cliques needed to cover the graph are all colored a different color, the clique in the upper left corner has green edges, the lowest clique has red edges, and the right most clique has blue edges. Returning to the topic of perfect graphs which led to the presentation of all of this terminology, an α-perfect graph, G, has α(H) = θ(H) for all H which are induced subgraphs of G. Using the definitions of the two types of perfection possible in graphs, there are connections between them that can be made. α-perfect can be written as γ-perfect’s complement. So a graph is perfect if χ(G¯) = ω(G¯). This also works in reverse, γ-perfect is α-perfect’s complement.
Perfect Graph Theorem
The Perfect Graph Theorem of Lovsz states that the complement of every perfect graph is also perfect. It was proved using the fact that if a graph is α-perfect, then it is also γ-perfect. 24
Example
You can see that this theorem works on the bipartite graph pre- sented as the γ-perfect example. Here is that graph’s complement to be exam- ined.
The complement will have a chromatic and clique number of 3, 2, or 1; however, they will always stay the same.
The Mycielski Construction
Perfect graphs are a very special case when it comes to the relationship between the chromatic and clique numbers of a graph. The more typical case is when the chromatic number is much higher than the clique number. In fact, there is a graph construction which shows that a graph can have a very large chromatic number with a clique number of two. The construction is called the Mycielski Construction. An important requirement to remember when attempting this construction is that the original graph must be free of triangles. The construc- tion asserts that for every triangle-free graph, G, with any chromatic number, χ(G), it is possible to create a new triangle-free graph, G0, with chromatic num- ber, χ(G0) = (χ(G) + 1). There is no limit to how large the original chromatic number can be, so it would be true to say that the chromatic number could become infinite, thus making it infinitely larger than the clique number.
To build the construction, take the original set of vertices, V = {v1, vn}, and add an identically sized set of vertices, U, and a single vertex, w. First, connect every vertex in U to w. Second, take every vertex, ui and connect it to the vertex v(i+1). Do not connect un to any vertex. Third, take every vertex, vi and connect it to the vertex u(i+1). Do not connect vn to any vertex. 25
Example 1
The steps to the Mycielski Construction are outlined visually be- low.
The original graph is in the upper left. The edges of the original graph are blue, and the vertices of the original set of vertices, V , are colored orange. Note that the chromatic number of the original graph is 2, just like its clique number. Also illustrated in this graph are the set of vertices, U, along with the vertex, w, which will be needed for the construction. The vertices in U are colored purple, and w is colored black. The graph labeled “1” is the graph that follows the first instruction in the preceding paragraph, connecting every vertex in U to w. The edges that show this are colored red. The graph labeled “2” is the graph that follows the second instruction in the preceding paragraph. The edges that show this are colored green. The graph labeled “3” is the graph that follows the third instruction in the preceding paragraph. The edges that show this are colored aqua. The graph labeled “3” is the Myscielskian Construction of the original graph. Notice that its chromatic number is 3, while its clique number is still 2.
Example 2
The Mycielski Construction for five vertices is known as the Grotsch graph. It is the smallest graph – free of triangles – to have a chromatic number 26 of 4. It is illustrated below in the same manner that the last example was using the same color key.
Notice that the original graph’s chromatic number is 3 and after the Myscielski edges and vertices have been added the new graph’s chromatic number is 4. It has now been shown that the Mycielski Construction works in two different examples, but that does not prove that it works all of the time. Here is the proof that it does, in fact, work in all cases of triangle-free graphs.
Proof
A triangle-free graph, G, is constructed of a set of vertices, V . A graph, G0, is to be constructed from G using the Mycielski Construction. As explained earlier, a set of vertices, U, identical to V , is added, along with a single vertex, w. The vertices in V are connected to other vertices in V , but never in a way that creates a triangle. No two vertices in U are ever connected because the Mycielski Construction does not instruct for this to happen. No vertex in V will ever be connected to w because the Mycielski Construction does not indicate that this should happen either. Every vertex in U will be connected to w however because the Mycielski Construction requires it. Therefore, the only way that a triangle could be formed is if a vertex in U, ui, is connected to two adjacent vertices in V , vj and v(j+1), which is not a part of the instructions to create a Mycielskian. The Mycielski Construction only directs that ui should be connected to v(i−1) 0 and v(i+1) which will always be two non-adjacent vertices. This proves that G will not have triangles in it as long as G does not have a triangle in it. Since G being triangle-free is one of the requirements to use the Mycielski Construction, it is safe to assume that G has no triangles. By proving that G0 does not have 27 triangles, it has been proven that the clique number does not increase from G to G0. Now, it must be proven that the chromatic number does increase. If G has chro- matic number χ(G) = k, then in G0 the subset of vertices created by combining the existing subsets V and U also has a chromatic number equal to k. Since ui is not adjacent to vi, the color of ui can be the same as the color of vi. The color of w must be different than the color of any of the other vertices in G0 because it is connected to every vertex in U, and if the chromatic number of U is less than the chromatic number of V , then the graph G can use fewer colors in its vertex coloring than were originally used, giving it a smaller chromatic number to begin with. This can be proved by working through the coloring in a reverse order, starting with w in G0 and working towards the outside of the graph. If a chromatic number is found for G0, then a lower chromatic number for G can be found. If the color of w is t, then the coloring of U must be done in (t − 1) colors. ui and vi will be adjacent to the same two vertices in V . So if U is colored in (t − 1) colors, and V is allowed to be colored with all t colors again, then a subset, C, can be taken from V consisting of the vertices in V that have the color t. The members of C will keep the same subscripts as they had in V . Every member of C can be taken, one at a time, and matched with its corresponding vertex in U, and colored the same as that vertex. So ci = vi, and ci can therefore be colored the same color as ui. This will result in the set V being colored in (t − 1) colors, which is the same as the original graph, G, being colored in (t − 1) colors. Therefore proving that a smaller number of colors could be used to color G, making χ(G0) > χ(G).
Vertex Coloring in the Mycielski Construction
Here are the previous two examples of the Mycielski Construction, illustrating the vertex coloring of the original graphs, G and H, and the graphs created by the Mycielski Construction, G0 and H0. 28
Notice that w must be a different color in both cases, G0 and H0, than any other vertex in the graph.
Clique Graphs
Forming clique graphs is an interesting thing to do with graphs containing cliques. A clique graph is a graphical representation of the cliques in a graph and their connections with one another. In the clique graph, every clique from the original graph is represented by a vertex. If the clique intersects another clique in the original graph, that is, they share at least one vertex, then the vertices representing those cliques will have an edge between them in the clique graph, representing that intersection. 29
Example 1
The graph on the left is the original graph which con- sists of three cliques, each a different color. The vertices that are a part of an intersection are colored yellow, while all the rest are black. In the graph on the right there is a vertex for every clique, totaling three vertices. The vertices are the same color as the clique they represent from the original graph. The lines in the graph to the right connect the vertices which represent the cliques connected by a yellow vertex, so the yellow vertices have become the edges.
Example 2
An interesting case of this is when the clique number of a graph is 2. 30
Notice that the clique graph is essentially the same shape as the original graph. The clique graph has the same number of vertices and edges as the original had. The edges have become vertices, and the vertices have become edges. This is illustrated through the use of a different color for every vertex and edge, which remains consistent from the original graph on the left to the clique graph on the right.
Conclusion
In conclusion, cliques appear more places in our lives than just socially. One simple example of this is the game of Uno. Following are some pictures of some Uno cards acting as vertices for a graph. Wild cards and other non-numeric cards have been omitted to keep the graph from getting too complex. The edges of the graph connect cards that have relations between them. The relations that would be of interest are those which make play of either card on the other legal. For example, all of the yellow cards have edges between them, making a clique of yellow cards. Also all of the “1“s have edges between them making a clique of “1“s. 31
Exercises
1. Do a Mycielski construction on a graph with six vertices. Comment on the relationship of the chromatic and clique number of the original graph in relation to those numbers of your resulting graph.
2. What is the clique number of the graph below?
3. Create a clique graph for the graph below.
4. Give an example of a perfect graph. 32
5. What is the clique covering number of the graph below?
References
14. Some Graph Theory. 11 Nov. 2006
Color critical
A graph is color critical if the removal of any vertex of the graph decreases the chromatic number. By removing the vertex, you also remove all the edges that connect to that vertex. 33
Graph G
Examples: The graph G is color critical. (It has, a chormatic number, χ = 3, but the removal of vertex B or D gives χ = 2.)
Combinatorics
Combinatorics is a crazy word that rhymes with “Dabombinatorics” and “Your- mominatorics”.
Comparing binary search trees
Binary search trees (BSTs) are used extensively in computer science as building blocks for data structures with a higher level of abstraction. Computer scientists use BSTs because they store data compactly and can be searched or sorted efficiently. Exactly how efficiently depends upon the structure of the tree.
Insertion and search efficiency
To insert a new value into a BST, one must travel down a path from the root, comparing the new value to the values of the existing nodes. If the new value is less than the value of any given node, the new value is next compared to the left child of that node. If it is greater, it is compared to the right value. Eventually, the new value is inserted as a leaf. This same basic process occurs when searching. Therefore, the height of the BST will be the main determining factor in the efficiency of the algorithm. A BST of minimal height is a balanced binary tree. James Fill has shown that, given a random set of values, a balanced tree has a high probability of occurring[35]. So the height of the tree is in the range of log n, where n is the number of elements in the tree. Therefore, in the average case, searching or inserting in a BST is O(log n). The worst-case efficiency is less optimal. In the worst case, a BST ends up looking like a list–each node has only a single child. Searches and insertions in 34 such a tree occur in O(n) time. Since all useful data has groupings and patterns, computer scientists have developed methods for keeping a BST balanced and efficient.
Red-black tree
The red-black tree was invented in 1972 by Rudolf Bayer.[57] He called his data structure the B-tree. It was later revised and renamed in 1978 by Leo J. Guibas and Robert Sedgewick. A red-black tree maintains balance by introducing a new property to each node, essentially a boolean value, that identifies the node as either red or black. Red-black trees have the following properties in addition to those already imposed by the definition of a BST:
1. Every node is either red or black.
2. The root node is black.
3. All leaves are black (any node without two child nodes is assigned a black “null” node in each empty position).
4. Both children of every red node are black.
5. All paths from any given node to any descendant leaf nodes contain the same number of black nodes.
Insertion
In order to keep a tree balanced, certain operations must be performed on the tree after every insertion. In a red-black tree, insertion itself occurs as it would in any BST, with the exception that the inserted node is by default colored red. After insertion, a balance operation is performed on the inserted node. This operation does one of five things depending upon the properties of the immediate “family” of the inserted node.
Rotation Rotations in both of the BSTs follow the general form for rotations: the node p upon which the rotation is performed becomes the right child (for right rotations) or left child (for left rotations) of its existing left child (for right rotations) or right child (for left rotations) c. The empty spot that used to be occupied by the original child is now occupied by the original grandchild g that was displaced. 35
Case 1 The new node is the root of the tree; it’s the first node we added. In this case we re-color the node black.
Case 2 The parent of the inserted node is black. We don’t need to do anything, as no rules have been broken.
Case 3 The parent of the inserted node and the uncle (the other child of the new node’s grandparent) are both red. This would violate rule 4, as our new node has a red parent. If it weren’t for rule 5, we could just re-color either the new node or the parent of the new node black. However, this would mean that the path to the leaves underneath the new node would have more black nodes in it than the path to any leaves under the uncle. So, the solution is to re-color both the parent and the uncle of the new node black, and make the grandparent red. Since this could conceivably break a rule higher up in the tree, we must now balance the grandparent.
Case 4 The new node’s parent is red but its uncle is black. Additionally, either the parent is a left child and the new node is a right child or the parent is a right child and the new node is a left child. This also violates rule 5. The solution here is to perform a rotation. We perform a left rotation on the parent if the new node is a right child and right rotation on the parent if the new node is a left child. The conclusion of this case does not fully fix the tree; therefore, we apply Case 5 to what was the parent node (but is now the child of the new node).
Case 5 The new node’s parent is red and its uncle is black. Also, both the new node and the parent are either right or left children. This violates rule 4. The first step towards reconciliation is to re-color both the parent and (if it exists) the grandparent red. Additionally, we must perform a rotation on the grandparent. We rotate right if the node and its parent are left children and left if the node and its parent are right children. 36
Randomized search trees (randomized treaps)
The randomized search tree was invented in 1989 by Cecilia R. Aragon and Raimund G. Seidel.[3] This data structure is a modified version of something commonly called a treap because it maintains its balance by behaving not only like a tree, but also like a heap. In practice this means the following:
1. Every node has an assigned “priority”.
2. Every descendant of a node has a priority greater than the node. This puts the treap in min-order. (Note: in his treatment of the subject, Seidel usually arranges his treaps in max-order.)[22]
Insertion
Treap insertion, like insertion in a red-black tree, follows the same algorithm as any BST. The priority of the node is assigned upon insertion. In order to make our treap randomized, we assign a random priority. After inserting the node, the treap is balanced according to heap rules applied to the random priority. This has the same effect as having a tree of totally random data. Since randomized data entry minimizes the height of the tree (see “Insertion and search efficiency,” above), new insertions as well as searches are guaranteed to have worst case efficiency O(log n). This improves efficiency over entering non-random data into a tree without balancing. To balance the treap, we compare the inserted node’s priority to the priority of the parent node. If the new node’s priority is less and it is a left child, we rotate right. If the new node’s priority is less and it is a right child, we rotate left. After any rotation, we must compare the node’s priority to its new parent, as rotation has moved it up one level in the tree. In this way we recursively balance the node until the tree has become a proper heap.
Evaluating insertion algorithms
When determining which tree to use, one must find some way to compare the algorithms used in each type of tree. The most obvious comparison is that of performance (usually the amount of time it takes to complete n operations). The first and most obvious indicator of performance is the number of iterations required in an algorithm. For both of these algorithms, the main source of iterations is the path taken from the root to the leaf where the node is initially added to the tree. Therefore, the height of the tree is of primary significance when comparing computational efficiency. 37
Treap height
The function used to categorize the height of a treap is based in probability and is beyond the scope of this paper. However, it can be shown that the probability of the height of a randomized search tree being greater than logarithmic is exceedingly low. Therefore, the height of the tree can be said to be O(log n).[22]
Red-black tree height
Lemma: The height of a red-black tree is at most 2 log2 (n + 1).
Proof We start by claiming that any sub-tree in a red-black tree has at least 2bh(x)−1 internal nodes, where bh(x) is the number of black nodes in the sub-tree defined by the node x (this is known as the black-height of the node). Claim: Any sub-tree has i(x) = 2bh(x) − 1 internal nodes. Base case: bh(leaf) = 0 (there are 20 − 1 = 0 internal nodes in the sub-tree of a leaf) Inductive case:
• If some x where h(x) = k and i(x) = 2bh(x) − 1
• Then, in order for the claim to be true, we must show x0 where h(x0) = k+1 0 and i(x0) = 2bh(x ) − 1
• If h(x0) > 0, then x0 is an internal node.
• The children of x0 have black-height bh(x0) (if x’ is red) or bh(x0) − 1 (if x’ is black).
0 • So, each child has internal nodes i(child) ≥ 2bh(x )−1 − 1
0 0 • Therefore, i(x0) = 2bh(x )−1 − 1 + 2bh(x )−1 − 1 + 1
0 • Simplifying, we get i(x0) = 2bh(x ) − 1
Having found the number of internal nodes, we can find the limits on the height of the tree. According to rule 4, both children of every red node are black. Conversely, every red node must have a black parent. Therefore, at least half of the nodes on any path from root to leaf must be black. This means that h(root) bh(root) ≥ 2 .
Let i(root) = n n ≥ 2bh(root) − 1 38
h(root) n ≥ 2 2 − 1 h(root) log (n + 1) ≥ 2 h(root) ≤ 2 log (n + 1)
Thus, the height of a red-black tree is O(log n), as well.
Other considerations
While the height of a treap can be expected to be logarithmic, and the height of a red-black tree is no more than 2 times that, there are other factors to consider. So far all we have done is insert the new node; no balancing activities have taken place. In our red-black tree, we never have more than 2 rotations per insertion (see cases 4 and 5 above). We also have no more than m re-colorings averaged over a sequence of m insertions.[57] However, in a treap, we can have up to O(log n) rotations per insertion. This is because it is possible for the randomly assigned priority of the inserted node to be smaller than any other node in the tree. In this case, the rotations would occur recursively until the node is the root. Still, our complexity is O(log n) (as n goes to infinity, the doubling of the iterations becomes insignificant). Although the complexity of the insert operations on both trees is comparable, one would likely see a difference in practice. Firstly, the cost of recursively rotating in a treap is much higher than the cost of a constant number of rotations and re-colorings in the red-black tree. This is particularly a cause for concern when the tree is part of a larger data structure that would require recalculation after every rotation. Secondly, while it is true that the theoretical height of a randomized treap is less than the red-black tree, small data sets often behave differently. In smaller data sets the red-black tree tends to be more balanced as it is not subject to the whims of random priorities. In general, the red- black tree is seen as an improvement on the randomized treap as it has more predictable height (and hence, more predictable asymptotic running time) and requires fewer expensive rotations.
Improving efficiency in randomized treaps
Suppose a start-up company with a new search engine wished to provide a simple dictionary definition every time a user searched for a term. Additionally, assume that the search for the definition would use a treap. There would be certain terms that would be searched upon many times an hour, and other terms that might be searched upon once or twice a year. Using our completely randomized binary search tree, both the likely term and the unlikely term would have the same chance of being at the top or the bottom of the tree. If by chance the 39 popular search term happened to be at the bottom of the tree, the search would have sub-optimal performance. One way of optimizing this search algorithm would be to analyze the search data from, say, the past six months, and produce a continuous probability distribution for the data. This distribution would enable the company to weight the data entering the tree with regard to its likelihood of being retrieved. Doing so would create a treap where the most-used data is at the top. This could significantly speed up the majority of the searches. Seidel and Aragon have shown that such a treap would have a search time of W O(1 + log w(x) ), where W is the sum of all of the weights in the treap and w(x) is the weight of the piece of data being searched for.[22]
Conclusions
After having analyzed insertion and searching in two different BSTs, are there any conclusions that can be drawn about these trees? First, a word of caution: insertion and searching aren’t the only operations that can be done on a tree. There are also deletions, splits, and merges to consider. That being said, the following observations can be made:
1. A red-black tree is more rigid than a treap. Probability plays much less of a role in the red-black tree, so it is easier to predict its asymptotic behaviors.
2. A treap is more flexible than a red-black tree. Because the priorities determine the depth of a datum in the treap, the makeup of a treap can be tweaked simply by making subtle adjustments to the priorities. With this knowledge, data sets with less-than-random usage patterns can be handled very efficiently.
3. Red-black trees make a good base for general-use libraries. The red-black tree is a good general-use performer when nothing is known beforehand about the makeup of the data.
4. Treaps are useful for specialized custom data structures. Knowing a little bit about the usage of the data can make a treap very efficient, so it is a good choice when building a custom data structure for a specific purpose.
An implementation
The page http://pc-cstaecker-2.messiah.edu/ cstaecker/jason/Trees.htm con- tains my own implementation of both a red-black tree and a randomized treap. Values entered in the form at the top are added to both trees simultaneously. 40
Click the “Balance” button that appears after a value is added to run the balance algorithms on the trees (if you don’t balance before you add another element, the code calls the balance algorithms automatically). The visual representation of the tree relies upon tables, where each cell contains a node. The cell directly above any node contains the node’s parent. Hence, the first row contains one cell, the second two cells, the third four cells, and so on. Source code for the red-black tree can be found at http://pc-cstaecker-2.messiah.edu/ cstaecker/jason/RedBlack.js and source for the treap can be found at http://pc-cstaecker-2.messiah.edu/ cstaecker/jason/Treap.js. Each implementation contains a class definition for the tree as a whole and an- other class definition for nodes.
Exercises
1. Suppose you are tasked with implementing search functionality on a state government’s website. You analyze a year’s worth of usage statistics and come up with a continuous probability distribution for the search terms entered by users. However, the oversight committee argues that analyzing the search terms once a year is not often enough – your search function must react dynamically if a search term suddenly becomes more popular. Describe a method for dynamically updating your treap so that the time it takes to search for a term decreases as it gains popularity.
2. A certain red-black tree has 32,767 nodes. If it has worst-case height and each binary comparison takes 1 one-thousandth of a second, how long will it take in the worst case to find a value in the tree?
3. Another consideration when comparing trees is the amount of computer memory required to store the tree. Assuming that null children are not represented by null-valued nodes but are actually empty pointers, which of the two trees discussed here is more space-efficient and why? (See Red-black tree definition above for explanation of null nodes).
See the bibliography section.
Complement
A complement of some set A in some universe U is written as A¯. A¯ is the set of all things in U not in A. This is illustrated in the diagram below. The purple portion of the diagram represents the set A and the blue portion represents the complement of A, A¯. 41
Complete graph
The complete graphs for K3,K4, and K5, respectively
K6
A complete graph is a set of vertices and edges in which every vertex is adjacent to every other vertex. The notation for a complete graph on n vertices is Kn. A bipartite graph can also be complete. A complete bipartite graph is no- tated as Kn,m where n and m are the number of vertices in each of the two sets.
To determine the number of edges in Kn we would use the formula Σd = 2 ∗ e, where d is the degree of each vertex and e is the number of edges in the graph. In a complete graph the degree of each vertex is n − 1 (each vertex has an edge connecting it to every other vertex). The sum of the degrees is then n(n − 1) , so the number of edges in a complete graph of n vertices is: n(n−1) 2 . 42
Graph Example
Description
Uses of complete graphs
When determining whether or not a graph is planar, if the graph in question contains a K3,3 configuration or a K5 configuration. This is known as Kura- towski’s Theorem.
Computing coefficients
Basically, these are tricks that are based on simple polynomial identities.
Observation 0
n X n n n n n (1 + x)n = xk = + x + x2 + ... + xn k 0 1 2 n k=0 (This is actually the Binomial Theorem.)
Observation 1
1 − xm+1 1 + x + x2 + ... + xm = 1 − x
Proof:
(1 − x)(1 + x + ... + xm)
= (1 + x + x2 + ... + xm) − (x + x2 + x3 + ... + xm+1)
= 1 − xm+1 43
Observation 1.5
1 1 + x + x2 + ... = 1 − x
Proof:
(1 − x)(1 + x + x2 + ...)
= (1 + x + x2 + ...) − (x + x2 + x3 + ...) = 1
Observation 2
1 1 + n − 1 2 + n − 1 = 1 + x + x2 + ... (1 − x)n 1 2
Proof:
By 1.5, 1 1 n = = (1 + x + x2 + ...)n (1 − x)n 1 − x The coefficient on this is the number of ways to choose r things from n types. r r+n−1 So the coefficient on x is r , which is what we want. Class Example involving Observations 1 and 2
Observation 3
n n n n (1 − xm)n = 1 − xm + x2m − x3m + ... + (−1)n xnm 1 2 3 n
Proof:
n Pn n k • The binomial theorem is (1 + x) = k=0 k x
n m n Pn n k mk • If x = −1x , then (1 + (−1)x ) = k=0 k (−1) x
0 n 0m 1 n 1m 2 n 2m • This written out is (−1) 0 x + (−1) 1 x + (−1) 2 x ... • Which, if you squint, will look familiar (it equals the right half of the observation above). 44
Observation 4
2 2 (a0 + a1x + a2x + ...)(b0 + b1x + b2x + ...) =
2 a0b0 + (a1b0 + a0b1)x + (a2b0 + a1b1 + a0b2)x + ... r +(arb0 + ar−1b1 + ... + a0br)x + ... Observations Example Combining All Observations
Computing coefficients/Examples
Find the number of ways to collect $15 total from 20 people with: each of first 19 give $0 or $1, last one gives $0, $1 or $5
e1 + e2 + ... + e20 = 15 = r
e1...e19 ∈ {0, 1}
e20 ∈ {0, 1, 5} Our generating function is (1 + x)19(1 + x + x5) We want the coefficient on x15, so by the Binomial Theorem we can expand our generating function as shown below:
19 19 19 (1 + x)19(1 + x + x5) = (1 + x + x2 + ... + x19)(1 + x + x5) 1 2 19
15 coefficient on x is a15b0 + a14b1 + ... + a0b15 all b’s are 0 except b0, b1, b5 = 1 19 19 19 So we get a15 + a14 + a10 = 15 + 14 + 10
Another Example
Here is another problem that we can solve using observations 1-4. Find the coefficient of x25 in (x2 + x3 + x4 + x5 + x6)7. We can begin solving the problem by factoring an x2 out from the sum, and then raising it to the seventh power to equal x14(1 + x + x2 + x3 + x4)7.
2 3 4 1−x5 We can use Observation 1 to put (1 + x + x + x + x ) into the form 1−x . 7 14 1−x5 This step simplifies the equation to x 1−x . 45
7 14 5 7 1 By separating the fraction so that the expression is x (1−x ) 1−x , we can apply Observations 2 and 3. 5 7 7 5 7 10 7 35 Observation 3 shows that (1 − x ) = (1 − 1 x + 2 x − ... − 7 x ). 7 1 1+7−1 2+7−1 2 Observation 2 shows that 1−x = (1 + 1 x + 2 x + ...).
14 7 5 7 10 When we combine these pieces of the expression, we get x (1− 1 x + 2 x − 7 35 1+7−1 2+7−1 2 ... − 7 x )(1 + 1 x + 2 x + ...). To find the coefficient on x25, we can determine the coefficient of x11 on
7 7 7 1 + 7 − 1 2 + 7 − 1 (1 − x5 + x10 − ... − x35)(1 + x + x2 + ...) 1 2 7 1 2 since this expression will be multiplied by x14 at the end. 11 From Observation 4, we know that the coefficient on x equals a11b0 + a10b1 + ... + a1b10 + a0b11.
In the first piece of the expression, the only nonzero values of a are a10, a5, and a0, which simplifies the sum for the coefficient to
a10b1 + a5b6 + a0b11