<<

UNIT 6C Organizing Data: Trees and Graphs

15110 Principles of Computing, 1 Carnegie Mellon University

Last Lecture

• Hash tables – Using hash function to map keys of different data types to array indices – Constant search time if the load factor is small • Associative arrays in Ruby

15110 Principles of Computing, 2 Carnegie Mellon University

1 This Lecture

• Non‐linear data structures – Trees, specifically, binary trees – Graphs

15110 Principles of Computing, 3 Carnegie Mellon University

Trees

15110 Principles of Computing, 4 Carnegie Mellon University

2 Trees

• A is a hierarchical . – Every tree has a node called the root. – Each node can have 1 or more nodes as children. – A node that has no children is called a leaf. • A common tree in computing is a . – A binary tree consists of nodes that have at most 2 children. – A complete binary tree has the maximum number of nodes on each of its levels. • Applications: data compression, file storage, game trees

15110 Principles of Computing, 5 Carnegie Mellon University

Binary Tree

84

41 96

24 50 98

13 37

Which one is the root? Which ones are the leaves? Is this a complete binary tree? What is the height of this tree?

15110 Principles of Computing, 6 Carnegie Mellon University

3 Binary Tree

84

41 96

24 50 98

13 37

The root has the data value 84. There are 4 leaves in this binary tree: 13, 37, 50, 98. This binary tree is not complete. This binary tree has height 3. 15110 Principles of Computing, 7 Carnegie Mellon University

Binary Trees: Implementation

• One common implementation of binary trees uses nodes like a linked list does. – Instead of having a “next” pointer, each node has a “left” pointer and a “right” pointer.

45 Level 1

31 70 Level 2

19 38 86 Level 3

15110 Principles of Computing, 8 Carnegie Mellon University

4 Using Nested Arrays

• Languages like Ruby do not let programmers manipulate pointers explicitly. • We could use Ruby arrays to implement binary trees. For example: [left,45,right]

45 Level 1 [[left,31,right],45,[left,70,right]] 31 70 Level 2

[[[nil,19,nil],31,[nil,38,nil]],45, 19 38 86 Level 3

[nil,70,[nil,86,nil]] ] nil stands for an empty tree Arrows point to subtrees 15110 Principles of Computing, 9 Carnegie Mellon University

Using One Dimensional Arrays

• We could also use a flat array.

45 Level 1

31 70 Level 2

19 38 86 Level 3 45 31 70 19 38 86

Level 1 Level 2 Level 3

15110 Principles of Computing, 10 Carnegie Mellon University

5 Dynamic Set Operations

• Insert • Delete • Search • Find min/max • ...

15110 Principles of Computing, 11 Carnegie Mellon University

Binary Search Tree (BST)

• A (BST) is a binary tree such that – All nodes to the left of any node have data values less than that node – All nodes to the right of any node have data values greater than that node

15110 Principles of Computing, 12 Carnegie Mellon University

6 Inserting into a BST

• For each data value that you wish to insert into the binary search tree: – Start at the root and compare the new data value with the root. – If it is less, move down left. If it is greater, move down right. – Repeat on the child of the root until you end up in a position that has no node. – Insert a new node at this empty position.

15110 Principles of Computing, 13 Carnegie Mellon University

Example

• Insert: 84, 41, 96, 24, 37, 50, 13, 98

84

41 96

24 50 98

13 37

15110 Principles of Computing, 14 Carnegie Mellon University

7 Using a BST

• How would you search for an element in a BST? 84

41 96

24 50 98

13 37

15110 Principles of Computing, 15 Carnegie Mellon University

Searching a BST

• For the key (data value) that you wish to search – Start at the root and compare the key with the root. If equal, key found. – Otherwise • If it is less, move down left. If it is greater, move down right. Repeat search on the child of the root. • If there is no non‐empty subtree to move to , then key not found.

15110 Principles of Computing, 16 Carnegie Mellon University

8 Exercises

• How you would find the minimum and maximum elements in a BST? • What would be output if we walked the tree in left‐node‐right order?

15110 Principles of Computing, 17 Carnegie Mellon University

Max‐Heaps

• A max‐heap is a binary tree such that – The largest data value is in the root – For every node in the max‐heap, its children contain smaller data. – The max‐heap is an almost‐complete binary tree. • An almost‐complete binary tree is a binary tree such that every level of the tree has the maximum number of nodes possible except possibly the last level, where its nodes are attached as far left as possible.

15110 Principles of Computing, 18 Carnegie Mellon University

9 These are not heaps! Why?

84 68

41 56 53 26

24 10 38 72 30

13 7

15110 Principles of Computing, 19 Carnegie Mellon University

Adding data to a Max‐Heap

Insert new data into next available tree position so the tree remains (almost) complete. 84

Exchange data with its parent(s) (if necessary) 4162 3. 56 until the tree is restored to a heap. 24 106241 13 38

2. 13 7 6210 1. insert here

15110 Principles of Computing, 20 Carnegie Mellon University

10 Building a Max‐Heap

• To build a max‐heap, just insert each element one at a time into the heap using the previous algorithm.

15110 Principles of Computing, 21 Carnegie Mellon University

Removing Data from a Max‐Heap

Remove root (maximum value) and replace it with "last" node of the lowest level. 2. 841062

Exchange data with its larger child (if necessary) 621041 3. 56 until the tree is restored to a heap. 24 4110 13 38

1. move last node to root 13 7 10

15110 Principles of Computing, 22 Carnegie Mellon University

11 BSTs vs. Max‐Heaps

• Which tree is designed for easier searching?

• Which tree is designed for retrieving the maximum value quickly?

• A heap is guaranteed to be “balanced” (complete or almost‐complete). What about a BST?

15110 Principles of Computing, 23 Carnegie Mellon University

BSTs vs Max‐Heaps • BST with n elements – Insert and Search: • worst case O(log n) if tree is “balanced” • worst case O(n) in general since tree could have one node per level • Max‐Heap with n elements – Insert and Remove‐Max • worst case O(log n) since tree is always “balanced” – Find‐Max • worst case O(1) since max is always at the root

15110 Principles of Computing, 24 Carnegie Mellon University

12 Graphs

15110 Principles of Computing, 25 Carnegie Mellon University

15110 Principles of Computing, 26 Carnegie Mellon University

13 Graphs

• A graph is a data structure that consists of a set of vertices and a set of edges connecting pairs of the vertices. – A graph doesn’t have a root, per se. – A can be connected to any number of other vertices using edges. – An edge may be bidirectional or directed (one‐way). – An edge may have a weight on it that indicates a cost for traveling over that edge in the graph. • Applications: computer networks, transportation systems, social relationships

15110 Principles of Computing, 27 Carnegie Mellon University

Undirected and Directed Graphs

0 7 0 7 6 6 2 4 2 4 2 1 1 9 3 3 5 5 3 3

to to from 0123 from 0123 0 0675 0 0675 1 604∞ 1 ∞ 04∞ 2 7403 2 2 ∞ 03 3 5 ∞ 30 3 ∞∞90

15110 Principles of Computing, 28 Carnegie Mellon University

14 Graphs in Ruby

0 7 6 4 2 1 3 5 3 inf = 1.0/0.0 to 0123 from graph = 0 0675 [ [ 0, 6, 7, 5 ], 1 604∞ [ 6, 0, 4, inf ], 2 7403 [ 7, 4, 0, 3], 3 5 ∞ 30 [ 5, inf, 3, 0] ]

15110 Principles of Computing, 29 Carnegie Mellon University

An Undirected Weighted Graph to 12 2 7 from 0123456 1 6 0 010∞ 87∞∞ 5 10 7 3 1 10 0 12 7 ∞∞∞ 4 3 8 5 0 9 6 2 ∞ 12 0 6 ∞ 75 7114 3 876094∞ 4 7 ∞∞90∞ 11 01234565 ∞∞74∞ 03 Pitt. Erie Will. S.C. Harr. Scr. Phil. 6 ∞∞5 ∞ 11 3 0 vertices edges

15110 Principles of Computing, 30 Carnegie Mellon University

15 Original Graph

12 Will. 7 Erie 7 6 Scr. 10 4 8 S.C. 5 3 Pitt 9 Phil. 7 Harr. 11

15110 Principles of Computing, 31 Carnegie Mellon University

A Minimal

Will. Erie 7 Scr. 4 8 S.C. 5 3 Pitt Phil. 7 Harr.

The minimum total cost to connect all vertices using edges from the original graph without using cycles. (minimum total cost = 34)

15110 Principles of Computing, 32 Carnegie Mellon University

16 Original Graph

12 Will. 7 Erie 7 6 Scr. 10 4 8 S.C. 5 3 Pitt 9 Phil. 7 Harr. 11

15110 Principles of Computing, 33 Carnegie Mellon University

Shortest Paths from Pittsburgh

14 10 Will. 12 Erie 6 Scr. 10 4 8 S.C. 8 3 Pitt Phil. 7 7 Harr. 15

The total costs of the shortest from Pittsburgh to every other location using only edges from the original graph.

15110 Principles of Computing, 34 Carnegie Mellon University

17 Graph Algorithms

• There are algorithms to compute the minimal spanning tree of a graph and the shortest paths for a graph. • There are algorithms for other graph operations: – If a graph represents a set of pipes and the number represent the maximum flow through each pipe, then we can determine the maximum amount of water that can flow through the pipes assuming one vertex is a “source” (water coming into the system) and one vertex is a “sink” (water leaving the system) – Many more graph algorithms... very useful to solve real life problems.

15110 Principles of Computing, 35 Carnegie Mellon University

18