<<

CIT208: Algorithms II

By

Dr. E.K. Olatunji

Computer Science Programme College of and Communication Studies Bowen University, Iwo, Osun State, Nigeria

March 2020

1 Data

• References • 1. btechsmartclass.com/ds • 2. www.cs.cmu.edu/ • 3. computer Science by CS French • 4. tutorialpoint.com Tree

• A tree data structure (TDS) is a collection of data (nodes) which is organized in a hierarchical structure. • In TDS, every element is called a node • A node in TDS stores the actual data of that particular element and links to next element in the hierachical structure • It is non linear data structure unlike arrays, , stack • Searching for element in a TDS is much faster than doing the same in array or linked list • A sample TDS is as shown in the next slide Sample TDS

A

B C D

E K F G

L

• Figure XX1: A sample TDS Application of TDS

• Storing of info that naturally form an hierachy, e.g the file system on a computer. Folders in OS are organized using TDS • In compiler during compilation, every expression (arith, logical, etc) is converted into a syntax tree format e,g, syntax tree of (a+b) *c is as shown below (next slide) • It is also used in auto corrector and spell checker • Assignment • Describe clearly 5 more applications of TDS Application of TDS Contd

• Syntax trees of Algebraic Expressions

* +

• cc aa + *

b c • aa bb b c

• Fig xx2:Syntax tree for (a+b)*c (ii) a+ b*c Tree Terminology

• An Example of tree •

• Figure xx3: Example of a Tree data structure for Illustration • Source:http://btechsmartclass.com/data_structures/tree-terminology.html Tree Terminology Contd

• Root Node – This is the first node in a TDS. It is the origin of a TDS – There is only one root node in every tree – In the sample TDS above (Fig xx 3) A is the root node • Edge – The connecting link or line between any 2 nodes • Parent – Any node apart from the root that has other nodes below it is called a parent node. In the above TDS, B is the parent of D,E, F • Child – A node below a given node connected downward by its edge. E.g D, E, F are children nodes of B; I and J are children of node E • Sibling – Nodes with the same parents • Leaf – A node without any child node. Also called a terminal node or External node e.g D, J, H • Internal Node – A node that has at least one child. Also called internal or non-terminal node Tree Terminology Contd

• Degree – This is the total no of children a node has. The highest degree of a node among all nodes in a tree is called the degree of the Tree – E.g; the degrees of A & C are 2 while the degree of B is 3; and degree of D is zero. The degree of the tree is 3 • Level: – Level of a node represents d generation of a node. The root node is said to be at level 0; children node of the root are at level 1 & the children of the node at level 1 will be at level 2, etc. – For e.g; A is at level 0, B & C at level 1 while I & J are at level 3 • Height – In any tree, the “height” of a tree is the total no of edges/links from leaf to that node in its longest path. In a tree, the height of the root node is said to be the height of the tree. – The height of any leaf node is zero. From the sample tree, height of A is 3, which is the height of the tree; heights of B & C are 2 while the height of K is zero. Tree Terminology Contd

• Depth – In a TDS, the total no of edges from the root node to a particular node is called the depth of the node. In a tree, the total no of edges from root node to a leaf node in the longest path is said to be the depth of the tree. – The depth of the root node is zero.. In the diagram, the depth of the tree is 3, depth of I, J, and K are 3; but the depth of C is 2. • Path – In a TDS, the sequence of nodes and edges from one node to the other node is called the path between the 2 nodes. For e.g; the path between A & J in the diagram above is A-B-E-J, The path between B & J is B-E-J. – The length of a path is the total no of nodes in that path. In the example above, the path A-B- E-J has length 4 • Sub Tree – This is a node together with all its descendants, itself being a child of another node (possibly the root node). – for example, in the diagram, B and all its descendants form a sub-tree; likewise, E and C form sub-trees with their descendants • Traversing – This is passing thru nodes in a specified order Types of Tree DS

• A General Tree – This is a tree in which each node may have zero or more children. – It is used to model application such as file systems – Its Diagram here • Binary tree (BT) – It is a specialized case of a general tree – In a BT, each node cannot have more than 2 child nodes – Its diagram here • Full Binary Tree – This is a BT in which each node has exactly zero or 2 children. – In a Full BT, there is no node with exactly one child – Its diagram here • Complete Binary Tree – A complete BT is one which is completely filled from left to right (with possible exception of the bottom level). – Its diagram here Binary (BST)

(BST) – It is a binary tree – A left child node must have a data value smaller than its parent node and the right child node must have a value greater than it s parent node. For e.g: – ** its Diagram here

• Bb Constructing a BST

• We assume the following data are to be entered into a tree in this order: 27, 14, 35, 10, 31, 19, 42 • i) 27 is the first data item to be placed into the tree, its node is therefore the root node, depicted as below: 27 • ii) Next we add 14 to the tree, using the rule ‘lower number to the left of the root and higher number to the right of the root/parent’; since 14 < its parent, i.e 27; the tree now looks like the one below 27

• 14 Constructing a BST

• We assume the following data are to be entered into a tree in this order: 27, 14, 35, 10, 31, 19, 42 • iii) The next item is 35, which is larger than the root, so we add it to the right of its parent, thus the tree becomes • 27

14 35

• iv) Constructing a BST

• We assume the following data are to be entered into a tree in this order: 27, 14, 35, 10, 31, 19, 42 • iv) we continue like this and the final tree is as shown here: 27

• * 14 35

10 19 31 42 • Figure yy1: A BST Constructing a BST Contd

• Note: • The left most node contains the smallest value, while the rightmost node contains the highest value. • An important property of a BST is that an in- order traversal on it will always visit the nodes of the tree in a sorted order Basic Operation on a BST

• Insert • Search • Traversal: – Pre-order – traverses a tree in a pre-order manner – In-order – traverses a tree in an in-order manner – Post-order – traverses a tree in a post-order manner Traversals in BST

• This is the process of visiting all the nodes of a tree in order to process them, e.g printing all the values • There are 3 standardard traversal orders, each relating to when a node’s value is processed to when the value of its sub-trees are processed • The 3 traversals are: – Pre-order: processes the value in the current node 1st (NLR), then its left and right sub trees – In-order: Processes the left sub-tree 1st, then the value of its current node, then its right sub-tree (LNR) – Post-order: processes the left and right sub-trees 1st, then the value in the current node (LRN) • Thus the traversal order is determined by when a node’s value is processed compared to when the values in its sub-trees are processed: pre(before), in (in between) and post(after) Traversals in BST

A

B C

D E F G

• Figure xx4: Sample tree illustrating traversal orders In-order Traversal

• In-order: – Left sub tree 1st, Root Next, Finally Right sub Tree (LNR) A

B C

D E F G – For example, from Figure xx4?? The in-order traversal will produce: DBEAFCG – Used if we need to produce an increasingly ordered list of the value stored in a tree Pre-order Traversal

• Pre-order • NLR-Parent/Root node 1st, then Left sub tree and finally Right sub tree A

B C

D E F G • For e.g, from figure xx4?? A pre-order traversal will produce: ABDECFG • Can be used to make a prefix (polish notation) from expression tree; that is traverse the tree pre-orderly Post order Traversal

• Post order _ – LRN: Left sub tree 1st, then Right sub tree next and then Root/parent node last A

B C

D E F G

– For e.g; from figure xx4?? A post-order traversal will produce: DEBFGCA Post order Traversal

• Post order _ – Can be used to generate postfix representation of a binary tree – If a tree represent the structure of an arithmetic expression with nodes representing operators and sub-tree representing operands, then a post-order traversal is best adapted to the problem (operands must be evaluated b4 their associated operator) – Assignment1 • What are the outputs of traversing the BST constructed in Figure yy1 (slide 15?)? – Assignment2? Tree DS

• End of Lecture? GRAPH DATA STRUCTURE

• It is a non-linear DS • It contains a of points known as nodes( or vertices ) and a set of links or lines known as edges (or Arcs) which connect the vertices • Thus Graph can be defined as a collection of vertices (/nodes) and arcs (/edges) which connect the vertex • Generally, a graph G is represented as G = (V, E), where V is a set of vertices and E a set of edges Graph DS Contd

• Example • The following is a graph with 5 vertices and 7 edges

A B E

C D

• Figure 1: • The points A. B. C, D, & E are nodes. Each line is an edge • This graph can be defined as G(V,E), where V ={A, B, C, D, E} and E={(A,B)(A,C)),(A,D),(B,D),(C,D), (B,E), (D,E)} Graph DS

 Application of Graph DS • Finding a route from one location to another; – Can you give me the direction from here to the market – What prerequisites must I have to be promoted to the next class – What is the shortest route between Iwo and Ogbomosho. Etc – ** look for 5 more applications  Examples of Graph • A social graph which represents people as vertices and relationships among them as edges, eg

father of A B . A Road Network which tracks places/locations/towns as vertices and roads as edges Lagos Iwo

. An airline Network – which tracks cities as vertices and flights between cities as edges

. A curricular dependency graph – which tracks prere-quisite as (the edges) between courses (the vertices) Categories of graph

• Directed Graph – An edge points from one node to the other, eg – One way roads which allow travel in one direction only – A directed flight from one city to another does not necessarily imply there is also a direct return flight • Undirected Graph – Edges connect in both directions without pointing to any particular direction – All the afore-mentioned graphs before this slide are examples • Weighted Graph – i.e each edge has a cost or value associated with it – Road network is weighted , recording distance between locations Lagos 203Km Iwo – Airline network might record the price of ticket or the distance • Unweighted graph – No value associated with the edge – E.g social graphs, curriculum dependency graph Graph terminologies

• Vertex – This is an individual data item in a graph. it is known as node – In the example above, A, B, C, D and E are known as vertices • Edge – This is a connecting link between 2 vertices. Is also known as arcs – For e.g, d links between vertices A&B respectively is represented as (A,B). In the graph above, there are 7 edges (i.e, (A,B), (A,C), (A,D), (B,D), (B,E), (C,D), (D,E)) • Undirected graph – The edges are not pointing to any direction – Connected undirected graph with n vertices must have at least n-1 edges – Connected undirected graph with n vertices with exactly n-1 edges cannot contains a loop – A connected undirected graph with n vertices, more than n-1 edges must contain at least one cycle • Directed graph – This is a graph with only directed edges, ie, edges pointing to any direction – Example is

Lagos Iwo Graph terminologies

• Origin & Destination of edges – If an edge is directed, its 1st endpt is said to be the origin of it, & the other end pt is said to be the destination of it. – For example, Lagos is the origin & Iwo is the destination in the sample directed graph above • Adjacent vertices – If there is an edge between 2 vertices A & B, then A &B are said to be adjacent. E,g, A and B are adjacent vertices, likewise A and C in graph of figure 1 • Degree of a vertices – The total no of edges connected to a vertex is called the degree of a vertex. – E.g. in figure 1, the degrees of vertices A and E are 3 and 2 respectively Graph Terminologies Contd

• Self-loop – An edge (undirected or directed) is a self-loop, if its 2 endpoints coincide • Path • Simple graph – A graph is simple if there is no parallel and self-loop • Sub-graph – This is a graph that consists of a subset of a graph’s vertices and a subset of its edges. For e.g, • Complete graph Operations on Graph as ADT

• Test weather a graph is empty • Get number of vertices in graph • Get number of edges in a graph • See whether an edge exists between 2 given vertices • Insert vertex in graph whose vertices have distinct values that differ from the new vertex • Insert edge between 2 given vertices in a graph Graph Representation

• There are 3 main ways to represent graph: – Adjacency matrix (2D) – Adjacency array or vector – • Adjacency matrix – In this matrix, rows and column both represent vertices – It is filled with either 0 or 1. – 1 indicates there is an edge from row vertex to column vertex, and zero indicates there is no edge from row vertex to col vertex Adjacency Matrix contd • The adjacency matrix for the graph below • A B C D A 0 0 1 1 B 0 0 0 1 C 1 0 0 1 D 1 1 1 0 A C

B D • The graph of the above adjacency matrix • Quiz: – Represent the given graph G(V,E) in each of the 3 methods, – Draw the graph whose adjacency matrix is given above Adjacency List Adjacency (1-D) Array/vector Hash Table Data Structure

• Refs: • TutorialPoint.com • Basics of hash table by Prateek Garg (www.hackerearth.com Retrieved 19-11-2018 •

• etc Hash Table Data Structure

• Hash table – It is a DS that is used to store data as keys/value pairs – In this Ds data is stored in an array, where each data value has its own unique index value that tells where in the array to store the data element – It uses hashing technique to generate an index where a data value is to be stored or located – Insertion and search operations are very fast irrespective of the size of the data stored in the hash table – Under reasonable assumptions, average time required to search for an element in a hash table is constant, i.e O(1). – Compare on unsorted array = Big-O(n) & Binary search on sorted array = O(logn)! Hash Table Data Structure Contd

• Sample Hash Table storing Nigeria State Capitals Index Values 0 Abia 1 Sokoto 2 Kano 3 Osun 4 … 6 … Hash Table Data Structure Contd

• Hashing Technique – Technique used to uniquely identify a specific object from a group of similar object e.g – - Student matric no; – Vehicle plate number – Wrt HTDS, hashing is a technique used to transform a range of data values into a range of unique indexes of an array, where the data values are to be stored or located – Uses a formula, called to carry out the transformation or mapping • Hash function – Any function that can be used to map a data value into a unique index of the hash table – Variously called hash code, hash value ,etc Hash Table Data Structure Contd

• Example of Hash function – Assume we have a hash table of size 20, we can use the modulo operator as the hash function; i.e; – The Index or hash = Data-value Modulo Array-size; Thus – 42 will be stored in location 42 modulo 20 = 2 – 4 will be stored in location 4 modulo 20 = 4 – 17 will be stored in location 17 modulo 20 = 17 – 37 will be stored in location 37 modulo 20 = 17 • Collision – When 2 data values hash to the same index, – As can be observed in the example above, 17 & 37 hash to the same index! • Another Example of hash function – To store string data values, such as the one in one sample hash table in the last2 (?) slides – Index Number(or hash code) = Sum of ASCII codes of the characters Modulo Array- Size Hash Table Data Structure Contd

• Properties of a Good hash function • Easy to compute • Uniform distribution • Less collision • A good hash function should be modulo-ed with a (e.g; 17, 29, 5990, etc) instead of being modulo-ed with the size of the hash table – This approach reduces occurrence of collision • Collision Resolution techniques – Separate chaining – – Quadratic probing – • Assignment: • Describe each of the techniques with example; • Due date – In a weeks time Hash Table Data Structure Contd

• Implementation of Hash Functions – Programmers do not need to write complex hash functions. Most PLs already have good hash function; Such as: – Java – hashcode() – C# - GethashCode() – Python – hash() – C++ - std::hash • Applications of Hash table – Implementation of – Implementation of memory – Used as disk-based data structures and database indexing – Implementation of objects in PL like , JavaScript – Hash function are used in various algorithms to speed up their execution Hash Table Data Structure Contd

• Quizes • What is hashing? • Describe the Hash table data structure • What are the attributes of a good hashing function • Mention different techniques that can be employed by hash functions • Given the key 384724. Which technique will produce 9 as the index in the hash table? • List and describe collision resolution techniques • Explain 5 applications of Hash table • List 4 properties of resolving hash collision • Assume: Hash code = sum of the ASCII codes of the characters of a string Modulo 29. Where will the string ‘ABC’ be stored in the hash table? N.B: ASCII codes for A, B, and C are 65, 66 & 67 respectively