Data Structures and Algorithms s1

DATA STRUCTURES AND ALGORITHMS

UNIT I - Linear Structures

Objective  To understand the concept of data structures  The explain various data structures like linked list, stack and queue with their operations  To discuss about various applications of stacks and queue

Data structures Data structure is defines as the way of organizing the data that considers not only the items stored and also the relationships to each other. Types  Linear data structures  Non linear data structures

1. Abstract Data Type (ADT)

A set of data values and associated operations that are precisely specified independent of any particular implementation. It is also known as ADT.

2. List ADT A list is a sequential data structure, ie. a collection of items accessible one after another beginning at the head and ending at the tail.

 It is a widely used data structure for applications which do not need random access

 Addition and removals can be made at any position in the list

 lists are normally in the form of a1,a2,a3.....an. The size of this list is n.The first element of the list

is a1,and the last element is an.The position of element ai in a list is i.

 List of size 0 is called as null list.

Basic Operations on a List

Data Structures-P.Vasantha Kumari,L/IT 1  Creating a list

 Traversing the list

 Inserting an item in the list

 Deleting an item from the list

 Concatenating two lists into one

Implementation of List: A list can be implemented in two ways

1. Array list 2. Linked list

3. Array Implementation of List

 This implementation stores the list in an array.

 The position of each element is given by an index from 0 to n-1, where n is the number of elements.

 The element with the index can be accessed in constant time (ie) the time to access does not depend on the size of the list.

 The time taken to add an element at the end of the list does not depend on the size of the list. But the time taken to add an element at any other point in the list depends on the size of the list because the subsequent elements must be shifted to next index value.So the additions near the start of the list take longer time than the additions near the middle or end.

 Similarly when an element is removed,subsequent elements must be shifted to the previous index value. So removals near the start of the list take longer time than removals near the middle or end of the list.

Problems with Array implementation of lists

 Insertion and deletion are expensive. For example, inserting at position 0 (a new first element) requires first pushing the entire array down one spot to make room, whereas deleting the first

Data Structures-P.Vasantha Kumari,L/IT 2 element requires shifting all the elements in the list up one, so the worst case of these operations is O(n).

 Even if the array is dynamically allocated, an estimate of the maximum size of the list is required. Usually this requires a high over-estimate, which wastes considerable space. This could be a serious limitation, if there are many lists of unknown size.

 Simple arrays are generally not used to implement lists. Because the running time for insertion and deletion is so slow and the list size must be known in advance

4. Linked list implementation

A Linked list is a chain of structs or records called Nodes. Each node has at least two members, one of which points to the next Node in the list and the other holds the data. These are defined as Single Linked Lists because they can only point to the next Node in the list but not to the previous.

The list can be made just as long as required. It does not waste memory space because successive elements are connected by pointers. The position of each element is given by an index from 0 to n-1, where n is the number of elements. The time taken to access an element with an index depends on the index because each element of the list must be traversed until the required index is found. The time taken to add an element at any point in the list does not depend on the size of the list, as no shifts are required Additions and deletion near the end of the list take longer than additions near the middle or start of the list. because the list must be traversed until the required index is found.

Types of Linked List

 Singly Linked List  Doubly Linked List  Circular Linked List Data Structures-P.Vasantha Kumari,L/IT 3 Operation on Linked List(List ADT)

1.is_last() int is_last( position p, LIST L ) { return( p->next == NULL );

}

This procedure is used to check whether the position p is an end of the linked list L

2. Isempty() int is_empty( LIST L ) { return( L->next == NULL );

}

This is used to check whether the linked list is empty or not. If the header of linked list is NULL then return TRUE otherwise return FALSE.

3.Insert()

void insert( element_type x, LIST L, position p ) { position tmp_cell; tmp_cell = (position) malloc( sizeof (struct node) ); tmp_cell->element = x; tmp_cell->next = p->next; p->next = tmp_cell;

Data Structures-P.Vasantha Kumari,L/IT 4 }

This procedure is used to insert an element x at the position p in the linked list L. This is done by

 Create a new node as tmpcell  Assign the value x in the data filed of tmpcell  Identify the position p and assign the address of next cell of p to the pointer filed of tmpcell  Assign the address of tmpcell to the address filed of p

4. Delete() void delete( element_type x, LIST L ) { position p, tmp_cell; p = find_previous( x, L ); if( p->next != NULL ) { tmp_cell = p->next; p->next = tmp_cell->next; free( tmp_cell ); } }

This is used to delete an element x from the linked list L. This is done by

 Identify the element x in the linked list L using findpreviuos() function Data Structures-P.Vasantha Kumari,L/IT 5  Name the node x as tmpcell  Assign the address of next cell of x to the previous cell.  Remove the node x

5.Find() position find ( element_type x, LIST L ) { position p; p = L->next; while( (p != NULL) && (p->element != x) ) p = p->next; return p; }

This is used to check whether the element x is present in the linked list L or not. This is done by searching an element x from the beginning of the linked list. If it is matched then return TRUE, else move the next node and continue this process until the end of the linked list. If an element is matched with any of the node then return FALSE.

Array versus Linked Lists 1.Arrays are suitable for - Randomly accessing any element. - Searching the list for a particular value - Inserting or deleting an element at the end. 2.Linked lists are suitable for -Inserting/Deleting an element. -Applications where sequential access is required. -In situations where the number of elements cannot be predicted beforehand.

Data Structures-P.Vasantha Kumari,L/IT 6 5. Cursor based Linked List Implementation

Many languages, such as BASIC and FORTRAN, do not support pointers. If linked lists are required and pointers are not available, then an alternate implementation must be used. The alternate method we will describe is known as a cursor implementation. The two important items present in a pointer implementation of linked lists are

1. The data is stored in a collection of structures. Each structure contains the data and a pointer to the next structure. 2. A new structure can be obtained from the system's global memory by a call to malloc and released by a call to free.

Figure : Example of a cursor implementation of linked lists

Operation on Linked List

1.is_last()

Data Structures-P.Vasantha Kumari,L/IT 7 int is_last( position p, LIST L) /* using a header node */ { return( CURSOR_SPACE[p].next == 0

}

This procedure is used to check whether the position p is an end of the linked list L

2. Isempty() int is_empty( LIST L ) /* using a header node */ { return( CURSOR_SPACE[L].next == 0

}

This is used to check whether the linked list is empty or not. If the header of linked list is NULL then return TRUE otherwise return FALSE.

3.Insert() void insert( element_type x, LIST L, position p ) { position tmp_cell; tmp_cell = cursor_alloc( ); if( tmp_cell ==0 ) fatal_error("Out of space!!!"); else { CURSOR_SPACE[tmp_cell].element = x; CURSOR_SPACE[tmp_cell].next = CURSOR_SPACE[p].next; CURSOR_SPACE[p].next = tmp_cell; }

}

This procedure is used to insert an element x at the position p in the linked list L. This is done by

 Create a new node as tmpcell Data Structures-P.Vasantha Kumari,L/IT 8  Assign the value x in the data filed of tmpcell  Identify the position p and assign the address of next cell of p to the pointer filed of tmpcell  Assign the address of tmpcell to the address filed of p

4. Delete() void delete( element_type x, LIST L ) { position p, tmp_cell; p = find_previous( x, L ); if( !is_last( p, L) ) { tmp_cell = CURSOR_SPACE[p].next; CURSOR_SPACE[p].next = CURSOR_SPACE[tmp_cell].next; cursor_free( tmp_cell ); } }

This is used to delete an element x from the linked list L. This is done by

 Identify the element x in the linked list L using findpreviuos() function  Name the node x as tmpcell  Assign the address of next cell of x to the previous cell.  Remove the node x

5.Find() position find( element_type x, LIST L) /* using a header node */ { position p; CURSOR_SPACE[L].next; while( p && CURSOR_SPACE[p].element != x ) p = CURSOR_SPACE[p].next; return p;

This is used to check whether the element x is present in the linked list L or not. This is done by searching an element x from the beginning of the linked list. If it is matched then return TRUE, else move the next node and continue this process until the end of the linked list. If an element is matched with any of the node then return FALSE.

6. Double linked list implementation

A doubly linked list is a linked data structure that consists of a set of sequentially linked records called nodes. Each node contains two fields, called links, that are references to the previous and to the next node in the sequence of nodes.

Operations on double linked list

1. Insert()

void insert( element_type B, LIST L, position p ) { position newnode,temp; newnode = (position) malloc( sizeof (struct node) ); newnode->element = B; p->next=newnode; temp=p->next; temp->prev=newnode; newnode->prev=p; newnode->next=temp->prev;

This procedure is used to insert an element x after the node p in the doubly linked list. This is done by,

 Create a new node and assign the value x into the data field of new node.  Rearrange the pointers by assigning the address of x to the pointer field of p and vice versa  Assigning the address of next cell of p to the address filed of x and vice versa 2.Delete()

void delete( element_type B, LIST L) { position newnode,temp; newnode=findprevious(B,L); temp=newnode->next; newnode->prev=temp->next; temp->next->prev=newnode; } This is used to remove a node B from the doubly linked list. Removing a node from the middle requires that the preceding node skips over the node being removed, and the following node to skip over in the reverse direction.

7. Circular Linked List

Data Structures-P.Vasantha Kumari,L/IT 11 A linked list have a beginning node and an ending node: the beginning node was a node pointed to by a special pointer called head and the end of the list was denoted by a special node which had a NULL pointer in its next pointer field. If a problem requires that operations need to be performed on nodes in a linked list and it is not important that the list have special nodes indicating the front and rear of the list, then this problem should be solved using a circular linked list. A singly linked circular list is a linked list where the last node in the list points to the first node in the list. A circular list does not contain NULL pointers.

8. Applications of List

 The Polynomial ADT  Radix Sort  Multilists

1.The polynomial ADT

The polynomial can be represented by using linked list. Here, each node that consists of 2 data field (coefficient and exponent) and one pointer field that points the address of the next cell.

void add_polynomial( POLYNOMIAL poly1, POLYNOMIAL poly2,POLYNOMIAL poly_sum ) { int i; zero_polynomial( poly_sum ); poly_sum->high_power = max( poly1->high_power,poly2->high_power); for( i=poly_sum->high_power; i>=0; i-- ) poly_sum->coeff_array[i] = poly1->coeff_array[i]+ poly2->coeff_array[i]; } void mult_polynomial( POLYNOMIAL poly1, POLYNOMIAL poly2,POLYNOMIAL poly_prod ) {

Data Structures-P.Vasantha Kumari,L/IT 12 unsigned int i, j; zero_polynomial( poly_prod ); poly_prod->high_power = poly1->high_power+ poly2->high_power; if( poly_prod->high_power > MAX_DEGREE ) error("Exceeded array size"); else for( i=0; i<=poly->high_power; i++ ) for( j=0; j<=poly2->high_power; j++ ) poly_prod->coeff_array[i+j] +=poly1->coeff_array[i] * poly2->coeff_array[j]; } 2.Radix sort

Radix sort is one of the linear sorting algorithms for integers. It functions by sorting the input numbers on each digit, for each of the digits in the numbers. However, the process adopted by this sort method is somewhat counterintuitive, in the sense that the numbers are sorted on the least-significant digit first, followed by the second-least significant digit and so on till the most significant digit.

To appreciate Radix Sort, consider the following analogy: Suppose that we wish to sort a deck of 52 playing cards (the different suits can be given suitable values, for example 1 for Diamonds, 2 for Clubs, 3 for Hearts and 4 for Spades). The 'natural' thing to do would be to first sort the cards according to suits, then sort each of the four seperate piles, and finally combine the four in order. This approach, however, has an inherent disadvantage. When each of the piles is being sorted, the other piles have to be kept aside and kept track of. If, instead, we follow the 'counterintuitive' aproach of first sorting the cards by value, this problem is eliminated. After the first step, the four seperate piles are combined in order and then sorted by suit. If a stable sorting algorithm (i.e. one which resolves a tie by keeping the number obtained first in the input as the first in the output) it can be easily seen that correct final results are obtained.

An example

The following example shows the action of radix sort on 10 numbers. The input is 64, 8, 216, 512, 27, 729, 0, 1, 343, 125 (the first ten cubes arranged randomly). The first step bucket sorts by the least significant digit. In this case the math is in base 10 (to make things simple), but do not assume this in general. The

Data Structures-P.Vasantha Kumari,L/IT 13 buckets are as shown in Figure 3.24, so the list, sorted by least significant digit, is 0, 1, 512, 343, 64, 125, 216, 27, 8, 729. These are now sorted by the next least significant digit (the tens digit here) (see Fig. 3.25). Pass 2 gives output 0, 1, 8, 512, 216, 125, 27, 729, 343, 64. This list is now sorted with respect to the two least significant digits. The final pass, shown in Figure 3.26, bucket-sorts by most significant digit. The final list is 0, 1, 8, 27, 64, 125, 216, 343, 512, 729.

3.Multilists

Our last example shows a more complicated use of linked lists. A university with 40,000 students and 2,500 courses needs to be able to generate two types of reports. The first report lists the class

Data Structures-P.Vasantha Kumari,L/IT 14 registration for each class, and the second report lists, by student, the classes that each student is registered for. The obvious implementation might be to use a two-dimensional array. Such an array would have 100 million entries.

As the figure shows, we have combined two lists into one. All lists use a header and are circular. To list all of the students in class C3, we start at C3 and traverse its list (by going right). The first cell belongs to student S1. Although there is no explicit information to this effect, this can be determined by following the student's linked list until the header is reached. Once this is done, we return to C3's list (we stored the position we were at in the course list before we traversed the student's list) and find another cell, which can be determined to belong to S3. We can continue and find that S4 and S5 are also in this class. In a similar manner, we can determine, for any student, all of the classes in which the student is registered.

Using a circular list saves space but does so at the expense of time. In the worst case, if the first student was registered for every course, then every entry would need to be examined in order to determine all the course names for that student. Because in this application there are relatively few courses per student and few students per course, this is not likely to happen. If it were suspected that this could Data Structures-P.Vasantha Kumari,L/IT 15 cause a problem, then each of the (nonheader) cells could have pointers directly back to the student and class header. This would double the space requirement, but simplify and speed up the implementation.

9. Stack ADT

A stack is a list with the restriction that inserts and deletes can be performed in only one position, namely the end of the list called the top. The fundamental operations on a stack are push, which is equivalent to an insert, and pop, which deletes the most recently inserted element. It is also called Last In First Out (LIFO)

Implementation of Stacks

 Linked List Implementation of Stacks  Array Implementation of Stacks

1. Linked List Implementation of Stacks

Stack data structure can be implemented using linked list. All thye insertion(push) and deletion operation(pop) can be performed in LIFO(Last In First Out) manner. Here, the elements can be inserted up to the size of storage space. The operations can be performed under stack is push(), pop() and top().

(i) push():

This is the function which is for insertion(pushing)of an element into stack. It is similar to the insertion of an element at the beginning of a single linked list. The steps to be followed, o Get the element to be inserted o Create a new cell to store the new element using malloc() o If the linked list is empty, then insert the new element as the first cell of the linked list and assign the address the new cell to NULL Data Structures-P.Vasantha Kumari,L/IT 16 o Otherwise, insert the new cell in the first of linked list by rearranging the pointers void push( element_type x, STACK S ) { node_ptr tmp_cell; tmp_cell = (node_ptr) malloc( sizeof ( struct node ) ); tmp_cell->element = x; tmp_cell->next = S->next; S->next = tmp_cell; }

(ii) pop()

This is the function which is for deletion (popping up) of an element from the stack. It is similar to the deletion of an element at the beginning of a single linked list. Before deleting the element, check whether the stack(linked list) is empty or not. If it is empty then return error otherwise delete the first element from the linked list by rearranging the pointer. void pop( STACK S ) { node_ptr first_cell; if( is_empty( S ) ) error("Empty stack"); else { first_cell = S->next; S->next = S->next->next; free( first_cell ); } }

Data Structures-P.Vasantha Kumari,L/IT 17 (iii) top() This function is used to return he top element in the stack which means that the last inserted element. element_type top( STACK S ) { if( is_empty( S ) ) error("Empty stack"); else return S->next->element; } 2. Array Implementation of Stacks

Stack data structure can be implemented using arrays. All thye insertion(push) and deletion operation(pop) can be performed in LIFO(Last In First Out) manner. Here, the elements can be inserted up to the size of the array. The operations can be performed under stack is push(), pop() and top().

push() This is used to push an element x into the stack using an array. This is done by increment the top pointer by 1 and assign the value x into the top position. void push( element_type x, STACK S )

Data Structures-P.Vasantha Kumari,L/IT 18 { if( is_full( S ) ) error("Full stack"); else S->stack_array[ ++S->top_of_stack ] = x; } pop()

This is used to deleet or pop an element from th stack. Initially check wtheter the stack is empty or not. If it is empty then return FALSE. Otherwise decrement the top pointer by 1. void pop( STACK S ) { if( is_empty( S ) ) error("Empty stack"); else S->top_of_stack--; } top()

This function is used to return he top element in the stack which means that the last inserted element. element_type top( STACK S ) { if( is_empty( S ) ) error("Empty stack"); else return S->stack_array[ S->top_of_stack ]; }

Data Structures-P.Vasantha Kumari,L/IT 19 9. Queue ADT

Queue is an linear data structure which insert an element into the rear(enqueue) and dete an element from the front(dequeue). It is also called First In First Out(FIFO). That is the first inserted element will be deleted first.

Array Implementation of Queue

Queue is implemented using an array. It can insert elements upto the size of an array. All the insertion(enqueue) and deletion(dequeue) operation can be performed in FIFO (First In First Out) manner. Here there are two poitera are used. Rear is ued for inserion and front pointer is use for deletion.

(i) enqueue() void enqueue( element_type x, QUEUE Q )

{ if( is_full( Q ) ) error("Full queue"); else

{

Q->q_size++; Data Structures-P.Vasantha Kumari,L/IT 20 Q->q_rear = succ( Q->q_rear, Q );

Q->q_array[ Q->q_rear ] = x;

}

} Thi s procedure is used to insert an element into the queue using an array. Before inserting any element first check whether the queue is full or not. If it is full, then display an error. Otherwise identify the successor rear position using succ() function and insert an element into the rear position of queue.

Dequeue()

This is used to delete an element from the queue. This is done by identify the successor of front. Then the first inserted element is deleted first.

Data Structures-P.Vasantha Kumari,L/IT 21 10. Applications of stack

 Infix to postfix conversion

 Evaluating postfix expression

 Function calls

(i) Infix to postfix conversion

There are three different ways in which an expression like a+b can be represented. Prefix (Polish) +ab Postfix (Suffix or reverse polish) ab+ Infix Data Structures-P.Vasantha Kumari,L/IT 22 a+b Note than an infix expression can have parathesis, but postfix and prefix expressions are paranthesis free expressions. Conversion from infix to postfix Suppose this is the infix expresssion ((A + (B - C) * D) ^ E + F) To convert it to postfix, we add an extra special value ] at the end of the infix string and push [ onto the stack. ((A + (B - C) * D) ^ E + F)] ------> We move from left to right in the infix expression. We keep on pushing elements onto the stack till we reach a operand. Once we reach an operand, we add it to the output. If we reach a ) symbol, we pop elements off the stack till we reach a corresponding { symbol. If we reach an operator, we pop off any operators on the stack which are of higher precedence than this one and push this operator onto the stack. As an example

Expresssion Stack Output ------((A + (B - C) * D) ^ E + F)] [ ^

((A + (B - C) * D) ^ E + F)] [(( A ^

((A + (B - C) * D) ^ E + F)] [((+ A ^

((A + (B - C) * D) ^ E + F)] [((+(- ABC ^

((A + (B - C) * D) ^ E + F)] [( ABC-D*+ ^ Data Structures-P.Vasantha Kumari,L/IT 23 ((A + (B - C) * D) ^ E + F)] [( ABC-D*+E^F+ ^

((A + (B - C) * D) ^ E + F)] [ ABC-D*+E^F+ ^ Is there a way to find out if the converted postfix expression is valid or not Yes. We need to associate a rank for each symbol of the expression. The rank of an operator is -1 and the rank of an operand is +1. The total rank of an expression can be determined as follows: - If an operand is placed in the post fix expression, increment the rank by 1. - If an operator is placed in the post fix expression, decrement the rank by 1. At any point of time, while converting an infix expression to a postfix expression, the rank of the expression can be greater than or equal to one. If the rank is anytime less than one, the expression is invalid. Once the entire expression is converted, the rank must be equal to 1. Else the expression is invalid. (ii) Evaluating postfix expression

Here is the pseudocode. As we scan from left to right * If we encounter an operand, push it onto the stack. * If we encounter an operator, pop 2 operand from the stack. The first one popped is called operand2 and the second one is called operand1. * Perform Result=operand1 operator operand2. * Push the result onto the stack. * Repeat the above steps till the end of the input.

Example:

For instance, the postfix expression 6 5 2 3 + 8 * + 3 + * is evaluated as follows: The first four symbols are placed on the stack. The resulting stack is

Data Structures-P.Vasantha Kumari,L/IT 24 Next a '+' is read, so 3 and 2 are popped from the stack and their sum, 5, is pushed.

Next 8 is pushed.

Now a '*' is seen, so 8 and 5 are popped as 8 * 5 = 40 is pushed.

Next a '+' is seen, so 40 and 5 are popped and 40 + 5 = 45 is pushed.

Now, 3 is pushed.

Next '+' pops 3 and 45 and pushes 45 + 3 = 48.

Data Structures-P.Vasantha Kumari,L/IT 25 Finally,

Finally, a '*' is seen and 48 and 6 are popped, the result 6 * 48 = 288 is pushed.

(iii) Function Calls

The algorithm to check balanced symbols suggests a way to implement function calls. The problem here is that when a call is made to a new function, all the variables local to the calling routine need to be saved by the system, since otherwise the new function will overwrite the calling routine's variables. Furthermore, the current location in the routine must be saved so that the new function knows where to go after it is done. The variables have generally been assigned by the compiler to machine registers, and there are certain to be conflicts (usually all procedures get some variables assigned to register #1), especially if recursion is involved. Applications of Queue  Printers  Job Scheduling  Ticket Reservation …. Review Questions Two Mark Questions 1. Define data structure 2. Define ADT 3. What is linear and non linear data structure? 4. Define array and linked list 5. What is the difference between arrays and linked list

Data Structures-P.Vasantha Kumari,L/IT 26 6. What is the drawback of array? 7. What are the types of linked list 8. Define doubly linked list 9. Define circular linked list 10.What are the applications of linked list 11.Define cursor based linked list 12.Define stack 13.Define queue 14.What are the applications of stack and queue? 15.How to convert an infix to postfix expression? 16.Problems in  converting infix to postfix  evaluating postfix expression Big Questions 1. Write a program to implement a singly linked list 2. Write an procedure to perform insert and delete operation in doubly linked list 3. Explain in detail about the cursor implementation of linked list 4. Explain stack ADT 5. Explain queue ADT 6. Explain in detail about the applications of stack 7. Write a program to reverse a linked list 8. Write an procedure for polynomial addition using linked list ------END OF FIRST UNIT ------

Data Structures-P.Vasantha Kumari,L/IT 27