Quick viewing(Text Mode)

DATA TYPES and DATA STRUCTURES Class: Comp

DATA TYPES and DATA STRUCTURES Class: Comp

Ministry of Secondary Education Republic of Cameroon Progressive Comprehensive High School Peace – Work – Fatherland

PCHS Mankon – Bamenda School Year: 2013/2014 Department of Studies

TOPIC: DATA TYPES AND DATA Class: Comp. Sc. A/L By: DZEUGANG PLACIDE

The primary purpose of most computer programs is to store and retrieve data rather than to perform calculations. There are many different ways to organize data for storage and retrieval, and each type of organization is well suited to solving certain kinds of problems and poorly suited to solving problems of other kinds. Consequently, the way in which a program's data is organized may have a profound effect on the program's running time and memory requirements. The finite of values along with set of rules for different operations is called and the study of data organization is called data structures and is the primary focus of this topic.

Learning objectives After studying this chapter, student should be able to:  Give examples of data types and state how their represented in the memory.  Appreciate and use concepts of data as array, pointer, structure, , …  Appreciate the advantages of abstract data types as a development strategy.  To develop significant facility with the like Stack, Queue, and binary from the client perspective.  Implement Abstract Data Type used data structures

Contents I. INTRODUCTION TO DATA TYPE ...... 2 II. PRIMITIVE DATA TYPES ...... 2 III. ...... 4 IV. ABSTRACT DATA TYPES ...... 8

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 2 By DZEUGANG Placide

I. INTRODUCTION TO DATA TYPE

A is proposed to help to process certain kinds of data and to provide useful output. The task of data processing is accomplished by executing series of commands called program. A program usually contains different types of data types (, float, etc.) and need to store the values being used in the program. language is rich of data types. A programmer has to employ proper data type as per his requirements.

Data type exists in various types:

- Primitive or primary, or atomic data type - Composite, or complex, or secondary data type - Abstract data type

NB: The size a data type is dependent. One can be using 2 to represent an integer while another is using 4 bytes. In C, the size of a data type can be obtained using the sizeof() defined in the stdio.h. For instance, with the declaration int p, we have sizeof(p) = sizeof(int) = 4;

II. PRIMITIVE DATA TYPES

Primitive data types are standard predefined types that you can use as the basis for variables, record fields, or your own Data Item parts. Though exact names may vary, many of these types (like INT) are common to most programming languages. C and other procedural languages often refer to these types as "elementary items" because they are not based on any other type.

Classic basic primitive types may include:

 Character (character, char);  Integer (integer, int, short, long, ) with a variety of precisions;  Floating-point number (float, double, real, double precision);  Fixed-point number (fixed) with a variety of precisions and a programmer-selected scale.  Boolean, logical values true and false.  Reference (also called a pointer or ), a small referring to another object's address in memory, possibly a much larger one.

II.1 Boolean type

The Boolean type represents the values: true and false. Although only two values are possible, they are rarely implemented as a single binary digit for efficiency reasons.

Many programming languages like C do not have an explicit boolean type, instead interpreting (for instance) 0 as false and other values as true.

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 3 By DZEUGANG Placide

II.2 Integer Data Types:

Integers are whole numbers with a range of values, which are machine dependent. Generally an integer occupies 2 bytes memory space and its value range limited to -32768 to +32767(that is, -215 to +215-1). (A signed integer use one for storing sign and rest 15 for number.)

To control the range of numbers and storage space, C has three classes of integer storage namely short int, int and long int. All three data types have signed and unsigned forms. A short int requires half the amount of storage than normal integer. Unlike signed integer, unsigned are always positive and use all the bits for the magnitude of the number. Therefore , the range of an unsigned integer will be from 0 to 65535. The long integers are used to declare a longer range of values and it occupies 4 bytes of storage space.

Data Type Bytes Range Format short int or signed short int 2 -32768 to +32767 % unsigned short int 2 0 to 65535 %u long int or signed long int or long 4 -2147483648 to +2147483647 %ld unsigned long int or unsigned long 4 0 to 4294967295 %Ld

II.3 Floating Point Data Types:

The floating point data type is used to store fractional numbers (real numbers) with 6 digits of precision. Floating point numbers are denoted in C by the keyword float. When the accuracy of the floating point number is insufficient, we can use the double to define the number. The double is same as float but with longer precision and takes double space (8 bytes) than float. To extend the precision further we can use which occupies 10 bytes of memory space.

Data Type Bytes Range Format float 4 -3.4e 38 to +3.4e38 %f double 8 -1.7e 308 to +1.7e 308 %lf long double 12 -1.7e 4932 to +1.7e 4932 %Lf

II.4 Character Data Type:

The character data types are used to store the special character and alphabets. It consists of ASCII characters. It occupies one byte of memory. It can be signed and unsigned i.e they have the range of -128 to +127 and 0 to 255 respectively. The following table shows the different character data types.

Data Type Bytes Range Format Signed char or char 1 -128 to 127 %c Unsigned char 1 0 to 255 %c

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 4 By DZEUGANG Placide

III. COMPOSITE DATA TYPE

In , a composite data type is any data type which can be constructed in a program using its programming language's primitive data types and other composite types. The act of constructing a composite type is known as composition.

III.1 Record

A record also called a structure is a group of related data items stored in fields, each with its own name and datatype. Suppose you have various data about an employee such as name, salary, and hire date. These items are logically related but dissimilar in type. A record containing a field for each item lets you treat the data as a logical unit. Thus, records make it easier to organize and represent information.

Such a declaration can be as follow: pseudocode In C Type employee = record struct person Name: string { Salary: number char name[50]; float salary; sex: character char sex; Endrecord };

The size of a structure depends on the data type of its each field. For instance, for example with the structure defined above,

Sizeof (struct person) = 50 * sizeof(char) + sizeof(float) + sizeof(char) = 50*1 + 4 + 1 = 55 bytes

III.2

An array is a sequenced collection of elements of the same data type with a single identifier name. Arrays can have multiple axes (more than one axis). Each axis is a dimension. Thus a single dimension array is also known as a list. A two dimension array is commonly known as a table (a spreadsheet like Excel is a two dimension array).

We refer to the individual values as members (or elements) of the array. Programming languages implement the details of arrays differently. Because there is only one identifier name assigned to the array, we have operators that allow us to reference or access the individual members of an array. The operator commonly associated with referencing array members is the index operator. It is important to learn how to define an array and initialize its members. Declaration

In pseudocode In C Age = array[1 to 5] of integer int age[5];

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 5 By DZEUGANG Placide

Age is an array of 5 integers. Notice that in C the index is not defined by the user but the first is always 0

Here, the size of array age is 5 times the size of int because there are 5 elements. Suppose, the starting address of age[0] is 2120d and the size of int be 4 bytes. Then, the next address (address of a[1]) will be 2124d, address of a[2] will be 2128d and so on.

Accessing array elements

In pseudocode In C Age[2] ← 4 Age[2] = 4; Read(Age[1]) Scanf(“%d”, &Age[1])

C Programming Multidimensional Arrays C programming language allows creating arrays of arrays known as multidimensional arrays. For example:

In pseudocode In C A=array[1to 2, 1 to 6] of real float A[2][6];

Here, A is an array of two dimension, which is an example of multidimensional array. This array has 2 rows and 6 columns In C the first element of the array is A[0][0], then the next A[0][1], A[0][2], …

For better understanding of multidimensional arrays, array elements of above example can be thinked of as below:

III.3 String

A string is any finite of characters (i.e., letters, numerals, symbols and punctuation marks). An important characteristic of each string is its length, which is the number of

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 6 By DZEUGANG Placide characters in it.The syntax of most high-level programming languages allows for a string, usually quoted I n some way, to represent an instance of a string datatype

In C, a string is represented as an array of characters

In pseudocode In C Name : string Char name[20]

Here the variable name cannot be more than 20 characters. In the variable name contain “placide”, then we have name[0]=’p’, name[1]=’l’, name[2]=’a’ and so on.

III.4 POINTER

Pointers are widely used in programming; they are used to refer to memory location of another variable without using variable identifier itself. They are mainly used in linked lists and call by reference functions. The figure below illustrates the idea of pointers. As you can see below; Yptr is pointing to 100.

Figure: Pointer and memory relationship

“POINTERS CONTAIN MEMORY ADDRESSES, NOTDATA VALUES!”

When you declare a simple variable, like int i; a memory location with a certain address is set aside for any values that will be placed in i. We thus have the following picture:

• After the statement i=35; the location corresponding to I will be filled

You can find out the memory address of a variable by simply using the address operator &. Here is an example of its use: &v

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 7 By DZEUGANG Placide

The above expression should be read as “address of v”, and it returns the memory address of the variable v.

Pointer Declaration

A pointer is a C variable that contains memory addresses. Like all other C variables, pointers must be declared before they are used. The syntax for pointer declaration is as follows: Datatype * identifier; Examples int*p; double*offset;

Note that the prefix *defines the variable to a pointer. In the above example, p is the type “pointer to integer” and offset is the type “pointer to double”.

Once a pointer has been declared, it can be assigned an address. This is usually done with the address operator. For example, int *p; int count; p=&count;

After this assignment, we say that p is “referring to” the variable count or “pointing to” the variable count. The pointer p contains the memory address of the variable count.

Linked list

A linked list is a finite sequence of nodes each of which contains a pointer field pointing to the next node. In many languages, a pointer to the first node must be supplied. In a simply linked list, the pointer in the last node points to nil. In an empty list, the pointer in the first node points to nil.

An empty linked list an integer linked list

The definition

For the purpose of this document, we shall define a list as a pointer to a structure of type cell like this:

struct cell { void *element; struct cell next;

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 8 By DZEUGANG Placide

}*list;

Doubly linked list

In a doubly linked list, each node contains, besides the next-node link, a second link field pointing to the previous node in the sequence. The two links may be called forward(s) and backwards, or next and prev(ious).

A doubly linked list whose nodes contain three fields: an integer value, the link forward to the next node, and the link backward to the previous node

IV. ABSTRACT DATA TYPES

IV.1 Definition

The data types you have seen so far are all concrete, in the sense that we have completely specified how they are implemented. An abstract data type, or ADT, specifies a set of operations (or methods) and the semantics of the operations (what they do), but it does not specify the of the operations. That’s what makes it abstract. the inclusion of the term abstract means that an ADT does not mention implementation details; an ADT is conceptual, not concrete.

Data structure: an implementation of an ADT; a is concrete, i.e., a data structure makes an ADT a reality by specifying how to represent instances of the data type and how to perform operations on those instances

Components of Abstract Data Types

An abstract data type is an encapsulation mechanism. In general it is composed of several components

 A data type  A set of operations (called the methods or operations).  A specification  A signature: A precise description of the types of the methods.  A set of axioms: A precise set of rules about how it behaves  A set of implementation hidden from the programmer who uses the data type.

In an ADT, operation can be a function or a predicate. An example ADT already familiar to you appears below. integer: a whole number operations and signatures:  addition (+) : interger x integer → integer

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 9 By DZEUGANG Placide

 subtraction (-) : interger x integer → integer  multiplication (*) : interger x integer → integer  division (/) : interger x integer → real  modulus (%) : interger x integer → integer

There exist two main types of abstract data types: Linear ADT (vector, queue, stark, …) and NON-Linear ADT (binary tree, …)

IV.2 Common examples of ADT

IV.2.1 A vector data type

a) Definition and Representation

A vector is a random-access collection of elements

b) Operations and signatures:

Operation Signature specification append(element) vector x element → vector Add a new element to the end of the collection. clear() Vector → vector (with no element) Make the collection empty. contains(element) Vector x element → Boolean Does the collection contain the given element? elementAt(index) Vector x index → element Access the element at the given index. indexOf(element) Vector x element → index What is the index of the given element? insertAt(index, Vector x element x index → Vector Insert a new element at the given element) index. isEmpty() vector → Boolean Is the collection empty? removeAt(index) Vector x index → element Remove the element at the given index. remove(element) Vector x element → Vector Remove the given element from the collection. size() Vector → integer How many elements are in the collection?

c) Implementation

It may have already occurred to you that a vector is very much like an array. A vector is so much like an array that we may think of a vector as an "intelligent" array, for we can do anything with a vector that we can do with an array, yet a vector offers something more in the way of convenience. Clearing an array, for example, requires a loop, whereas clearing a vector requires only an invocation of the clear operation, as the code below illustrates.

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 10 By DZEUGANG Placide

Let’s consider the following declaration: Vector vec = new Vector(); and let’s suppose that it has been filled by some elements. This is the specification using of the operation clear() using an array and a vector

Clear the array. Clear the vector. for (int j = 0; j < size; j++) vec.clear(); or clear(vec); array[j] = null; size = 0;

IV.2.2 The Queue

a) Definition

A queue is a linear collection of items, where an item to be added to the queue must be placed at the end of the queue and items that are removed from the queue must be removed from the front. The end of the queue is known as the tail and the front of the queue is known as the front. The term enqueue means to add an item to the queue, and the term dequeue means to remove an item from the queue. A queue is refered to as a FIFO data structure: First In, First Out.

b) The operations

There are actually very few operations on a queue.

Operation Signature specification Enqueue() queue x item → queue add an item to the queue Dequeue() queue → item x queue remove an item from the queue Front() queue → item access the first item at the front of the queue Emptyqueue() queue → Boolean Test if the queue is empty Fullqueue() queue → Boolean Test if the queue is full queueSize() Queue → integer return the number of element in a queue

c) Example

Table 1: Example Queue Operations Queue Operation Queue Contents Return Value q.isEmpty() [] True q.enqueue(4) [4] q.enqueue('dog') ['dog',4] q.enqueue(True) [True,'dog',4] q.size() [True,'dog',4] 3 q.isEmpty() [True,'dog',4] False q.enqueue(8.4) [8.4,True,'dog',4] q.dequeue() [8.4,True,'dog'] 4 q.dequeue() [8.4,True] 'dog'

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 11 By DZEUGANG Placide

q.size() [8.4,True] 2

d) Queue implementation

 Array implementation

Since a queue usually holds a bunch of items with the same type, we could implement a queue with an array. Let's simplify our array implementation of a queue by using an array of a fixed size, MAX_QUEUE_SIZE. One of the things we'll need to keep track of is the number of elements in the queue, i.e., not all the elements in the array may be holding elements of the queue at any given time. So far, the pieces of data we need for our array implementation of the queue are: an array, a count

 Linked list implementation

The diagrams in Figures below show a simple queue before and after adding a new item and before and after removing an item. At each point, you can add a new item only at the rear of the queue and can remove an item only from the front of the queue. (Note that the front of the queue, where you delete items, is at the left of the diagrams. The rear of the queue, where you add items, appears to the right.)

Figure a: A simple queue just before a fourth item is added

Figure b: The simple queue after the fourth item is added and before an item is removed

Figure c: The simple queue after an item has been removed

IV.2.3 The Stack

a) The definition

A stack is a linear collection of similar items, where an item to be added to the stack must be placed on top of the stack and items that are removed from the stack must be removed from the top. The top of the stack is known as the top. The term push means to add an

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 12 By DZEUGANG Placide item to the stack, and the term pop means to remove an item from the stack. A stack is refered to as a LIFO data structure: Last In, First Out.

b) The operations

There are actually very few operations on a stack.

Operation Signature specification Push() stack x item → stack add an item to the stack Pop() stack → item remove an item from the stack Top() stark → item access item at the top of the stack emptyStack() Stack → Boolean test if the stack is empty fullStack() stark → Boolean test if the stack is full

c) Using a stack to evaluate postfix

In most programming languages, mathematical expressions are written with the operator between the two operands, as in 1 + 2. This format is called infix. An alternative used by some calculators is called postfix. In postfix, the operator follows the operands, as in 1 2 +.

The reason postfix is sometimes useful is that there is a natural way to evaluate a postfix expression using a stack:

1. Starting at the beginning of the expression, get one term (operator or operand) at a time.

o If the term is an operand, push it on the stack.

o If the term is an operator, pop two operands off the stack, perform the operation on them, and push the result back on the stack. 2. When you get to the end of the expression, there should be exactly one operand left on the stack. That operand is the result.

d) Implementation of a stack

Implementation of a stack can be done using an array or a linked list

1) Implementing a stack with an array:

Since a stack usually holds a bunch of items with the same type, we could implement a stack as an array.

Consider how we could have an array of characters, contents, to hold the contents of the stack, and an integer top that holds the index of the element at the top of the stack.

2) Implementing a stack with a linked list:

Using a linked list is one way to implement a stack so that it can handle essentially any number of elements. Here is what a linked list implementing a stack with 3 elements might look like:

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 13 By DZEUGANG Placide

IV.2.4 Binary Tree

a) Abstract idea of a tree:

A tree is another data structure that you can use to store information. Unlike stacks and queues, which are linear data structures, trees are hierarchical data structures. Here is an example of a tree holding letters:

b) Tree Vocabulary

Let's now introduce some vocabulary with our sample tree... The element at the top of the tree is called the root. The elements that are directly under an element are called its children. The element directly above something is called its parent. For example, a is a child of f and f is the parent of a. Finally, elements with no children are called leaves. A tree can be viewed, as a recursive data structure, for it is made up of subtrees

c) Uses of trees

(1) File systems A file system can be represented as a tree, with the top-most directory as the root

(2) Arithmetic expressions

An arithmetic expression can be represented by a tree the leave nodes are the variables/values the internal nodes are the operations

(3) Organization chart (4) Family tree

d) Binary Trees

A tree whose elements have at most 2 children is called a binary tree. For the rest of this example, we will enforce this to be the case.

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 14 By DZEUGANG Placide

Since each element in a binary tree can have only 2 children, we typically name them the left and right child. e) Tree operations:

As mentioned, there are different kinds of trees (e.g., binary search trees, 2-3 trees, AVL trees, , just to name a few). What operations we will need for a tree, and how they work, depends on what of tree we use. However, there are some common operations we can mention:

 Add: Places an element in the tree (where elements end up depends on the kind of tree).  Remove: Removes something from the tree (how the tree is reorganized after a removal depends on the kind of tree).  IsMember: Reports whether some element is in the tree.

Other operations may be necessary, depending on the kind of tree we use. f) Tree Traversals

A tree traversal is a specific order in which to trace the nodes of a tree. To perform a traversal of a data structure, we use a method of visiting every node in some predetermined order. Traversals can be used

 to test data structures for equality  to display a data structure  to construct a data structure of a give size  to copy a data structure There are 3 common tree traversals.

1. in-order: left, root, right 2. pre-order: root, left, right 3. post-order: left, right, root

In order to illustrate few of the binary tree traversals, let us consider the below binary tree:

1) Preorder traversal: To traverse a binary tree in Preorder, following operations are carried-out

(i) Visit the root, (ii) Traverse the left subtree, and (iii) Traverse the right subtree.

Therefore, the Preorder traversal of the above tree will outputs: 15, 5, 3, 12, 10, 6, 7, 13, 16, 20, 18, 23

2) Inorder traversal: To traverse a binary tree in

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 15 By DZEUGANG Placide

Inorder, following operations are carried-out

(iv) Traverse the left most subtree starting at the left external node, (v) Visit the root, and (vi) Traverse the right subtree starting at the left external node.

Therefore, the Inorder traversal of the above tree will outputs: 3, 5, 6, 7, 10, 12, 13, 15, 16, 18, 20, 23

3) Postorder traversal: To traverse a binary tree in Postorder, following operations are carried-out

(i) Traverse all the left external nodes starting with the left most subtree which is then followed by bubble-up all the internal nodes, (ii) Traverse the right subtree starting at the left external node which is then followed by bubble-up all the internal nodes, and (iii) Visit the root.

Therefore, the Postorder traversal of the above tree will outputs: 3, 7, 6, 10, 13, 12, 5, 18, 23, 20, 16, 15

Another example

The 3 different types of traversal

Pre-order Traversal In-order Traversal Post-order Traversal FBADCEGIH ABCDEFGHI ACEDBHIGF

Exercise: Binary Tree Traversal

3) For the following binary tree perform the following:

 Pre-order traversal  In-order traversal  Post-order traversal

Answer :

 Pre-order traversal: GEBDFKMR  In-order traversal: BDEFGKMR

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 16 By DZEUGANG Placide

 Post-order traversal: DBFERMKG

4) Using the following binary tree: what would be the outputs for:

 Pre-order traversal  In-order traversal  Post-order traversal

Answer :

 Pre-order traversal: 7,5,4,2,3,8,9,1  In-order traversal: 4,2,5,3,7,9,8,1  Post-order traversal: 2,4,3,5,9,1,8,7

5) For each tree shown in Figure show the order in which the nodes are visited during the following tree traversals:

1. preorder traversal, 2. inorder traversal (if defined), 3. postorder traversal, and 4. breadth-first traversal.

EXERCISES ON BINARY TREES

1. Give the inorder and postorder traversal for the tree whose preorder traversal is A B C - - D - - E - F - -. The letters correspond to labeled internal nodes; the minus signs to external nodes.

2. (Sedgewick, Exercise 5.79). Give the preorder, inorder, postorder, and level-order traversals of the following binary trees.

3. (a) Write a function that counts the number of items in a binary tree. (b) Write a function that returns the sum of all the keys in a binary tree.

This topic and others are available on www.dzplacide.overblog.com in PDF format Topic: Computer System Architecture 17 By DZEUGANG Placide

(c) Write a function that returns the maximum value of all the keys in a binary tree. Assume all values are nonnegative; return -1 if the tree is empty.

This topic and others are available on www.dzplacide.overblog.com in PDF format