Intro to Data Structures and the Standard Template Library (STL)
Total Page:16
File Type:pdf, Size:1020Kb
COMS W3101: Programming Languages (C++) Instructor: Austin Reiter Lecture 6 Outline for Today (Last Lecture) • Intro to Data Structures • Standard Template Library (STL) Last Homework • HW 4 questions? • Show example robot output DISCLAIMER • Today we are condensing what is usually a semester- long course into two hours! • Take it with a grain of salt – I’m just trying to introduce the tools, what’s out there, and hope you play with them on your own • We’ve spent the entire course on the “rules and practices” of C++ – STL is an entire other area of study of C++ – I wish we did an entire 6-week course on STL alone! – Now that you know templates and the rules of objects, hopefully you can appreciate the powers of the library. Data Structures • Up to now we’ve studied fixed-size data structures (arrays) • More useful are dynamically-sized data structures: grow and shrink during execution (size unknown during compile time) • Also, the data structures are arranged (conceptually) different than arrays – Ex: the data doesn’t need to be arranged contiguously in memory. This often helps speed up certain processes (sorting, searching, reordering, etc) Data Structures • These data structures are implemented independent of type – Templates! • The concepts of how the data is arranged is independent of what is being stored – However, as usual, you must consider the operations being done to your data in the storage container. • Ex: many containers store things as sorted in some way. So your structure must have a concept of “less than” Data Structures • Vector: just like an array, but can grow and shrink dynamically • Linked List: collection of data items logically “lined up in a row” – We can insert and remove anywhere in the list • Stack: list of items arranged in a last-in, first-out ordering. – Insertions and removals are only made at the top of the stack – Very important for compilers and operating systems • Think about memory allocations: Stack-vs-Heap • Queue: opposite of stacks; arranged in a first-in, first-out ordering. – Insertions are made at the back and removals are made at the front – Like a “waiting line” Data Structures • Binary Tree: useful for high-speed searching and sorting of data. – Often useful for representation of file directories • In the data structures we present today, we use classes, class templates, inheritance and many other concepts we’ve already learned to create and package reusable and maintainable data structure! STL • This prepares us for using the Standard Template Library (STL), which is a major part of the C++ Standard Library. • Once we understand the structures and concepts they represent, we can make more informed decisions about which are best for our applications • They are all implemented as templates Self-Referential Classes • A self-referential class contains a pointer member to a class object of the same class type: class Node { public: Node( int ); // constructor void setData( int ); // set data member int getData() const; // get data member void setNextPtr( Node * ); // set pointer to next Node Node* getNextPtr() const; // get pointer to next Node private: int data; // data stored in this Node Node* nextPtr; // pointer to another object of same type }; Self-Referential Classes • The member nextPtr is a link. It can “tie” an object of type Node to another object of the same type. • These types of objects can be linked together to form useful data structures such as lists, queues, stacks and trees 10 2 self-referential class objects linked 15 together to form a list. Self-Referential Classes • The member nextPtr is a link. It can “tie” an object of type Node to another object of the same type. • These types of objects can be linked together to form useful data structures such as lists, queues, stacks and trees 10 This represents a NULL “next” Node ptr. 15 It usually represents the end of a data structure. Pointers • This should start to answer how pointers are useful beyond simple memory allocation and data passing Memory Allocation • Dynamic data structures means dynamic memory allocations (both larger and smaller) which enable programs to hold different amounts of memory during run-time. • The data structure must maintain how many elements it currently has and how to best re-allocate to reduce calls to new and delete – For example, often STL will resize by 2x greater than the current capacity when it needs more memory, thereby reducing (over time) the number of times it needs to re- allocate • However, this can be wasteful when it gets to larger and larger sizes! Linked Lists • A linear collection of self-referential class objects, called nodes, connected by pointer links (hence the term “linked list”) • A linked list is accessed via a pointer to that list’s first node – Each subsequent node is accessed via the link- pointer member stored in the previous node – The last node points to a NULL node, indicating the end of the list Linked Lists • They are dynamic in the sense that new nodes are created as needed • A node can contain any type of data • This along with stacks and queues are linear data structures, whereas trees are nonlinear data structures – More on these in a bit Linked Lists • Linked lists are advantageous to arrays when the number of data elements to be represented at one time is unpredictable – The length of the list can increase/decrease as necessary – C++ array lengths are fixed at compile time, and can become “full” – Linked lists only become full if the system runs out of memory Linked Lists • However, the data in a linked list is not stored contiguously – This means accessing arbitrary elements from a list is not as efficient as in a vector or array – They are accessed via pointers from the previous element (i.e., no indices) • The nodes are stored contiguously firstPtr lastPtr H D … Q Linked Lists • Usually we provide functions to add elements to the front or to the back as well as remove from the front or back • We provide pointers (referred to as iterators) to the beginning and end of the list and we can go through the nodes one-by-one • This is called a singly linked list – Each node contains a pointer to the next node “in sequence” • We can also construct a circular, singly linked list – The last node pointer is not NULL. It points back to the first element Linked Lists • A doubly linked list allows traversal both forwards and backwards – Each node has a pointer to both the “next” and “previous” nodes, separately • And finally, we can construct a circular, doubly linked list – Same as a doubly linked list but the forward pointer of the last node points to the first node and the backward pointer of the first node points to the last node lastPtr firstPtr 12 7 … 5 Stacks • We previously implemented a fixed-size stack using an array • We can also do it using a pointer-based linked-list implementation • A stack allows nodes to be added and removed only from the top. It is referred to as a LIFO data structure, for last-in first-out. • It can be thought of as a constrained version of a linked-list – The link member in the last node of the stack is set to NULL to indicate the bottom of the stack Stacks • The push() method inserts a new node at the top • The pop() method removes a node from the top • By using a linked-list as the implementation: – A push inserts data at the front of the list – A pop removes an element from the front of the list – Nothing else changes – Reusability! Queues • Similar to a stack, a queue is like a checkout line from a supermarket. The first person on the line is the first person processed • Queue nodes are removed from the head (front) of the queue and are inserted at the tail (back) of the queue • It is referred to as a FIFO, for first-in first out ordering • The insert operation is often referred to as enqueue. The remove operation is often referred to as dequeue. Queues • We can use a linked-list to implement a queue also: – The enqueue inserts elements at the back of the list – The dequeue removes elements from the front of the list – Nothing else changes – Reusability! Linear Data Structures • Vectors are fairly straightforward, as they are simply resizable arrays – We’ll show some concrete examples in STL • Let’s look at a non-linear data structure… Trees • A two-dimensional nonlinear data structure, tree nodes contain 2 or more links • In a binary tree, all nodes contain two links – None, one or both of which may be NULL root node pointer B left subtree right subtree of of node A D node containing containing B B C Trees • Node B is the root of the tree • Each link in the root node refers to a child (nodes A and D) – The children of a node are called siblings root node pointer B left subtree right subtree of of node A D node containing containing B B C Binary Search Tree • A binary search tree (BST) has the characteristic that the values in any left subtree of a node are less than the value in its parent. – Similarly, all values in any right subtree of a node are greater than the value in its parent • The shape of a BST can vary depending on the order that the data is inserted into the tree! 47 25 77 11 43 65 31 44 68 Binary Search Trees • We could spend a few lectures on BSTs.