<<

Designing and Implementing a class Behavior of MultiSet: methods?

● Consider a simple class to implement sets (no duplicates) or ● What accessor functions do we need (these are const)? multisets (duplicates allowed) ➤ Size of ? Printing set? in set? ➤ Initially we’ll store strings, but we’d like to store anything, ➤ What does Print generalize to? how is this done in C++? • For Iterator class, see book, requires friend class ➤ How do we design: behavior or implementation first? ➤ How do we test/implement, one method at a time or all at ● What mutator functions do we need (non const)? once? ➤ Insert an element into the set ➤ Make a set empty ● Tapestry book shows string set implemented with singly ➤ Erase an element (all occurrences or one?) linked list and header node ➤ Templated set class implemented after string set done ● ➤ What constructor(s) and other, similar standard methods are Detailed discussion of implementation provided needed ● We’ll look at a doubly-linked list implementation of a ➤ Copy constructor, assignment operator, destructor MultiSet

CPS 100 4.1 CPS 100 4.2

Draft of multiset behavior (multiset.h) From Behavior to Implementation

class MultiSet ● We’ll use a linked list { public: ➤ New nodes added to back, but this will change in other MultiSet(); // construct a multiset versions: add to front, move to front, tree, … ➤ // mutators Use a header node (not in set), last node (in set) void insert(const string & word); ➤ To print, count, etc. we’ll use an internal iterator void clear(); • Pass to the set the operation you want to perform on all set elements // accessor void apply(MSApplicant & app) const; • What’s good/bad about this compared to iterators? int size() const; int count(const string & key) const; ● Use doubly-linked list, where is Node declaration found? private: ➤ Private section means clients cannot access, ok? }; ➤ Notice return type of private Find(..) in multiset.cpp ● What’s missing? What is apply(..)

CPS 100 4.3 CPS 100 4.4 What’s easy, hard, first? Iterators and alternatives

● Searching the list is straightforward, visit all nodes MultiSet ms; ➤ We need to search for insert and for count, how can we ms.insert("bad"); factor out common code? ms.insert("good"); ms.insert("ugly"); ➤ If we implemented remove/erase, we’d need to search too. ms.insert("good");

● Once we implement insert, how do we test? MultiSetIterator it(ms); ➤ We need to print, we should implement print() even if for(it.Init(); it.HasMore(); it.Next()) not general, we’ll toss it later, why is this ok? { cout << it.Current() << endl; } ● What’s after insert? We’ll look to the accessor functions ● What’s printed? What does Current() return? Internals? ➤ count is simple, why? ➤ What does iterator class have access to? ➤ What is apply(..) about? ➤ What happens when MultiSet passed to iterator? Stored?

CPS 100 4.5 CPS 100 4.6

Iterators and alternatives, continued Using a common interface

● Iterators require class tightly coupled to container class ● We’ll use a named apply(…) ➤ Friend class often required ➤ It will be called for each element in a MultiSet ➤ Const can be a problem, but surmountable • An element is a (string, count) pair ➤ Encapsulated in a class with other behavior/state ● Alternative, pass function into MultiSet, function is applied to every element of the set void apply(const string& s, int count) const ➤ { How can we pass a function? cout << count << "\t" << s << endl; ➤ What’s potential difference compared to Iterator class? }

● To pass a function, we’ll put the function in a class and pass // alternative an object void apply(const string& s, int count) const { ➤ Must adhere to naming conventions since written in client myTotal += count; code, called in MultiSet code }

CPS 100 4.7 CPS 100 4.8 Interfaces and Inheritance Multisets, function objects, inheritance

● Programming to a common interface is a good idea ● Interface used by MultiSet objects, apply function to every ➤ I have an iterator named FooIterator, how is it used? object in the set • Convention enforces function names, not the compiler ➤ string and count of a set element are passed to apply(..) ➤ I have an ostream object named out, how is it used? • Inheritance enforces function names, compiler checks class MSApplicant { public: ● Design a good interface and write client programs to use it virtual ~MSApplicant(){} ➤ Change the implementation without changing client code virtual void apply(const string & word, ➤ Example: read a stream, what’s the source of the stream? int count) = 0; • file, cin, string, web, … }; ● Virtual means “inheritance works”, function called ● C++ inheritance syntax is cumbersome, but the idea is simple: determined at run-time, not compile-time ➤ Design an interface and make classes use the interface ➤ The =0 syntax means this that subclasses must implement ➤ Client code knows interface only, not implementation the function --- subclass implements the interface

CPS 100 4.9 CPS 100 4.10

What is a function object? How does interface inheritance help?

● Encapsulate a function in a class, enforce interface using ● MultiSet code uses interface only to process all set elements inheritance or templates ➤ Class has state, functions don’t (really) void MultiSet::apply(MSApplicant & app) const ➤ Sorting using different comparison criteria as in extra- // postcondition: app.apply called for all credit for Anagram assignment // elements in the set { Node * current = myFirst->next; // skip header ● In C++ it’s possible to pass a function, actually use pointer to while (current != 0) function { ➤ Syntax is awkward and ugly app.apply(current->myKey, current->myCount); ➤ Functions can’t have state accessible outside the function current = current->next; (how would we count elements in a set, for example)? } ➤ Limited since return type and parameters fixed, in classes } can add additional member functions ● How do we count # elements in a set? # distinct elements?

CPS 100 4.11 CPS 100 4.12 Why inheritance? Guidelines for using inheritance

● Add new shapes easily without ● Create a base/super/parent class that specifies the behavior shape changing code that will be implemented in subclasses ➤ Shape * sp = new Circle(); ➤ Functions in base class should be virtual ➤ Shape * sp2 = new Square(); • Often pure virtual (= 0 syntax), interface only ● abstract base class: mammal ➤ Subclasses do not need to specify virtual, but good idea ➤ interface or abstraction • May subclass further, show programmer what’s going on MultiSetBack, MultiSetTable ➤ pure virtual function ➤ Subclasses specify inheritance using : public Base MultiSet ● concrete subclass • C++ has other kinds of inheritance, stay away from these ➤ implementation User’s eye view: think and ➤ Must have virtual destructor in base class ➤ provide a version of all program with abstractions , realize pure functions different, but conforming ● Inheritance models “is-a” relationship, a subclass is-a parent- ● “is-a” view of inheritance implementations class, can be used-as-a, is substitutable-for ➤ Substitutable for, usable in ➤ Standard examples include animals and shapes all cases as-a

CPS 100 4.13 CPS 100 4.14

Inheritance guidelines/examples See students.cpp, school.cpp

● Virtual function binding is determined at run-time ● Base class student doesn’t have all functions virtual ➤ Non-virtual function binding (which one is called) ➤ What happens if subclass uses new name() function? determined at compile time • name() bound at compile time, no change observed ➤ Can’t change which function called if compile-time determined ● How do subclass objects call parent class code? ➤ Small overhead for using virtual functions in terms of speed, design flexibility replaces need for speed ➤ Use class::function syntax, must know name of parent class • Contrast Java, all functions “virtual” by default ● In a base class, make all functions virtual ● Why is data protected rather than private? ➤ Allow design flexibility, if you need speed you’re wrong, ➤ Must be accessed directly in subclasses, why? or do it later ➤ Not ideal, try to avoid state in base/parent class: trouble ● In C++, inheritance works only through pointer or reference • What if derived class doesn’t need data? ➤ If a copy is made, all bets are off, need the “real” object

CPS 100 4.15 CPS 100 4.16 Consider a modification to MultiSet How do we print all values in a tree?

● Instead of using prev and next to point to a linear ● When is root printed? arrangement, use them to divide the universe in half ➤ After left subtree, before right subtree. ➤ Similar to binary search, everything less goes left, everything greater goes right void Visit(Node * t) “llama” { if (t != 0) “giraffe” “tiger” { Visit(t->prev); cout << t->info << endl; “jaguar” ➤ How do we search? “elephant” “monkey” Visit(t->next); “llama” ➤ } How do we insert? “giraffe” “hippo” “leopard” “pig” } “tiger”

“elephant” “jaguar” “monkey”

➤ How are lists and trees related? ● Inorder traversal “hippo” “leopard” “pig”

CPS 100 4.17 CPS 100 4.18

Insertion and Find? Complexity? Binary Trees

● How do we search for a value in a tree, starting at root? ● Linked lists have efficient insertion and deletion, but ➤ Can do this both iteratively and recursively, contrast to inefficient search printing which is very difficult to do iteratively ➤ arrays: search is efficient, insertion and deletion are not ➤ How is insertion similar to search? ● Binary trees are structures that can be used to yield efficient insertion/deletion and search ● What is complexity of print? Of insertion? ➤ trees used in many contexts, not just for searching, e.g., ➤ Is there a worst case for trees? expression trees ➤ ➤ Do we use best case? Worst case? Average case? insertion is as efficient as binary search in array, insertion/deletion as efficient as linked list (once node found) ● How do we define worst and average cases ➤ binary trees are inherently recursive, difficult to process ➤ What about add-to-back MultiSet, add-to-front? Move-to- trees non-recursively, but possible (recursion never front? required, but often makes coding/algorithms simpler)

CPS 100 4.19 CPS 100 4.20 Binary trees (continued) Binary trees (continued)

● Binary tree is a structure: ● Trees can have many shapes: short/bushy, long/stringy ➤ empty ➤ if height is h, number of nodes is between h and 2h-1 ➤ root node with left and right subtrees ➤ single node tree: height = 1, if height = 3 ➤ terminology: parent, children, leaf node, internal node, depth, height, path • link from node N to M then N is parent of M –M is child of N A • leaf node has no children B C – internal node has 1 or 2 children ➤ C++ implementation, similar to doubly-linked list • path is sequence of nodes, N1, N2, … Nk D E F struct Tree –Ni is parent of Ni+1 { – sometimes edge instead of node G string info; depth (level) of node: length of root-to-node path • Tree * left; – level of root is 1 Tree * right; • height of node: length of longest node-to-leaf path }; – height of tree is height of root

CPS 100 4.21 CPS 100 4.22

Tree functions Tree traversals

● Compute height of a tree, what is complexity? ● Different traversals useful in different contexts ➤ Inorder prints search tree in order int height(Tree * root) { • Visit left-subtree, process root, visit right-subtree if (root == 0) return ; else ➤ Preorder useful for reading/writing trees { return 1 + max(height(root->left), height(root->right) ); • Process root, visit left-subtree, visit right-subtree } } ➤ Postorder useful for destroying trees ● Modify function to compute number of nodes in a tree, does • Visit left-subtree, visit right-subtree, process root complexity change? “llama” ➤ What about computing number of leaf nodes? “giraffe” “tiger”

“elephant” “jaguar” “monkey”

CPS 100 4.23 CPS 100 4.24 Balanced Trees and Complexity What is complexity?

● A tree is height-balanced if ● Assume trees are “balanced” in analyzing complexity ➤ Left and right subtrees are height-balanced ➤ Roughly half the nodes in each subtree ➤ Left and right heights differ by at most one ➤ Leads to easier analysis

● How to develop ? ➤ What is T(n)? ➤ bool isBalanced(Tree * root) What other work is done? { if (root == 0) return true; else ● How to solve recurrence relation { return ➤ Plug, expand, plug, expand, find pattern isBalanced(root->left) && isBalanced(root->right) && ➤ A real proof requires induction to verify that pattern is abs(height(root->left) – height(root->right)) <= 1; correct } }

CPS 100 4.25 CPS 100 4.26

sidebar: solving recurrence Recognizing Recurrences

T(n) = 2T(n/2) + O(n) T(¨) = 2T(fi/2) + ˝ ● Solve once, re-use in new contexts T(1) = 1 ➤ T must be explicitly identified what about 2n? 3n? ➤ T(n) = 2[2T(n/4) + n/2] + n n must be some measure of size of input/parameter = 4 T(n/4) + n + n • T(n) is the time for quicksort to run on an n-element vector = 4[2T(n/8) + n/4] + 2n = 8T(n/8) + 3n T(n) = T(n-1) + O(n) sequential search O( ) = ... eureka! T(n) = T(n/2) + O(1) binary search O( ) = 2k T(n/2k) + kn T(n) = 2T(n/2) + O(1) tree traversal O( ) let 2k = n k = log n, this yields 2log n T(n/2log n) + n(log n) T(n) = 2T(n/2) + O(n) quicksort O( ) n T(1) + n(log n) O(n log n) ● Remember the algorithm, re-derive complexity

CPS 100 4.27 CPS 100 4.28