Lab 01: String Homework due Wed. Sept. 23 Williams CS 136, Fall 2015, Prof. McGuire Project due Mon. Sept. 28 10pm; Limit: 5 hrs.

1 Educational Goals

• Develop independence by locating technical information online and interpreting it • Polish CS134-level skills: iteration, string manipulation, simple implementation design • Develop more sophisticated programming skills by implementing a program from a rigorous specification with- out detailed hints or a lab-specific guide • Learn manual (new and delete) management of the memory resource • Discover some of the awkward-but-unavoidable parts of ++ (to avoid being caught by them later!) • Begin to consider how interfaces are designed around efficient implementation strategies

2 Homework

Recall the directory naming and homework submission specification from the website. Look up the specification for C++’s std::string class online (it is not in the C textbook because it is a C++ class). It is more than a little confusing; take notes on what you don’t understand to ask about later, but skip over a lot of it to get to the parts you need for this homework. Paraphrase the specification of the following methods in a way that makes sense to you, and then write out pseu- docode for how to implement them:

1. size_t std::string::find(char c, size_t pos = 0) const;

2. std::string std::string::substr(size_t pos = 0, size_t len = npos) const;

Be careful! I chose these methods for the homework because they are the trickiest to get right–it is very easy to be “off by one” when iterating or to forget to handle corner cases. I hope that by thinking them through now you will be able to implement and debug them more quickly later in lab. Answer the following questions in one or two sentences each: 3. What tasks must a C++ destructor perform?

4. What is the difference between memory allocated on the heap and memory allocated on the program stack?

5. What are some important differences between a “C string” (char*) and “C++ string” class?

6. Why is it sometimes advantageous to make a class immutable by protecting its state and creating only const methods?

7. What is a faster way to find the first instance of a character in a string than iterating through each element in turn? (This calls for significant creativity, and there is no single “right” answer to this.)

3 Base Specification

You should start implementing this specification before coming to your scheduled lab–enter class with questions ready and hit the ground running. Just typing in the starting code will probably take about 20 minutes, so at a minimum, you should have read the whole lab and typed in the code before your lab period or you’ll lose your best opportunity to discuss the lab with me and the TAs. This week you don’t need subdirectories. Specification parts are numbered so that I can refer to them in grading, and to identify all of the elements that you must satisfy. They are not meant to imply steps that must be taken in order.

1 1. List the beginning and ending time of each of your programming sessions, and the total time spent on this assignment outside of your scheduled lab session (these are worth points!) This is a timed assignment. You may not spend more than five hours on it. Time spent in office hours and before your scheduled lab each week does not count towards this limit.

2. Implement a class named String in a header named String.h that contains no executable code and a source file named String.cpp. These two files cannot depend on any other files in your solution.

3. Provide the following public API for String (i.e., put this in String.h...and use a header guard):

class String { public: static const size_t npos = -1; String(); String(const char* cstr); String(const String& other); ˜String(); String& operator=(const String& other); size_t size() const; bool empty() const; char operator[](size_t i) const; bool operator==(const String& other) const; bool operator!=(const String& other) const; String operator+(const String& other) const; const char* c str() const; size_t find(char c, size_t pos = 0) const; String substr(size_t pos, size_t len = npos) const; };

See the C++ standard library std::string documentation for specifications of these methods. You are not required to implement any other std::string methods. You may implement any private helper methods or public methods to support testing that you wish. You are not required to throw exceptions on bad arguments...you may simply assert that the arguments are well-formed if you wish.

4. Use only C++ (i.e., new and delete; not malloc and free)

5. Do not use any C or C++ standard library routines in String.h and String.cpp (e.g., do not use std::vector, std::string, strncpy, or strcmp)

6. The first line of String.h must be exactly: // username, Your Full Name, [email protected] where username, Your Full Name, and email are the appropriate fields for you.

I’ll evaluate your design, documentation, style, and correctness. Be sure to document your program with appropriate variable names, helper methods, and comments, including a general description at the top of the file.

4 Advanced Specification

1. Implement size_t String::find(const String& s, size_t pos = 0) const;

2. Implement size_t String::find first of(const String& s, size_t pos = 0) const;

2 5 Advice

Because this is a timed assignment, you should use your time efficiently within each programming session and treat it somewhat like an exam. Don’t waste time socializing, half-working, or checking facebook–take a break whenever you wish to stop the clock, but when you’re at a computer you should be 100% focused. The best way to prepare for each assignment is to work out an implementation plan on paper, complete the required reading, and talk to the professor and TAs ahead of time. The specification only says what externally visible, mandatory parts are required for a class. You’ll almost always need to add private or protected state and helper methods to implement the specification. I don’t tell you about those in the specification because: 1) that’s not part of the class’ contract 2) there is no single right answer; you have the freedom to design your own efficient implementation. For example, in this assignment, you’ll need to have a pointer to some characters stored in the heap, which means that you’ll have a member variable declaration that looks something like char* str; and somewhere else say something like str = new char[n]; to allocate it and delete[] str; to free it. Recall that header guards are macro definitions that prevent a header (“.h”) file from being included twice (it is OK if you don’t know what that means). For String.h, they will look like: #ifndef String_h #define String_h

#include // for size t

class String{ ... };

#endif

Don’t put header guards on source (“.cpp”) files. They aren’t ever included, so don’t need this. Recall that when implementing a method in a source file you must include the relevant headers and use the class name and scope operator. Your String.cpp should look something like the following (without the comments...I put those in so that you’d know why I added those headers). #include // for NULL #include // for new and delete #include // for std::min #include // for assert #include"String.h"

String::String() { ... }

...

size_t String::size() const { ... }

...

You’ll want a separate main.cpp file with some debugging routines and tests that looks something like: include #include"String.h"

int main(const int argc, const char* argv[]) { String a("Hello"); String b(" World");

printf("a= \"%s\"\n", a.c_str());

printf("a.size()=%d\n", a.size());

printf("a+b= \"%s\"\n", (a + b).c_str()); ... return 0; }

3 Do not put a main function definition in any other file...it will prevent me from testing your program effectively. We’re implementing a C++ string on top of C strings (that’s why the method to return the value used for printing is called “c str”–it converts to a C string for the C printf routine). A C string is not actually much of a data structure. It is just a pointer to a character (or a set of characters) in memory. It doesn’t even know how long it is. By convention, C strings end with a “null” character, which is written ’\0’. They therefore contain one more character than the apparent length of the string. When constructing your C++ string from a C string, you have to search for that null character to determine the length. When returning a C string, you have to ensure that the pointer you’re revealing leads to an array of characters that ends in a null character. I added a private constructor String(size_t len) that creates an empty string of length len. This was very helpful for implementing the other methods that need to return a string such as operator+–with this constructor, I was able to create the result string (on the stack, not using new!), mutate it as I needed, and then return it. The assignment operator is a strange thing. It is like a copy constructor, but it runs on an object that has already been created. If your internal fields are m_size and m_letter, then the implementation might look like:

String& String::operator=(const String& other) { // check for assignment to self if (m_letter != other.m_letter) { delete[] m_letter; m_size = other.m_size; m_letter = new char[size() + 1]; for(int i = 0; i < size() + 1; ++i) { m_letter[i] = other.m_letter[i]; } } return *this; }

Note that it returns a reference to the object referenced by this. That is how mutating operators +=, -=, etc. operate, so the last line is always return *this;. Some common mistakes with are failing to deallocate memory, aliasing memory between different strings, and deallocating memory that is still in use. These are the reasons for most professional program crashes and computer security holes. To avoid this, good programmers use standard library classes and avoid manual memory management as much as possible. So, be extra careful about memory management. Use a lot of assert statements in String.cpp. Write lots of tests in your main.cpp, including corner cases such as empty strings or looking at the last character of a string. Assertions can’t catch all such errors, however. When you return the value from c_str, it can’t possibly be a newly- allocated char* array, because there would be no subsequent code to deallocate it. There’s nothing that you can easily assert to identify this kind of error; you just have to audit your code carefully. I recommend making a diagram of memory and pointers that corresponds to your tests. By the end of the tests, there should be no memory allocated (and never should two strings reference the same block of memory). There’s a subtle issue in that String should have an overloaded assignment operator so that statements such as String a("hi"); String b; b = a; don’t produce two strings that are referencing the same underlying memory. Technically, the default assignment operator is deprecated and should produce a warning in the latest (11) version of C++. It would be much better if this were an error. Unlike this week, I’m not requiring you to implement an assignment operator for most other projects in CS136, but be aware that this issue exists when you use C++ in the future. If manual memory management is so error-prone, why am I teaching you how to do it? Because doing so will make you both more knowledgable of computer science and better able to use library routines more effectively. In general, you’re using the language of C++ now to be very close to the machine (even when this is tedious or dangerous) so that you can deeply understand how computers operate. That will let you write much more efficient high-level code in Python later. Even though memory management and other low-level tasks are handled for you in another language, they still incur costs and like any good manager or (citizen of a democracy), you bear responsibility for the side effects of work that you delegate.

4