<<

ASSIGNMENT #2: C++ BASICS CS246, FALL 2020

Assignment #2: C++ basics

Due Date 1: Wednesday, October 7, 2020, 5:00 pm Due Date 2: Wednesday, October 21, 2020, 5:00 pm Online Quiz: Wednesday, October 28, 2020, 5:00 pm

Topics that must have been completed before starting Due Date 1: 1. Software Testing 2. produceOutputs and runSuite from A1

Topics that must have been completed before starting Due Date 2: 1. C++: Introduction to C++ 2. Preprocessing and Compilation Learning objectives:

• C++ I/O (standard, file streams) and output formatting • argv and argc • C++ and stringstreams • separate compilation and Makefiles

• Questions 1a, 2a, and 3a are due on Due Date 1; questions 1b, 2b, 3b, 4b, 5 and 6 are due on Due Date 2. You must submit the online quiz on Learn by the Quiz date. • On this and subsequent assignments, you will take responsibility for your own testing. This assignment is designed to get you into the habit of thinking about testing before you start writing your program. If you look the deliverables and their due dates, you will notice that there is no C++ code due on Due Date 1. Instead, you will be asked to submit suites for C++ programs that you will later submit by Due Date 2. Test suites will be in a format compatible with that of the latter questions of Assignment 1, so if you did a good job writing your runSuite script, that experience will serve you well here. • Design your test suites with care; they are your primary tool for verifying the correctness of your code. Note that test suite submission zip files are restricted to contain a maximum of 40 tests. The size of each input (.in) file is also restricted to 300 bytes, and each output file (.out) is restricted to 1,000 bytes. This is to encourage you not to combine all of your testing eggs in one basket. • You must use the standard C++ I/O streaming and memory management (MM) facilities on this assignment; you may not use C-style I/O or MM. concretely, you may #include the following C++ libraries (and no others!) for the current assignment: iostream, fstream, sstream, iomanip, and string. Marmoset will be setup to reject submissions that use C-style I/O or MM, or libraries other than the ones specified above. • We will manually check that you follow a reasonable standard of documentation and style, and to ver- ify any assignment requirements that are not automatically enforced by Marmoset. Code to a standard that you would expect from someone else if you had to maintain their code. Further comments on cod- ing guidelines can be found here: https://www.student.cs.uwaterloo.ca/˜cs246/F20/ codingguidelines.shtml • We have provided some code and sample executables in the subdirectory codeForStudents under the appropriate subdirectories. These executables have been compiled in the CS student environment and will not run anywhere else.

Page 1 of 6 ASSIGNMENT #2: C++ BASICS CS246, FALL 2020

• You may not ask public questions on Piazza about what the programs that up the assignment are supposed to do. A major part of this assignment involves designing test cases, and questions that ask what the programs should do in one case or another will give away potential test cases to the rest of the class. Questions found in violation of this rule will be marked private or deleted; repeat offences could be subject to discipline.

Coding Assessment

Questions 1 to 4 are part of the coding assessment, and may be publicly discussed on Piazza so long as solutions are neither discussed nor revealed.

Question 1

(40% of DD1; 15% of DD2) In this question, you will a C++ program called lineWrap, whose command-line syntax looks like this:1 lineWrap [-n maxLineLength] [-c censorWord] [inputFile] This program reads in lines of text, and prints them to cout, with at maxLineLength characters in each printed line, and any remaining text printed afterwards (to a maximum of of maxLineLength again). The default value of maxLineLength is 20, but this can be overridden at the command-line, as we’ll see below. Note that you should preserve all white space in the input, except for the newline characters. For example, if the input file looks like this (here, we’ll use a period instead of actual spaces, to make them more obvious): a234567890b234567890c234567890d234567890e234567890 slings...and...... arrows...of...outrageous then we expect the output file to look like this: a234567890b234567890 c234567890d234567890 e234567890 slings...and...... arrows...of...o utrageous Note that the first line of input was 50 characters long, so we had to use three lines of output to print it all; the second input line was 13 characters long, so it fit onto one line of output. You can choose to override the maxLineLength by using the -n option at the command-line; this value must be a positive integer. The program can optionally take a censorWord (a string) as an argument; if one is specified on the command-line, then the censorWord must not appear in the output stream. The removal should be done before you try to print the line, so the output will line up as if the censorWord had never been in the input file in the first place. For example, consider this input file: a23456789gong0b234567890gonggongc234567890d234567890e234567890 ..gong..gogongng.. If there is no censorWord given, then the output should be this a23456789gong0b23456 7890gonggongc2345678 90d234567890e2345678 90 ..gong..gogongng..

1Note that as per convention, items in square brackets are optional; if you can call lineWrap with no arguments, it will take input from cin.

Page 2 of 6 ASSIGNMENT #2: C++ BASICS CS246, FALL 2020

But if the censorWord is set to gong, then the output should be this. a234567890b234567890 c234567890d234567890 e234567890 ......

The processing of the second line of input shows that (a) whitespace around a censorWord is preserved in the output and (b) if deleting one instance of the censorWord creates another instance of the censorWord, you have to delete that second instance (and third and fourth ...) also. That is, gogongng becomes gong after deleting the gong in the middle, but we’re not done yet; we have to keep deleting the censorWord until no instances remain. If the censorWord is the empty string "", then no censoring should be done; note that a good design may be able to implement this case without any special handling. If an input file name is specified on the command-line, then the input text should be taken from there; otherwise, assume that the input text will come from cin. You need to perform error checking on the command-line arguments: • If the -n option is used, it must be followed by a positive integer.

• If the -c option is used, it must be followed by another string i.e., lineWrap -c on its own is illegal. • If an input file name is specified, you must be able to create an istream object successully using that name. Use the following global definitions for error messages:2 const string eBadLineLength = "Error, maxLineLength must be a positive integer"; const string eNoCensorWord = "Error, missing censor word"; const string eCantOpenInputFile = "Error, could not open input ";

If one of these errors arises, print the appropriate message to cerr with nothing else on the line and abort via (1). Note that your program should be able to process valid command-line calls with options in any order. That is, the following should all be legal and have the same output: $ lineWrap -n 15 -c gong test1.txt $ lineWrap -c gong test1.txt -n 15 $ lineWrap test1.txt -n 15 -c gong

You may assume without having to check that each command-line argument is used at most once; so we don’t care about the output to this, for example: $ lineWrap -c gong test1.txt -n 15 -c fluble -n 35

Feel free to use any element of the C++ APIs for string (http://www.cplusplus.com/reference/string/ string/) and stringstream (http://www.cplusplus.com/reference/sstream/?kw=sstream), even if it has not been discussed “in lecture”; in particular, string::npos might be useful to you. Hint: Try creating the basic program without the command-line options first, then progressively add in the command-line features once you are confident you have solved the simpler problem.

a) Due on Due Date 1: Design a test suite for this program. Call your suite file suiteq1.txt. Zip your suite file, together with the associated .in, .out, and .args files, into the file a2q1.zip. Since the .in extension is used for file redirection, and .args is for command-line arguments, the input test files you create should not use either of those extensions. b) Due on Due Date 2: Write the program in C++. Save your solution in a file named a2q1.cc. 2These definitions will also be provided to you in the a2/codeForStudents/q1 subdirectory of the course git repository; please use that file rather than copy/pasting from this PDF to avoid strange formatting errors.

Page 3 of 6 ASSIGNMENT #2: C++ BASICS CS246, FALL 2020

Question 2

(30% of DD1; 10% of DD2) You are going to write a slightly simplified version of the GNU UNIX utilities in C++; we expect that you will make effective use of the C++ string API in doing so. Basically, basename takes a valid UNIX pathname such as /usr/bin/vim and returns the name of the file with the enclosing names removed. For example, $ basename /usr/bin/vim vim

More precisely3, given a PATHNAME: basename outputs PATHNAME with any leading directory components removed. Note that basename does not check that these paths or files are valid on the current computer; rather, it just operates on the string values provided to it. Play around with the real version of this tools to get a feeling for how it works. Our version of basename will be slightly simpler than (and mildly inconsistent with) the real version. The typical usage will be: basename [-s SUFFIX] PATHNAME1 [PATHNAME2 ...]

If the -s option is not used, it just prints the file names of the given PATHNAMEs, one per line: $ basename /Volumes /usr/bin/vim Movies Volumes vim Movies

If the -s option is used, then the SUFFIX is removed from the end of the file names, if they appear: $ ./basename -s .txt /Users/fred/story.txt /etc/ greetings.txt story passwd greetings

Note that a SUFFIX can be any character string, it doesn’t have to be a followed by three letters. Note also that the real version of basename has more options and slightly different usage from what we’re asking you to do; in particular, if you call the real basename with exactly two arguments, it will assume that the second is a suffix. Don’t implement that! We require that you create a function with this signature:4 std::string basename (const std::string & s, const std::string & suffix = "");

This function should do all of the work processing a single PATHNAME, returning the correct file name as its value, and trimming the suffix if it appears at the end of the file name. You should implement the handling of multiple PATHNAMEs and the processing of the -s option within the main program. You may assume without checking that if the -s option is used, it will always come before the pathnames.

1. Due on Due Date 1: Submit a file called a2q2.zip that contains the test suite you designed to test your program, called suiteq2.txt, and all of the .in, .out, and .args files. Since the .in extension is used for file redirection, and .args is for command-line arguments, the input test files you create should not use either of those extensions.

2. Due on Due Date 2: Write the program in C++. Save your solution in a file named a2q2.cc.

Question 3

(30% of DD1; 10% of DD2) In this question, you are going to write slightly simplified versions of the GNU UNIX utility in C++; this is the companion tool to basename as given a pathname, it returns the full name of encompassing directory but ignores the file name. For example,

3This definition is taken from the . 4See the file a2/codeForStudents/q2/basename in the course git repository.

Page 4 of 6 ASSIGNMENT #2: C++ BASICS CS246, FALL 2020

$ dirname /usr/bin/vim /usr/bin

More precisely5, given a PATHNAME, dirname outputs each PATHNAME with its last non-slash component and trailing slashes removed; if PATHNAME contains no slashes, then it should output ’.’ (meaning the current directory). Our version of dirname will be almost the same as the real version; it is used in this way: dirname PATHNAME1 [PATHNAME2 ...]

It takes one or more PATHNAMEs and returns the output one per line. For example, $ dirname /Volumes /usr/bin/vim Movies / /usr/bin .

We will not ask you to implement any of the command-line options of the real version of dirname, i.e., (--version, --, or --zero). We require that you create a function with this signature:6 std::string dirname (const std::string & s);

This function should do all of the work processing a single PATHNAME, returning the correct directory name as its value. You will be glad to did this when we get to Question #4. You should implement the handling of multiple PATHNAMEs within the main program.

1. Due on Due Date 1: Submit a file called a2q3.zip that contains the test suite you designed to test your program, called suiteq3.txt, and all of the .in, .out, and .args files. Since the .in extension is used for file redirection, and .args is for command-line arguments, the input test files you create should not use either of those extensions. 2. Due on Due Date 2: Write the program in C++. Save your solution in a file named a2q3.cc.

Question 4

(8% of DD2) In this question, you will re-structure your solutions from Q2 and Q3 to enable separate compilation. None of the commands, definitions or function signatures will change; instead, you will separate your program into five files: dirname.h, dirname.cc, basename.h, basename.cc, and a2q4.cc. The file dirname.h should contain the function prototype for dirname from Q3 with appropriate preprocessor guards and #includes; dirname.cc should provide the implementation for the dirname function. Construct basename.h and basename.cc similarly. The main function must be placed in to the file a2q4.cc. To get you started, we’ve created a copy of the Makefile used in the video lectures in the a2/codeForStudents/q4 subdirectory of the course git repository; you will have to adapt it to work with your program. The dependencies must be correctly set up to minimize the amount of unnecessary recompilation; the TAs will be looking for this when the question is hand-marked. The executable you create should be called dirname. Add a second command to the main target (after the final linking has been done) to create a symbolic called basename to the executable you just created named dirname. Do this using the UNIX command with the options -sf (read the man page; you don’t have to understand it very deeply for our purposes). Basically, we’re going to write only one program, but give it two names; the main program will have to check what name it was called with at run- (i.e., the value of argv[0]) to know what to do. This seems like extra work, but it means that the core functionality need only be written once; if you can abstract out any common subtasks into internal functions, that’s a win for elegant design! The main programs from Q2 and Q3 will have to be merged creatively together inside the file a2q4.cc. As stated above, the new main program should check the value of argv[0]; if it is "basename", then process the command-line arguments as appropriate to basename and call the basename function; if argv[0] is "dirname" then process the command-line arguments as appropriate to dirname and call the dirname function. Note that you should probably call

5This definition is also taken from the man page. 6See the file a2/codeForStudents/q3/dirname in the course git repository.

Page 5 of 6 ASSIGNMENT #2: C++ BASICS CS246, FALL 2020 basename on argv[0] before trying to process it further so that, for example, ./basename and ./dirname will be recognized as valid commands. If (processed) argv[0] is any string value other than basename or dirname, print the following message and and abort via exit(1):7 cerr << "Error, unknown command: " << processedVersionOfArgvZero << endl; 1. Due on Due Date 1: Nothing, since your testing for Q2 and Q3 will suffice. 2. Due on Due Date 2: Write the program in C++. Submit your Makefile, dirname.h, dirname.cc, basename.h, basename.cc, and a2q4.cc files that make up your program in your zip file, a2q4b.zip.

Written Assessment

Questions 5 and 6 are part of the written assessment, and may not be publicly discussed on Piazza (or anywhere else).

Question 5

(2% of DD2) Describe what would happen if the preprocessor variable TRICKY is defined at compilation time? #include

int main (int argc, char* argv[]) { #ifndef TRICKY int i = 5; #endif std::cout << "i = " << i << std::endl; }

Question 6

(5% of DD2) What is wrong with this program, and how should it be fixed? #include

int main() { int *arr;

for (int i = 1; i < 10; i++) { arr = new int[i]; for (int j = 0; j < i; j++) { arr[j] = j; }

int result = 0; for (int j = 0; j < i; j++) { result += arr[j]; }

std::cout << result << std::endl; }

delete arr; return 0; }

7See the file a2/codeForStudents/q4/errorMessage in the course git repository.

Page 6 of 6