Chapter 1 Getting Started Installing (v1.8) Installing Dr. Java Hello, World! Java

Hello, Lab! Welcome to CS 211! The early focus in this course is to get introduced to Java, and to begin getting comfortable writing code in Java, so that we can use Java intensively throughout the rest of the course when we learn about various object oriented programming concepts. So today, we will get Java installed on your machine, get an IDE installed (Dr. Java), and then begin writing some Java code.

This text often contains a lot of content – the lab manual chapters give you the chance to work through more problems with more step-by-step guidance. You should finish each chapter and all the associated practice problems to ensure you the content.

Installing Java In order to write and test Java code on your machine, we will need to get certain tools installed, contained in the JDK (not just the JRE). Our goal is to get version 1.8 installed, to ensure you can use all the recent additions to the Java language that we will discuss in class. Dr. Java supports version 1.8, but will require a small bit of extra work to get installed on Mac. We will use the tools directly at the command line first, instead of relying on an IDE to utilize the tools on our behalf. We want you to experience the "edit, compile, run" cycle first-hand. In later classes you take, this style of programming will be necessary/mandatory—you can't always write code on a fancy IDE on your own machine (for example, if you are writing code to run on a cluster, or are ssh'd into a remote machine to access the one licensed copy of some software), and we don't want you to think that "==Java Programming" or "Dr. Java==Java Programming". Those tools are indeed useful, but we need to understand the basics of what is occurring to really utilize them well.

Quick Note "JDK" means "Java Development Kit", and is intended for people who will create Java programs (that's you!). "JRE" means "Java Runtime Environment", and is intended only for running Java programs. If you install the JDK, you will be getting the JRE as well. So we are only trying to install the JDK, as it gets everything for us.

Checking for Java First, check if you have Java installed. Open up a terminal (Mac) or command prompt (Windows: go to Start → "run…", and enter 'cmd'). Type in these two commands:

demo$ javac –version demo$ java –version

If you don't already have it, or have a version older than 1.8 (smaller numbers), follow the instructions below.

1 Downloading the latest version Go to this link. (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html ) Click "Accept License Agreement" Select the version that correlates to your machine; you probably want "Windows x64", "Mac OSX x64" or some Linux version. Download and open the downloaded file. Follow any installer instructions. Windows Users: Update your PATH variable (instructions here http://docs.oracle.com/javase/8/docs/technotes/guides/install/windows_jdk_install.html#BABGDJFH ).

Next, test that you've got everything installed and working. Again, open up a new terminal window. Type in these two commands again:

demo$ javac –version demo$ java –version

And make sure that they both report the same version, e.g. "1.8.0_25" for version 1.8. We don't care what follows 1.8 (0_25 suggested here), as long as they match. (Java might have an extended build-number after the 1.8.0_25 part, that's okay).

If you get error messages such as "command not found", then you didn't set up your path correctly (Windows), or you forgot to open a new command prompt after setting your PATH variable. Similarly, if you get an error message containing a trace-back and the phrase "unsupported major.minor version"…, it means you have different versions of javac and java installed. You would then need to update enough so they matched, meaning you should just install the latest JDK release.

At this point, you should have Java installed! If Windows users want to get the Linux feel, they could also install Cygwin (https://cygwin.com/). It allows you to re-compile code from source that was intended for Linux, and run it within Cygwin, on a PC.

Still Lost? Here's a link that gives instructions on installing Java; perhaps another perspective will help! First WikiHow. (http://www.wikihow.com/Install-the-Java-Software-Development-Kit ).

Installing an IDE

A good coding environment makes a world of difference. Please avoid Notepad, TextEdit, Wordpad, nano, and pico. You could take the plunge and start learning something like http://www.vim.org) or , (http://www.gnu.org/software/emacs/), but for this course we will install a basic IDE.

We will officially support you in this class with Dr. Java, but won't promise to solve all your installation woes if you have installed other editors and IDEs. Some other popular more lightweight options are JEdit, JGrasp, or Notepad++ but of course you're on your own for installation if you use any of these, and you will be missing out on some features that are directly available in Dr. Java later on.

We will also use JUnit, which also works directly with both Dr. Java and Eclipse (as well as other IDEs, and of course on the command line as well).

Installing Dr. Java Go here and get our specially-patched version of Dr Java: http://cs.gmu.edu/~kauffman/drjava/

2 Windows/Linux Users o An alternate (less cool) version is here: http://drjava.sourceforge.net/ (download the app version; after a short countdown the download begins). It directly gives you the application. You might want to move it to some more permanent location and just have a handy link or pin it to your dock/task bar.

Mac Users o Mac machines need convincing to open a file that doesn't go through their fancy App Store. The first time you open this file (it's double-clickable), you might need to agree to open it despite having downloaded it from the internet. If you have trouble opening it, try right-clicking and selecting Open. This is actually supposed to make a difference. o Missing the Debugger? This should be solved in our patched version above. When you've got Dr. Java open, look for a menu named "Debugger". If it's not present, then you need to tell Dr. Java where to find tools.jar. Copy this file: http://cs.gmu.edu/~marks/211/share/tools.jar to a place that doesn't have any spaces in the full path. Perhaps /Users/youraccountname/tools.jar would be satisfactory. Go to Preferences, Resource Locations, edit the tools.jar location to be your chosen location, e.g. /Users/youraccountname/tools.jar Try re-opening Dr. Java and ensure you see the Debugger menu. We definitely want to use this later in the semester.

You type code into the largest pane, and then can save it where you like. There are two buttons of current interest across the top: compile, and run. Compile will cause Dr. Java to try compiling the code on your behalf. You must successfully compile your code with no errors before you are able to then run the file (with the run button).

Installing Eclipse (unsupported alternative option) This section is here as a courtesy for anyone who wants to install Eclipse, but you should install Dr. Java for this class. Be sure to focus on getting started on writing Java code, and maybe come back here if you want to install Eclipse later on.

First, go to this link http://www.eclipse.org/downloads/ and download "Eclipse for Java Developers". Follow the installation instructions.

Eclipse is written in Java, and has support for writing code in many languages. It has many extra features (common to most IDEs), such as code completion, type checking in the background to immediately bring up errors or warnings, and management of your packages and files.

One reason not to use a big IDE right away is that it also creates extra files and folders, requires setting many properties for a "project", and can make it confusing what files and folder structures are actually required to be writing Java code. It also blurs the distinction between editing, compiling, and executing, because you get a nice, shiny "play" button that will compile and run your code all at once, if it can. Some future programming situations will explicitly require you to write code command-line style, so we wanted to cement comfortability with that mode of programming first.

When you open Eclipse, you'll need to specify your workspace. This is a directory where Eclipse can create "projects". It's important that you manually "close project" whenever you go between projects; the play button sometimes is trying to run the other project. To close a project, in the "package explorer" panel, right-click the project and "close project", so that just the one you want to create is open.

3 We think of each assignment or program as a "project". We can create a new project via File → New → Java Project. We similarly add classes with File → New → Class. Eclipse tries to be helpful with all sorts of checkboxes and things, but you can always just type things in manually later on, in case you forget to check "create method stubs for inherited abstract methods", or something similar.

Eclipse will create a folder for the project in its workspace directory, and inside of that, all your code will go by default into a "src" directory (source). Finding this workspace directory is sometimes a pain. You will also have to be careful when turning in your work, as this doesn't match the expected folder/file structure at all.

Navigating through the DOS Prompt / Terminal This is a skill you will need to get comfortable with at some early point during your computer science career (or any engineering-related career). You might as well get comfortable with it now, or else it will make you very unhappy during some project in this class and later classes, when it will be assumed you know how to navigate in a terminal. However, if you get stuck too much on this, you can rely on Dr. Java's compile and run buttons instead. Just be sure to get comfortable using the tools at the command line at some point.

DOS> In order to navigate through folders, create folders, and so on in the DOS prompt, you will need to get used to some commands. Here are some basics to get you started: Basic DOS Commands (https://www.sophos.com/en-us/support/knowledgebase/13195.aspx). For now, just focus on cd, del, dir, md & rd.

Terminal$ UNIX tutorial (http://people.ischool.berkeley.edu/~kevin/unix-tutorial/toc.html): useful for navigating around in the terminal (for macs & linux). For now, focus mainly on "looking around", "managing files and folders", and "viewing and editing files".

Writing Our First Java Program Now that Java is installed, we can finally get to the simpler world of writing a file, compiling it, and running it. This "edit-compile-run" cycle is at the heart of what we do when we develop a program.

Writing a Java Program To write a Java program, we could use any old text editor to create a document with a .java extension, containing valid Java code. Any old editor will do, but you will quickly discover that writing code in editors designed for writing code is a far, far superior experience. We've already installed Dr. Java so we will use that.

Once you've decided on an editor (or perhaps only used old Notepad or TextEdit just today!), copy the following into a file named HelloLab.java : public class HelloLab {

public static void main (String[] args) { System.out.println("Hello, Lab!"); //more code to go here! } }

4

Compiling a Java Program Now that you have HelloLab.java stored onto your computer, you need to compile it. Open up a prompt/terminal, and use the cd command to navigate to the folder containing the file HelloLab.java, and execute the following instruction to compile the .java file:

javac HelloLab.java

Unless some odd character encodings occurred during copy-pasting, there should be no errors, and the file HelloLab.class has been created. This file is the compiled version of our HelloLab class definition. Every Java program contains a main method that is run to execute the program. Running a Java program means to call its main method. In our case, the only method in HelloLab is main, and we will cause this method to be run with the next terminal command. Our entire program is the contents of main.

In general, when we write code that doesn't represent valid Java, we will get compilation errors here.

Executing a Java Program Now that we have a compiled version of our HelloLab program, we want to run it. In the prompt/terminal, run:

java HelloLab

Notice that we did not have any extension after HelloLab; java looks for HelloLab.class, and looks for the main method in it and then runs it. It prints out "Hello, Lab!".

Edit-Compile-Run! Now that we have a source file that we can edit, a way to compile it, and a way to run it, we are ready to start writing code! Contrast this cycle to an interpreted language with an interactive prompt; if we want to try a new expression, we pretty much have to edit the source file so that it calculates and uses/prints the new value, and then compile and run it again. Later on we'll let an IDE like Dr. Java or Eclipse help speed up this process, but the three phases—edit, compile, run—will still be present, whenever we develop Java code.

Running Code Through Fancy IDEs Okay, okay, of course you want to use that fancy editor that also lets you run your code. With Dr. Java installed, you can just click the compile and run buttons instead of manually invoking javac and java at the command prompt. But if you get comfortable now using the command-line version, you will thank yourself in some later class where all tools are command-line-only.

Practice: Java Basics Now it's time to try out some basic Java language features.

Literals. Literals are small, atomic 'building blocks' of our data in Java, such as 5, true, or ''. Java understands each literal to be of a particular primitive type. The eight primitive types in Java are:

char: 16-bit unsigned integer representation; used to represent Unicode characters. byte: 8-bit signed integer representation. short: 16-bit signed integer representation.

5 int: 32-bit signed integer representation. long: 64-bit signed integer representation. boolean: truth values. true and false are the only two values. They are lower-case.

float: 32-bit floating-point representation (approximates real numbers). Place an f at the end of the number to distinguish it from double values. double: 64-bit floating-point representation (approximates real numbers).

Signed means that we have positive and negative numbers available; unsigned means that only non- negative numbers are available (zero and up, to a maximum of 2n-1 for n-bit numbers).

→ We often just use int and double for numbers, ignoring byte, short, and long.

Variables. Variables are our names for storage in memory. We can declare new variables, and assign values to them; we can later look up these values. The two things we must do in order to use a variable are declare it and then instantiate (initialize) it.

Declaration: give a type, followed by an identifier (name). Multiple variables may be created with a single type followed by a comma-separated listing of identifiers for the variables.

Add these examples to your HelloLab.java file (at the //more code goes here location):

int x; double f,g;

Instantiation: use an assignment statement (varname = expr;) to give a value to the variable.

Add these examples to your HelloLab.java file:

x = 5; f = 2.3; g = 3.14159;

All together: We can declare and instantiate all in one go: give a type, a new identifier, =, and an initial expression.

Add this example to your HelloLab.java file:

int y = 7;

Printing things—just the basics. Later on we'll learn about System and everything we need to know about printing, but for now, we just need to feed an expression as the argument to this method call, as shown. We'll also pretend we know all about Strings (characters in double-quotes):

System.out.println(some_expression_here);

It will be printed to the terminal when we run our program. So let's print some of our variables out:

System.out.println(x); System.out.println("y equals " + y); System.out.println("f="+f); System.out.print("g: "); System.out.println(g);

6

Your Turn! Create a couple variables of each primitive type (including boolean). Print them out. It might help to just print them each individually at the end, and change the code ahead of it; you might also prefer to make sequential changes interleaved with print statements. It's all for your benefit, so however you want to do this is up to you.

Getting input. Programming is much more interesting when our code can request information from the user. We will take a very brief look at the Scanner class. Right now we view it more as boilerplate the lets us make the right incantation to get the job done, and later on we will understand what is happening behind the abstraction that a Scanner provides us.

First, to use the Scanner class, we have to include its code with an import statement. Add the following import statement on a line before public class HelloLab:

import java.util.Scanner;

We could have also stated " import java.util.*; ". This version allows access to any class in the java.util package. We will take a much better look at packages later.

Now that we are allowed to use the Scanner class, we create a Scanner object (also stuff we will explain in better detail later on). Place the following line inside the main method before the first time you want to get input from the user—you might as well put it at the very beginning of the method.

Scanner sc = new Scanner (System.in);

We are now allowed to use the Scanner object, which we named sc, by calling methods on it. Of particular interest to us are the nextInt() method and nextDouble() methods. There are other similarly-named ones.

It's pretty polite to ask for input before we actually demand it, so we tend to see printing on the line before:

System.out.println("please give me a number: "); int userNum = sc.nextInt();

That's all: each time you want a value of a particular type Foo, use the corresponding nextFoo method of our sc Scanner object, and you'll get the next thing they type (after hitting enter).

Note: the nextFoo methods can grab parts of a line; it can be convenient to read many numbers from one line, or it can be annoying to need to manually get to the next line by calling sc.nextLine() and discarding the result.

Your Turn! Create variables of three different types; ask the user for values that will fit in those types, and then use them (print them, change them, whatever you want to try). This will be useful in the rest of the lab, semester, and your major.

Expressions versus Statements. Remember, an expression is just a representation of a calculation that would result in some value (it might even already be a value, like a literal). A statement is a command to the programming language to

7 do something or perform some action, often involving expressions to describe what should be done. Let's add many assignment statements, using more and more sorts of expressions as our righthand-sides.

Basic mathematical operations. We have the usual mathematical operations that operate over all of our numerical types: +, –, *, /, % (modulo, or remainder). As you may have encountered in other languages, division (/) is closed, meaning that if we divide an int by an int, we get back an int (by truncating the result—throwing away the remainder). Similarly, if we encounter different types in a division, the "wider" type is chosen. So an int divided by a float will return a float. The order of operations is the same (PEMDAS) as in mathematics, so no surprises there.

Your Turn! Try to use all the different mathematical operators between your numerical variables.

Converting between numerical types. The action of coercing or converting a number from one type to another is known as 'casting' the value from one type to another. There are default assumptions in Java as to when casting is automatically applied (as in int/float division, where the int is cast to a float), and we can also manually apply a cast.

Here is an idealized notion of the widening of types. Each type on the left is 'smaller', or a 'subset' of the types to its right. (It is idealized because of the imprecision of floating-point representations).

byte → short → int → long → float → double

We perform a cast manually by adding a type in parentheses to the left of the expression:

int z = (int) 2.46f; //convert 2.46 from a float to an int. z = ((int) 3.14) * 10; //use parentheses to cast within expr.

Casting Implications. We say that a cast to a 'wider' type (a type with all the other type's values, and more) preserves the value, but that a cast to a 'narrower' type will lose precision in some sense. Casting from a floating-point number to an integer number will lose the fractional part; casting from a double to a float may lose some precision. Casting from a larger integer type to a smaller integer type may mean that the number wasn't even representable, and an 'overflow' (modular division) will occur to fit the value into the new, smaller, type.

For more information, read the primitives section of the section on casting http://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html in the JLS (especially 5.1.2, 5.5).

Your Turn! Try casting between various primitive numerical types. See what happens (via printing) when you cast between wider and smaller types, or from smaller to wider types. See what happens to fractional parts, negative fractional numbers, and so on. Try to create each of the situations:

A cast completely preserves the old value in the new type. A cast causes a truncation (loss of information after the decimal place). A cast causes an overflow. A cast causes division between two integers to be performed as a float.

Operators. There are a variety of operators available to us in Java that we are currently interested in. Try out each of the following:

8 Relational operators. < <= > >= == != These operators take two numbers, and result in a boolean matching whether the relation holds. E.g., 3<4 results in true; (5%2)<=0 results in false. Keep in mind that == and != are also defined for all primitive types (so, boolean and char, and the various number types). Just have compatible types on both sides. Java will complain if you attempt something like false!=4, because there's no type that contains both booleans and ints. But it's safe to express 3==3.2 (and the result is false). This one works, because Java has implicit casting between the numerical types. The casting always 'upgrades' values to make the types match, rather than 'degrading' values: 3 can easily become 3.0 and also be a double; but for 3.2 to become 3, there's a loss of information. This basic idea should help you figure out which direction the conversion is occurring.

Mathematical operators. + – * / % We mentioned these above (they're listed here for completeness' sake).

Boolean operators. & | && || ! != These operators only accept boolean expressions, and result in a boolean value.

& and && both mean "and"; then return true only when the expressions on both sides are true, false otherwise. | and || both mean "or"; they return false only when the expressions on both side end up being false, and true otherwise. What's the difference here? & and | will always evaluate the expressions on both side, no matter what. && and || are called "short-circuiting operators": when possible, they skip evaluating the righthand expression. false && whatever will always be false, no matter what value whatever has. Similarly, true||whatever will be true, regardless of whatever's value. ! means "not"; it is a unary operator, meaning we only supply one boolean expression after it. It turns a true value into false, and a false value into true. It gives us alternate ways to phrase things: !(x>4) is the same as (x<=4). For instance, we might say (x<4)||(x>4) more simply by stating that !(x==4) (or even shorter, x!=4).

Ternary operator: boolexpr ? expr : expr The 'conditional expression' is a ternary operator: it requires three expressions, sandwiched around the ? and the :. This is essentially an if-statement occurring at the expression-level. Used at the right time, it is extremely useful! Examples:

x = (x<0) ? 0 : x; String status = (x>0) ? "good" : "bad";

Compare this with an if-statement version:

if (x>0){ status = "good"; } else { status = "bad"; }

The first expression must be a boolean expression; this serves as the decision of whether this entire expression results in the middle-expression (when true), or the last expression (when false).

9

Keep Trying Things! You know best what features at this basic level are the most confusing to you; explore things you were able to do in your previous language, and try them out in Java. Lab is a great place to ask questions one- on-one, through the TAs. Next we discuss control structures and methods, but you can try to figure those out now too, if you're already comfortable with basic operators, creating variables, and user input/command-line output.

Keep in mind that the first couple of weeks of this course is heavily focused on getting proficient and comfortable in Java. If you aren't writing code and running it, you aren't getting prepared for the rest of the semester. It's really that simple, so go ahead and get prepared!

10 Chapter 2 Control Structures

Hello! Today we will focus on reviewing the various basic control structures available in Java. We also have links to "The Java Tutorials" (http://docs.oracle.com/javase/tutorial/java/index.html ) throughout, in case you want another perspective.

Through this chapter, create a file named "Lab_02.java" and edit it. (Note that your class's name inside this file will also be Lab_02.java.) Just add code for each "Your Turn!" as you reach them, one after another. Remember you can add comments /*…*/ around all the parts that you've completed but don't want to see running over and over each time, without having to delete all your progress.

A note on braces The curly braces { } let us group multiple statements together into one group. Although they are not required for any of these control structures (except switch), we will show them here because it is very good practice to just always use them.

If-Else Branch Java's branching control uses the if-else structure (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/if.html) for selecting which blocks of code should run. The if-statement by itself lets us choose whether or not to run a block of code:

if ( boolExpr ) { // run these statements only if boolExpr was true; otherwise, skip. }

The if-else structure lets us select which block of code to run:

if (boolExpr) { //run when boolExpr is true } else { // run when boolExpr is false }

There is no "elif" structure – we can just glue if-else statements together. Since if and else both grab "the next statement" as the branch body, the following if-else statement becomes the else's entire branch. Below, exactly one of stmt1, stmt2, stmt3, and stmtDefault will run, based on the first boolexpr that is true (or stmtDefault, when none of the boolexprs are true).

if (boolexpr_1) { stmt1; } else if (boolexpr_2) { stmt2; } else if (boolexpr_3) { stmt3; } else { stmtDefault; }

Your Turn! Try to use if-elses to solve the following problems: Accept three numbers from the user. Print out "the following ones were even: #2, #3" (based on actual values), or "none of them were even." Again with the three numbers, print out whether or not any two of them could be added up together to be a multiple of 5 (you don't need to say which ones). Try this using "else if" blocks of code, and also try to just do this all in if statement (with no else). If-elses will be useful in the later examples too, as part of the blocks that form the body of loops, the code inside a switch case, and really anywhere a statement is allowed.

Switch Statement The switch statement (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/switch.html) allows us some peculiar control flow possibilities. First, look at an example:

int cutoff = 0; Scanner myScanner = new Scanner(System.in); String s = myScanner.nextLine(); char gradeLetter = s.charAt(0); // grabs the first letter.

switch (gradeLetter) { case 'A': cutoff = 90; break; case 'B': cutoff = 80; break; case 'C': cutoff = 70; case 'D': cutoff = 60; break; default: cutoff = 0; }

Notes: The expression at the start can be a byte, short, char, int, enumerated type, and even a String (as of Java 7 – you would need to have version 1.7 or higher installed). Each value of a case must be of the same type as the starting expression. The 'expression' after the case keyword must be an actual literal value (like 5, 'G', and in Java 7 it can also be "strings"). By putting a break statement at the end of each case, they are isolated from each other and actually behave like "else if" blocks. (There's one break missing above – what will this effect be? case 'C' will run both cutoff=70 and then immediately run cutoff=60).

Your Turn! Write a switch statement that takes in an integer representing a month (1 – 12), and then prints out "months remaining in the year: ", followed by all the months that are not yet finished. Consider your break statements. Use a switch statement to print the following outputs for the following inputs. Try to rearrange cases and use breaks sparingly. To make it interesting (more difficult), try to only have one print statement in your code for each letter that shows up in outputs; you'll have to be careful about how to set up your control flow to always pass through the right piece of code! What other control flow can you use to make this possible?

Input: Output: 1 A B C 5 B C D 2 A 0 A D

While Loop A while loop (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/while.html) is like an if- statement that enjoys success so much, it keeps on running again and again until its boolean condition finally fails (at which point the party's over).

while ( thisIsStillTrue ) { runThisCodeBlock; }

Notes: Remember that if the boolean expression is false the first time, the loop's body might not run at all! A while loop provides for zero or more iterations of the loop's body.

Your Turn! Write a 'password checker' using a while loop. Try to make the wording of your first request friendly, and then all later requests should state say "wrong, try again!" . Extend the password example to ask them if the caps lock key is on, starting with the 5th attempt. Take the longcut. (meaning, let's do something in the not-best way, to learn what we can do). Add up the number of seconds in a day by using nested while-loops. The trick here is to not use multiplication, but actually add up each second individually. Silly constraints, yes, but see if you can figure out the logic and the nested nature of this approach. Print your answer when done. It might help to first use one while-loop to calculate the number of seconds in a minute (yes, it just counts up to 60), and then embed that in a while-loop that does this calculation once for each minute in an hour, and so on.

Do-While Loop The do statement, also called the do-while loop (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/while.html), is the same in functionality as a while loop except that its body is guaranteed to run at least once. This is why its loop body comes before the condition (boolean expression).

do { thisCodeBlock; } while ( thisIsStillTrue );

Notes The body of the block (thisCodeBlock above) will always run at least once, because we don't test the condition (thisIsStillTrue) until after the block has run. The semicolon is required syntax after the parenthesized condition.

Your Turn! Try to turn your 'password checker' into a do-while loop. (It's perhaps easier when you aren't handling the 1st, 2nd, and 5th + cases differently…) Use a do-while loop to print some range of numbers, such as 1-100 inclusive. Print this range of numbers, including the {}'s and commas: {100, 200, 300,…2000}. Of course, fill in the ellipses, and use a do-while loop.

For Loop For loops (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/for.html) are excellent for iteration with well-defined changes between values between each iteration, such as the "print this range of numbers" example above, and as we will see, generating the indexes to use when accessing each element in an array.

for ( init; guard; update) { someCodeBlock }

Notes. The init, guard, and update portions are used as follows: init: only run once, at the very beginning of the entire for-loop. guard: the boolean expression used before each loop iteration – must be true to run next iteration. update: statement that is run after each loop iteration. Often it is i++ or something similar (increment i by 1), but it's your chance to do a single assignment to your loop variable in any amount, direction, or what have you. Example:

for ( int i = 0; i<10; i+=1) { System.out.println( i ); }

Your Turn! Try declaring i before the for-loop (int i;), and still assign it to zero in the init area: for (i = 0; …) Print the even numbers from 0 to a million. Print the following, braces/commas and all: {5, 11, 17, 23, … , 65} Use a for-loop to print the number 13, thirteen times. Do you still need a loop variable? (the i in the above code)

For-each Loop The for-each loop lets us access each element of something known as an Iterator. Arrays are the first Iterator we will learn about. If you are already comfortable with arrays, do this section now. If not, go ahead and skip to the arrays section, and come back here.

for ( SomeType someIdentifier : iteratorOfSomeType ) { /* statement(s) that uses someIdentifier */ }

The for-each loop allows us to access each item in a sequence of values in a more regular way than manually managing an array index or carefully-yet-manually constructed conditions for a while loop. It should look similar to Python's for loops. The two following for-loops do the same thing:

int[ ] xs = {2,4,6,1,3,5,7};

for (int i = 0; i < xs.length; i++) { System.out.println( xs[i] ); }

for (int val : xs) { System.out.println( val ); }

Notes. The type ascription in the for-each loop (int, above) must match what is stored in the Iterator. We have an array of ints, so we must ascribe int here. When we learn more about the types we can create (subtyping and polymorphism and so on), it will become more apparent why we might need to write the type 'again'. The identifier, val, is a new identifier chosen by the . The current item from the Iterator uses this name. val is indeed the value inside the array at each location, and not the index of the value; if we wanted to print out the indexes (0 through 6 above), we would have to use the original for-loop (or just manually increment a variable to serve as a counter, at which point we also might as well have used the original formulation).

Your Turn! Use a for-each loop to calculate the sum of an array of doubles. Create your own array and fill it in with any values you choose. Use a for-each loop to find the largest number in an array of ints. Create your own starting array of integers, and test it a few times by moving the largest number around in the array.

Wrap Up

Your Turn! Print out all the elements in your arrays you've created. How will you handle the multiple dimensions? Use a Scanner and ask the user for 10 integers, storing them in an array. Print them back out. Generalize the Scanner example to be however many numbers they first tell you they want to enter. Does this affect where you declare your array? What about where you instantiate it? Try calculating the sum of an array of integers – what other storage might you need? What type should it be? Choose one of your arrays, and print it out its contents as we've done before, such as {2,4,6,1,3,5,7}, but based on whatever values happen to be in there. Report both the index of the maximum value in the array, as well as the value of the maximum value in the array. Chapter 3 Arrays

Arrays represent an ordered sequence of values. An array in Java can only hold values of one particular type, which is indicated at declaration.

Arrays are not quite the same as lists (though Python lists also use [ ] syntax). A list can change in length easily, but might take much longer to perform value lookups, because each element is chained to the next. An array uses a fixed layout and so looking up values can be much faster (just jump to the correct location), but changing the length (adding or removing elements in the middle) can take a lot of time to perform all the copying. An array type does not include the length of the array (or any of the dimensions, when working with multi-dimensional arrays like int[][][]). While the initial length of an array can be decided at run-time (e.g., ask for a number, make an array that long), an array value does not change in length. If you want a longer one, you must create a new longer array value, copy everything over, and then use the extra space.

Declaration To declare an array, we again give a type and an identifier to announce to Java that something of this type should now exist, with this name. To create an array of type T, the array type is created by appending [ ] to the end of the type: T[ ]. Other examples would be int[ ], double[ ], int[ ] [ ], and so on. Here are some declarations:

int[ ] xs; double[ ] samples; String[ ] names;

I tend to make names like xs, ys, zs. They should be pronounced "exes, why's, zees", and are supposed to represent multiple x values (whatever x is), multiple y values, and so on.

Your Turn! Try declaring a few arrays, and see if your code still compiles. We'll give them values next.

Instantiation – two options If you know the specific values to start out with, you can place them in curly braces { }. This is only an option, though, at declaration time:

int[] xs = {5,6,7,8,9}; double[] ds = {2.5, 3.5, 4.5, 5.5, 6.0 }; int[][] yss = { { 1, 2, 3}, {5, 6, 7, 8}, {4} };

If you don't want to or can't supply those values initially, we can instead specify the dimension(s), and get default values (of zero, false, space-character, or null for Classes):

int[] xs; // declaration can now be separate. xs = new int[5]; // xs now holds {0,0,0,0,0}. ys [] [] = new int[2][3]; // ys now holds {{0,0,0},{0,0,0}}. bs[] = new boolean[3]; // bs now holds {false, false, false}.

This style is often used in tandem with a for-loop that then follows up and systematically updates the value at each location. We will cover this in a couple of sections.

Your Turn! Now that you have a couple of arrays declared, try giving them values using both means of instantiation. Again, just make sure it compiles. If you try System.out.println(xs), you will see essentially the array's address and not its contents. Take a peek back at the "for-each loop" section for more syntax on writing for-loops for arrays. We're going to access the elements next!

Accessing, Updating Java uses square brackets [ ] to describe a particular indexed location in an array. In the same way that a variable can show up on the left and right sides of an assignment and mean a location of storage (on the left) and an expression to look up the value (on the right), we can do this with arrays and array syntax. Java arrays use zero-based indexes. If there are 5 elements in the array, the available indexes are 0,1,2,3,4. A value in an array is found by giving the name of the array , an open bracket, an index number (integer that happens to be a valid index), and a close bracket. Any integer value can be used as the index, whether it's a literal, a variable, or some other expression. The length of an array is found by:

arrayName.length

Note that this length is how many spots are in the array, which is one larger than the largest index.

int[] xs = {2,4,6,8,10}; //access elements in the array: int midVal = xs[2]; System.out.println("the first value in xs is : " + xs[0] ) ;

// change the value at one location in the array: xs[3] = 33;

//the array now contains {2,4,6,33,10}. System.out.println ("The xs array is " + xs.length + " spots long.");

Your Turn! Now we finally see how to access elements and print them! Try printing out a couple of different elements in your arrays. Make sure one of your arrays uses at least two dimensions, such as int[][] zs = {{2,4,6},{1,3,5}}. Try accessing individual values in it and also modify them.

Loops and Arrays Now that we know about arrays and how to access them, and we know about loops and how they let us systematically step through a range of numbers, we can use the loop to use each index in separate iterations, allowing us to visit each element in an array. The following example creates an array, specifies how long it is, uses a loop to fill in each spot (updates them) according to some pattern, and then prints them out (accesses the elements).

int[] xs = new int[15];

for(int i=0; i< xs.length; i++) { xs[i] = i*i; }

System.out.println("Some square numbers: "); for (int i=0; i

Your Turn! Print out all the elements in your arrays you've created. How will you handle the multiple dimensions? Use a Scanner and ask the user for 10 integers, storing them in an array. Print them back out. Generalize the Scanner example to be however many numbers they first tell you they want to enter. Does this affect where you declare your array? What about where you instantiate it? Try calculating the sum of an array of integers – what other storage might you need? What type should it be? Choose one of your arrays, and print it out its contents as we've done before, such as {2,4,6,1,3,5,7}, but based on whatever values happen to be in there. Report both the index of the maximum value in the array, as well as the value of the maximum value in the array. Chapter 4 Methods

Hello! Today we will focus on the static keyword, calling conventions of methods, and what scope and lifetime mean in Java.

Now that our chapters will tend to generate multiple files, I strongly suggest you create a folder for each lab, and then just add files to that folder.

Naming Things. Quick note on names: when we create a variable in a class – whether we make it static or not – we can refer to it as a field. We may refer to both fields and methods as members of the class. Although we have strived to use the same names for things, these words are commonly used to refer to the variables and methods in a class.

Methods – Static Methods The way in Java to name a block of code, and then be able to call it elsewhere, is to write a method. A method always goes in a class definition.

We have one major distinction to make, though: should the method be static or not? A static method can be called all by itself (no object of the enclosing class needed), much like a function in Python. A non-static method in Java is written in a class and associated with a particular object and thus can use the object's instance variables. Being non-static is the default behavior in Java, specified by the absence of the static modifier. We focus first in this tutorial on writing static methods.

Non-static Method Example Here is an example of a non-static method, in the Square class: look at the perimeter method below.

public class Square { public int side; public Square (int side) { this.side = side; } public int perimeter () { return (side * 4); } }

In order to call the perimeter method, we must have a Square object, such as sq.perimeter() in the following code:

Square sq = new Square(5); System.out.println("square's perimeter: " + sq.perimeter() );

Static Method Example Contrast this with the following doTheTask method:

public class Test { public static void main (String[] args) { int x = 10; String s = "hello"; int res = doTheTask(x,s); System.out.println("result: " + res); } public static int doTheTask(int a, String t) { int max = 8; // don't print too often int i; // declared here so we can use its value after the loop for (i=0; imax) { break; } } return i; } }

We actually have two static methods here: main, and doTheTask. Whenever we run a program, there's no action that creates a Test object – the main method is run directly, as if we had called Test.main(args) from somewhere else. Indeed, we can do exactly that:

public class OtherTest { public static void main(String[] args){ Test.main(); Test.doTheTask(5,"yo."); } }

Quick understanding question: how many times did doTheTask get called due to running the main method above? → Twice: once because Test.main calls it, and once because OtherTest's main method calls it directly.

Your Turn! Make a class named MyMathStuff, and then add these static methods to it: o sumArray, which accepts an array of integers and returns the sum of the array o maxIndex, which accepts an array of integers and returns the index of the maximum value in the array. (When no max exists for a length-zero array, return -1). Create another class, TestMyMathStuff, which does not create a MyMathStuff object but still uses each method and prints out the results. Add a static method to TestMyMathStuff, named useMMS, which creates an array, calls sumArray and maxIndex, and then prints each. Call it from TestMyMathStuff's main method.

Static Variables Variables can be static, too. Instead of calling them "instance variables" as we did for non-static variables (to indicate that we have one variable per object, or instance), we call these static variables class members to indicate that we have one shared value for the entire class. It will always exist. The most direct examples of this are also usually constants: Math.PI, Integer.MAX_VALUE, and Integer.MIN_VALUE. Because they are declared static, there is only one copy stored on the computer and we access that value via the class name. As a separate choice, making them final means that they cannot be changed – it is just the most meaningful to store the value of pi in the Math class, and the maximum and minimum values that an int can store in the Integer class.

When we create a static variable, since we will not instantiate it per-object at constructor-call time, we do actually initialize static variables at their declaration. (If we don't, a default value of 0/false/' '/null is supplied). // modifiers...... type... identifier...... instantiation. public static int identificationNumber = 1; public static final int FAVORITE_NUM = 49;

Just a quick note: our declarations are still following the pattern of modifiers type identifier. We are also just performing an instantiation at the same time. Keep in mind that public, private, static, and final are all modifiers. The above two lines of code are spaced out a bit oddly to show the grouping of the modifiers, type, and identifier.

To access a class member, we again just type ClassName.memberName (where we use the actual class definition's name, and the specific variable or method's name).

Your Turn! Print out the value of pi using the static definition in the Math class. Add a static variable to your MyMathStuff class to define your own silly version of pi as 3.0. Use it to print out the area of a circle with radius=5. (You don't need to create and use a Circle class for this one usage, just do the calculation directly). How do you access your version of pi? Longer Example: make a small class like Square, Coordinate, or Sphere. o Add a static variable named numCreated, and initialize it to zero. o In the constructor, increment this variable. You can also use the current value of numCreated during a constructor call to give a unique identification number to each object, sort of like a serial number on a musical instrument or the VIN on a vehicle. o Note that in this way, the constructor caller doesn't have to track how many objects have been created – the class takes care of it. This effect is not achievable without static variables (or some piece of state outside of the class).

Quick Notes on static. Assuming it is public, a static variable can be accessed by static and non-static methods within the same class. It can also be accessed anywhere the class containing it is accessible. A class can contain both static and non-static variables, and static and non-static methods. You can actually use an object of a class to access a static thing in that class (obeying the static thing's public/private visibility). You would type objectName.staticMember to access it; this is unusual, and we'd prefer to see Classname.staticMember instead. When we focus on visibility below, consider what it would mean to have a private static variable, or a private static method. Visibility and static-ness are two separate choices that we make for each member in a class.

References In the previous lab, we learned that: an object is the actual value stored in memory a reference is a 'handle' or 'address' that tells us where an object is a variable is something that may be created to store a reference.

If you have trouble describing the difference between any two of {object, reference, variable}, be sure you grasp the difference sooner than later. Drawing memory diagrams of your program at one specific point in time, and then stating where the variables/references/objects are is a great way to test yourself.

All of this 3-way distinction is for complex values, like objects and arrays – we call all of these sorts of types "reference types". Part of the reasoning is that Java always knows how large an address is (how much space it takes to store one), so it greatly simplifies the process of using these complex values.

Primitive values are immutable (can't be changed), so there's no point in distinguishing between a reference to a primitive value and the actual value stored in memory. (Of course a variable storing a primitive value can be changed all you want – as long as you didn't also make it final).

Given this very distinct view between primitive types and reference types, what does Java do when we call methods and use different types of values for our parameters? We'll discuss it in a moment!

Your Turn! Draw out memory – the variables, references, and objects – at each stage as indicated in the code below. Remember, String is a class, so String values are objects.

int x = 5; String s = "hello"; String t = "goodbye"; // draw out the memory here t = x+s; // draw out the memory here s = t; // draw out the memory here

Perform the same exercise here. This one uses our Square class, as written above.

Square s1 = new Square(5); Square s2 = new Square(8); Square s3 = s1; // draw out memory Square[] ss = new Square[3]; ss[0] = s1; ss[1] = s2; ss[2] = s3; //draw out memory ss[1].side = ss[2].side; ss[2].side = 3; //draw out memory

In each of the two above examples, how many were there of: variables? (all types) references? (all types, all over) objects + arrays?

Calling Convention Each language – be it Java, Python, C, Ada, Algol68, Haskell, PHP, Fortran, or any other – must decide on a calling convention for parameters. Do we copy the entire value that is passed in to a method's parameter? Or do we just pass in a reference to the expression? Do we even evaluate a parameter at all, before calling the method? Let's discuss the issues that help language designers make the decision, and then look at what convention Java follows.

Primitive values are all small and immutable. It is simple to just send a copy of this tiny value, because we know exactly how much space it takes up, and it is always a very small amount of space. Reference types embody information that could vary wildly in size; dealing with wide variety in the space parameters take up (regardless of the number of parameters) could be a real headache for compiler writers when trying to make method calls efficient, so having small predictable things as parameters is a good thing. Should we even evaluate the actual parameters? Migrating the values, expressions, and everything that was in scope for that expression (in order to be able to calculate it later on) is pretty hard, and gives some surprising behaviors. Almost all languages choose to always evaluate the actual parameters (arguments) no matter what, and just deal with these values. Copying large values (like a large array) as parameters could be wasteful – imagine writing a method that accepts an array and a key, and the looks for the index at which that key value shows up. Copying an array with thousands or millions of entries in it would be quite wasteful, especially since we don't want to modify it, just inspect it. When we do want to modify one small part of a large structure, it is wasteful to copy the entire thing, create a new copy of all that structure with the small change made, and then return this large structure to then be copied into the location originally occupied by the original value. Java, like many languages, decided to only allow returning one value. If we want to modify multiple things at once, we would have to somehow group all the values together for the return value of a method (like using a tuple, which Java doesn't have built in).

Java passes actual parameters using call-by-value. A parameter's specific value is calculated, and a copy of this value is given to the method being called. Since a copy was received, the original value is not affected. It is worth rephrasing to say that Java uses call-by-value for both primitive types and reference types. But the implications are different:

For primitive types, the value is immutable and of a fixed size, so there is no way this copy of the value can affect the original value. (There is no notion of updating part of a primitive value – 5 is always worth |||||, and we can't change that). For reference types, the copy of the reference can be modified, and the original reference will not be affected. However, both the original reference and the copy of the reference are able to access and modify the information found at that address, so the effects of using the copy to modify the object are still visible after the method returns. o Thus when we pass a reference type into a function, the formal parameter and the actual parameter become aliases for the same object. o Be careful – if we reassign the parameter (like a = expr), the parameter is no longer an alias of what was passed in; thus this change and any further modifications via the parameter are not affecting what was passed in. It's an easy trap to fall into.

Your Turn! Now that we have discussed references and the calling convention of Java, the following code and questions should help you focus on the differences between the objects in memory; the references to those objects; and the variables that hold references to those objects.

Create the following Foo class:

public class Foo { public int x; public String s; public Foo (int x, String s){ this.x = x; this.s = s; } public String toString() { return (" Foo(x="+x+",s=\""+s+"\")"); } }

Consider (and play around with) the following testing class:

public class TestFoo { public static void main(String[] args) { int y = 0; Foo f1 = new Foo(1,"A"); Foo f2 = new Foo(2,"B");

//problem #1 m1(y); System.out.println(y);

//problem #2 m2(f1); System.out.println(f1);

//problem #3 m1(f1.x); System.out.println(f1);

//problem #4 f2 = m3(f1); System.out.println("#1: "+f1+". #2: "+f2);

//problem #5 f2 = f1; System.out.println("#1: "+f1+". #2: "+f2);

} public static void m1 (int a) { a = 5; } public static void m2 (Foo f) { f.x = 3; f.s = "C"; } public static Foo m3 (Foo f) { Foo temp = new Foo(4,"D"); return temp; } }

For each of the problems (listed in comments), answer these questions: Is a primitive value or reference value passed in? What is printed on the next line? Were there any aliases during the method call? (if there was a method call at all) Were there any aliases after the method call? (or after running the statements, if no call)

Scope and Visibility

We need to be aware of when values and variables are accessible or not, and when they are created and destroyed. We say the scope of a variable is where in the source code we can refer to the variable, and we think of the lifetime of a value as the time from creation to destruction.

Scope Scope refers to the region of a program in which a definition is available. Each variable, method, and even class definition has a scope. We first focus on the location of the definition:

Locations and Scope: Where something is defined, and where it is accessible. local variable - defined within a block of statements, such as inside a for-loop, inside a method, or other control structure. Accessible from declaration to end of the statement block. parameter - variable defined in a method signature (the method definition). Accessible through the entire body of its method. (Parameters are truly just local variables). field (instance variable or class variable) - defined in a class. Accessible through the entire class definition, except that static methods cannot access non-static fields. method - defined in a class. accessible through the entire class definition. class - defined in a package. Accessible through the entire package. (Note: we're ignoring inner/nested classes!)

Note: if we have a local variable that is currently referring to an object, and that object has (publicly visible) fields that we can also access, that doesn't mean the interior thing we accessed is in scope here: we used the dot operator to go to a different scope and find something that was in scope over there.

For each of these, we should consider the definition's lifetime: when the definition begins its existence, and when the definition ceases to exist.

Lifetimes local variable - a local definition begins existence at the declaration statement creating it, and dies upon exiting the block of code in which it was created. (The closest-enclosing curly braces { } dictate the lifetime of local definitions). If you declare a variable inside an if-statement, this does in fact mean it is destroyed upon exiting the block. parameter - a parameter, defined at the start of a method, lives during the entire duration of one call of the method (and dies when we return from the method). Specifically, it is created (and instantiated with the corresponding actual parameter) when the method is entered, and is destroyed when we return from the method. instance variable - a particular instance variable begins when the object in which it resides begins its existence, and dies when that object is no longer accessible. If you don't give it a starting value in the constructor, Java will complain and not compile your code. class variable - a class variable begins its existence when the entire program begins running, because the class definition is in scope the entire time the program runs. It doesn't die until the program quits entirely. Note that since no object of the class is required to access it, the definition of its lifetime has nothing to do with the lifetime of any object either. method - a method exists the entire duration of the class containing it (so during the entire program's duration). class - a class exists for the entire program's duration.

A collective example is shown in an image below: → The scopes of things are written on the left side → destruction points (deaths) are indicated on the right.

Code example (text format): public class Bar { //begin scope of a, b, c, doStuff(), and other() public int a; private int b; public static int c;

public int doStuff(int x, int y) { //begin scope of x and y int count = 0; //begin scope of count for (int i = 0; i < x; i++) { //begin scope of i int temp = i; //begin scope of temp temp += y; if(temp % 17 == 0) { return count; } if(temp %2 == 0) { Strinng s = "even"; //begin scope of s System.out.println("looking + s); count++; } //end scope of s (s dies after if) } //end scope i and temp (i dies after last iteration, temp dies each iter.) } //end scope of x, y, and count (x, y, count die at method exit)

public void other() { return; } } //end scope of a, b, c, doStuff(), and other()

Notes a,b live as long as the object containing them exists (has reference to it) c lives the entire duration of the program (static fields are always there) doStuff() and other() are methods, they don't "die" but are only in scope with an object (they are non-static).

What do we do with scope and lifetimes? So what do we do with knowledge of the scope and lifetime of things? Especially for variables, it helps us understand how we can use them.

As long as an object has a living reference to it, Java will keep it around. Once no such references exist, Java marks it as garbage and effectively deletes it (specifically, Java reclaims that memory to use for storing other data).

We can look for and understand bugs, such as declaring a variable in too narrow a block of scope. As an example:

for(int i=0; i

Visibility A large part of the OO philosophy revolves around encapsulation and abstraction. We want to be able to only interact with the exposed interface of a class, to such an extent that the language itself gives the programmer the ability to enforce hiding of the interior representations and other algorithmic decisions.

We have four possible visibilities: public, private, protected, and package-private. public: allow anyone with a reference to access this member. private: disallow access from outside the object. (An object can always access its fields and methods; its methods can always access its own members of any visibility). protected: used during inheritance. (Spoiler alert: protected members are effectively private except that child classes can access them, unlike actual private members.) default : behaves as public inside the entire package, but private outside the package. (Package-wide visibility). This often leads to bugs once you share your package of code!

So far, we have perhaps written all of our classes, methods, and instance variables as public. But now that we are more comfortable with classes and objects and the components that we use to create classes, we should focus more on enforcing the abstractions of OO-style programming by hiding members of our classes.

Example If I have everything defined as public, and I am representing my house (and everything in it), then anyone who has my address can go inside and steal my sandwich. This is not good.

meanie.eat(myHouse . kitchen . fridge . sandwich); //sandwichless sadness.

But what if myHouse is locked? I can do this by making everything in my house private. Now you can't have my sandwich. Java will protect it for me!

//sandwich haven via compilation error. meanie.eat(myHouse . kitchen . fridge . sandwich); //compilation error: we can't access the kitchen when we're locked outside. what if I have a public requestSandwich method? Then you can request my sandwich and I'll give it to you.

politePerson .eat (myHouse . requestSandwich ());

We will talk about protected more once we've learned about inheritance. (But to look ahead, you could imagine letting your college kids have access to the fridge, and not have access to your diary; all this, while they have a key to the house that is always locked to outsiders.)

Reading and Writing Privileges Making everything public is problematic. No matter how carefully we construct a class, and write wonderful methods that can be used to beautifully solve the task at hand, any user of our class is free to go in and directly modify any state, and call any method even when it's not appropriate to do so.

public fields offer unlimited read and write privileges

Consider an example of a BankAccount class:

class BankAccount { public int balance; …}

//elsewhere, outside the BankAccount class: BankAccount ba = new Account(); ba.balance = 2100000000; //bad for business! exsAccount.balance = 0; // bad for the customer!

Rather than allow anyone to access a field member, we can make it private so that both reading and writing privileges are revoked:

class BankAccount { private int balance; …}

ba.balance = 2100000000; // compilation error int secret = exsAccount.balance; // compilation error

How can we selectively restore reading and writing privileges? Through public methods that are in the same class as the private field. A field is always visible inside the class, so we can choose to return a copy of a private field from a public method in that same class.

By creating a public method that happens to return a copy of the field, we can restore reading privileges. These methods are often called "accessors" or "getters". By creating a public method that happens to accept a parameter and update a field, we can restore writing privileges. These methods are often called "mutators" or "setters". For a field named fooBar, it is conventional to name the getter getFooBar, and the setter setFooBar. Notice the consistent use of camel-casing. There is nothing special about a getter or setter method – Java treats them just like any other public method.

public class Square { //private: revoke reading and writing privileges. private int side;

public Square (int side) { this.side = side; }

//restore reading privilege to side. public int getSide ( ) { return side; }

//restore writing privilege to side. Perform extra checks! public void setSide(int side) { if (side >0) { this.side = side; } else { this.side = 1; } } }

Good Practice: The purist's view is that every single field should always be made private, and that we then choose whether reading/writing is allowed selectively by adding getter/setter methods. The only exception to this would be constants – since they can't be written anyways, it is safe to restore just reading privileges by making a public final field member. → if you want to make something final and public, this is a good compromise of usability and good encapsulation.

Pragmatic Practice: when a definition is extremely simple – in a class that only has state, such as a class representing a pair of ints – if we are going to just offer reading and writing without any extra checks such as disallowing non-negative values, we might as well just make everything public and make the syntax that much more succinct.

Your Turn!

Rectangle Example Create a Rectangle class, with fields for length and width. Make them both public initially. Make a public constructor. (Note: even constructors can be private! Can you think of when this might be useful?) Create or reuse a testing class to create a Rectangle object, then modify its fields, and then print them out. Now, realize we want to write good OOP code, and change both fields to be private. Add getter and setter methods for both fields. Disallow non-positive values for each field. Go back to your testing class and edit the code to use the getters and setters.

Going Further: Let's remove the setter for width (revoke writing privileges for width). Add public methods to get and set the perimeter. But don't add a perimeter field! o In the setter, re-work the equation 2*(length+width)=perimeter to figure out how to modify just the width to achieve the given perimeter. o In the perimeter getter, calculate the perimeter based on the length and width. → notice how our internal representation can be substantially different from what the user sees? It appears that we have a perimeter field, but we have chosen a different implementation path. If we later change our minds, we can get rid of the width field and add a perimeter field instead, and not have to change the signatures of any of the methods in order to provide the same functionality. So let's try that: get rid of the width field, add a perimeter field, but don't change any methods' signatures (the visibility/return type/name/parameters portion of the definition).

BankAccount Example Create a BankAccount class. Give it private fields for accountNum, balance (in pennies), password, and customerName. What getters and setters do you want to provide? Add a constructor, with parameters for each field. Add another constructor, with parameters for each field except balance, and set it to zero. This is an example of an overloaded method. Add a toString method that returns a nice String representation of the account. Do you want to include all fields in this printout? (Honest question – think of where you want this method to be used, and realize if it is public you have already decided where it can be used). Add withdraw, deposit, and changePassword methods. What should their visibility be? Should a copy of the password be a formal parameter in any of these? Add sufficient checks so that a user can't deposit a negative amount, withdraw a negative amount, or withdraw more than they have in the account. Are there any other checks you want to add? Test out your code. Note on money: Why did we suggest an int instead of a double for money? Although we like to think in terms of dollars, the chance for imprecision in floating point calculations is unnecessary. If we view the amounts as pennies, we can always store the exact value in an int.

Chapter 5 File Input/Output

We now turn our attention to reading and writing text files. Java has libraries that make this pretty straightforward.

File I/O File I/O (input/output) means reading and writing the contents of files. We can view the contents of a file as if it contains text (in ASCII, Unicode, etc) or also as if it contains binary data such as integers, floats, and other types of non-textual data.

We'll show a couple of versions of reading and writing content, so just keep in mind that the preferred versions would be: read textual data with a Scanner write textual data with a PrintWriter read and write binary files with ObjectInputStream and ObjectOutputStream.

Reading Text with a Scanner. We have already been using Scanner objects to read from the keyboard via System.in. It turns out we can attach a Scanner to many different things! import java.util.Scanner; //before the class import java.io.File; //before the class ... Scanner sc1 = new Scanner (System.in); Scanner sc2 = new Scanner ("really long\nString\n\t\tthat I want to pick apart\n"); Scanner sc3 = new Scanner (new File("local.txt")); // needs exception handling… see below

In each case, we are then able to just use the Scanner methods all in the same fashion. We can call nextInt(), nextDouble(), nextLine(), and so on regardless of where the inputs are coming from. We get to use this one nice interface for everything!

When opening a file, we should close the file when we're done with it: sc3.close();

The only ingredient missing for us is to be able to handle exceptions that might occur when opening a file. For certain kinds of exceptions, Java will not allow us to write code that might allow those types of exceptions to silently escape. If we tried the declaration of Scanner sc3 above, we'd see the following compiler error message:

Error: /path/to/file.ClassName.java:8: unreported exception java.io.FileNotFoundException; must be caught or declared to be thrown

We will have a more detailed look at exceptions later on; for now, we will need to use our exception handling (try-catch blocks) successfully to just read a file: import java.util.*; // Scanner import java.io.*; // File public class FIO {

public static void main(String[] args){ Scanner sc3 = null; try { sc3 = new Scanner (new File("help.txt")); } catch (FileNotFoundException e){ //print actual error messages to System.err. System.err.println("file wasn't found! Choosing to quit now."); //return 0 for 'normal' exit; non-0 for crashy exit //(also resets DrJava's "interactions" pane). System.exit(0); } // assuming we got here, use the Scanner as normal. String s = sc3.next(); System.out.println("s='"+s+"'"); sc3.close(); } }

Notes We are printing error messages to System.err. We are choosing to end the program with our catch-block (via System.exit(0)), but your own program might want to loop back and try another file somehow, or use a default file, or whatever makes the most sense for your situation. We closed the Scanner when we were done with it.

Your Turn! Write code to ask the user for a text file; repeatedly ask for it until you get one that exists. Then print out its contents all in uppercase. Find an old class definition lying around that had you perform significant user interaction. Change the code to attach the Scanner to a particular file, and then try putting your user responses into a file all at once. See if you can run your program successfully in this wholly automated way! o You could write a program that accepts a file containing a single int (how many items are coming), followed by that many ints, and then print the sum of them.

Overall, using the Scanner class looks much more inviting. We'll try echoing lines in all-caps.

public static void main(String[] args) throws IOException { Scanner sc = new Scanner (new FileReader(foo.txt"));

String currentLine; while (sc.hasNextLine()) { currentLine = sc.nextLine(); System.out.println(currentLine.toUpperCase()); } sc.close(); }

The Last Newline Question The only complication we have is if the last line of text doesn't contain a newline character. Consider a file that contained the following in it: "a \n b \n c \n d e f". There are three words on the last line (d e f), but our Scanner's nextLine method will return false at the last line, because there is no newline character. We can fix this by just checking whether we have either any next token (using the hasNext() method, looking for non-whitespace) or a newline (using the hasNextLine() method):

if (sc.hasNext() || sc.hasNextLine() ) { … }

Now, it doesn't matter if we have empty lines at the end (caught by hasNextLine), or an incomplete line at the end (no newline character, caught by the hasNext method).

Note: although creating the Scanner as Scanner sc = new Scanner (new File("foo.txt")); also works, it makes it harder to identify non-existing files. The original way we showed also allows for closing the underlying file stream (because FileReader implements Closeable and File doesn't) , which lets us manage our resources better.

Your Turn! Use a Scanner to open a file that contains an integer first, indicating how many words are in the file; then, create an array of Strings of that length, and read in all the words using the next() method. o Be sure to test this on a file that has stuff on the last line, but no newline character (so, the cursor is next to a word but can't go down to the next blank line). Mix-and-match: use command line arguments to get the file name; does this change your usage? Where we've had "foo.txt", you can actually have a proper directory path, such as "foo/bar.txt". Give it a try. Question: what would you say is the difference between using a Scanner on System.in, using a Scanner on a File object, and using command-line arguments? o How might you reasonably use two or more means of input at once in a project?

Writing to Text Files Writing to a text file is very easy – we have already been using System.out.print, System.out.println, and System.out.printf to write to the terminal, and we will now use a PrintWriter object to print/println/printf directly to a named file. We still must deal with exceptions just like when reading a file, though. Here's a complete example: import java.util.*; // Scanner import java.io.*; // File, PrintWriter public class FIO {

public static void main(String[] args){

PrintWriter pw = null; try { pw = new PrintWriter(new File("help.txt")); } catch (FileNotFoundException e) { System.err.print("couldn't open file for writing!"); System.exit(0); } pw.print("partial line "); pw.println("complete line"); pw.printf("with %s\n","substitutions"); pw.close(); // always! } }

Notes We can use print and println in the same familiar way as we have with PrintStream (System.out is a PrintStream). It even works for our own types, like a Square object, in getting the String representation via the toString implementation. We always close the PrintWriter when we're done! This is when your file actually gets written, even though all the previous code was figuring out what will be written. Nothing about file-writing means we can't still get input via a Scanner anywhere, or even still write to the terminal via System.out.

Your Turn! Try writing a method that accepts an integer n as a parameter and a String filename for a filename, writes n to a file, and then that many integers, into the named file (separated by some whitespace). → this kind of file can be read by a previous Your-Turn task! Just write any numbers you want – all 7's, increasing/decreasing numbers, anything. Write a complete program that (1) asks the user for a filename and a number to use for n; (2) calls your file-writing method; (3) calls your file-reading method and prints out the result to the terminal. In a method that accepts an array of ints as a parameter, create two separate PrintWriter objects, attached to two files, named evens.txt and odds.txt. Using that array as your source of numbers, print the even numbers on separates lines in the file evens.txt and print the odd numbers on separate lines in the file odds.txt.

Historical Diversion: BufferedReader and BufferedWriter.

Prior to the Scanner class's existence, the way to perform textual reading and writing was through the BufferedReader and BufferedWriter classes. You might come across it in legacy code, so it is presentd here. But please use Scanner and PrintWriter when you've got the choice.

Textual Input with BufferedReader.

As a simple approach, let's begin with an example that allows us to read lines of text from a file, and prints them to the terminal in all-caps. import java.io.*; // gets BufferedReader, FileNotFoundException, IOException, FileReader… public class Lab7 {

public static void main(String[] args) throws IOException { String theFileName = "foo.txt"; BufferedReader br = new BufferedReader(new FileReader(theFileName)); String currentLine = br.readLine(); while (currentLine != null) { System.out.println( currentLine.toUpperCase() ); currentLine = br.readLine(); } br.close(); } }

The important parts of this example are: (1) we see how to create a BufferedReader object based on a file name (2) we are seeing the invocation necessary to call the readLine method (3) we can check if the end of the file has been reached by checking if the result of reading a line is null.

Here are the littler details of the example: we create a BufferedReader object, which represents a stream of characters that can be read using the readLine() method. We supplied our buffered reader the stream of characters in the form of a new FileReader, whose constructor needed a String containing the file name.

As a brief look at Java prior to the introduction of the FileReader class, we would create a BufferedReader like this:

BufferedReader oldWay = new BufferedReader( new InputStreamReader( new FileInputStream(new File("foo.txt"))));

Hideous long-lined code! At least the usage afterwards would just be the same as above, calling readLine and checking that the result isn't null.

There aren't many other useful methods in BufferedReader: we can read a single character with the read() method, and some other ways to mark our current place in the stream and jump back to it, but we don't have any of the convenience that we saw with the Scanner class.

Textual Output with BufferedWriter

File output is achieved in similar fashion. This first time around, we will use a BufferedWriter object. We will do even better with a PrintWriter next. But you just might come across code that still does things this first way we show, so let's take a quick look at it.

public static void main(String[] args) throws IOException { BufferedWriter bw = new BufferedWriter ( new FileWriter("out.txt")); bw.write("hello!"); bw.write(" more stuff!\n yeah\n that's write\n"); //spelling intended. bw.write("the end.!no newline."); bw.close(); //always! }

Notes: We feed the contents we want written to the file a String at a time, through the write method. We manually add newlines ("\n") as necessary. We always close the file when we're done! because the write method expects a String, if we pass other types to it, it unfortunately won't automatically use toString definitions to get the String representations – it will record their actual byte representation, which would then look like various seemingly-random characters once those bytes are used as lookups into the ASCII/Unicode tables. Manually call toString (or even wrap up as a Wrapper object to do so on primitive types) for non-String types.

Your Turn! write your name, favorite number, and favorite color to a file using a BufferedWriter.

Reading/Writing Binary Data

We can also read and write files that contain binary data: the actual bits representing, say, an int, double, or Square, rather than human-readable ASCII characters. The difference between a text file and binary file is the same idea as the difference between "5" and 5; they have different bit representations and will only make sense if interpreted the correct way.

We could use DataInputStream and DataOutputStream if we only had primitive values, but we can also use ObjectInputStream and ObjectOutputStream for a mixture of objects and primitives, so we will jump directly to the Object* versions.

Let's start with the following two pieces of code:

example code using an ObjectOutputStream for writing, and ObjectInputStream for reading:

public static void main(String[] args) throws IOException,ClassNotFoundException {

ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream ("test.bin")); oos.writeInt(5); oos.writeDouble(Math.PI); oos.writeObject(new Square(17)); oos.close();

ObjectInputStream ois = new ObjectInputStream (new FileInputStream ("test.bin")); int x = ois.readInt(); double pi = ois.readDouble(); Square sq = (Square) ois.readObject(); //note we had to cast to our object type System.out.println("x="+x+"\npi="+pi+"\nsquare: "+sq+"\n"); ois.close(); //always! }

a modification to our Square class so that it can be written:

import java.io.Serializable; public class Square implements Serializable { …

Notice that we can write primitive types with writeInt, writeDouble, writeBoolean, and so on. For all objects, regardless of the class, we need them to implement java.io.Serializable, and then we use the writeObject method of ObjectOutputStream to write it.

Notes We are writing the actual bit representations to a file; try opening the test.bin file above with your favorite text editor. Why do we not see the written contents in human-readable form? Because "5"≠5, and we are interpreting the bit-pattern for 5 (which takes 4 bytes = 32 bits) as a character encoding, reading a byte at a time and interpreting them as ASCII codes.

We always close the file! Whether we are reading or writing it, we close it when we're done. Yes, if the program ends, closing may be taken care of for you; still, you should get in the habit of closing them on your own.

Only Serializable things can be written/read to binary files. The Serializable interface has zero methods, though, so if there's even the slightest chance that a class might be used in this fashion, go ahead and implement it.

Of course the file-writing and file-reading don't have to be adjacent blocks of code! Now you can write programs where you store objects for 'long-term' storage between program runs, and you can read the information back in. For instance, the high scores table in a game could be stored in a binary file by writing the objects to the file, and they could be read back in the next time the program runs.

One Step Further: If your object contains references to other objects, Java will automatically write those other objects into the stream so that they are also read in automatically. Even better, suppose obj1 and obj2 both have references to obj3. If you write obj1 (causing obj3 to be written too) and then write obj2,(with a reference to obj3 also), and then later on read them both, they will still have references to the same third object! When you write obj2, Java realizes that it's already written the object to which obj2 is referring, and it will store a reference to that already- written object seamlessly. This only works within one output stream, so if you are creating multiple ObjectOutputStreams and writing an object via both, they will be separate copies and are no longer linked together. o Always use just one ObjectOutputStream object to write one file.

Sadly, appending more objects to a file isn't directly allowed (perhaps due to seeking for shared sub-objects?). But you can read in the objects, and then write them as well as your new objects with a single ObjectOutputStream.

Your Turn!

Make a couple classes that are Serializable. It would be even better if one of them contains fields of the other (like a Sphere having a Coordinate). Reusing classes you have lying around is just fine here; the point is to make them Serializable so we can use them next.

Create an ObjectOutputStream, create a few objects, and write them to the file (using writeObject(..)). o Write some primitive types into the file, too (using writeInt(..), writeFloat(..), etc). o close the file.

Create an ObjectInputStream, and read in the objects. You'll have to do so in the exact same order as you wrote them; we cannot escape the ordering. o You'll have to cast each object to the type you're expecting. This could cause an exception if you cast to the wrong type. o close the file.

Use (maybe just print) the objects again, showing to yourself that you've successfully read in objects/primitives from the file.

Problem: Suppose you want to make your class Serializable, but it has a field of a non-serializable class. How could you get around the situation, so that you can still have a serializable version for your class? Direct coding solutions, or reading about the issue online, think of how you can use the Java ideas you've already learned (perhaps quite recently, hint-hint) to solve this issue.

Chapter 6 Classes and Objects

Hello! Today we will focus on creating classes and objects. Now that our practice problems will tend to generate multiple files, I strongly suggest you create a folder for each one, and then just add files to that folder.

Classes and Objects: The Big Idea A type can be thought of as representing a set of values.

byte is the set of values {-128,-127,…,-1,0,1,…,127}. int is the set of values { -2147483648..,-2,-1,0,1,2,… 2147483647}. boolean is the set of values {true, false}.

We say that the false value is an "element of the boolean type", "false is of type boolean", "false has type boolean", or more simply, "false is a boolean." Similarly, "13 is an int." If we require a value whose type is an int, we may say that we "need an int." It is always important to differentiate between the values (like 5, "hi", true) and the types (like int, String, boolean).

A programming language comes with some specific built-in types. Java has the eight primitive types (boolean, byte, char, short, int, long, float, double), as well as array types. Any other types (like Strings) that seem to be provided in Java are actually present in the java.lang package, which is implicitly imported. (Much like the 'built-in' definitions of Python).

But the world around us has a lot more structure than just numbers, booleans, and arrays or Strings. There are cars, people, traffic lights, trees, instruments, all kinds of things that are the objects involved in the system that is our real world. If we want to solve real world problems, we'd like to mimic the structure of the real world with values that exhibit the same structure.

So if we don't want to be limited to just values of those types, and we want to get any more abstraction in our data representations, how can we create new types of values? How can we group values together in a meaningful fashion, and then define how to use them?

A class definition defines a new type of values in Java. It gives us two main capabilities.

First, it allows us to meaningfully group different types of values together to represent some more complex structure, such as a String and an int used to represent a person (for their name and age), or two doubles representing a rectangle (for the length and width). We may call these sub- values by many names: instance variables (because each object is an instance of the class); attributes (because these are the salient parts of any object of the class); field members (a bit more of a traditional name for Java documentation).

Second, we get to define the acceptable behaviors of these values by defining methods ('behaviors'). It gives us the chance to not just represent the world through values, but animate that world-representation through interactions (method calls).

An object represents a specific value of a class. The object has its own value for each attribute listed in the class, and is stored in its own unique place in memory.

When we create a variable to hold an object, we actually store a reference to the object: a 'pointer' into memory that is only allowed to refer to values of a particular type.

We can create multiple objects from one class. Each object resides in its own location in memory. When we ask a particular object to run a method from its class, the object uses only its own instance variables when running the code.

Defining a Class When we create a class, we can create various members of the class. We can declare variables, to indicate that each value of this newly defined type should have its own value for each of these variables. We can declare methods, each defining the way in which these values can behave.

Think of the methods as actions that outsiders can request the object to perform. These actions can be simple, like reporting some quantity or value; they can be complex, like simulating some process, generating new structures, or other tasks like searching/sorting through the instance variables. The statements in a method can be any procedural-style statement – calling other methods, defining variables, using control structures, and performing exception handling.

Once we've got a class definition, we can then use it as a recipe to create individual values of this new type, which we call objects. Each object is guaranteed (required) to have its own copy of all the declared variables, and thus we can call all the methods on that object.

(This is a simplification – once we add the static modifier to methods or variables, and once we discuss what the private modifier does to methods or variables, we will also see how that qualifies the description above).

Class Example Here is a very simple example class definition, with no methods:

public class Person { public int age; public String name; public String email; }

Your Turn! Copy that class definition into a file named Person.java .

Creating Objects We could then create Person objects like this:

Person p1 = new Person(); p1.age = 18 p1.name = "Billy"; p1.email = "[email protected]";

Person p2 = new Person(); p2.age = 19; p2.name = "Sally"; p2.email = "[email protected]";

System.out.println("p1: age: "+p1.age+", name: "+p1.name+", email:"+p1.email); System.out.println("p2: age: "+p2.age+", name: "+p2.name+", email:"+p2.email);

First of all, where should this code go? A class definition is just a definition, not a specific request for Java to perform any action. A method does nothing, but a method call runs the statements in the method. We learned that running a Java program is synonymous with running the main method of a class, so we want to put this code in a class's main method (or somewhere that will be reached by a class's main method).

We can just add a main method to this class, or we can create an entirely separate class, and add the code to that class's main method. We don't want to get in the habit of thinking the code using a class is required to be in just that class's main method, so let's create a separate class (in the file TestLab2.java):

public class TestLab2 { public static void main(String[] args) { Person p1 = new Person() ; … } }

Now, when we want to run our program, we can just compile TestLab2.java, and run TestLab2, and it will automatically deal with compiling Person.java, and integrating the compiled Person.class file with the TestLab2.class code.

javac TestLab2.java java TestLab2

Even though we don't explicitly execute javac Person.java, notice that the Person.class file has been generated. Incidentally, even if the Person class had its own main method, by executing java TestLab2, we are not using Person's main method – we are only executing TestLab2's main method. It is far more common to have many classes involved with only one main method, so that is the usage pattern we are trying to establish.

Your Turn! create a class named Player that has these instance variables in it: name, age, height, weight, position. What types will you choose for each instance variable? create three of your Player objects, and give them values for each instance variable. Print things out so that you are comfortable seeing them Arrays. Remember, arrays can hold anything. Make an array to hold Person values, and make it length three. Can you figure out how to put those three Person objects you just made into the array? Can you use the array to perform the printing? Especially useful may be the for-each loop. Create another class, Bottle. Each bottle should have state to represent how many ounces it stores, how many ounces of fluid are currently in it, whether it is insulated or not, and whether it has a lid or not.

Constructor Methods We find that the first thing we want to do with a brand new object is to give values to all the instance variables; this is so common, that we can write a special method called a constructor to serve this exact purpose.

A constructor is a method for taking a brand new object and setting up all the instance variables, and then returning (a reference to) that object. Because a constructor must return that object, the return type is not specified. It is actually a compilation error to attempt to give a return type to a constructor. The name of a constructor method is identical to the class name, so that Java knows the method is a constructor. Just like any other method in Java, we can then require zero or more parameters.

Add the following example constructor method to the Person class:

public Person(int a, String nm, String e) { age = a; name = nm; email = e; }

Before we add our own constructors, we need to point out a more common pattern of naming conventions in constructors:

public Person(int age, String name, String email) { this.age = age; this.name = name; this.email = email; }

The this keyword functions as a reference to the current object running this method. Since we are creating a new object of this class type, this refers to the object being initialized. Why doesn't Java get confused between the instance variable age, and the parameter age? Usually Java chooses to disallow situations like this where there is any chance for confusion. But having to choose new parameter names for each of the instance variables is another way to introduce confusion or errors. We will discuss parameters and scope in better detail soon, so just look at the above code and know that this.age refers to the current object's age attribute, and age refers to the parameter (because the parameter is the 'closest' definition of age).

Your Turn! Add a constructor to your Player class. What parameters will you have? Will you use the this- notation or not? Do you need a return statement? Add a constructor to your Bottle class. Ask yourself the same questions as for the previous constructor.

What Types are Allowed? Now that we are starting to create more structures, we might wonder if there are any limitations on what types we can use for instance variables. It turns out there are no restrictions: primitive types, arrays, other class types, we can even use the type of the class we are defining!

Your Turn! Add a Bottle attribute to the Player class. Hydration is important for our sport-generic athletes! In your Player constructor, you will now need to create the bottle. You are allowed to write things like bottle = new Bottle(…); inside a constructor. Bonus (more difficult): Matryoshka dolls are those "nesting dolls" that are popular in Russian folk art. Create a Matryoshka class, with these attributes: canOpen, innerDoll, and any others you might want to add. The issue of how to stop making inner dolls is tricky, and requires an understanding of recursion. The issue is solved by using the "not-a-value" value for the innermost doll: the null value can be stored into a reference of any class type, and means that there is no current value. Thus the purpose of the canOpen attribute is to serve as a safety check for whether we can access the inner doll. We could also just check if (innerDoll==null).

Creating Objects (again) We glossed over creating objects a bit. Now that we have constructors and a better understanding of classes, we can discuss a bit more the process of creating objects. Let's assume that our Bottle class has the following constructor:

public Bottle (int ounceMax, int ounces, boolean isInsulated, boolean hasLid) { …}

With this constructor, we can now create Bottle objects. The big distinction we need to make is between objects (the values in memory) and references (the variables/expressions that refer to a particular object value).

Objects versus References When we create a new object, we are reserving a new space in memory. The new keyword is what actually creates the space, while the constructor method merely tells us how to set up that already- reserved chunk of memory.

new Bottle(32, 21, false, true)

We have seen the new keyword before, when we created space in memory for an array:

new int[10] new serves the same purpose, reserving exactly enough space for this complex data structure.

So what is a reference? A reference is simply a "typed address" for a particular object value. It is roughly equivalent to variables for class types, though we can obtain a reference from a method call as well. Bottle hydroHolder = new Bottle(32, 21, false, true) Bottle coffeeCup = new Bottle(12, 5, true, false) hydroHolder is a reference for the first created object (call it 32oz), while coffeeCup is a reference to the second created object (call it 12oz). But there is no restriction on a one-to-one reference-to-object relation:

Bottle bensBottle = hydroHolder; Bottle jensCoffee = coffeeCup;

After these two extra references have been created, we still only have two objects, 32oz and 12oz. But hydroHolder and bensBottle both point to the 32oz object, and coffeeCup and jensCoffee both point to the 12oz object. When we have multiple references to the same object, we call them aliases. Just as you have already learned about mutability and multiple names for the same spot of memory from CS 112, we will see that references in Java allow us to modify complex values 'from afar' by passing a reference.

Your Turn! Create the two objects and four references as above. Learn what happens when you try to print the reference itself (e.g., System.out.println(jensCoffee)). This is much like printing out objects in Python, where you get both a type and some sort of obfuscated memory address that uniquely identifies the location of the object (or array). Printing references, can you see that they are the exact same two objects despite there being four references? Print out all the information you can for the objects from all four references (yes, two of them will be repeats). Modify hydroHolder and jensCoffee in some way (such as hydroHolder.ouncesMax = 42;), and then re-print all of that information and note where your changes took place. You are seeing aliases in action!

Take away experience: Objects are the actual values in memory. References are our handles that help us find those objects. Just as many people can have your mailing address written down but there is only one house at that location, there can be many references to a single object. Keep this in mind, especially when surprising things are happening in your code. Chapter 7

Inheritance Class Inheritance: Introduction extends usage protected visibility abstract classes/methods

Hello!

Be sure to actually work through the examples in order; the "Your Turn" problems build upon each other extensively.

Today we will cover another key OO concept: inheritance. We're comfortable with all the structure and hierarchy of creating separate classes, using fields to get some aggregation going, but our class definitions are still all written separately. Inheritance is going to allow us to write classes by essentially borrowing all of another class's definition, and selectively adding more fields and methods to it, with some nifty relationships holding between the original and newly-defined classes.

Motivation

So far, the only way we could cause two classes to interact in any way was through aggregation: where the object of one class "HAS-A" object of another class as part of its own data. Let's quickly review an example of aggregation, because we need to be sure we contrast the idea of aggregation with the new concept of inheritance.

Aggregation Example: A Sphere HAS-A Coordinate: public class Coordinate { public double x,y,z;

public Coordinate (int a, int b, int c) { x=a; y=b; z=c; } } public class Sphere { private Coordinate location; private in radius;

public Sphere (int , int x, int y, int z){ radius = r; location = new Coordinate(x,y,z); } } A particular sphere has its own location. Each time we create a Sphere object, we will have a Coordinate object as well. Although we see that an instance of one class (a Sphere) has a reference to an instance of another class (a Coordinate), neither class definition is a more specific version of the other. A Sphere isn't a Coordinate, even though a Sphere has-a Coordinate; a Coordinate isn't a Sphere. Similarly, Sphere isn't a double, even though it may have one to represent the radius.

Inheritance

The classes themselves each represent a type that is entirely separate from anything else: A Sphere object has-a radius, and it has-a Coordinate object, but nothing relates our Sphere class to a Shape class or a Cube class.

We want to start identifying different class definitions that should somehow be linked together, defined somehow via the same code.

When Would I Want Inheritance?

Consider the following three class definitions. We are creating a (very simple) program to use on campus, and want to store information about various people on campus. We have students, employees, and other non-specific persons.

public class Person { private String name; private int age;

//methods: (constructors), getName, setName, getAge, setAge. } public class Student { private String name, major; private int age, masonID, yearsOnCampus;

/* methods: (constructors), getName, setName, getMajor, setMajor, getAge, setAge, getMasonID, setMasonID, getYearsOnCampus, setYearsOnCampus. */ }

public class Employee { private String name, jobTitle; private int age, masonID, yearsOnCampus;

/* methods: (constructors), getName, setName, getJobTitle, setJobTitle, getAge, setAge, getMasonID, setMasonID, getYearsOnCampus, setYearsOnCampus. */ }

We see some overlap in these definitions: all of these classes have a name and age. Some of the classes have masonID and yearsOnCampus variables. And a few other variables are unique to different classes (major, jobTitle). Now imagine that we added constructors, getters, and setters for each of these instance variables. We'd see more and more duplicated code, both in data and behaviors (methods).

We can create multiple objects from each of these classes:

Student s1 = new Student( "Bob", "art", 19, 71922, 2 ); Student s2 = new Student( "Helen", "education", 18, 71923, 1 );

Person[] people = { new Person("A",1) , new Person("B",2) , new Person("C",3) };

Employee e1 = new Employee ("Catherine", "cashier", 35, 71985, 7);

But there's a problem: we can't actually use these definitions together very well. Suppose we wanted to represent the people in line at the cash register in the food court:

Person pers = new Person ("Sally", 25); Student stud = new Student ("Buddy", "undeclared", 21, 71234,2); Employee emp = new Employee ("Doug", "HR", 41, 72468, 10);

SomeType[] customers = {pers, stud, emp}; // BAD CODE ????????

What type should we put for SomeType? We can't choose any of Person, Student, Employee, because there's something in the array that doesn't match the type.

The problem is the lack of a relation between these different types: there is no way for us to treat a Student like a Person. Even though a Student has all the instance variables a Person has, and has all the methods a Person has, we can't treat a Student like a Person. Java has no idea that it is safe to treat a Student like a Person. How impersonal! Similarly, we can't treat an Employee like a Person either. Doug in HR is going to hear my complaints about this work environment.

We want to give Java the information so that these different classes are linked together: we want to be able to say a Student IS-A Person, and that an Employee IS-A Person.

So: we want to relate these different class definitions by actually defining one class in terms of another. We want to reuse the Person definition, and extend it to tell Java how a Student is a more-specific version of a Person, and how an Employee is a more-specific version of a Person. public class Person { private String name; private int age; //methods: (constructors), getName, setName, getAge, setAge. } public class Student extends Person { // we inherit name and age. private String major; private int masonID, yearsOnCampus;

/* methods that we write: (constructors), getMajor, setMajor, getMasonID, setMasonID, getYearsOnCampus, setYearsOnCampus. */ } public class Employee extends Person { // we inherit name and age. private String jobTitle; private int masonID, yearsOnCampus;

/* methods: (constructors), getJobTitle, setJobTitle, getMasonID, setMasonID, getYearsOnCampus, setYearsOnCampus. */ }

Let's stop and think a moment about our two uses of extends Person.

Student extends Person. By adding those two words (extends Person), we are telling Java that a Student is a Person. That means that all the data a Person has, a Student has that data, too. All the behaviors that a Person can exhibit, a Student can exhibit too. We don't explicitly list String name and int age in the Student class, but each Student object will have these instance variables, because a Student is a Person and a Person has a name and age. Similarly, all the methods that were defined in the Person class (e.g., getName, setAge) are also available to Student objects, because a Student is a Person.

When we draw the memory usage for objects on the whiteboard with a circle encompassing all the data and methods, we would now draw a Student object by drawing a Person object, and then adding in the major, masonID, and yearsOnCampus instance variables to the data portion (top half), and adding the extra method definitions to the behaviors section (bottom half). Again, note that all we do with subclasses is add definitions, never take away. This crucial only-adding-things property is why we're able to say that every single Student object can always be used as a Person object. It's as if we temporarily ignore the extra variables and methods that Students have, and just rely on the inherited portions.

In the diagram, we see that the Student object has all the Person fields (age, name), followed by the extra fields specific to Students (major, masonID, yearsOnCampus). Similarly, it has all the Person methods, followed by all the specific Student methods (getters/setters for major, masonID, and yearsOnCampus). If we ignored parts of the Student object, it would look just like a Person object. This is the guarantee that Java is given when we extend a class to create a more specific version of it.

Some Terminology

We have a lot of descriptive ways to describe the relationship between Student and Person. • We can say that the Student class is a subclass of the Person class, or that Person is a superclass of Student. • We can also call Student a child-class of Person, and Person the parent-class of Student. • We can also say that Student is a subtype of Person, and Person is a supertype of Student.

Using the is-a Relationship.

So how can we actually use this "Student is-a Person" relationship? Consider the following legal code (once we've actually written the constructors):

Person p = new Person ("A",1); Student s = new Student("B", "CS", 20, 71235, 3);

System.out.println("p name: " + p.getName()); System.out.println("s name: " + s.getName()); p = s; System.out.println("p name: " + p.getName()); The p variable can only hold a reference to a Person object. But a Student is-a Person, so p can also hold a reference to a Student, because that means it also happens to be storing a reference to something that is- a Person.

Now, back to our register example. We now know that a Person is-a Person, a Student is-a Person, and an Employee is-a Person. So it's safe to use an array of Person references to store our various values:

Person[] customers = {pers, stud, emp};

Your Turn! Create files for all three of the above classes. For now, don't write constructors yet; rely on the annoying default constructors. That means we have to create our objects this way:

Person p = new Person(); p.setName("Bob"); p.setAge("20");

Student s = new Student(); s.setName("Jane"); s.setAge(19); s.setMajor("art"); …

• Caution: if you accidentally add name and age fields to Student or Employee, Java will allow it (it is called shadowing, and is generally a bad idea to use this 'feature'). Make sure that you don't re- define anything that is supposed to be inherited. • Add another class that extends Person. Choose some other type of person that would be on campus: Policeman, Athlete, Professor, or whatever. Even if it seems to overlap with either Student or Employee, it's okay, as long as there's extra data you want to keep track of for this new type of person. • Create objects of each of your class types. (Put this code in TestInheritance.java). Explore using the getters/setters, and see that you do indeed get to use methods declared in the Person class when using a Student object, Employee object, and so on. • Create a toString method for just the Person class, and not the others. Retest the "Using the Is-A Relationship" code, but just print out p and s instead of p.getName() and so on. Notice that even methods like this can be inherited. We'll see how we can do better to represent Student objects than being stuck with the Person version of the toString implementation. Visibility

We finally have a chance to experience the last visibility: protected. Try writing a toString method for the Student class. Print all the relevant info. For example, try to get your toString method in the Student class to output:

Student{name="foo", major="art", age=19, masonID=1234, years=2}

What goes wrong? We get a complaint from the Java compiler that name and age are private. (Assuming you used the visibilities shown above! If the knee-jerk reaction to make everything public kicked in, just go back to Person, make those instance variables private, and then see the child fail to access it). Even though we're inheriting from the Person class, private still means "only accessible in code actually written in this class definition". Rather than make name and age public (which violates encapsulation principles), we are offered a compromise: the protected visibility. This will cause name and age to still behave as if it were private in places such as your testing code (in some other class like TestInheritance), but they will now behave as if they were public when accessed from a child class like Student.

Your Turn!

• Change the visibility of name and age to protected. o First, verify that you still can't access them from TestInheritance. This is because it's not a child class of Person. • Go back and write your toString method in the Student class. Now that they are protected, it's as if they were declared with public visibility as far as the child classes are concerned. o Revisit the "Using the Is-A Relationship" example from above (and still just print p and s rather than p.getName() and so on). Now it's even more apparent that we have a Student object being used where we expect a Person.

Overriding Methods

In the last example, we witnessed something that is actually kind of amazing: the Student class got to override the definition of the toString method. Person already provided a toString method, which used to be what our Student class used. But we decided we could do better, and got rid of that definition in favor of a more specific one.

We can override methods inherited from the parent class, with a few requirements: • We have the exact same method signature, and then we get to supply a different body (implementation) for the method. • The parent class has to give us permission to override a method by not making the method final. Yes, methods can be final, too – it means that the method body cannot be modified. So if we don't want re-implementations of a method in child classes, we just say something like public final int importantMethod(The args) and now overriding is not allowed.

Whenever our code calls this overridden method on objects of our child class, it will always use this 'better' version that we provided, thus overriding the original definition.

Your Turn! • Add the wakeUpTime method to the Person class, and override it in the Student class. (Just return a String for whether this person wakes up early or late; I'm assuming that students like to sleep in, and employees don't get to). • Test it out (call it on a Person, and on a Student). • Now make the Person version of the method final, and look for the compiler error.

We can actually tell Java that we intend to override a method: just put @Override right before a method that is supposed to be on overriding definition. If anything goes wrong, such as not being allowed to override the method, or it turns out that your method doesn't actually override any method, the compiler can now alert you.

when would we not be overriding a method when we thought we were? Consider a method with signature @Override public String tostring() . It's trying to override toString, but neglected to capitalize the S.

Your Turn!

• add a method that isn't inherited to your Student class. Add the @Override tag and see that the Java compiler catches the issue.

Inheritance and Constructors

Let's finally discuss how to make constructors for child classes. As a first step, add a nice constructor to the Person class: public Person (String name, int age) { this.name = name; this.age = age; }

You will also need to change your Person instantiations to look like new Person ("Bob", 20). Try to re- compile. Specifically, try to recompile just the Student class:

javac Student.java

What happens? Java can create Person objects using the constructor you added, but now there's no default constructor, and so we can't even compile the Student class anymore, which was actually relying on it. Our Person class declared that it was no longer acceptable to use the default constructor, and it exposed the fact that our Student constructor actually used the Person constructor to complete the task of constructing a Student object. Let's create a constructor for the Student class.

First (Bad) Attempt. Let's try to just write a Student constructor that doesn't rely on the Person constructor at all.

//BAD attempt at a child class constructor public Student (String name, String major, int age, int masonID, int years){ this.name = name; this.major = major; this.age = age; this.masonID = masonID; this.yearsOnCampus = years; }

This fails; somehow, the Java compiler still wants to find the constructor Person(). It turns out we have to learn another keyword: super. super is similar in nature to this. The keyword this refers to the current object, giving us access to values and methods of the object. super refers to the parent class, giving us access to the members of the parent class which we otherwise maybe couldn't see in the child class. If we want to access anything in the parent class, such as a specific constructor method, we have to use super to access it.

// GOOD attempt at a child constructor public Student (String name, String major, int age, int masonID, int years){ //call the parent constructor super(name,age); //finish instantiating things declared in this class. this.major = major; this.masonID = masonID; this.yearsOnCampus = yearsOnCampus; }

The call super(name,age) is actually calling the constructor of the Person class that accepts a String and an int. The call to super needs to be first in our constructor body so that Java knows what's going on.

Required super calls

Java requires us to use the parent class's constructor when writing our own constructor. This ensures that our claim that "Every Student is a Person" won't be violated by some naïve implementation of a constructor. It turns out that an implicit super() call is added to constructor method of child classes unless we add our own super-call. That's why our first attempt at a Student constructor failed: there was no public Person() constructor definition any more.

By creating a new constructor for Person, and then explicitly calling that new constructor with a call to super(some,args), we are fulfilling the "always use your parent class's constructor" requirement while also completing the instantiation efforts that are local to our own class's instance variables.

Your Turn!

• Create constructors for all classes. Be sure to use the non-default constructor from the Person class by using the correct super call to the Person constructor from your child classes' constructors. • When creating constructors for Student and Employee, remember that you have five instance variables to instantiate (two came from the Person class, three are defined here in the class). • Create all the getters and setters for the instance variables introduced in each class. (Remember, the ones defined in the Person class are inherited in the child classes: you won't write getName in the Student class, but Students will inherit the getName getter that you define in the Person class). Will anything need to be made protected instead of private?

Abstract Classes

One last topic for inheritance is the notion of an abstract class. Abstract classes participate in the class hierarchy of parent and child classes, but by making a class abstract, we are stating that no objects of this specific class may ever be created. (Objects of child classes would be permitted; otherwise, there would be no point!).

Let's consider an example of an abstract class. If we look to our original three classes, we saw a slight chance for more overlap: the Student and Employee classes shared instance variables masonID and yearsOnCampus. Let's create a class to help us indicate that relationship by leaving the Person class alone, but introducing a new class between the Person class and the two child classes Student and Employee: public class MasonPerson extends Person { protected int masonID; protected int yearsOnCampus;

public MasonPerson(String name,int age,int masonID,int years) { super(name, age); this.masonID = masonID; this.yearsOnCampus = years; } } public class Student extends MasonPerson { protected String major;

public Student(String name, String major, int age, int masonID, int years){ //calls MasonPerson constructor super(name,age,masonID,years); this.major = major; } //no more variables, only methods below… } public class Employee extends MasonPerson { protected String jobTitle; public Employee (String name, String job, int age, int masonID, int years){ //calls MasonPerson constructor super(name, age, masonID, years); this.jobTitle = job; } //no more variables, only methods below… } One nice thing is the transitivity of inheritance: because a Student is-a MasonPerson and a MasonPerson is-a Person, we can transitively claim that a Student is-a Person. All the data/behavior that MasonPerson inherits from Person will be passed on to Student.

On to abstract classes. Suppose, for whatever reason, that creating a MasonPerson doesn't make sense: we only want there to be objects of specific kinds of people: Student objects and Employee objects are okay, but the vague idea of a MasonPerson only helps us organize our thoughts; any MasonPerson must actually be some more specific type. We then choose to make the MasonPerson class abstract, indicating that it is illegal to instantiate the class: public abstract class MasonPerson extends Person { … }

Your Turn!

• Add the MasonPerson class, which extends the Person class. o Add instance variables masonID and yearsOnCampus. o Give it a constructor, using the Person constructor. • Change Student and Employee to be child classes of MasonPerson. Update their constructors to use the MasonPerson constructor. • In your TestInheritance class, create a MasonPerson object and test it a bit. • Make MasonPerson an abstract class; re-compile TestInheritance and see the error message. you've successfully created an abstract class that contributes to the class hierarchy, yet is safely not instantiable (we can't create objects of it).

(keep going! Big chart + description all on the next page ) Another Abstract Classes Example

An alternate example could be a program representing a zoo, where we use the Kingdom-Phylum-Class- Order-Family-Species hierarchy to organize our classes of animals. We want to have classes for Monkey, Orangutan, Panda, Dolphin, Penguin, Dove, Frog, Alligator and more animals. It makes sense to add classes to generate the following hierarchy:

It makes sense to have Monkey, Panda, and Dolphin objects, but it doesn't make sense to have a Mammal object or a Chordata object. Those classes just helped us organize things, and maybe provide definitions like numVertebra, numChromosomes, eggSize, and so on. If we make all the blue classes abstract, they still participate in the hierarchy of types (and can contribute fields and methods for inheriting), but are guaranteed to not be instantiated themselves.

More Abstract Things We can use the abstract keyword for methods, too. What is an abstract method? It is a method signature with no body: public abstract int reportNumberOfLegs(); public abstract void moveAround();

Any time we want to indicate that every child class of our abstract class must have its own version of a method, yet there's not good starting implementation right now in the abstract class, we can create an abstract method to require that all child classes provide implementations (by overwriting them). If they don't, they become abstract as well, guaranteeing that this method will be implemented for any actual instance of a class that is a child of the abstract class that introduced this abstract method.

• Any class containing an abstract method must be abstract as well. • Any class that inherits an abstract method can implement it by overwriting that definition. • Any class that doesn't implement an inherited abstract method therefore still contains an abstract method, and is thus also abstract.

Your Turn!

• Now that MasonPerson is an abstract class, add an abstract method to it, such as:

public abstract String favoriteFoodSite();

• Try instantiating a Student and see that it's now becoming abstract (well, we see compilation erros complaining to that effect). • Override the abstract method that was inherited, and now we again are able to instantiate our Student class.

Chapter 8

Interfaces

Hello!

The next few topics will be interfaces, enumerations, and exceptions. Interfaces and enumerations both help us introduce new types (similar to how classes did for us previously). Exceptions will be explored in more depth soon, which rely heavily upon types for us to differentiate between different exception types.

Interfaces

Java does not allow multiple inheritance; that is to say, each class has exactly one parent class. If we wanted to have multiple inheritance – say, that a Cow class that is an Animal, is Food, is Sellable at auction – we would not be allowed to extend all three of those classes. In reality, it's not that a Cow really is multiple things at once; it's that a Cow is one thing (an Animal), and yet we can interact with it in other specific ways: eat it, sell it, ride it, and so on.

What we really want, instead of multiple inheritance, is the chance to interact with an object in some extra way. We expect extra behaviors to be guaranteed by the objects of this class. Behaviors mean methods, so the real goal here is to be guaranteed that certain methods are available for all things that could be eaten, or all things that could be sold, or all things that could be ridden (be it a cow, a car, a wave, or a bike).

An interface is a grouping of abstract methods that any class may implement. • A class implements an interface by overriding (implementing) every single method of the interface. • The methods are abstract, because we expect the various classes to provide the definitions. • We can think of an interface as a contract: any class that implements all the methods of the Fooable interface can behave like a Fooable thing. • An interface is a type.

Creating an Interface

We create an interface in similar fashion to creating a new class or enumeration (explored in the next section): we create a separate file with the same name as the interface and place the definition inside.

public interface Zippable { public abstract boolean isClosedZ(); public abstract void openZ(); public abstract void closeZ(); } • A quick note on naming interfaces: because the purpose of an interface is to be able to interact with it in some fashion (by calling certain methods), the names might tend to be Somethingable: Sellable, Serializable, Comparable, and so on. It's just a convention, but it does help emphasize that we can interact with the objects of this class in another way. • Every single method in the interface must be abstract, so you can actually safely omit the abstract keyword here. But the abstract keyword is still required for abstract methods in classes, so you can simplify your life and always write abstract if you'd like.

Your Turn! • Create an interface named Listenable. Give it abstract methods for listen() and ignore(). Choose what parameters or return types you feel are appropriate. • What various classes could be Listenable? Try to come up with at least two examples.

Implementing an Interface

Now that we have an existing interface, we can cause any class to implement it by adding implements Zippable to the declaration, and then by overriding (implementing) every single method that was listed in the interface.

For our Zippable example, this means we must implement isClosedZ, openZ, and closeZ.

public class Mouth implements Zippable {

// the class has its own fields public boolean lipsOpen;

// the class has its own constructors, other methods, etc. public Mouth (…) {…} public void eat(Food f) {…}

// the class implements all methods of the Zippable interface: public boolean isOpenZ() { return lipsOpen; } public void openZ () { lipsOpen = true; } public void closeZ () { lipsOpen = false; } }

public class Purse implements Zippable {

//the usual parts of a class definition: fields, methods, etc. public Zipper z; …

//now, we implement all methods from Zippable: public boolean isOpenZ() { return z.isOpen(); } public void openZ () { z.open(); } public void closeZ () { z.close(); } }

Your Turn! • implement your Listenable interface with both of your example classes. (Perhaps SignificantOther, Record, Ocean, or Phone? Friend, Roman, Countryman?) Using An Interface Implementation

Now that classes Mouth and Purse have implemented Zippable, we can now use the Zippable behavior whenever we have a Mouth object or a Purse object. Remember that we stated an interface is a type. This means we can use the interface wherever a type was required, such as at declaration time for a variable or parameter.

Mouth m = new Mouth(); Purse p = new Purse();

m.closeZ(); p.openZ();

if ( m.isOpenZ() ) { System.out.println("my lips are not sealed! :-O"); m.eat(new Food("potato chip")); // pretend the Food class exists... }

// We can create a Zippable variable. Zippable z = m; z.openZ();

z = p; z.closeZ();

We can also use the interface as a type for parameters to methods:

public void closeIfNeeded(Zippable z) { if (z.isOpenZ()) { z.closeZ(); } }

• Although there is no Zippable class, and thus no instances (objects) exactly of type Zippable and no Zippable constructor, we can create objects of classes that do implement Zippable, and use references to these objects as the Zippable actual parameters. • By choosing the Zippable type for the parameter, all we can do with it is call the methods of the Zippable interface on the object. We have no idea what else might be available other than those methods found in the interface.

Mouth m = new Mouth(); closeIfNeeded(m);

Purse p = new Purse(); closeIfNeeded(p);

Your Turn! • Create objects of the classes that implemented Listenable. Store them in variables of their own class types. • Call the Listenable methods on these objects. • Create a variable of type Listenable; store your various objects that are Listenable into it. • Call the Listenable methods on your Listenable variable. This is all you can do with the Listenable variable. • Create a method named performCustomerSupport that accepts a Listenable thing, and always ignores it. • Create an array of Listenable objects. Use a for-each loop to listen to each thing in your array.

Implementing multiple Interfaces

A class can implement multiple interfaces: we just add the interface names in a comma-separated list after the implements keyword, and then provide all the methods of each interface that is being implemented.

public class Foo implements A,B,C { //Foo stuff // A methods here // B methods here // C methods here }

Example Java Interfaces

Java uses interfaces in a couple of interesting ways. Two interfaces we will consider are Comparable and Iterator.

Comparable is used to order values. Think of it as a way of answering the question "which one is greater?" by encoding the answer as a number. Comparable has one method:

public interface Comparable { public int compareTo (Object other); }

If an invocation (such as a.compareTo(b) ) returns a negative number, it implies a "less than" relationship (ab). And if the result is zero, then it implies "equal" (a=b).

It is up to the class designer to decide what constitutes "greater than", and then implement the compareTo method accordingly. We might decide that our Square class will implement Comparable by comparing the sizes:

public class Square implements Comparable {

public int side;

public Square (int side) { this.side = side; }

// implement all (1) Comparable methods. public int compareTo(Object so) { // parameter needs to be Object. Square s = (Square) so; // we cast it to our desired type. if (sides.side) { return 1; } else return 0; }

Your Turn!

• Implement Comparable in any class. Consider the different ways you might want to define the relation: for Cars, is the mpg all that matters? The top speed? The maximum passengers? It's often obvious what the relation should be, but in practice whatever is the most meaningful for the program (and any future programs using this class) is what should dictate the decision.

The Iterator interface provides three methods: public boolean hasNext(); // does this collection of values have any more values? public Object next(); // assuming there's another value, get the next one. public void remove(); // remove the item that the previous next() call returned.

If we were to create any type of structure where we wanted to allow some internal values to be regularly accessed, we could just implement the Iterator interface, and then the for-each loop syntax would be readily available! We will learn how to make some basic data structures later on, and so we might have a chance to implement Iterator before the course ends on a realistic data structure.

Your Turn! • Create a class named SizeTen that has a field of type int[] which always has ten values in it (enforce this in your constructor). • Make this class implement Iterator. You can actually ignore the remove() method when removal doesn't make sense, so just implement hasNext() and next(). • Test out your SizeTen class by writing a for-each loop:

for (int i : mySizeTen) { …use i… }

Chapter 9 Enumerations

The Need for Something New When writing programs, we occasionally come across a small set of values that we need to represent. Some examples:

the days of the week the model# of car the color of a traffic light the flags (options) that may be enabled or disabled for our program.

Although we could just do something simple, such as choose integers for each value:

public final int RED = 0; public final int YELLOW = 1; public final int GREEN = 2;

There's nothing to stop us from saying RED+1, or get mistakes such as this:

myColor = 5; if (myColor ==RED) { car1.waitForGreen(); } else if (myColor == YELLOW) { car1.clearIntersection(); } else //assuming green, but we have a non-color! { car1 . enterIntersection(); }

We might choose instead to use Strings, like "red", "green", "yellow". We still have no control over the values: what if we end up with "Red"? Or what about String values like "pink", "", "gray" vs "grey", and so on? Although we've chosen a type that has enough values for us to stand in for each of our values, we are forced to have more values than we want to deal with, and we have to allow all the operators that already exist (over ints or String, for instance).

We want a way to explicitly list the values of our new type, rather than re-use another type as a proxy for the values we actually have in mind. We want to enumerate the values.

The Answer: Enumerations

An enumeration defines a new type (much like a class definition), and the enumeration definition explicitly lists all the values of that type:

public enum Day { SUN, MON, TUES, WED, THURS, FRI, SAT }

public enum Simpson { BART, LISA, MAGGIE, HOMER, MARGE }

public enum Digit { THUMB, INDEX, MIDDLE, RING, PINKY }

public enum Ghost { BLINKY, PINKY, INKY, CLYDE }

public enum MyBool { TRUE , FALSE }

Because the values are unchanging representations of one value, we name them with the same convention as final constants: ALL_CAPS_WORDS_SEPARATED_BY_UNDERSCORES.

Just like classes, we place an enumeration in a file by itself, named as .java. For example, we would have Day.java, Simpson.java, Digit.java, Ghost.java, and MyBool.java above.

Just like classes, we can use an enumeration any time it is visible; placing the enumeration in the same folder means that it will be in the same package, and thus available in other classes in the package. We could also state that the enumeration exists in a specific package, and then import the enumeration through a package import statement.

Usage

Since an enumeration is a type, we can create variables that store one value of the type. We refer to the values in a fashion similar to static variables:

EnumerationName . VALUE_NAME

public class TestEnums { public static void main(String[] args){ Day day = Day . SUN; Simpson simp = Simpson . BART; Digit finger = Digit . PINKY; Ghost ghost = Ghost . PINKY;

System.out.println(day+"\n" + simp+"\n" + finger+"\n" + ghost );

We can use switch statements with enumerations (as of Java 1.7). Because the expression we're matching indicates the enumeration type that we are using, and each case must have a value of that type, we do not state the qualified name (Ghost.BLINKY), just the value (BLINKY).

String style= ""; switch (ghost) { case BLINKY: style = "chaser"; break; case PINKY: style = "ambusher"; break; case INKY: style = "fickle"; break; case CLYDE: style = "stupid"; } System.out.println("ghost: " + ghost + "; style = " + style); } }

Your Turn! Create an enumeration from any of the following: o characters in Super Mario Bros. o the five stages of grief (denial, anger, bargaining, depression, acceptance). (Formal name: the Kübler-Ross Model). o types of coins (penny, nickel, dime, quarter…) o make up your own enumeration! In another file, create a variable (or many) that can hold a value of your enumeration type. Store one of the enumeration's values into it, and print it. Use your enumeration in a switch statement. Use your enumeration in an if-statement, checking e.g. if (day==Day.MON). Does this work, or will you need to rely on the .equals() method? (Is .equals() even available? Try it out! That was defined for classes via the Object class being the ancestor of all classes, but this is an enumeration…).

Enriching Our Enumerations

Java's enumerations are actually much more useful than just listing out values. It turns out that enumerations are implemented as a special, restricted usage of classes. Your enumeration implicitly inherits from java.lang.Enum, which is a class.

So, what class-like features can we add to an enumeration definition? fields constructors (private/default-visibility only) methods

Fields and Constructors in Enumerators In the above example, we had to use a String to represent the style of attack for each Pac-Man ghost. It would be nice if each value in the Ghost enumeration had an associated String value to describe its attack style.

We will add a field to the Ghost enumeration definition, so that each ghost has its own movement style. However, we will still only be able to access the values through EnumName.VALUE_NAME as before.

Because the values are supposed to be unchanging and fully listed out, though, we can't call a constructor to create them – there is exactly one of each value, always. Instead, we will add a private constructor to deal with setting up our newly added field, and we will also add constructor calls for each value. By adding all three things at once (field, private constructor, constructor calls for each value), we achieve the desired effect:

public enum Ghost { /* we add in parameters to call the constructor. These look odd as constructor calls without using the Ghost identifier, but that's what they are. */ BLINKY("Blinky","attacker") , PINKY ("Pinky", "ambusher") , INKY ("Inky", "fickle") , CLYDE ("Clyde", "stupid");

// all 4 ghosts have their own movementStyle value. private final String name; private String movementStyle;

//constructor: set up the field(s) for each Ghost value. private Ghost (String name, String movementStyle) { this.name = name; this.movementStyle = movementStyle; }

We can also add methods to our enumeration definition. The methods may be any visibility, and may or may not be static.

//public methods are available.

//getter (but no setter) for private final field. public String getName(){ return name; }

//getters and setters for non-final field: public String getMovementStyle(){ return movementStyle;} public void setMovementStyle(String newval) { movementStyle=newval; }

//because we implicitly inherit from java.lang.Enum, //we have Object methods available to override: public String toString() { return name+"("+movementStyle+")"; } }

We can now add to our testing code (TestEnums.java).

Ghost g1 = Ghost.BLINKY; //no constructor call – just name. System.out.println("g1 is named " + g1.getName()); System.out.println("g1 toString: "+ g1); g1.setMovementStyle("stationary"); System.out.println("g1 toString: "+ g1);

We can also use an enumeration as an Iterator, by getting an array of all the values. The static method values() returns an array containing all the enumeration's values. It's as if the following definition was provided explicitly in our enumeration definition:

// this values definition is automatically made for you. public static Ghost[] values () { Ghost[] ghosts = {BLINKY,PINKY,INKY,CLYDE}; return ghosts; }

Indeed, the Java compiler adds this definition to your enum definitions every time.

Let's test out our code:

System.out.println("Testing for-each Ghost.values():"); for (Ghost gval : Ghost.values() ) { System.out.println(gval); }

Your Turn! Add a private field to your enumeration. Add a constructor method (or many, via overloading) that will set that field. Add a parameters list to each listed value to use a constructor method definition that you just added. o Make sure your code still compiles! Add a getter method for your field. Add more testing code that now uses your getter method to see that the enumeration values do indeed have their own private field values, which were privately declared inside the enumeration definition and instantiated using the private constructor definition.

Other than working with the explicit list of enumeration values only (instead of an arbitrary number of objects of a class type), and needing to make our constructors private (or package-private), we essentially are writing a class with special restraints. We can even add a main method to an enumeration and use that as our entire Java program, just like any class's main method.

Chapter 10 Exceptions

Introduction Sometimes something goes wrong in attempting to perform the calculations that our code represents. Here are some examples:

we try to open a file that doesn't exist for reading we try to access an element of an array beyond the last element (out-of-bounds index) we try to treat null like an actual object, and access members (fields or methods) in it. we try to divide by zero we try to parse a String to an int via Integer.parseInt(strExpr), but there's no int represented in the String.

In each of these situations, there's not really a best action for the program to automatically take. Even if there were, there are legitimate reasons to prefer an alternative response. In these situations, rather than try to handle this situation, the program quits its normal control flow, and instead raises an exception and immediately leaves each block of code that was running until the exception is handled.

Rather than just have a single String represent the error, Java (and many other languages) creates a hierarchy of classes that represent different classes of errors via inheritance. Here are a few of those exception classes:

Exception Class Name //Indication or Notes

java.lang.Throwable //(implicitly inherits from Object). • Exception • RuntimeException //for recoverable events. • NullPointerException //using null like an object • ClassCastException //cast to class-type that wasn't possible • IndexOutOfBoundsException • ArrayIndexOutOfBoundsException • StringIndexOutOfBoundsException • ArithmeticException // bad arithmetic, like "divide by zero" • IOException • FileNotFoundException //attempted to open non-existing file • EOFException //end of file reached (no more content) • Error //unrecoverable events: e.g., out of memory

For example, Exception extends Throwable, ArithmeticException extends RuntimeException, EOFException extends IOException, and so on. Because these are classes, each of them is a different type, and we will use these types in our code to be as general or specific as we'd like: we can deal with any IndexOutOfBoundsException, or just NullPointerExceptions, or all exceptions at once with the Exception type.

When do exceptions occur? Exceptions can occur in almost any place:

1. in expressions that fail: 5/0 (divide by zero: ArithmethicExpression ) xs[i] (when i is invalid index: ArrayOutOfBoundsException ) sq.side (when sq==null: NullPointerException ) (Square) "2x2" (casting, String is not a Square: ClassCastException ) 2. within methods that we call: any method that contains expressions like the above Integer.parseInt("no numbers here!") (NumberFormatException) Scanner sc = new Scanner(new FileReader("foo.txt")); (FileNotFoundException) 3. in expressions that purposefully raise an exception: throw new Exception("something went wrong!"); throw new ArithmeticException("needed an even integer, found " + x);

In all of these cases, the normal notion of what instruction should run next is abandoned, and instead the block of code being called is immediately exited with the exception value instead of whatever return value was expected, and the method that called that method is also exited with the exception value, and so on, until the main method (the whole program) crashes.

Your Turn! Try using expressions that divide by zero, use an invalid index, use a null pointer as if it pointed to an object, and see the exceptions crash the program. Try calling the Integer.parseInt method and creating a Scanner on a non-existent file to see the exceptions crash the program. Notice how we get different information, and different types of exceptions, for each case. Create your own exception values, and throw them: just like the third group of example sources of exceptions, try creating objects of the ArithmeticException class and the NullPointerException class. You might need to look them up to find out what parameters are needed for the constructors (google "java ArithmeticException"), but often there's a single-String-argument constructor.

Catching Exceptions We can stop this program crash from happening, using a try-catch block. Whenever a block of code is capable of causing a certain kind of exception, we can wrap that block up in a try block, accompanied by a catch block that addresses that specific type of exception.

try { int[] xs = {5,3,2,4,6}; int x = xs[23]; } catch (ArrayIndexOutOfBoundsException e) { System.out.println("whoops! bad index."); }

Of course, we're just using printing to make it convenient to see where program flow actually goes, but it would be normal to have no printing inside the catch block, and instead just have code that addresses the situation.

As you read the documentation for various classes, you might find notes that a method may throw certain exceptions. This is your cue to decide if you want your code to handle that exception or not. (We'll learn what happens if you don't in a moment).

Your Turn! Given the examples of exceptions above, and the type of exception listed next to it, try to catch the exception in each case: o division by zero o accessing a null pointer as if it referred to an actual object value o opening a file that doesn't exist

Multiple catch blocks A single try-block may contain code that could cause a variety of different exceptions. We don't have to zoom in and wrap each single statement in a try-catch block of a different variety.

public static int foo (int[] xs, int ix, int factor, Square sq) {

try { int temp = xs[ix]; // ix might be an invalid index. xs might be null. sq.side = temp / factor; // sq might be null. factor might be zero. System.out.println( (Square) (Object) sq.side); // class cast exception. } catch (ArrayIndexOutOfBoundsException e) { System.out.println("uh-oh index: " + e); return -1; } catch (NullPointerException e) { System.out.println("booo, null! " + e); return -2; } catch (ArithmeticException e) { System.out.println("uh-oh, math. " + e); return -3; } catch (Exception e) { System.out.println("I forgot to specially handle this: " + e); return -4; } return sq.side; }

The single try-block of this method can cause ArrayIndexOutOfBoundsException, NullPointerException, ArithmeticException, and ClassCastException exceptions. We can separately try to catch any or all of these. We do so for the first three (array index, null pointer, divide by zero).

However, we also write a general catch block to catch any Exception value. Due to inheritance, this Exception catch-block could actually catch all the other exceptions. If we wanted to perform the same action regardless of which exception occurred, we could have used just this one catch-block.

Being able to include catch-blocks for exception types that are parent classes and child classes of each other allows us to create filters of increasing generality that will catch any exceptions as necessary. It is important to note:

Only one exception can occur in a try-block, and then only the first catch-block that matches the exception's type will be executed.

So, even though a NullPointerException occurs, because we had the NullPointerException catch-block before the Exception block, only the NullPointerException catch-block will run. If we instead had placed the Exception catch-block first, it happens to match all the other more specific exceptions, and when listed first, it would be the only one that would ever run.

Your Turn! Write a method that accepts an Object parameter, casts it to a Square, and returns its side. Use a try-catch block to catch both the NullPointerException and the ClassCastException. o Feed testing inputs to trigger both catch-blocks. The following code will open a file, read a line, and close it. It can cause an exception, though, when the file doesn't exist. Run the code to find the exception type (be sure to use a filename that doesn't exist!), then add a try-catch block to keep the program from crashing.

Scanner sc = new Scanner (new FileReader ("foo.txt")); System.out.println(sc.nextLine()); sc.close(); Modify the previous task's code to ask the user for the filename in the first place; as often as the user gives a non-existent file name, keep looping over and over to ask them again until a valid filename is given without letting the program crash. Use a Scanner to read an int from the keyboard via standard input (System.in). As often as the user types a non-integer in the input, keep gracefully asking again until you successfully read the int. finally Blocks In addition to try and catch blocks, we can also add finally blocks. A finally block is placed after the last catch block. It always runs:

If the try block is successful with no exceptions, it runs. If the try block fails with an exception and the exception is caught, it runs. If the try block fails with an exception and it is not caught, the finally block runs.

Consider the following code:

public static void finallyMethod (Square s) { try { System.out.println(5/s.side); } catch (ArithmeticException e) { System.out.println("caught infinity."); } finally { System.out.println("finally."); } }

Your Turn! Test the above code with a Square that: o has a non-zero side length o has a side length of zero o is null → in all cases, does "finally" print? → the finally block runs before a propagated exception proceeds to propagate. throw another exception in the finally block. Re-test the three examples. What happens now? Which exception propagates, the original one or the new one? o Lesson: make your catch and finally blocks simple enough that they don't also throw exceptions.

Propagating Exceptions If we don't want to handle an exception (or don't know what to do), we can also allow the exception to crash its way out of its current location. Even if the suspicious code is in a try-catch block, the particular exception might not match the types afforded by any of the catch-blocks. Or, we might not even wrap the offending code in a try-catch block at all.

There are two groups of Exceptions: those that can silently be propagated, called "unchecked exceptions", and those that must explicitly be announced as propagated, called "checked exceptions".

The unchecked exceptions are the Error class, the RuntimeException class, and any children of these two classes. The Error class represents things that can't reasonably be recovered from within a program – if the hard disk is failing, or the program has run out of memory, or some other physical issue happens to have arisen, the program's crashing is likely the least of the user's concerns. From the other direction, RuntimeException exceptions represent common events that are likely recoverable – if we find a null pointer, perhaps we know what to return instead; if we find that a class cast conversion didn't work, perhaps we know what to do instead. For these alternate reasons, we don't have to explicitly list when we aren't handling these types of exceptions; indeed, we've been silently propagating these exceptions throughout our code all along.

In essence, unchecked exceptions usually represent bugs in the code, and if the programmer sufficiently tests the code, these should no longer happen. Whenever they do, it indicates that the program wasn't written to handle all the corner cases, but that it could be edited to handle this situation. It would be an immense nuisance to have to explicitly handle all possible exceptions – it would destroy the readability of almost everything.

The checked exceptions are all others: things like FileNotFoundException and other IOExceptions may be the most common checked exceptions you will experience. If a method attempts code that can throw one of these exceptions, we must explicitly confess that the particular method may result in one of these exceptions rather than completing with regular control flow: public static int readNumFile (String filename) throws FileNotFoundException { Scanner sc = new Scanner (new File (filename)); return sc.nextInt(); }

Notice that this example is trying to read a file, which might not exist. So we add "throws FileNotFoundException" to the method signature, after the parameters. This method also could throw a java.util.InputMismatchException, which is a child class of java.util.NoSuchElementException, which is a child class of java.lang.RuntimeException. Thus it is an implicit exception and we didn't have to list it.

In practice, you will perhaps find the Java compiler generate an error/complaint when you use code that is capable of generating a checked exception, but you forgot to explicitly list it in the method signature.

You can explicitly list unchecked exceptions too; the compiler allows it, and it can serve as extra documentation about the likely behavior of a method.

Your Turn! We don't have many checked exception types that we can run across yet (but we'll see one more IOException when we look at file I/O). So there's not much to do here yet. Write your own method that opens a file and assumes it exists, and keeps reading numbers from it until the end of the file is reached. What exceptions will you explicitly propagate? What other exceptions will you find on the way, and then handle with a try-catch block? Writing an optimistic version of your program and letting it crash is a simple way to discover the correct exception type.

Creating Your Own Exception Classes Exception classes are just that: classes. If you want a more specific exception, you can just extend the specific class you want, perhaps add your own fields, add a constructor or two, and then create them.

public class MyEx extends Exception {

public int x;

public MyEx (int x) { this.x = x; }

public String toString () { return "MyEx: "+x; } }

Although only extending some exception class is necessary to create your own exception class, it is good to extend the most specific exception class you can. Also, it is good to always provide the toString method. If you extend a RuntimeException class or an Error class, your class will also be an unchecked exception.

Throwing Exceptions Manually. You can now create objects of your classes that happen to be exception classes, and then throw them:

MyEx me = new MyEx(5); throw me;

This code could then show up inside a try-catch block (catching a MyEx), or since we elected to directly extend the Exception class, we could also explicitly propagate this checked exception in the method signature.

Given this functionality, you can now approach a programming task by thinking, what hard-to-solve situations might arise here? Can I just create my own exception class, throw one as necessary, and regroup/fix the situation from somewhere else in the code? This is particularly useful if the tricky/confusing/bad situation occurs in one place in code, and the needed information to solve the situation occurs somewhere else in the code.

Your Turn! Create your own exception class, named NegativeIntegerException, which is a child class of the ArithmeticException class. Give it meaningful fields, a constructor, and an overriden toString method. Write a small program that gets a number from the user via System.in and a Scanner. If they enter a negative number, create one of your exception objects and throw it.

Continued Propagation Some of the exceptions we've caught were actually generated inside other methods. It is possible for an exception to be uncaught from method to method, but if the exception were a checked exception, each method would have to explicitly list that it throws the exception that it is choosing to allow to propagate beyond its scope.

Your Turn! Write three methods that call each other named methodA, methodB, and methodC. Have the innermost one throw one of your NegativeIntegerExceptions, and let the other two explicitly throw it beyond their own scope. Now, have the outermost method wrap the call to the middle method in a try-catch block that catches NegativeIntegerExceptions. → At each stage, your methods can choose to catch some exceptions, and propagate others. But the checked exceptions must be listed, always. Chapter 11 Command-Line Arguments

Command Line Arguments We now turn our attention to an alternate means of gaining input to our program. Whenever we run a program:

demo$ java MyProgram

We actually have the chance to pass as many String values in as we want:

demo$ java MyProgram str1 str2 str3 …

We receive these strings through the String[] args parameter of the main method:

public class TestCLArgs { public static void main (String[] args) { System.out.println("Received command line args: "); for(String arg : args) { System.out.println(arg); } } }

Copy the above code into a new file, and then try running it at the terminal:

demo $ javac TestCLArgs.java demo $ java TestCLArgs hello world 1 2 3

The strings are separated by spaces. If you want a space character as part of a string, just surround the entire string value in quotes (single or double will work here):

demo $ java TestCLArgs "light blue" red 'hot pink' "1 2 3"

No matter what you type after java TestCLArgs, you will get them as String values. If you want to read in numbers, you must convert from a String to the appropriate numeric type:

public class TestCLArgs { public static void main (String[] args) { // if there aren't exactly three args, print an error message and quit. if (args.length !=3) { System.err.println("usage: java TestCLArgs # # #"); System.exit(1); }

int a = Integer.parseInt(args[0]); short b = Short.parseShort(args[1]); double c =Double.parseDouble(args[2]); boolean result = (double)a/b == c; System.out.println("a="+a+"\nb="+b+"\nc="+d+"\nresult="+result); } }

We see that we can convert from String to any other basic type through the wrapper classes' parseX methods.

Digression #1 We see that we can print to the error-reporting stream (System.err) as easily as we print to standard output (System.out). Although this output is usually just shown on the terminal window intermingled with System.out's output, they are two separate streams of characters that can be dealt with separately. Any message that relates to program malfunction ought to be sent to System.err, while expected behaviors should result in output to System.out. Even "whoops, try again!" should be sent to System.out: we expect users to see this message and then type better input next time; that is a normal behavior of the program.

Digression #2 We also see how to immediately exit a program: System.exit(someIntValue). An exit value of zero represents normal termination (success); any other int value indicates abnormal quitting, and the chosen number should mean something to the programmer to indicate what sort of abnormal reason for quitting occurred.

Your Turn! Write a program that accepts an arbitrary number of integers on the command line, and then prints their sum and average. Write a program that accepts exactly 5 String arguments (or else it quits with an error message), and then checks against a secret password String that is only written in the source file. Either print "You said the password, among other things!", or "you didn't say the password." Questions: o What are some key differences between using command line arguments versus the keyboard? Versus reading from a file? o When is each a better choice? o Does using the command-line redirection operator (<) still work with command-line arguments, or not? What is an example where you would even want to use both? Construct a sample call using both, or describe how you would work around not being able to use both, based on what you find out. (recall that on the command line, we can direct the contents of a file to be used as the "keyboard input" that System.in reads. This is done such as java MyProg < readFromHere.txt ). Larger Example: Create an enumeration with three values: sparse, normal, verbose. Allow a flag to be passed in through a command-line argument that is either –s, -n, or –v to represent each value. Write a program that now gives sparse, normal, or verbose messages each printing time. You could even have a method that is always given three messages, and prints the correct one based on the current verbosity. → with different ways to represent it, many programs allow for "verbose output." Though this is perhaps not a common way to implement it, the same idea of varied levels of printed chatter definitely shows up in many command-line-based programs.

Chapter 12 Testing with JUnit

Hello!

The purpose of this chapter is to explore unit testing with JUnit.

Unit Test Cases Writing test cases is a vital part of software development. Whether we create whole-program tests, manually and interactively test the program, or write small test cases for the smaller pieces of code, we should always be actively seeking whether our code behaves correctly in all expected scenarios.

Unit tests are test cases that are trying to test as small a part of code as manageable. In Java, this tends to mean that one method's behavior is under scrutiny. We can still write different kinds of unit tests, for instance: - testing the expected uses with various inputs and checking the outputs (black box testing) - writing tests that should utilize specific lines of code from the implementation (whit box testing)

We also want to be able to re-run all our tests at will, so that if we make even the smallest change, we can re-check that we didn't break any of the functionality that any passing test cases had witnessed in the past. By writing our tests as actual code that can be executed in a batch ,we are performing regression testing. junit is a Java package that provides the framework for both writing unit test cases as well as a controlled means for automatically running all these tests (also as regression tests), recovering from exceptions as needed, and reporting the results of pass/fail for all tests.

In this lab, we will examine some existing code and JUnit test cases to learn more about how to use this particular package. If you find yourself writing code in other languages, the same opportunities should be available – writing small tests; performing regression testing; and quite likely, some other semi-standardized unit testing facility or libraries exist. So take the ideas of unit testing with you far beyond the end of this semester, and save yourself oodles of time – test as you go!

"Installing" JUnit In order to write test cases using JUnit, we need to have the Java package that provides all of the junit testing pieces available on our classpath when we compile. Grab the file junit-4.11.jar, which should be available next to our lab. (Or if you can download a later version online, that should be equally applicable).

We can always just manually add it to our classpath when we compile and run our tests (shown Mac-style, don't forget to use ;'s on PC): demo$ javac –cp .:/Users/me/path/to/junit-4.11.jar MyTestsClass.java demo$ java –cp .:/Users/me/path/to/junit-4.11.jar MyTestClass

JUnit is a convenient-enough thing to always have around that you might think you want to make it automatically show up on your classpath. There are ways to extend your CLASSPATH environment variable to include some directory where you can place jar files like this one, so that they are always available for any use of Java. But there's a decent argument to be made that jar files should not be made globally accessible, as this could break the functionality of already- installed Java applications and other system uses of Java. (The alternative would then be that packages are copied per-project, or managed through some other heavier-weight approach such as Ant or Maven).

So we will just keep using the –cp approach. You can copy junit-4.11.jar to each project and then have a simpler invocation if you'd like, such as javac -cp .:junit-04.11.jar *.java.

Getting Started: Some Sample Code to Test

In the supplied code, there are two files: SomeCode.java, and TestSomeCode.java. Load them both up and inspect them. SomeCode.java includes various method definitions that are perhaps correct (no. They are all wrong.).

TestSomeCode.java includes @Test methods that will convince us whether the methods of SomeCode.java are written correctly or not.

Testing the isPrime Method In SomeCode.java, the isPrime method checks whether its int parameter is prime or not, and returns a boolean. We see a few unit tests over in TestSomeCode:

@Test public void isPrime_1(){ assertEquals(false,SomeCode.isPrime(21)); }

@Test public void isPrime_2(){ assertEquals(true,SomeCode.isPrime(29)); }

The first two tests show the basics of writing a junit test case: return type should be void, no parameters accepted, and the @Test tag just before all the modifiers. With those features in place, it's just a matter of using various assert* methods to say "require this equality/boolean/condition to be true". We can also indicate failure with fail methods. Below are some of the definitions found in org.junit.Assert:

public static void assertArrayEquals(int[] xs, int[] ys); //similarly for other array types public static void assertEquals(Object a, Object b); public static void assertEquals(int a, int b); // similarly for all other primitive types

public static void assertTrue(boolean expr); public static void assertFalse(boolean expr);

public static void assertNull(Object obj); public static void assertNotNull(Object obj);

public static void fail(); public static void fail(String reason);

Whenever an assertTrue method is called, if its argument evaluates to false the entire @Test method is marked as failure. Similarly, whenever an assertEquals(..) method is called, if its two arguments don't either == each other (for primitive types) or a.equals(b) (for object types), then the entire @Test method fails. If fail() or fail("some reason") are ever evaluated, that @Test would also be marked as failure. We would of course want to hide these fail-calls behind if- statement logic or something.

The exact mechanics of failure is that an AssertionError (an exception) is thrown, which the junit test-bench code then catches, so that it can move on to the next test.

Combining multiple assertions If you want to include multiple assertions in one @Test, you can. But be forewarned, once a single assertion is failed, the entire test is aborted and the later test cases won't even be checked. Similarly, if two different assertions exist along different paths (such as in different branches of an if-else), then by definition they can't both be tested. It would be better to put them in separate @Test methods, and then you'll always get feedback on each individual assertion.

@Test public void isPrime_3(){ // many assertions in one test. not always a good idea, but often convenient! assertEquals(true,SomeCode.isPrime(2)); assertEquals(true,17); assertEquals(true,SomeCode.isPrime(113)); //assertTrue is simpler for these tests here. Here are more examples. assertTrue(SomeCode.isPrime(5)); assertTrue( ! SomeCode.isPrime(4)); assertTrue( ! SomeCode.isPrime(12)); assertTrue( ! SomeCode.isPrime(15)); }

Running the Tests If you are working in DrJava, you can easily run the tests. Just have at least TestSomeCode.java open in DrJava, and after compiling click the Test button. You will see the results in the "Test Output" pane at the bottom.

If you are running the tests from the command line, it would look like the following (except that Windows users should use ;'s, not :'s):

demo$ javac -cp .:junit-4.11.jar *.java demo$ java -cp .:junit-4.11.jar TestSomeCode JUnit version 4.11 ..E.E Time: 0.01 There were 2 failures: 1) isPrime_2(TestSomeCode) java.lang.AssertionError: expected: but was: ….. 2) isPrime_3(TestSomeCode) java.lang.AssertionError: expected: but was: ….. FAILURES!!! Tests run: 3, Failures: 2

Notice that we needed to make sure junit-4.11.jar was part of our class path. It was in the current directory for this example, but we still had to explicitly add the jar file to the path. Consider the jar file like a folder, and this makes sense – javac/java never look in nested folders for us. As always, running the java executable on a class means calling its main method; take a look at TestSomeCode.java's main method:

public static void main(String args[]){ org.junit.runner.JUnitCore.main("TestSomeCode"); }

It invokes some of the JUnit code on the class named "TestSomeCode". If it had been part of a package, we'd have seen org.junit.runner.JUnitCore.main("package.sub.sub2.TestSomeCode"), or something similar. All we really need to know is that the exact contents of the String works as the actual class and its actual entire fully qualified package name, and the rest of this main method is just "boilerplate code" (code that always has a very regular shape, but still must be written each time we want to use some programming idiom or feature).

Your Turn! add a @Test, checking that negative values are all composite (we'll go with that definition). Seeing the error messages, go ahead and fix the definition itself – don't change any test cases (they should all be correct already), just keep fixing SomeCode until all tests pass. → does this mean our code is always correct? No. But it means it's at least as correct as our test cases are smart enough to consider.

Testing the fibonacci_iterative Method The fibonacci sequence looks like this: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, … It begins with the first two values being 1, and then each later value is the sum of the two previous fibonacci numbers. If we consider the values all stored in some endless array, with the usual zero-based indexing, then we can as for the nth fibonacci number. fibonacci_iterative is a method that calculates the nth fibonacci value using iteration (loops) instead of recursion. (Fibonacci number calculation is a very popular example of when naïve use of recursion is a bad idea).

There is at least one major flaw in this method (can you see it?). Add this @Test to your testing code:

@Test public void fibonacci_iterative(){ assertEquals(34,SomeCode.fibonacci_iterative(8)); }

Now run the tests again. Uh-oh, it's stuck in an endless loop! Either hit "reset" in DrJava, or ctrl-C on the command line to quit. How can we still write test cases when we're worried about code never ending? If we know about how long it should take (or just want to give ample time), we can place a timeout on any individual test:

@Test (timeout = 1000) public void fibonacci_iterative(){ assertEquals(34,SomeCode.fibonacci_iterative(8)); }

The timeout number is measured in milliseconds (thousandths of a second). Often any small test case will easily be done before a second, so 1000 or 2000 milliseconds is usually sufficient. In any case, the testing results will indicate that this specific test case timed out, and it is counted as failure.

Your Turn! fix the endless loop issue by remembering that n must go down by one each iteration. See how the timeout no longer happens. Does the code work yet? Add some extra @Tests for fibonacci_iterative(..). What would be good cases? 0, 1, 2? negatives? Can we test them all? Convince yourself that you've entirely fixed the code. If that means adding more unit tests, that's fine. Testing indexOf and dayOfYear

The next two methods are inter-related. dayOfYear uses indexOf as part of its calculations. Whenever you notice this dependency, it's a good idea to spend effort on the independent one (e.g. indexOf), and later try implementing the dependent one (e.g., dayOfYear). These are also buggy implementations. Let's first try to make tests, and then we can fix the code itself.

Black Box Tests Tests that don't consider the particular implementation choices can be called "black box tests", because they treat the method like an opaque, uninspectable thing where all we can do is feed it inputs and see if we get the expected outputs. Let's write some black box tests for indexOf. indexOf has the behavior we'd expect: given an array of Strings and a key value, it's supposed to find the first occurrence of the key and report back that index-location. If it's not found, it should return -1. We'll start you off with two test cases before you write your own:

@Test public void indexOf_1(){ String[] words = {"a","b","c","d","e"}; assertEquals(3, SomeCode.indexOf(words,"d")); }

@Test public void indexOf_2(){ String[] words = {"a","b","c","d","e"}; assertEquals(-1, SomeCode.indexOf(words, "z")); }

One test passes, the other fails. Why might that be? Before we go inspecting the code more closely, try making your own test cases. What are some examples you'd like to try?

Your Turn! Think about expected uses, namely when the value exists and we search for and find it. Try creating another example of this style of test. Some times there are corner cases; if we have multiple matches, does it find the first one? Make one or more test cases to inspect this. Sometimes the sought value won't be present, and it should return -1. Write one or more tests like this.

→ these are all black-box tests; we're not looking at the specifics of the code, just the outsider's view of what indexOf is supposed to do.

White Box Tests Sometimes we choose to write test cases that are directly attempting to test parts of one method. If we take a closer look at indexOf, we see that it uses a loop. We should always consider whether it's possible for the loop to run zero times, one time, or many times perhaps. Similarly, there's an if-statement inside that loop. What if the if-statement's guard is true on the very first iteration? Some interior iteration? None of the iterations? As these examples are trying to show, white box tests are to help us achieve code coverage. We want to make sure that every part of the code has been inspected by our tests, so we actually look at the shape of the implementation and try to come up with enough test cases to make sure all possible control-flow paths through the method are considered.

Your Turn! Write test cases that focus on the loop of indexOf. These are white box tests. Write test cases that focus on the if-statement of indexOf. These are also white box tests. Lastly, target the return -1 at the end of the method. If you already have such a case, then you can just mentally check this off your list of places to focus upon; but it never hurts to have extra test cases.

Once you're comfortable that indexOf(..) is entirely correct and tested, it's time to tackle dayOfYear(..). The assumption is that we could call it for April Fool's Day 2013 as dayOfYear(2013,4,1). Notice that January is assumed to be 1, not 0.

Your Turn! Come up with test cases for dayOfYear. Try to have some black box examples (expected usage), as well as white box examples (code coverage tests). Do you have test cases that cover all the corner cases? What would be good corner cases here – negative values, different leap-year scenarios, zero-values, what? Run your tests; find the bugs; fix the bugs; ????; profit! If you find yourself wanting to add more test cases as you go, that's a good thing – do it.

A note on duplicated setup You might have noticed that the two black box test cases above had the same String[] defined in them. If the setup turns out to be significant overlap between multiple test cases, we can actually push them into one special method that gets called once before all the test cases are run, using the @Before tag:

String[] words;

@Before public void setup(){ String[] words = {"a","b","c","d","e"}; }

We can add instance variables like words. Then in our @Before method, it gets its value. No matter how complex a setup we want to do, we can just put that method code here. After all @Before methods have been run, JUnit then runs all the @Test methods. As a mirror to @Before, we can also write @After methods, which can perform "teardown" operations like closing PrintWriters and Scanners, writing to log files, or whatever else you want to do after testing.

To be fair, we could also have just written : String[] words = {"a","b","c","d","e"};

And then not needed the @Before method at all; it's just a small example of @Before, so it's a bit contrived. If there are multiple @Before methods, which one is run first? We don't get any guarantee. If you're worried about the ordering, just have one @Before method that calls your other methods to complete your setup. Not every method in TestSomeCode has to have some @ tag, so using a single @Before tag to give you the chance to control the dispatch order is a fine idea.

Case Study: Greatest Common Denominator and Least Common Multiple

Familiarize yourself with the greatest common denominator and least common multiple. Wikipedia is a good resource for this. http://en.wikipedia.org/wiki/Greatest_common_denominator http://en.wikipedia.org/wiki/Least_common_multiple

In brief, from Wikipedia GCD(a,b) is the largest positive integer that divides both a and b without remainder. LCM(a, b), is the smallest positive integer that is divisible by both a and b.

There are two more methods in SomeCode.java which implement GCD and LCM:

public static int gcd (int a, int b){…} public static int lcm (int a, int b){…}

Your Turn!

Use your newly acquired unit testing skillz to: Create an effective set of test cases which would convince you that the code is correct if they all pass. What corner cases are there, and what behavior is expected? For instance, what if a>b, or a

Old Materials: installing Eclipse, JUnit, tests based on regular expressions.

Installing Eclipse

This page is provided in case you want to install Eclipse. We will be using JUnit directly, whether you use it at the command line, with DrJava, or Eclipse (or any other compatible IDE). First, go to this link http://www.eclipse.org/downloads/ and download "Eclipse IDE for Java Developers". Follow the installation instructions.

Eclipse is written in Java, and has support for writing code in many languages. It has many extra features (common to most IDEs), such as code completion, type checking in the background to immediately bring up errors or warnings, and management of your packages and files.

One reason we haven't explicitly focused on using Eclipse at first is that it also creates extra files and folders, and can make it confusing what files and folder structures are actually required to be writing Java code. It also blurs the distinction between editing, compiling, and executing, because you get a nice, shiny "play" button that will compile and run your code all at once, if it can. Some future programming situations will explicitly require you to write code command-line style, so we wanted to cement comfortability with that mode of programming first.

When you open Eclipse, you'll need to specify your workspace. This is a directory where Eclipse can create "projects". It's important that you manually "close project" whenever you go between projects; the play button sometimes is trying to run the other project. When it's time to work on another project and you want to close any other projects, in the "package explorer" panel, right- click the project and "close project", so that just the one you want to create is open.

We think of each assignment or program as a "project". We can create a new project via File → New → Java Project. We similarly add classes with File → New → Class. Eclipse tries to be helpful with all sorts of checkboxes and things, but you can always just type things in manually later on, in case you forget to check "create method stubs for inherited abstract methods", or something similar. Eclipse will create a folder for the project in its workspace directory, and inside of that, all your code will go by default into a "src" directory (source).

Installing JUnit in Eclipse Let's install JUnit. Go here, http://www.junit.org/, to download the jar file named junit-4.11.jar. (If this lab gets outdated and there's a later edition, grab that one instead! )

We need to add this jar file to your project's class path. In Eclipse, go to: File → Preferences (or, Eclipse → Preferences on Mac) select the Java menu on the left open the build path submenu edit the classpath variables submenu "Add", and select your file. It's a good idea to put the file somewhere like a bin directory: ~/bin, ~/local/bin, maybe /users/shared/bin, … anywhere is actually okay, as long as it's added to your classpath and you don't go moving it later on.

An alternate way to add JUnit to a project: go to Project → Properties → Java BuildPath left-menu → Libraries tab-header → Add External JARs → Follow the dialog and add your .jar file.

Testing with JUnit

In order to perform testing with JUnit, we need something to test. Let's create a single class that has multiple static methods that main can call, each one trying to utilize a regular expression.

Let's start out with the following code:

import java.util.*;

public class RE {

public static void main(String[]args){

Scanner sc = new Scanner(System.in);

System.out.print("Enter a phone number: "); String input = sc.nextLine();

boolean wasPhoneNum = checkPhoneNumber(input);

System.out.println("\nThat was" +(wasPhoneNum?"":"n't") +" a phone number."); }

public static boolean checkPhoneNumber(String s) { return s.matches ("(\d{3}) \d{3} - \d{4}"); } }

In Eclipse, select File → New → Java Project, name it (perhaps Lab8), and then hit OK. Next, we want to add a class: File → New → Class. Name it (RE was used above), hit OK. You can now paste or retype the class definition provided above. First off, fix the compilation error. It relates to the issues of actually representing our regular expressions inside a Java String. What does \d mean to Java when we're creating a String? Nothing intelligible; we meant to have a backslash, then a d, inside the String, so we have to represent each individually. The answer is in the footnote 1.

1 represent it as "\\d", so that "\\" means backslash, and "d" means d.

In general, write your regular expression on paper (which might contain backslashes), and then symbol-by-symbol, represent it as a Java String. You'll also have to escape double-quote characters if they are a part of your regular expression.

Initial coding: writing something worth testing.

Okay, back on task. Let's test this regular expression with the following inputs:

(123)123-1234 (123) 456 - 7890

Why aren't these working? What's wrong with our regex? Let's try another:

123 123 – 1234

Ah, parentheses didn't mean what we thought. We forgot they are special symbols. Let's modify the regular expression (escape the parentheses), and test again:

(123)123-1234 (123) 456 - 7890

Okay, we're getting closer. We'd like to allow whitespace anywhere except between adjacent number characters. Try fixing the regular expression to your heart's content, and then let's move on to testing it with JUnit.

Testing our method with JUnit

Let's use Eclipse's handy menus to help us create a JUnit test class: navigate to File → New → JUnit Test Case. We are actually creating a regular Java class, where the testing code will live. Name it something like RETester, and hit OK. If asked if you should add junit to the build path, say yes! Otherwise, we had instructions at the top of this section to add the jar file to a project. let's replace the fail line with the following: assertTrue(RE.checkPhoneNumber("(123) 456 - 7890"));

We now have our first JUnit test. Its purpose is to check that when we call RE's checkPhoneNumber method on particular inputs, it returns true.

Near the Package Explorer tab, there should be a JUnit tab. (If not, go to Window → Show View → Other → Java → JUnit). We want to "run as JUnit Test". Either use the narrow expander next to the green "play" arrow, or use the menus (Run → Run as → JUnit Test). When we run, we should see something like this:

Hooray! All one of our tests passed. Let's create another test case. In order to do so, we just create another @Test method in our class. We can find many alternatives to the assertTrue method in the documentation here: http://junit.org/apidocs/org/junit/Assert.html For now, let's just test more inputs by :

@Test public void test2() { assertTrue(RE.checkPhoneNumber("(123)456-7890")); }

Next, we run our tests again. The play button in Eclipse will remember that we last ran our JUnit tests, and will still do that instead of running our main method (until we change back with another "run as" selection).

We see a lot of information this time: we see a big red bar, indicating failure; we see one test with a checkmark next to it, and our new test (test2) with an x on it. We can also see the "failure trace" portion of the JUnit tab, and if we double-click on it, it will even highlight the line of our RETester class where the assertion failed.

Overall, we're seeing that we can make lots of test cases, and then re-run them all conveniently. Now, let's try to add in whitespace wherever it's convenient: let's add "\\s+" thus: return s.matches ("\\(\\s+\\d{3}\\s+\\)\\s+\\d{3}\\s+-\\s+\\d{4}");

We can now conveniently re-run all our test cases, and see how we fared. We got the same results: test passed, test2 failed. Let's make things less restrictive, and use "\\s*" instead of "\\s+". Yay! Now our test2 case passes.

Your Turn! Add more test cases: specifically, add cases that should not be valid phone numbers. Remember, we should always ask ourselves two questions: o "do I accept enough strings with my regular expression?" o "do I accept too many strings with my regular expression?" Try using more methods from org.junit.Assert. For instance, assertFalse and assertNotNull both might be useful starting points in your testing (in general, not just or specifically for this lab tutorial).

Your Turn! Create another static method in your main class, named validateEmailAddress. It should work like checkPhoneNumber did, using a regular expression. Then, create multiple JUnit tests to see if you think it accepts enough strings, as well as not too many, by relying on the results of your JUnit tests. o Notice at this point we don't even have to call the method from the main method; we can just happily test away on a small, isolated part of our code! Testing at its finest...

Another Case Study

Let's look at another case for testing. We will have a Triangle class that contains a calculateArea method. We want to test that method; it gets the math wrong for Heron's formula.

import static org.junit.Assert.assertTrue; import org.junit.Test;

public class Triangle { public int side1, side2, side3; public Triangle(int side1, int side2, int side3) { this.side1 = side1; this.side2 = side2; this.side3 = side3; }

public double calculateArea () { //Heron's Formula for area of a triangle double s = (side1 + side2 + side3)*0.5; System.out.println("\ts="+s); double result = Math.sqrt(s * (side1-s) * (side2-s) * (side3-s)); System.out.println("\tresult="+result); return result; } } Add the above class definition to a package in Eclipse. Again, let's create a new JUnit Test Case. Fill it in with the following code.

import static org.junit.Assert.*; import org.junit.Test; import org.junit.*;

public class TriangleTest2 {

public Triangle t1, t2, t3;

@Before public void setupStuff() { t1 = new Triangle (3,4,5); t2 = new Triangle (5,4,3); t3 = new Triangle (8,5,5); }

@Test public void test() { assertTrue(t1.calculateArea() == t2.calculateArea()); }

@Test public void test2() { assertTrue(t3.calculateArea()==12); } }

We are using another feature of JUnit: a method annotated with the @Before annotation will be called, once, before then calling each @Test method once. In this way, we were able to just use t1, t2, and t3 in our @Test methods without having to manually declare and instantiate them inside each method. Note that we still had to declare t1, t2, and t3 as instance variables of the entire TriangleTest class. This allows us to share the setup work that prepares us for running the actual assertion checks that our tests want to perform.

We could have actually put both assertions in the same method:

@Test public void test() { assertTrue(t1.calculateArea() == t2.calculateArea()); assertTrue(t3.calculateArea()==12); }

But when the first one fails, we don't get to even attempt the second assertion; it's more disciplined to have separate methods for each test case, though it's perhaps more convenient to have multiple assertions in the same test method.

Your Turn! Use your understanding of JUnit to debug the issues plaguing our implementation of Heron's formula. Feel free to look up the original formula as part of your investigations. Make sure you have both "positive" and "negative" test cases: ones that show the code serves its intended usage, and others that show it can't be misused. o What will happen if the user creates a triangle with new Triangle(3,4,100) ? Is this something you should or can test in JUnit? How might you do so? It's possible to add any normal bit of code in our testing class. Hopefully the testing class doesn't get so complicated that we need to test our testing code.

Chapter 13 JavaDoc

Hello!

JavaDoc is a tool that helps us generate documentation for our code; specially formatted comments allow the javadoc tool to automatically generate good API documentation directly from the source files.

JavaDoc: Automated API Documentation Generation Have you noticed how detailed and organized the API documentation is for the standard Java libraries? If you google any Java class, you see a nice web page with descriptions of the inheritance involved, all known fields/constructors/methods, and nice detailed descriptions of what each method does, down to the purpose of each parameter, links to other classes of interest, and maybe some good examples or discussion text. It probably took quite a bit of time for someone to make all those webpages, huh?

Wait a minute, though – Java was implemented by . Programmers love writing programs to do menial tasks for them. Poke around a bit more, and you'll find that many other large-scale projects written in Java have quite similar-looking documentation, other than perhaps a more visually pleasing cosmetic touch. How did everybody get this cool documentation, and can we get in on it??

They used Javadoc, and we can most definitely get in on it! It turns out that we can write comments in a very specific style, and then a tool (available on the command line as javadoc) can scrape all of our source files and generate the entire batch of html files, with many different flags and options available to direct what we actually want included in our API webpages.

Javadoc comments are still just regular Java multi-line comments, but they will tend to look like this:

/** * Some short summary description goes here. * * @someAttribute identifier more info about this one. * @someAttribute identifier more info about this next one… */

Where can we add these Javadoc comments? Pretty much anywhere. We mostly want to add them at: - the package level - the class level - the member level (both fields and methods)

After we've written some marvelous package in Java, and perhaps also added wonderfully descriptive javadoc comments all over the place, we can then invoke the javadoc tool to generate all our API documentation. Perhaps it would look like this example:

javadoc -d /Users/me/htmlDir -subpackages Foo

It requests all generated files to be placed at /Users/me/htmlDir, and is asking for all packages starting with Foo (and all contained subpackages) to be included in the documentation. There are many more options; this is a rather sparse usage to introduce the barest of basics.

Setup: Find and download javadocExample.zip.(here: http://muddsnyder.com/211/javadocExample.zip ). Open up all the .java files in it, and inspect them. Nice little package structure, doing a little math on the side. It's a bit of a silly example, but it's got all the pieces we need to try making javadoc documentation with it. You might want to draw out the package hierarchy as well as the classes hierarchy to get a feel for what this example contains.

Basic javadoc Generation

Javadoc will be generating documentation based on the structure of our code. We will first learn how to invoke the javadoc command, and then we will also learn how to write javadoc-style comments, so that the generated output has much more textual, descriptive information included in it.

Invoking javadoc: a few scenarios. We will be asking the javadoc command to generate a significant number of files for us to implement all the HTML documents for us, so let's make a folder for them (which the lab will assume is named htmldir). You might just put it adjacent to the package folder for now.

Image description: File directory found in the javadoc example zip file.

Assuming you have that directory structure, open up the terminal and navigate into the example folder. We will start off with almost the simplest invocation of javadoc:

javadoc -d htmldir mynums

This (-d htmldir) instructs javadoc to place all generated files in the directory htmldir; and then we requested that just the mynums package is documented. If we look inside the htmldir folder, we now see many generated files!

Word of warning: cut-pasting from this document might use weird characters, like a – instead of a -. Re- typing them is a good idea. Always double check that there are zero errors or warnings after regenerating the documentation whenever you invoke javadoc!

Image description: Contents of the htmldir directory.

Explore what we generated; variations on a theme Under that htmldir, open index.html; this is the only file in there that we are interested in opening (the rest get used by this). You should see the familiar API layout as the regular Java class documentation, such as what you'd find as the top hits by googling "Java Scanner", "Java InputMismatchException", and so on.

We had a second package, though: mynums.other was also a (sub)package. Let's get that generated, too:

javadoc -d htmldir mynums mynums.other

This would get a bit tedious if we had many extra subpackages: we can also request all nested subpackages be included via the -subpackages option:

javadoc -d htmldir –subpackages mynums

Better! And if we have any extra java files we wanted to include (perhaps ones that are implicitly in the default package), you can just list them at the end:

javadoc -d htmldir –subpackages mynums Other.java

When you used the default package only, you might use this command (useful on some of our earlier projects, most likely):

javadoc -d htmldir *.java

Visibility-Level Inclusions But now, take a closer look at the MyRational class's documentation. Where is our denominator? It was the default package-level visibility, but we don't see it here. It turns out that we can set a threshold of what level of visibility things to document:

Flag What levels of visibility are documented? (everything up to the listed visibility) -public public

-protected public, protected

-package public, protected, package

-private public, protected, package, private

-protected is the default option. Let's use -private to make everything visible (as backwards as that sounds: "-private, so show everything!" Think of it as, "show everything as sensitive as [private]".):

javadoc -d htmldir -private -subpackages mynums

Okay, now we re-visit the HTML document (just reload index.html), and see that MyRational now has its (package private) int denominator documented. Similarly, we can see private int imag in the MyComplex class. We actually have one of each level visibility in these packages (on purpose just for this section of the lab):

visibility: package.Class: member

public: mynums.MyInt → public String garbage protected: mynums.MyInt → protected int val package: mynums.other.MyRational → int denominator private: mynums.MyComplex → private int imag

There are of course many more options we can use to change the generated documentation. But this is enough to get you started with the javadoc tool. We next need to learn how to add comments in a style that javadoc expects, so that it reads more like actual documentation (human-written, with sentences and links and things) instead of just exposing the structure of the code. In short, we want it to say not just what our code is, but also why it is doing things.

Adding Javadoc-style Comments

Even though javadoc can do a lot of useful work without any javadoc comments, the generated documentation is far richer when we embed descriptions throughout our code. These javadoc comments look like regular multi-line comments, and just so happen to have a double-star at the beginning:

/** * This is a javadoc comment. It basically always has a short * summary/description on the first few lines. */

The stars on the interior lines are not necessary any more. Just like your experience writing code so far, even though comment-writing isn't exactly the most mentally stimulating part of the coding process, it still takes a significant amount of time. Your main interaction with Javadoc will be writing javadoc-style comments, and not the command-line documentation generation step that we just learned. But it helps to know why we're writing all of these specially formatted comments.

There are a number of places we will put these comments: members of classes (both fields and methods) class definitions themselves packages (in a special file named package-info.java, just inside the package folder).

Method Comments Place the following comment in the MyInt class, immediately preceding the addIn(MyInt other) method.

/** * addIn accepts another MyInt object, whose value is * absorbed into this one. * * @param other the other number to get some value from. */

Note the empty line between the short description and the @param line; it needs to be there.

Regenerate the documentation, and inspect your Method Summary section for the MyInt class. The description was directly pulled into the documentation. Click through to the detailed method description, and you'll also see the Parameters section listing out our single parameter.

Let's inspect the @return tag next.

/** * returns the value contained in this MyInt. * * @return the current value in the instance variable val. */

Regenerate the documentation, and view this method's output as well. There are plenty of other @tags that we can use; we are only going to try out a few in this tutorial. You can get much more exhaustive coverage from Oracle here: http://docs.oracle.com/javase/7/docs/technotes/tools/windows/javadoc.html#javadoctags

Annotation Identifier Text @param identifierName very short description of its purpose @return [none] short description of what is returned @throws className short description of when/why it would be thrown @exception className actually just a synonym for the @throws tag @see relevantPlace many allowed formats: for classes, members, strings, HTML, etc. @author name @version someText

We can of course have multiple @param lines, multiple @see lines, and multiple @throws (or @exception) lines. But there will be only zero or one line for any of @author, @version, @return (not used with void). It's even possible to create your own tags, a feature we won't explore here.

Your Turn! You've been making a lot of modifications all along anyways, but now let's try some of these (and then regenerate the documentation each time to check our work): add a javadoc comment to the addedTo method of MyInt. add a javadoc comment to the constructor for any of our three classes (MyInt, MyComplex, MyRational). add a method public double slope() to the MyRational class; write a javadoc comment, and try out the @throws tag. This method returns val/denominator, so can you guess what exception might be thrown?

Field Comments Class fields can also have javadoc comments. There aren't as many tags that we tend to use; in fact, we might not use any, opting just for a short description. (See the above link and scroll down a bit to see all the ones that are allowed for fields at all).

Change the plain old comment near the garbage field in MyInt to a javadoc comment:

/** this is just here so we have something public... * Basically a wall for people to write graffiti. */

We tend to just give a description for these. We might choose to include a few @see tags too.

Your Turn! add javadoc tags to the various fields in our three classes; regenerate the documentation and view them.

Class Comments We can provide javadoc comments for an entire class. Try adding this to the MyInt class, directly above the public class MyInt line.

/** * the MyInt class is just a thin wrapper around an int, * plus some unnecessary extra functionality. It's really * just here for our javadoc tutorial. * * @since about two minutes ago, mynums v0.1 * @author Yours Truly */

Regenerate the documentation, and find where this shows up. So far so good, except that our author tag wasn't used! We can request it by adding –author to our javadoc invocation:

javadoc -d htmldir -private -author -subpackages mynums

We might just have the description here, similar to fields. If anything, the @since and @author tags are likely to be used here. (Note: we can't split up –d and its directory, or –subpackages and its package; go ahead and put –author just after javadoc, or some other place where it doesn't disrupt these "pairs" of command-line arguments).

Your Turn! Document the other classes. Document the interface. It works just like the classes.

Package Comments There is one last place we are interested in adding javadoc comments: comments for an entire package. But we have one hurdle to overcome: there's no specific file that represents the entire package! Each class or enumeration or interface has its own file, but our packages are just a manifestation of the folder that contains various things that include package statements matching the name of the package in mind.

To solve this issue, we will add an almost-empty file, directly in the package folder. It must always be named package-info.java, and it needs to have a single package statement in it (identical to the package statements found in all other classes and whatnot that are also in this package).

Your Turn! Add the following text to the file package-info.java, directly in the mynums folder.

/** * The mynums pacakage is just a silly package example * that we are using to learn about javadoc documentation. */ package mynums; //nothing else needs to be in this file.

Regenerate the documentation, and you'll immediately see the documented mynums package.

Your Turn! Document the mynums.other package in the same fashion. Try using javadoc on your most recent project, and also on any packages examples from class.

Going Further There is plenty more that can be done with javadoc. At this point, it's probably best to just learn about features as you need them in your own uses of javadoc, perhaps on the job some day in the near future!

Description of the various @tags http://docs.oracle.com/javase/7/docs/technotes/tools/windows/javadoc.html#javadoctags

Much more info: http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html

Options for the javadoc command: http://www.java2s.com/Tutorial/Java/0020__Language/ThejavadocTool.htm http://docs.oracle.com/javase/7/docs/technotes/tools/windows/javadoc.html#runningjavadoc

Doclets can be written to generate other outputs such as pdfs, personalized/better(?) HTML, etc. http://docs.oracle.com/javase/1.4.2/docs/tooldocs/javadoc/overview.html Chapter 14 Generics

Hello!

The topic for this chapter is generics. Collections rely heavily upon generics, and so they are also discussed at length now. Both generics and collections let us write code that works at many different types.

Collections Java provides the Java Collections Framework (JCF), which gives us many powerful ways to create data structures (such as lists and maps and sets) and perform operations over them (such as add, remove, etc).

We will just briefly discuss the layout and then a couple of specific classes that we want some familiarity with (ArrayList, mainly).

The JCF uses some advanced features of the Java language: There are many interfaces that extend each other (yes, interfaces using the extends keyword!). There are many classes that implement specific interfaces. For instance, the ArrayList and LinkedList classes both implement the List interface. Basically all these interfaces and classes use generics: instead of the ArrayList class storing just Objects (not very useful), we have the ArrayList class that stores objects of type E.

Here are some of the interfaces, shown in their own hierarchy: Collection o Set SortedSet o List o Queue o Deque Map o SortedMap

Each one provides some group of abstract methods. Child interfaces provide all the abstract methods of the parent interface, plus any more that they define. We're particularly interested in the List interface. Please take a moment to browse all the methods available, as shown in its API: http://docs.oracle.com/javase/6/docs/api/java/util/List.html

We're thus also interested in classes that implement these interfaces, and the List interface in particular. On the List API, you can see "All Known Implementing Classes". Open up the API documentation for both ArrayList and LinkedList.

ArrayList ArrayList provides all the Listr definitions, as well as one or two extras such as trimToSize(). An ArrayList is an implementation of a List that has a hidden array of values inside. It keeps track of how much space is currently not used. If we add too many items, the ArrayList will automatically create another array twice as long, copy everything over, and then continue (now that it has plenty of space to add values again). The trimToSize() method lets us say, "I removed a whole bunch of stuff from my ArrayList, so let's regain all that space that's no longer needed. Create an array juuuust big enough to hold the contents." While looking up items in the ArrayList is fast, inserting items is really slow.

LinkedList The LinkedList class also implements the List interface, only it doesn't have an array hiding around. It relies upon lots of individual objects that each store a single value and then a reference to the next and previous items in the list. While inserting items into the LinkedList is pretty fast, looking up items in the list is slower.

Quick Comparison We see that they provide basically the same methods, with actual implementations. So: a List is some interface type that represents a list of E values. We can actually create lists of E values using an ArrayList or a LinkedList. There are a very small number of extra methods, for instance the ArrayList

Not Quite Your Turn… In order to create things of type ArrayList, we need to use generics, or else the compiler gets really mad at us all over the place. These non-generic uses are called "raw types", and we don't want to use them, because Java can't perform the type checking we're used to relying upon. So let's get started on generics, and then we can eventually understand what it takes to make values like an ArrayList or ArrayList. Using an ArrayList will feel quite like a Python list (if you have experience with them). We just have to call methods for all operations, instead of the [ ] syntax here and there.

Generics

Motivation: too much re-writing In Java, we occasionally find ourselves writing the same code, except that type types change in each version. Suppose we wanted to make our own pairs in Java (length-2 tuples). We would normally have to decide ahead of time what the type of the contained things are: public class IntPair { public int fst, snd; public IntPair(int fst, int snd){ this.fst = fst; this.snd = snd; } public String toString(){ return "("+fst+","+snd+")"; } }

// another version, for doubles: public class DoublePair { public double fst, snd; public DoublePair(double fst, double snd){ this.fst = fst; this.snd = snd; } public String toString(){ return "("+fst+","+snd+")"; } }

This can get out of hand. We need a separate copy of the class for every pair of types we'd like to put in a tuple!

// another version, for a String and an int: public class StringIntPair { public String fst; public int snd; public StringIntPair(String fst, int snd){ this.fst = fst; this.snd = snd; } public String toString(){return "("+fst+","+snd+")";} }

// another version for each different usage! public class IntStringPair { public int fst; public String snd; ... } public class DoubleStringPair { public double fst; public String snd; ... } public class StringDoublePair { public String fst; public double snd; ... } public class IntListPair { public Int[] fst, snd; ... } // please, make it stop…

Motivation: too much casting Well, we could try to get around it by just putting in Objects, once and for all: public class ObjectsPair { public Object fst, snd; public ObjectsPair(Object fst, Object snd){ this.fst = fst; this.snd = snd; } public String toString(){return "("+fst+","+snd+")";} }

Our usage would then require us to perform casts on everything that came out of it:

ObjectsPair op = new ObjectsPair(4, "hello"); String s = (String) op.snd; //must convince Java to convert from Object to int via Integer... int i = (int) (Integer) op.fst; char loc = s.charAt(i); System.out.printf("found '%s' at index %d in %s.\n",loc, i, s);

Relying on the programmer for correctly performing all these casts is a bad idea. It would be much nicer if we didn't have to convince Java what values were there, and if it could just remember for us that this Pair is supposed to always hold specific types at each location (fst or snd, in our Pair example).

There was/is the ArrayList type, which provides all the convenience of lists (a la Python), even though the underlying representation uses arrays. Before generics were added to Java, we had the same problem: whatever we put into it with the add method, we forgot anything about it other than being an Object, and had to cast everything as it came out:

ArrayList xs = new ArrayList(); xs.add("hello"); xs.add(new Integer(3)); char c = ((String)xs.get(0)).charAt((int)(Integer)xs.get(1)); System.out.println("Char is '"+c+"'");

Raw Types If you compile that code in modern versions of Java, you'll actually get warnings. Not errors (the code is still compiled for you), but it complains that you've used a "raw" type, which is to say you didn't use generics when you should have.

Hopefully by now, we're seeing the problem that generics solve: we don't want to have to re-write code at multiple specific types, but we also don't want to completely forget what types are being used.

Generics: Parameterizing Code with Types! Generics allow us to write code that is parameterized over types. We're already quite comfortable parameterizing a method over values with our usual formal parameters list; we will now be able to write entire classes that are parameterized over types, so that we can write the once-and-for-all definition of Pair. We will be able to parameterize a method over types as well, as a sort of infinite overloading. (These methods still have their usual formal parameters list for values).

What does a type parameters list look like? Whether used at the class or method level, we place new identifiers in angular brackets, and separate them with commas, such as:

We can choose any unique name for our type parameters. By convention, they ought to start capitalized, as they represent types. Also, since they can be instantiated by any type we can think of, it can be hard to think of meaningful names. Single-letter names are common, especially T (for type), E (for element), and so on.

Let's look at the example for Pair, as it ought to be written:

public class Pair { public R fst; public S snd; public Pair(R fst, S snd){ this.fst = fst; this.snd = snd; } public String toString(){ return "("+fst+","+snd+")"; } }

All of the occurrences of the type parameters are highlighted, but notice that we only have the type parameters list at the class declaration itself; all other uses just use the already-in-existence types named R or S.

Generics with Classes In order to use the Pair class, we need to supply actual types for each type parameter. This occurs when: we call the constructor we create a variable to hold one of our pairs

First, let's look at the constructor calls:

Pair pintstr = new Pair(4,"yo");

System.out.println("pintstr: "+pintstr); pintstr.fst = 12; // Java knew that putting an int here was okay. int left = pintstr.fst; // Java already knows there should be an Integer here. String right = pintstr.snd; // Java also knows that there's a String here. System.out.println("as parts: fst="+left+", snd="+right);

This particular usage decided to use Integer as the first type, and String as the second type.

We can use this one Pair class with any types we like except for primitives:

// they can be the same Pair pints = new Pair(2,4);

// they can be arrays! Pair parrs = new Pair( new int[]{0,1,2},

// they can even be other Pair<..> types! Pair> triplet = new Pair>("a", new Pair(5,true));

// minimal usage... System.out.println(pints+"\n"+parrs+"\n"+triplet);

So far, we at least are able to use type parameters to make a class that operates over some particular type, and yet we get to choose that particular type for each unique usage of the class. Furthermore, Java can perform extra type checking on our code, and can help us avoid many castings!

Your Turn! Make a class definition that is generic; it should be named Box. A Box can hold one item of any type, but that type is decided by the class's type parameter. In a separate class named TestGenerics (we'll use the name later on), create a couple of Box items and store them to variables (also using generics). What are some interesting types of Boxes you can make? What about boxes containing boxes? Boxes of arrays of boxes? access your box's value. Then, update its value (assign a new value to it). Perhaps more importantly, try to mis-use your Box: put the wrong thing in it, mismatch the constructor type parameters with the variable's type's type parameters. What do Java's compilation errors look like?

Using the Type Parameters within the Class Let's continue with the Pair class, and add getters and setters (even though its fields were declared public for convenience's sake above). Put the following methods into your Pair class:

public R getFst(){ return fst; } public S getSnd(){ return snd;}

public void setFst(R r) { fst = r;} public void setSnd(S s) { snd = s;}

Because this code is inside the class, the types R and S are in scope: we can use them like any other type in our code while inside the Pair class.

Notice that our getters have R and S as return types. We don't know exactly what R will be when we write this code, but we do know that whatever eventually gets used is the type that fst will have; we're returning the value of fst, so we'll also definitely have that same type.

Similarly, note how our setters accept parameters of R and S respectively. They know those are the exact types needed for fst and snd.

Your Turn! make the item in your Box class private. (sorry if this breaks your earlier testing code!) add the method replaceItem to your Box class. (it's a setter). add the method unpack to your Box class. (it's a getter). Try using your Box with these nicely-named getters and setters.

Generics and Methods Type parameter lists may also be added to individual methods, so that they are only in scope during that method. This means that we can call these methods at different types, and Java can keep track of those specific types for each usage.

Consider the following method. It accepts a Pair, and creates a new pair by duplicating the first item in both spots.

public static Pair duplicateFirst(Pair p){ return new Pair(p.fst,p.fst); }

It's important to understand that the yellow-highlighted type parameters list is the only parameters list in this method. In the text Pair and Pair, we are merely using R and S as plain old types. The fact that they came from a type parameters list doesn't matter.

Using generic methods We have to write somewhat funky syntax to use these generic methods. We still (usually) have to supply the actual types to be used. Given that our method is static, we can call it with ClassName.methodName, only now we have to wedge in the type parameters:

ClassName.methodName(regular,arguments,here)

Suppose we've been writing our code in the TestGenerics class as suggested above. We'd write:

Pair pis = new Pair(5,"hi"); Pair pii = TestGenerics.duplicateFirst(pis); System.out.println(pii);

If we're lucky, Java can actually figure out what types we mean to use. Given that we have one parameter that exactly uses R and S, it seems reasonable that Java could use those as cues to instantiate the type parameters for us. Indeed, that is allowed:

Pair pis = new Pair(5,"hi"); // look ma, no explicitly instantiated type parameters! Pair pii = TestGenerics.duplicateFirst(pis); System.out.println(pii);

We can't always rely on Java to know the best type, so we will often have to list them explicitly.

Why didn't we just use a class type parameter and use that type in our method? Because (1) we'd have to lock in that type once and for all, and couldn't call the same method at different types; and (2) the method is static, so we don't have an actual object from which to figure out what types were used during its instantiation.

If our method was not static, we could do the exact same with the object in front of the dot instead of the className:

// create an object of the class... TestGenerics tg = new TestGenerics(); // explicit type parameter instantiation Pair pii = tg.duplicateFirst(pis); // implicit type parameter instantiation Pair pii2 = tg.duplicateFirst(pis);

Your Turn! In your testing class, write the generic method repeatFst (which will use your Pair class). It accepts an int count representing how many times to repeat the value of fst, also accepts a pair, and then returns an array that is count spots long, each filled with a copy of the pair's fst value. Write a static generic method, called zip, which accepts two arrays of two different types. It returns an array of Pairs, where the items from the two input arrays have been zipped together: the first Pair contains the first items of each array; the second Pair contains the second value from each array, and so on. Whichever array is shorter, that is how long the resulting array is. (if either array is null, just return null).

Quick Aside: Wildcards We actually don't care what S is, to the point that we don't ever need to use it! We can get rid of S, and use a wildcard (represented with a question mark) to make this point clear:

public static Pair duplicateFirst(Pair p){ return new Pair(p.fst,p.fst); }

This code only has one type parameter in its type parameters list. When we get to the parameter Pair, we're stating that whatever the second type of this Pair is, we don't care and will never be relying upon it anywhere else.

Case Study: Interfaces and Generics

Consider the following interface:

public interface Changer{ public B change(A a); }

The interface has a type parameters list, but the method inside of it just uses that type parameter. If we wanted to implement this interface, we'd have to instantiate the type parameters in the "implements" portion of the class declaration: public class IntBoxToIntPair implements Changer,Pair> { public Pair change(Box b){ return new Pair(b.unpack(),b.unpack()); } }

We didn't have to use a type parameters list for our IntBoxToIntPair class, because we're immediately listing at what type we're implementing the Changer interface.

Here is how we might use our IntBoxToIntPair class:

Box myBox = new Box(5);

IntBoxToIntPair ibtip = new IntBoxToIntPair(); Pair plainChange = ibtip.change(myBox); System.out.println("ibtip got "+plainChange);

The construction of ibtip doesn't need any generics. Calling its particular implementation of change is like any interface, and doesn't need any type parameters (there's only one version of it for this class). The return type of ibtip's change is always Pair, so that is the type of variable we can store the output to.

We can also implement a generic interface without committing to the final types, if our class itself is generic. Consider this more general version of converting a box to a pair:

public class BoxToPair implements Changer,Pair> { public Pair change(Box b){ return new Pair(b.unpack(),b.unpack()); } }

And here is how we use it, with all type parameter instantiations highlighted:

// create the box. Box myBox = new Box(5);

// create instances of the class at a specific type that can change stuff for us. BoxToPair box2pair = new BoxToPair();

// use the change method to perform the conversion. Pair pii = box2pair.change(myBox);

// show the results! System.out.println("got: "+pii);

// bizarre one-liner version... Pair p2 =(new BoxToPair()).change(new Box("present")); System.out.println("horrible one-liner returned: "+p2);

This kind of class does nothing other than implement a generic interface at one particular type(s) instantiation, and then is used as a sort of facilitator. Your Turn! Create another class, named PairToString. It should implement Changer,String>. Its change method can just return a String looking something like "a Pair of 5 and hello". Does your class need a type parameters list? o Continue with the previous usage code: create one of your PairToString objects, and use its change method to convert the Pair (that our BoxToPair gave us) into a String.

Case Study: java.util.Collections' sort method

The java.util.Collections class has the following method:

static void sort(List list, Comparator c).

The entire method has a type parameter list (). Its first parameter is just any List. Note that List is an interface, so we'll actually have some object from a class that implements it, such as ArrayList or LinkedList. That last parameter is something a bit beyond how far we want to go with generics, but here's enough of a description that we can at least use this method.

Comparator is an interface providing one method:

interface Comparator { int compare(T o1, T o2); }

Its job as an object is just to know how two things of type T could be compared, and return a negative number if they are already in ascending order (o1 < o2), return a zero if they are equivalent (o1.equals(o2) in some sense), or a positive number if they are out of ascending order (o1>o2).

We've already seen the wildcard (?). Here, its usage means that we don't need to ever use the specific type of things we're comparing, as long as it's some super-type of T.

The super keyword is being used to indicate that whatever is used for this comparator, it has to be some type that is no more specific than T. Suppose we had a list of poodles: the type of it is List. We could sort them using an implementation of Comparator, but we could also sort them with an implementation of Comparator or Comparator or Comparator. In order to allow us to use any of these, the Collections.sort method doesn't just accept a Comparator. It instead is telling us that the List can be sorted by any comparator for things that T can behave like. What can T behave like? Any of its ancestor classes. Thus, we're putting a lower bound on our generic type! Comparator means that any comparator implementation that could be used on this type of list is allowed to be used. Whew!

So suppose we create our own class that implements Comparator for some T. import java.util.*; public class CompareBoxes implements Comparator> { public int compare(Box b1, Box b2) { int b1val = b1.unpack(); int b2val = b2.unpack();

if (b1val < b2val) { return -1; } if (b1val == b2val) { return 0; } return 1; } }

Now we can use this with the Collections.sort method. An instance of our CompareBoxes class serves as a witness/facilitator to the sort, telling it whether two items are relatively in order yet or not.

CompareBoxes cb = new CompareBoxes(); List> boxes = new ArrayList>(); boxes.add(new Box(5)); boxes.add(new Box(3)); boxes.add(new Box(4)); System.out.println("before: "+boxes); Collections.sort(boxes,cb); System.out.println("after: "+boxes);

Your Turn! (Solutions below) Create a different class, SortIntPairs, that knows how to sort Pair based on their first values. If the first values are equal, then look at the second value to determine the sortedness. Create an ArrayList filled with Pair values. Create one of your SortIntPairs objects. Use both in a call to Collections.sort and print out your arraylist before and after the sort. (Super Hard‼‼) Create another class, SortPairs. It wants to sort pairs of anything, so it needs four type parameters: R and S, for the contents of the pairs; C1 and C2 (arbitrarily chosen names), which are types that are able to compare things of type R and S respectively. So we actually want this class signature:

public class SortPairs , C2 extends Comparator> implements Comparator> { … }

what fields should a SortPairs object have? How will we implement the compare method? Create some ArrayList of Pairs of something else (perhaps pairs of boxes of Integers). Construct all the "witness" objects necessary in order to call Collections.sort on your List of pairs of things you know how to compare.

Solutions to the Last "Your Turn": import java.util.*; public class SortIntPairs implements Comparator> {

public int compare (Pair p1, Pair p2) { // doesn't return exactly -1, 0, 1; but comparisons don't have to! Just needs neg, zero, pos. int firsts = p1.fst - p2.fst; int seconds = p1.snd - p2.snd; if (firsts != 0){return firsts;} else {return seconds;} } }

List> piis = new ArrayList>(); piis.add(new Pair(3,4)); piis.add(new Pair(2,5)); piis.add(new Pair(3,1)); System.out.println("piis before: "+piis); Collections.sort(piis, new SortIntPairs()); System.out.println("piis after: "+piis);

import java.util.*; public class SortPairs , C2 extends Comparator> implements Comparator> {

public C1 c1; public C2 c2; public SortPairs(C1 c1, C2 c2){ this.c1 = c1; this.c2 = c2; }

public int compare (Pair p1, Pair p2) { int firsts = c1.compare(p1.fst, p2.fst); int seconds = c2.compare(p1.snd, p2.snd); if (firsts != 0){return firsts;} else {return seconds;} }

}

CompareBoxes cb = new CompareBoxes(); SortPairs,Box,CompareBoxes,CompareBoxes> sp = new SortPairs,Box,CompareBoxes,CompareBoxes>(cb, cb); List,Box>> boxpairs = new ArrayList,Box>>(); boxpairs.add(new Pair,Box>(new Box(3), new Box(4))); boxpairs.add(new Pair,Box>(new Box(2), new Box(5))); boxpairs.add(new Pair,Box>(new Box(3), new Box(1))); System.out.println("boxpairs before: "+boxpairs); Collections.sort(boxpairs,sp); System.out.println("boxpairs after: "+boxpairs);

Chapter 15 Recursion

Hello!

Recursion implies that something is defined in terms of itself. We will see in detail how code can be recursive (a method that calls itself), and we will also explore how data can be recursive. Let's start with an example of a recursive method: calculating the factorial of a number.

Recursive Methods

We will first take a look at how a method can be recursive, by exploring methods that use recursion to solve a programming task. The general approach is to keep making the problem smaller on each successive call.

Example: Factorials The factorial of a non-negative integer can be defined in many ways. Here are two definitions that are not recursive:

Alternative text for the below formula: “n factorial is the product of ‘i’ for i = 0 through n”. � �! = ∏�=1 � n! = n*(n-1)*(n-2)*…*3*2*1

We could also define it as:

Alternative text for the below formula: “n factorial is equal to 1 when n = 0, and n times ((n minus 1) factorial) when n > 1”. 1 �ℎ�� � = 0 �! = { � ∗ (� − 1)! �ℎ�� � > 1

The definition of factorial involves a factorial expression, namely (n-1)!. This definition is also pretty convenient, as it gives us a specific expression to use based on the specific value of n we are given. This tells us what the value of n! is, instead of showing us how to calculate it. Let's look at the code for recursively calculating the factorial of a number:

public static int factorial (int n) {

//base case: 0! == 1. no recursion. if (n==0) return 1;

//recursive case: n! == n*(n-1)! else return n * factorial(n-1); }

We see the recursive call: factorial(n-1) . This is what makes the method recursive, which just means that the method contains a call to itself in its body. There are a few key points to notice that will show up in any recursive method:

The Base Cases Base cases represent the non-recursive uses of our method. When we calculate factorial for n==0, we already know the answer (1) and can simply return this result by rote. Base cases are our chance to 'escape' from the recursive calls, and thus must be checked first. It is very similar to the guard check of a while loop or for loop – before each iteration, we check if it's time to quit yet.

The Recursive Cases Recursive cases represent the recursive uses of our method. When we calculate factorial for n>0, we are basically realizing that we don't have the answer memorized for this particular n, but we know how to make the problem smaller: think of n! = n*(n-1)!. We already know n, because it was given; (n-1)! is a "smaller" problem, because (n-1) is closer to our base case, when we just know the answer. So we go ahead and recursively call factorial(n-1), and then multiply the result by n to figure out what n! is.

If we only had recursive cases, then we'd always be calling the method again, and we'd never finish calling the method! Just as we can (accidentally?) write an infinite while-loop or infinite for-loop, we can also write an infinitely recursive method, and it's just as bad.

Calling the code We can test our factorial code with the following testing harness, which accepts the value to test as a command-line argument. (You could instead create a java.util.Scanner and read from System.in).

public static void main (String[] args) { int n = Integer.parseInt(args[0]); int nfact = factorial (n); System.out.println("n="+n+"\nn!="+nfact); }

Each Call is Distinct When a method calls itself, the two calls are distinct: they were each invoked by passing particular parameters around; each one may create its own local data. The values of all this local data are stored in different locations in memory for each recursive call. It might help to think of each call as using a different copy of the code that just so happens to do the same thing:

public static int factorial (int n) { if (n==0) return 1; else return n*factorial1(n-1); } public static int factorial1 (int n) { if (n==0) return 1; else return n*factorial2(n-1); } public static int factorial2 (int n) { if (n==0) return 1; else return n*factorial3(n-1); } public static int factorial3 (int n) { if (n==0) return 1; else return n*factorial4(n-1); } public static int factorial4 (int n) { if (n==0) return 1; else return n*factorial5(n-1); } …

It is easy to understand that each call to a different method above has its own local data (including its parameters), and occupies its own space on the stack. If we wrote enough versions of factorialX, this would work and would not be recursion. We have enough extra copies to calculate factorial up to 4 in the above sample, but suppose we wanted to call factorial on just slightly larger numbers? There would never be enough copies to handle all possible inputs. This is the same reason we would write a while loop instead of a lot of nested cut-pasted if statements. It's not just cleaner and easier to write, it's actually more powerful.

Another thing to consider is that any particular method that calls another method basically pauses itself, lets the other method complete its task and return with a value, and then the caller gets to un-pause right where it was, and it continues its task. This same idea happens with a recursive call:

When a method calls itself recursively, the original call pauses, then the recursive call runs completely from start to finish; lastly, the original call finishes.

Okay, so we've got our single version, which calls itself:

public static int factorial (int n) { if (n==0) return 1; else return n*factorial(n-1); }

Note: each call to factorial still has its own local data (like parameters). Whenever we have one method call another, it has to pause; wait for the other one to finish; and then it un-pauses and completes its task right where it left off, using the result of the called method as necessary. If we are paused waiting on a recursive call, and multiple other recursive calls occur as a result, we'll still just have to wait (stay paused) until they all finish and the call we made is able to return to us. So it makes sense that the same thing (pausing until the call completes) should happen when a method calls itself.

Why does recursion work? When we use recursion, it's as if we get the solution for free: in order to solve a problem, we just call the solution we're writing to write the solution. How is this fair? It all comes down to that requirement that the recursive call must be a "smaller" problem, which we can also phrase as saying that we must make progress towards one of our base cases.

There's the notion of mathematical induction: If I want to show that some property P(n) holds for all positive integers n, let's try to show two things: base case: P(1) is true the property is true for n==1 recursive case: P(k) → P(k+1) if I know P(k) is true, I can show why P(k+1) is true.

Now, if I wanted to show why P(4) is true, I can say that I know P(1) is true; then, we show why P(1) → P(2); why P(2) → P(3); and lastly, why P(3) → P(4). The beauty is that these two pieces of information are enough to show that P(n) is true for any positive integer n. We don't have to do an endless number of proofs!

Now, let's apply this notion to method recursion. Using the example property P above, we have to think of the goal as proving P(4), and thinking in reverse:

If I only knew P(3), then I could show how P(3) → P(4). If I only knew that P(2) were true, I could show how P(2)→P(3)→P(4). If I only knew that P(1) were true, I could then show how P(1)→P(2)→P(3)→P(4). Well, hey, we happen to know that P(1) is true! So apply that information (the calculated results) to show that P(1)→P(2)→P(3)→P(4), and now we can see that P(4) is true.

Because each step got us closer to the base case (we always knew P(1)), we would eventually find the base case and be able to stop the recursive calling, completing all our calls and ending up with the answer.

So how can we check that we're getting closer to a base case? It's all about inputs (parameters). For many recursive methods, we have numerical variables as parameters. These same variables tend to be used to phrase our base cases. Whatever value we need to trigger the base case, and whatever value we currently have instead, will dictate the change in value that we'd need to see in order to know that we are getting closer to a base case. In our factorial example, the numerical parameter was named n. The base case is when n==0, and we would start with a larger positive number. Our recursive call used the value (n-1), so it is pretty clear that we're getting closer to our base case. If we were to plug in a negative number (say, -5), our code would still blindly call factorial on the next-lower number (-6). This would not be closer to our base case. This would recursively call itself infinitely, or as many times as possible before our computer ran out of memory to store the local data of each successive call (all those parameter values have to go somewhere while the recursive calls are busy!). In Java, you'd see a StackOverflowError.

Your Turn! Plug in a negative number to our recursive factorial definition. "fix" the factorial definition by treating all negative numbers as a base case: if the input is negative, just return -1 instead of performing any recursive calls. Let's write some simple definitions with recursion. True, there are easier ways to write these methods, but let's use recursion anyways, while the problems are simple. o We are comfortable writing the sum of all numbers from zero to n with a for-loop; now, try to write a method that sums all the values from zero to the single integer parameter that uses the following knowledge: sum(0)=0 sum(n)=n+sum(n-1). o Suppose we don't have the % operator in our language of choice, and we'd like to use recursion to figure out if a number is divisible by 2. Use recursion to write an isEven method, given that: isEven(0) = true isEven(1) = false isEven(n) = isEven(n-2)

Note: Recursive methods are far more common when our data are also recursive. The recursive structure of our values often beautifully lines up with the recursive calling structure of the methods that perform calculations or manipulations over those recursively-defined structures. We will see a few examples of these once we introduce data recursion later in this tutorial.

Mutual Recursion (Indirect Recursion) Sometimes recursion happens in a roundabout way: suppose that method A sometimes calls method B, and that method B sometimes calls method A. If we only looked at method A's code, we might not realize that calling A can result in many more calls to method A! Here is a very small example that can showcase it:

public static boolean isEven(int n) { if (n==0) return true; else return (! isOdd(n-1)); } public static boolean isOdd(int n) { if (n==0) return false; else return (! isEven(n-1)); }

The isEven method calls isOdd as its 'recursive' case. isOdd calls isEven as its 'recursive' case. If we called isEven on 5, we'd work thus:

isEven(5): let's figure out if 4 is odd. isOdd(4): let's figure out if 3 is even. isEven(3): let's figure out if 2 is odd. isOdd(2): let's figure out if 1 is even. isEven(1): let's figure out if 0 is odd. isOdd(0): false! return the false value through all those calls → 5 is not even after all.

Certain care must always be taken with recursive calls to ensure we're approaching a base case. When the recursion is indirect, it can be easy to forget that we still must make progress towards our base case. But it is just as necessary to successfully use recursion, be it direct or indirect.

Data Recursion Recursion can also occur in our data representations, not just in our instructions. Data recursion happens whenever we are defining a new type of value, and we state that one of the fields (some state) is of the type we're defining. In this way, the structure of our values is defined in terms of what the structure of our values looks like.

We will explore one very simplified version of a data structure in order to see data recursion. When you take a data structures course (CS 310, here at George Mason), you will get to learn about far more complex and interesting data structures. It gives you a very powerful set of tools to approach your programming tasks! Data structures let you build upon decades of others' programming experience of the 'correct' way to solve problems, and the better you understand them, the far stronger programmer you will become.

Example - Lists If we didn't have arrays in our language, we might want to create a list of integers by stating that:

a list node is a single number followed by another list. a list is just a list node, which might be null.

Now, if we create a new class for a list node, and another class for a list (which will just be a list node and a bunch of methods), we might come up with the following definition: public class IntList { public IntListNode head; public IntList() { head = null; } } public class IntListNode { public int val; public IntListNode next; public IntListNode(int val, IntListNode next) { this.val = val; this.next = next; } }

Throughout, let's try not to get confused between the IntList class and the IntListNode class. An IntList represents the entire thing, and really just gives us a place to put methods that want to view the whole structure (it also lets us gracefully use null to represent an empty list, so that we don't have to check for null all over the place). An IntListNode represents just one link in the chain.

Your Turn! Let's get our code base started, which we will add to later. Starting with the above code, overload the IntListNode constructor to have one that only asks for the int val, and always sets the next field to null.

If we wanted to represent the three numbers 3,5,7 in an IntList, we could then do the following:

IntList ill = new IntList(); ill.head = new IntListNode(3); ill.head.next = new IntListNode(5); ill.head.next.next = new IntListNode(7);

We created each node (each spot in the list), and linked them together by using enough .next's to chain them together. Clearly we'd like a more convenient way to add items to our list. Let's add a method to our IntList class to do so:

// IntList add: drives the IntListNode add. public void add(int v){ if (head==null) head = new IntListNode(v); else head.addToNode(v); } //IntListNode addToNode. This method is recursive! public void addToNode(int v){ if (next==null) next = new IntListNode(v); else next.addToNode(v); }

Now we can create our 3-element list a bit more simply:

IntList ilist = new IntList(); ilist.add(3); ilist.add(5); ilist.add(7);

Notice how we are writing code that deals with recursive data (the addToNode method), and the definition was recursive. This is a common occurrence. We could have defined a non-recursive version of add:

//IntList add_iterative public void add_iterative(int v){ // If the first node is empty(null), then add it specially. if (head==null){ head = new IntListNode(v); return; } //keep stepping through nodes, as long as the next one also exists. IntListNode curr = head; while (curr.next != null){ curr = curr.next; } // we know curr is the last existing node, so reassign its next field. curr.next = new IntListNode(v); }

… // usage: pretty much like the recursive version. IntList ilist = new IntList(); ilist.add_iterative(3); ilist.add_iterative(5); ilist.add_iterative(7);

There is always a tradeoff between iterative and recursive versions of a method. Iterative versions tend to be faster (whether it's imperceptible or severely apparent), but recursive definitions are often a more natural way to phrase a solution. The successive recursive calls are a natural place to 'store' temporary calculations. Note that there was no current node in the recursive version; it was implicitly tracked by calling add(..) on a different IntList object each time. The this of each successive recursive call effectively was the curr node.

This time, the recursive version should run in almost identical time. The only overhead is in the effort spent calling a method versus the effort spent going to the next while loop iteration, and that difference is not huge.

We can write more recursive definitions on our definition. Let's consider pretty-printing our list, in iterative fashion first. We'd like something such as "[3,5,7]" to be the result.

// IntList toString public String toString(){ String s = "IntList[ "; IntListNode curr = head; while (curr != null){ s += curr.val+","; curr = curr.next; } return s+"]"; }

This task is perhaps slightly better suited to an iterative version (as above), because the very first time would be different from all successive recursive calls (it has to print square brackets). When the first time to call a method has a few extra details (or should do a few more/fewer things), we might want to write a "driver" method. This has nothing to do with recursion per se, though it does lend well to recursion:

// IntList toString, version 2 (with worker in IntListNode class). public String toString(){ // handle empty list case if (head==null){return "[]";} // just this first node must add [ ]'s. String s = "["+head.val; // all further nodes are handled normally s += head.next.toString_worker(); s += "]"; return s; }

// IntListNode: helper method, should only be called by IntList's toString. public String toString_worker(){ // include this node in the String String t = ", "+this.val; // base case: no more nodes if (this.next == null) { return t; } // recursive case: get entire rest of list's String representation else { t += this.next.toString_worker(); return t; } }

We would call IntList's toString the driver method, because it is used to initiate (drive) the toString activities. All the interior calls happen recursively through IntListNode's toString_worker. Another common time to use a driver method is if we have a few parameters that are tweaked on each successive call, but the initial call we'd always know their starting values (perhaps zero, false, or someArray.length). The driver method can offer a shorter parameter list, and then call the interior worker-version with these known starting values directly, without forcing the user to remember what those initial values ought to be. In our case above, the only difference between toString and toString_worker was just the body of the method.

Your Turn! There is a bug in IntList's toString as shown above! It turns out that it does not work for length- one arrays, though it does work for length-zero and length 2+ lists. Figure out why, and fix it.

The following methods can all be defined over lists. For each one, write either an iterative implementation in IntList, or else write a driver in IntList and a recursive worker in IntListNode. o Write the contains method, that accepts an int parameter and returns a boolean value describing whether this node or one it links to has that int value as its val. o Write the insert method, which accepts an index and a value to insert at that location. It should take over that index, and push all following values over one spot. o Write the remove method, which accepts an index, navigates to that spot, and removes that node. o write the addAll method, which accepts an int[] and adds each of its items to your list in the order given. (They all end up at the end of your list). o Write the method size, with no parameters : it returns the length of the list as an int. o Write a static version of size that accepts an IntList as a parameter. o Write the maxValue method, which has no parameters and returns the largest value in the list. It's okay if it crashes on empty lists.

Thoughts If we didn't have the IntList class, and just used the IntListNode class, we could still effectively get all of our operations implemented. But the only way to represent an empty list would be with the whole thing being null, instead of an instance variable being null as in our implementation. Instead of all our if (head==null) checks happening inside methods, any code that wanted to use our IntListNode would be responsible for knowing that null was a possible value, and to first check for null before using it. This is error prone and laborious usage, so we chose to have the IntList class. It serves as a facilitator between the nodes on the inside representation, and the abstracted view that outsiders have of a list. It is also a very convenient place to deal with the "first iteration is different" aspects of any of our operations, as it's truly a different place with different things available.

That's a common theme with Abstract Data Types such as lists: some inner recursive structure utilizes many objects to define a larger entity, and the whole thing is hidden behind a clean API. Java's Collections Framework provides many nice ADT's as interfaces (the API), and then the messy details are hidden inside class implementations (where the data recursion occurs).

Bonus Things Some jokes and takes on recursive structures. Comics http://xkcd.com/878/ http://www.buzzfeed.com/scott/sad-keanu-recursion image searches https://www.google.com/search?q=recursive+image&tbm=isch Chapter 16 Searching and Sorting

Hello!

The main topics for this chapter are searching and sorting data. We'll limit our discussion to arrays for searching and sorting, so that our data are linear (it's much simpler to discuss the ideas of searching and sorting, but we could also search and sort a tree structure of data, for instance).

Getting Hyped for Search and Sort

Once we stop doing silly homework assignments and try to do a truly necessary computation with our programming skillz, we often find that we’ll apply a single concept a whole lot more—maybe a class structure with 30+ classes, maybe a program with menus inside of menus, maybe reading much more complex files to do computations; in short, we do something a lot. When we get into complex problems that involve other sub- problems, such as sorting results to show to a human, or finding results (as in virtually any time you run a search engine, search for items in some online store, etc.), the store of information available which needs to be operated on can be quite large. So large, in fact, that simple, correct, ways of manipulating data simply aren’t feasible. Would you think Google was so great if it took a week to find your search for "Cliff Notes Hemmingway", which you entered in the search box the night before a reading quiz? What if it took minutes to connect a telephone call? Would that be okay? Probably not.

So, although we present some pretty simple algorithms for sorting and searching, keep in mind that we are only showing quite simple sorts; there is quite intensive study into what kind of algorithms perform best in which situations. That said, we still want to be able to manage smaller sets of data, even if our small projects or toy online database of phonebook entries can’t be macro-scaled to millions of entries without betraying a bit of naiveté. So let’s get sorting, and let’s start searching!

Getting Started

Grab the two files, SearchSortComplete.java, and SearchSortIncomplete.java, to your machine and open them (they are also pasted at the end of this document, though cut-pasting from a pdf is dicey). https://cs.gmu.edu/~msnyde14/211/share/SearchSortComplete.java https://cs.gmu.edu/~msnyde14/211/share/SearchSortIncomplete.java

You will be implementing some of the method 'stubs' throughout the chapter in the *Incomplete version, but you will have the *Complete version for testing and comparison's sake. This lab will be slightly more about observing code than writing it: we maybe saw some of these examples in class, and we'll just learn about a couple more algorithms today.

There are a few things to notice about this class.

First of all, scroll down to the bottom and notice there a few 'helper' functions— printArray, getChoice (two versions), printMenu, and displaySearchResult. This just helps us to organize our basic testing and exploring harness. In the main method, using these methods makes the code in the main method much shorter, and more descriptive. Instead of seeing a loop, and some sc.nextFoo() stuff, we see getChoice; instead of a lot of print statements, we see printMenu. Notice the two pairs of functions—the two overloadings of binarySearch, and the two overloadings of quickSort. The versions with more parameters are called by the others: the simpler ones (fewer parameters) are called "driver functions". All they do is take a request, make a start-up call to the 'real' method with any initial values for the additional parameters, and pass any results through. We could just as well have put the extra parameters (usually a zero or an x.length-1 or something) in our call from the main method, but driver methods are one more way we can abstract details away from our code. Notice the class variable debug; We make this variable at the very beginning, and then anywhere we want to print things out in testing mode, we conditionally test if debug==true. Taking a quick look inside a method like bubbleSort will exemplify this. Maybe using the debugger is a more effective way to view the state, but if you want to take print-style debugging to the next level, this is one approach.

Basic Searching: Linear Search

A linear search does just that—it starts at the beginning, and searches straight through until it either finds a match or finds itself with no more possible matches left. This is similar to what we’ve already seen; the method gets (a reference to) an array and a key value to seek, it steps straight through the array until it either finds a match (and returns the index where it was found) or finds out there were no matches (and returns -1 to signify no match found—no valid index is negative).

Your Turn! You should be comfortable writing a linear search algorithm by now, so try implementing the body of linearSearch without looking at your notes or the solution. Test your code: noting the default array values created in main, run the program and plug in a key to search for. Search for values that are: in the array not in the array in the array multiple times → what would be some good unit tests for these? It's a good chance to practice your testing skills if you've got the time. Compare your code against the solution.

Basic Sorting: Bubble Sort

Bubble Sort runs a double loop. The inner loop compares every single adjacent pairing of two elements (walking down the array: indexes 0 and 1, then 1&2, 2&3, 3&4, etc), and swapping them if they are unsorted relative to each other. Because of the regular pattern of access, no matter where the largest element is, it will get compared (and swapped) with every single smaller value to its right, ensuring that the largest value has been placed at the far right.

Each time we perform that swapping-traversal, one more almost-as-large value is guaranteed to have bubbled up to its correct location; if we have an array of length n, and we make (n-1) traversals, then we know n elements are in order. Thus all the elements are in order. So the outer loop must run as many as (n-1) times in the worst case.

Bubble Sort is the simplest sort; we might bash it in class, because it does far more swaps than necessary. Notice that, in sorting a particular spot, we might swap many of the values into the sorting spot, if we keep on finding smaller values to the right.

Your Turn! Try implementing bubble sort on your own, without looking at the solution. Using the completed code, turn our debugging print statements on (set the class variable debug equal to true); run bubble sort and see how it does a lot of swaps. Reset the array, and compare to Insertion Sort (using the complete code's version so that there's an implementation there). Now, try placing different values (and more of them) in the array—specifically, in decreasing order. Now see how inefficient bubbleSort can be! Going Further. We can make many improvements to the provided bubble sort implementation: Make the inner for-loop smarter: if we've completed k traversals, we know the last k elements are sorted; let's end not when i is less than the array length, but when it has entered the "already sorted" zone. Change the outer for-loop to a while loop. When should it quit? When no swaps were necessary during the last walk. (Use a boolean variable to track this).

More Efficient Searching: Binary Search

Binary Search could be called the "phonebook" search or the "dictionary" search. It requires the values to be in sorted order. It continually finds the middle of a range that could contain the key (if it is in the array at all), and then either finds the key at that mid-point or calls the method again on the smaller range (to the left or right of the inspected middle location) which is now known to be the only fit location for the key. Binary search relies on the data being sorted – otherwise, it won't work.

Your Turn! Going Further: Implement and test your own binary search, trying not to look at the solution (or class notes) in the process. This lends well to a recursive definition, but does not have to be.

Your Turn! Try running this algorithm just like linear search, without sorting first. Notice you can get some buggy results—search for a ‘6’ in the array originally listed, and the program crashes! A binary search needs sorted data. Let’s fix the example: Add a class variable, a boolean named isSorted. (This must be static; why?) Go ahead and initialize it to false right there, on the same line. (Reminder: we must always initialize on the same line for class variables—things at the class level don’t need an object, so we can't rely on some constructor call to instantiate it). Now, in the binarySearch method, at the very beginning, add an if-statement that checks if isSorted is false, and if so run a sort algorithm and set isSorted to true (take your pick of the various sorting methods in our program). We don’t have any other searches in our lab that require a sorted array, so we don’t have to fix this anywhere else. However, if we copy over the old array, we need to reset this variable to false. Go to the main method, and look for the option that resets the array to the copy. Immediately following that code, set isSorted to false. Now, our binarySearch will conditionally sort the information, if necessary, and won’t spend all that time if it’s already sorted! This is especially good because many sorting algorithms are pathologically slow when working on mostly- or completely-sorted data. Always sorting first would be an expensive waste of time!

(Slightly) More Efficient Searching: Insertion Sort

Insertion Sort works a bit more like we might actually think about sorting a list. The goal is to first sort a small portion of the list at the front; then, we successively consider one more spot in the list, and keep sliding it leftward (forwards) until it's in sorted place. We now have a slightly longer beginning part of the list that is sorted. We keep performing this action on the next unsorted spot, gradually expanding our sorted portion of the list to consume more and more unsorted elements, until the entire list is sorted.

The first iteration, we make sure the first element by itself is sorted. Well, yes, of course it is. So really, the first step is to add one more unsorted location to our already sorted list of length one, meaning we consider the first two items. If necessary, swap them.

Your Turn! Try implementing insertion sort. Notice that we still will likely have two for-loops: one inner one that handles inserting the next item into our sorted portion of the array, and one outer loop that guides us to adding each unsorted element to the sorted portion, one at a time.

Okay, one way or another, let's now take a look at the solution:

Notice that the inner loop counts down through indexes, starting as high as the outer loop tells us the next unsorted location is. If you had an array drawn on paper, try tracing your finger over indexes that correspond to j's values. After traversing down through our sorted array and swapping our unsorted element as far down as necessary, this additional value is now in its sorted place. Once it no longer needs to be swapped, it won't have to go any further so we can quit the inner loop.

More Efficient Sorting: Merge Sort

Merge Sort is a different sort of beast; It continually breaks up the array into two halves, sorts both half (using Merge Sort itself), and then zips them together—basically, it interleaves the values from the two halves back into one bigger, sorted array.

Merge Sort is a recursive algorithm, as implemented. This means we have to check for base cases first, and then complete the inductive (recursive) cases.

Your Turn! If you're up for the challenge, try implementing merge sort! o First, create two sub-lists by splitting the list in half (make two new arrays). o Then, recursively call merge sort on each. o Then, recreate a list of the original length by always copying in the smallest value from either list that hasn't yet been included. (Relying on their being sorted makes this simple, though you might have a many-else-ifs block of code).

Okay, let's discuss the provided solution. It uses arrays instead of ArrayLists (which Snyder's in-class examples showed), so that you can compare the two implementations.

This sort first creates two halves from the given array. Then, it calls mergeSort on both halves: we now have two smaller sorted arrays, and we just need to combine them into one larger sorted array. Look at the code for that zipping-together portion, and notice it basically checks if the value should come from the next in the first half, the next in the second half, or the smallest of the next in each. The reason this works is that lists with only one element in them get returned—they’re already sorted— and from this base case, we use the zipping portion to merge two sorted sub-arrays together. So, what this algorithm really does is break the array up into individual spots, then merge them all back together, two arrays at a time, until we get the entire set of values back in one array. Note that, especially with the creation of so many arrays all over the place, this is not going to be a very efficient implementation: it uses up a lot of extra space storing all these 'copies' of the array in various levels of separation. An in-place sort (which reuses the current locations in the array instead of creating copies of those values somewhere else) would be much more space-efficient, though it would probably mean more work shuffling values around in that limited space.

Your Turn! Try running this sort, as well as in debug mode. It prints a lot, but you can sort of see the two steps— splitting and merging—occurring, just by the width of the left and right arrays being used.

One Last Sort: Quick Sort

Quick Sort functions in an even ‘smarter’ way. It chooses a pivot value, and then moves anything smaller than it to the left of it, and anything larger than it to the right of it. Since the pivot is now correctly between all values it should be, we then quick-sort the left and right portions with a (recursive) call to Quick Sort again.

If you are brave, then it is your turn. Try to implement quickSort. It might also be easiest to use ArrayLists, because the number of values less than (or greater than) the pivot isn't known before you step through the list. You can choose any value in the list as your pivot value; simple choices include the value at the first, middle, or last index. (Choosing the middle one is actually not too bad an uneducated choice). Use recursion to sort the sub- lists to the left and right of the pivot value, once you've placed them all on the correct side of the pivot value (and have figured out where the pivot value belongs, for that matter).

Looking at the implementation. This sort has a driver function (explained in the beginning of this lab): it builds up the first call to the more- parameters overloaded version of quickSort, which does the real work. Again, this is just to abstract out the extra parameters. Why should we bother to include parameters whose values we can always deduce for the initial call? Also, if we called quickSort with arbitrary alternate values for the left and right, we’d only sort that portion. Partially sorted is close but not particularly useful.

Your Turn! Try turning on debugging to see how calls to quickSort operate. What happens when two values are identical? How does quickSort fare when the list is already sorted? When it's sorted the wrong way? (Try starting with values such as {8,7,6,5,4,3,2,1}).

Closing Thoughts on Search & Sort

That's it for searching and sorting for us! Hopefully you now understand what the purpose of searching and sorting is, at least for linear data structures like arrays. In later classes and projects, you will have larger and more complex data structures, but the ideas of searching and sorting will constantly show up: searching through those structures for specific parts (values); trying to organize unsorted data; or, even trying to maintain data in a sorted fashion while adding elements to it one at a time.

We've looked at just a couple of ways to search through data. A linear search was very simple, and needed nothing more than a way to inspect every location in some sequential order. But if we knew we had ordered data, we could do much better with binary search: each inspection allows us to throw out half the remaining search space each time.

We considered a few more ways to sort data, with increasing complexity and different approaches. Bubble sort is very simple, but has pretty poor performance (e.g., it performs many more swaps on average than the other sorts). Insertion sort keeps growing a sorted list at the front of the list, repeatedly sliding the next unsorted value into our sorted portion until it fits. It performs a bit better than bubble sort. Next we considered merge sort, a divide-and-conquer algorithm, which can easily be defined by recursion (but doesn't have to be). It successively splits the list down into sub-lists until they are of size 1 (sorted by definition), and then merges them back together, relying on the simplicity of combining two already-sorted lists. Lastly, we looked at quick sort, which chooses a pivot value, puts all the smaller-than-pivot values on the left, all the larger-than-pivot values on the right, and the pivot between them; it then recursively quicksorts the left and right, and is done.

These versions of sorting had different behaviors – some used lots of space (merge sort), some worked in place (the other three). Some performed better on nearly-sorted data than others.

There are many ways to improve upon the searching and sorting that we've introduced here. What can you find out about these two ubiquitous programming tasks? SearchSortIncomplete.java import java.util.*; public class SearchSortIncomplete {

public static boolean debug = false; public static Scanner sc = new Scanner (System.in); public static void main(String[] args) {

// a value to search for, hoping for its index in the array. int key = 0;

// a resulting index to be returned from a search. int result = 0;

//does the user want to quit? boolean quit = false;

//change this however you like. int[] x = { 3, 5, 2, 7, 4, 6, 0, 10, 3 }; int[] copyOfX = new int[x.length];

//why can't we just say "copyOfX = x;" ? for (int i = 0; i < x.length; i++) { copyOfX[i] = x[i]; }

while (!quit) { printMenu();

//get a number from 1-9 from the user. int choice = getChoice(9);

//just these options need an extra value, for the key. if (choice == 5 || choice == 6) { key = getChoice(); }

//dispatch the requested operation. switch (choice) { case 1: bubbleSort(x); break; case 2: insertionSort(x); break; case 3: mergeSort(x); break; case 4: quickSort(x); break; case 5: result = linearSearch(key, x); displaySearchResult(result, key); break; case 6: result = binarySearch(key, x); displaySearchResult(result, key); break; case 7: printArray(x); break; case 8: for (int i = 0; i < x.length; i++) { x[i] = copyOfX[i]; } printArray(x); break; case 9: quit = true; break; default: System.out.println("try again!"); } } }

/* Sorting Methods Section. */ public static void bubbleSort(int[] a) { //TODO: implement! } public static void insertionSort(int[] a) { //TODO: implement! } public static void mergeSort(int[] a) { //TODO, for the mighty: implement! } public static void quickSort(int[] a) { quickSort(a, 0, a.length - 1); } public static void quickSort(int[] a, int leftIndex, int rightIndex) { //TODO, for the mighty: implement! }

/** * Searching Methods Section. */ public static int linearSearch(int k, int[] a) { //TODO: implement! return -1; } public static int binarySearch(int k, int[] a) { return binarySearch(k, a, 0, a.length - 1); } private static int binarySearch(int k, int[] a, int left, int right) { //TODO: implement! return -1; }

/** * Helpful Functions Section. These all make the main method more * "streamlined." */ public static void printArray(int[] a) { //a slightly smart printing function. String vals = "values:"; String indices = "index: "; for (int i = 0; i < a.length; i++) { if (a[i] < 10) vals += " "; if (a[i] < 100) vals += " "; if (a[i] >= 0) vals += " "; vals += a[i];

if (i < 10) indices += " "; if (i < 100) indices += " "; indices += " " + i; } System.out.println(vals + "\n" + indices); } public static int getChoice(int num) { int x = -1; while (x < 1 || x > num) { try { x = sc.nextInt(); } catch (Exception e) { System.out .println("Whoops! try again--enter a menu option from 1 to " + num + ":"); sc.nextLine(); } } return x; } public static int getChoice() { System.out.println("please enter a number:"); while (true) { try { return sc.nextInt(); } catch (Exception e) { System.out.println("Whoops! try again--enter a number:"); sc.nextLine(); } } } public static void printMenu() { System.out.println("Please choose:"); System.out.println("[1]Bubble Sort"); System.out.println("[2]Insertion Sort"); System.out.println("[3]Merge Sort"); System.out.println("[4]Quick Sort"); System.out.println("[5]Linear Search"); System.out.println("[6]Binary Search"); System.out.println("[7]Print the Array"); System.out.println("[8]Reset the Array"); System.out.println("[9]Quit!"); }

public static void displaySearchResult(int r, int k) { //display results for if (r != -1) { System.out.println("The index for key=" + k + " is " + r); } else { System.out.println("The key=" + k + " was not found in the array."); }

}

}

SearchSortComplete.java import java.util.*; public class SearchSortComplete { public static boolean debug = false; public static Scanner sc = new Scanner (System.in);

public static void main(String[] args) {

// a value to search for, hoping for its index in the array. int key = 0;

// a resulting index to be returned from a search. int result = 0;

//does the user want to quit? boolean quit = false;

//change this however you like. int[] x = { 3, 5, 2, 7, 4, 6, 0, 10, 3 }; int[] copyOfX = new int[x.length];

//why can't we just say "copyOfX = x;" ? for (int i = 0; i < x.length; i++) { copyOfX[i] = x[i]; }

while (!quit) { printMenu();

//get a number from 1-9 from the user. int choice = getChoice(9);

//just these options need an extra value, for the key. if (choice == 5 || choice == 6) { key = getChoice(); }

//dispatch the requested operation. switch (choice) { case 1: bubbleSort(x); break; case 2: insertionSort(x); break; case 3: mergeSort(x); break; case 4: quickSort(x); break; case 5: result = linearSearch(key, x); displaySearchResult(result, key); break; case 6: result = binarySearch(key, x); displaySearchResult(result, key); break; case 7: printArray(x); break; case 8: for (int i = 0; i < x.length; i++) { x[i] = copyOfX[i]; } printArray(x); break; case 9: quit = true; break; default: System.out.println("try again!"); } } }

/* Sorting Methods Section. */

public static void bubbleSort(int[] a) { int swaps = 0; int temp = -1; // perform one 'bubbling' pass (n-1) times. for (int count = 0; count < a.length-1; count++) { for (int i = 0; i < a.length-1; i++) { //if these two are out of order, swap them. if (a[i] > a[i+1]) { if (debug) { System.out.println("swapping "+a[i]+"@[" + i + "] and " + a[i+1]+" @[" + (i+1) + "]."); swaps++; } //swap these locations. temp = a[i]; a[i] = a[i+1]; a[i+1] = temp; } } } if (debug) System.out.println("total swaps: " + swaps); }

public static void insertionSort(int[] a) { int swaps = 0; for (int i = 1; i < a.length; i++) { for (int j = i; j >0; j--) { // if necessary, swap value at j with value at (j-1). if (a[j-1] > a[j]) { //before swapping, mention it. if (debug) { System.out.println("swapping a[" + j + "-1]=" + a[j-1] + " and a["+ j + "]=" + a[j]); printArray(a); swaps++; } int temp = a[j]; a[j] = a[j-1]; a[j-1] = temp; } } } if (debug) System.out.println("Number of swaps=" + swaps); } public static void mergeSort(int[] a) {

//base cases if (a.length <= 1) { return; }

int halfnum = a.length / 2;

//make two arrays. secondHalf might be one-longer than firstHalf for odd-length arrays. int[] firstHalf = new int[halfnum]; int[] secondHalf = new int[a.length - halfnum];

//copy in the values to firstHalf and secondHalf. //Note the guard statement (i

if (debug) { System.out.println("to merge: " + a.length); printArray(a); printArray(firstHalf); printArray(secondHalf); }

//sort the two halves--Recursively! mergeSort(firstHalf); mergeSort(secondHalf);

if (debug) { System.out.println("having merged: " + a.length); printArray(a); printArray(firstHalf); printArray(secondHalf); }

/** * merge the two halves back together. If you think of the smallest * cases, this part actually does the sorting! */ int ifst = 0; int isnd = 0;

for (int i = 0; i < a.length; i++) { //when there are no more values in the first half. if (ifst >= firstHalf.length) { a[i] = secondHalf[isnd++]; } //when there are no more values in the second half. else if (isnd >= secondHalf.length) { a[i] = firstHalf[ifst++]; } //if both have values left, and the first's next one is smaller. else if (firstHalf[ifst] <= secondHalf[isnd]) { a[i] = firstHalf[ifst++]; } //if both have values left, and the second's next one is smaller. else { a[i] = secondHalf[isnd++]; } }

if (debug) { System.out.println("\tending a length: " + a.length); printArray(a); } } public static void quickSort(int[] a) { quickSort(a, 0, a.length - 1); } public static void quickSort(int[] a, int leftIndex, int rightIndex) { if (debug) System.out.println("quicksorting from indexes " + leftIndex + " to " + rightIndex); if (leftIndex == rightIndex) { return; } // choose a pivot. we are arbitrarily choosing the middle value, // which isn't that better than any other value. But if the data // happen to be mostly sorted, this will tend to do all right. int pivotIndex = (leftIndex + rightIndex) / 2; if (debug) System.out.println("\t with pivot " + pivotIndex);

//swap pivot value to far left of region. int temp = a[leftIndex]; a[leftIndex] = a[pivotIndex]; a[pivotIndex] = temp; //realign pivot index. pivotIndex = leftIndex;

//partition values around the pivot. //i walks right, and pivot swaps smaller values to the left side of it. for (int i = leftIndex + 1; i <= rightIndex; i++) { if (a[pivotIndex] > a[i]) { if (debug) System.out.println("\t[swapping " + i + " " + pivotIndex + " " + (pivotIndex + 1)); //three-way swap. This places the smaller-than-pivot value from a[i] // at the old pivot's location, moves the pivot value to the right one, // and moves this value displaced by the pivot to where the smaller one // happened to be (i.e., anywhere to the right of the pivot). temp = a[pivotIndex]; a[pivotIndex] = a[i]; a[i] = a[pivotIndex + 1]; a[pivotIndex + 1] = temp;

//realign pivot index. pivotIndex++; } }

//print out our partially sorted array. if (debug) { System.out.println("pivot now =" + pivotIndex); printArray(a); }

//quickSort the two partitions. if (pivotIndex > leftIndex) { if (debug) System.out.println("running left...\n"); quickSort(a, leftIndex, pivotIndex - 1); } if (pivotIndex < rightIndex) { if (debug) System.out.println("running right...\n"); quickSort(a, pivotIndex + 1, rightIndex); } }

/* Searching Methods Section. */ public static int linearSearch(int k, int[] a) { for (int i = 0; i < a.length; i++) { if (a[i] == k) { return i; } } return -1; } public static int binarySearch(int k, int[] a) { return binarySearch(k, a, 0, a.length - 1); } private static int binarySearch(int k, int[] a, int left, int right) {

int middle = (left + right) / 2; if (a[middle] == k) { return middle; } if (left == right) { return -1; } else if (a[middle] > k) { return binarySearch(k, a, left, middle - 1); } else { return binarySearch(k, a, middle + 1, right); } }

/** * Helpful Functions Section. These all make the main method more * "streamlined." */ public static void printArray(int[] a) { //a slightly smart printing function. String vals = "values:"; String indices = "index: "; for (int i = 0; i < a.length; i++) { if (a[i] < 10) vals += " "; if (a[i] < 100) vals += " "; if (a[i] >= 0) vals += " "; vals += a[i];

if (i < 10) indices += " "; if (i < 100) indices += " "; indices += " " + i; } System.out.println(vals + "\n" + indices); } public static int getChoice(int num) { int x = -1; while (x < 1 || x > num) { try { x = sc.nextInt(); } catch (Exception e) { System.out .println("Whoops! try again--enter a menu option from 1 to " + num + ":"); sc.nextLine(); } } return x; } public static int getChoice() { System.out.println("please enter a number:"); while (true) { try { return sc.nextInt(); } catch (Exception e) { System.out.println("Whoops! try again--enter a number:"); sc.nextLine(); } } } public static void printMenu() { System.out.println("Please choose:"); System.out.println("[1]Bubble Sort"); System.out.println("[2]Insertion Sort"); System.out.println("[3]Merge Sort"); System.out.println("[4]Quick Sort"); System.out.println("[5]Linear Search"); System.out.println("[6]Binary Search"); System.out.println("[7]Print the Array"); System.out.println("[8]Reset the Array"); System.out.println("[9]Quit!"); } public static void displaySearchResult(int r, int k) { //display results for if (r != -1) { System.out.println("The index for key=" + k + " is " + r); } else { System.out.println("The key=" + k + " was not found in the array."); }

}

} Appendix 1

Regular Expressions

Regular Expressions

We will step through our own tutorial, but if you'd like a more thorough introduction, you can follow through this link:

http://docs.oracle.com/javase/tutorial/essential/regex/

and test out things as they are introduced there instead of this portion of our lab dedicated to regular expressions.

Regular expressions are an extremely powerful mechanism for more nuanced searching through strings than a simple search for specific substrings. A regular expression defines a pattern that we can search for within a given string. We might ask whether a particular string entirely and exactly matches a pattern, or we might instead ask whether a pattern can be found to match some portion (substring) of the string. We might also use a regular expression to extract substrings from a string, or to replace portions of a string with some replacement, whenever the pattern matches in that string. In short, whenever we find ourselves wanting to perform string manipulations, chances are that regular expressions are available as a way to express what we want to occur.

Regular expressions show up all over the place in programming and computer science – not just in Java. One common use of regular expressions is to search for a file on your computer (e.g., using the egrep UNIX command).

A regular expression is simply a way to represent structure within a string of symbols—for instance, identifying what makes a valid phone number, identifying if a particular word is included, et cetera.

A regular expression defines a set of strings. That set of strings is all the strings that comply with the regular expression; all strings which do not comply are not in the set.

We can match things directly by simply having them there. The regular expression Cat defines a set of strings with exactly one element, the string "Cat" (notice the sensitivity to case—it does not match "cat"). If we want to match other characteristics, we start developing a set of symbols that imply things like "at least this many of those", "any one of these", "anything not of this group", and so on. Let’s define some of those now.

This does not cover all of regular expressions in Java – it is just an introduction.

pattern/symbol meaning any non-special char matches itself . matches any single char (except newline in some circumstances) * repetition: matches zero or more of the preceding thing + repetition: matches one or more of the preceding thing ? repetition: matches zero or one of the preceding thing {7} repetition: matches exactly 7 of the preceding thing {3,6} repetition: matches from 3 to 6 (inclusive) of the preceding thing {7,} repetition: matches 7 or more of the preceding thing (pattern) parentheses group a pattern. pattern1 | pattern2 selection: matches either pattern1 or pattern2 [aeiou] character class: matches any single character listed in [ ]'s [a-z] matches any single character in ASCII range a to z. [^stuff] matches any single char not in stuff [main[nested]] matches any single char in the main or nested character class [main&&[nested]] matches any single char that is in both main and nested ^ anchor: 'matches' beginning of input $ anchor: 'matches' end of input \b \B matches a word boundary (\B matches a non-word boundary) \d \w \s matches a single char of a: digit, word, whitespace. \D \W \S matches a single char that is NOT a digit, word, whitespace.

For testing cases, repeatedly modify the value of regex in the following code: public class TestLabRegex { // basic matching example public static void main1 (String[] args) { Scanner sc = new Scanner(System.in); String str = ""; String regex = "k.t+y"; System.out.println("\nPattern: "+regex+"\n"); while (!str.equals("quit")) { System.out.println("next input please: "); str = sc.nextLine();

System.out.println("\n\nstring: \""+str+"\"\nPattern: " +regex+"\nmatches: "+str.matches(regex)+"\n"); } } }

Your Turn! • For each of the following regular expressions, think of some Strings that do match it, and Strings that do not match it. Test it with the code that calls the matches method on a String. o (way-)+back machine o x?y+z* o (a|b)* o (something|) to do o abc+ o (abc)+ o (do|re|mi|fa|sol|la|ti|do)+ When designing your own regular expression, you should always ask yourself two questions: (1) Does it accept enough Strings? (2) Does it accept too many Strings?

Your Turn! • Create regular expressions that exactly match the set of Strings described in each example. o any number of letters, followed by seventy exclamation points o one of the Pac-Man ghosts: inky, binky, pinky, and clyde (try to be concise) o zero to twenty a's, followed by an h. o write a regular expression that matches one of the following smileys: :) :( :?) 8-P o valid Java identifiers o a valid Java int representation (just allow base 10 representations, so no leading 0's) o any valid int[] initializer, such as {1,4,2,5} or {100,101,-4}

Let's review some of those with more descriptive examples. We will underline a regular expression to help mark the boundaries without resorting to the specific String representations, which have a couple of twists related to escape characters.

• Character Class: To say that we want any of a batch of characters, but just one of them, we surround all of them with square brackets [ ] , with no space in between. The regular expression [aeiou]t matches "at", "et", "it", "ot", and "ut". We always choose exactly one of the options in the brackets: not zero, not many. • To say that we want to match any character except what’s listed in a character class, we use the ^ sign at the front of them all. The regular expression [^aeiou]t matches all consonant-followed-by-t strings: "bt", "ct", "dt", "ft"; but it also matches non-characters; "5t", "(t", and more would also match. • Since typing things like [456][456][456] can get tedious, we can specify how many of a particular pattern we want, by following the brackets with curly braces indicating how many. That could be reduced to [456]{3}, indicating we want exactly three digits (all either a 4, 5, or 6). If we want at least n repetitions but no more than m repetitions, we represent this as in [456]{n,m}. A specific example would be [456]{1,3}, which matches either a one, two, or three digit number (each digit being a 4, 5, or 6). Thus, the regular expression [45]{1,2} matches "4", "5", "44", "45", "54", and "55". If we have a minimum bound but no upper bound, we can leave that part out—[45]{1,} matches numbers with at least one digit, of only 4’s and 5’s. If we wanted up to some number of matches, we use a zero for the lower bound: e.g., [456]{0,4}. • We can match ranges of values by a dash; [0-9] matches any single digit; [a-z] matches any single lowercase letter; [A-Za-z] matches any letter regardless of case, and so on. Notice this only works inside brackets: 0-9 matches the string "0-9". Also, [1-10] matches "1" or "0", it does not match "1","2","3",…,"8","9","10". • If we want to allow a pattern to repeat any amount of times, we place an asterisk after it: a*t matches "t", "at", "aat", "aaat", etc. This represents zero or more a’s followed by exactly one t. A close cousin to the star is the plus symbol +, used to represent one or more repetitions of the pattern. a+t matches "at", "aat", etc., but does not match "t". • To choose between one sub-pattern and another, we use the vertical bar | to represent union of the sets created by each of them. So (a|parthe)non matches "anon" and "parthenon". • If we want a range of values minus a few particular values, we can use && to intersect the two expressions: [a-z&&[^abc]] is all lowercase letters except a, b, c. Be sure that you use the inner [ ]'s! If ^ doesn't appear as the very first character of a character class, it is not a special character and just matches itself. • When we want to group parts of a pattern together, we can use parentheses to separate them. For instance, if we want to match as many repetitions of the prefix "sub" at the front of the word "saharan", we could use parentheses as follows: (sub)*saharan matches "saharan", "subsaharan", subsubsaharan", etc. • We can match any character at all with a period (.), except a newline character (\n). So .t matches "at", "bt", …, "At", …, "1t", …, "$t", "_t", … . Oftentimes we mean a more restricted set like [a-zA-Z]. • Spaces are matched by leaving in the space directly; spaces matter in a pattern. a lot does not match "alot"; it only matches "a lot".

Your Turn! • write a character class that matches a single one of the following letters: ceiknop (pick one)… • write a character class that matches a single upper-case non-vowel in the second half of the alphabet. • write a character class that matches any single digit character(this is just \d being defined manually). • write a regular expression using a character class to represent a phone number in the format of 555- 555-5555. o bonus: disallow any String that starts with 911. • write a regular expression for length-3 Strings that don't match your initials in any position. (if you have more or fewer initials, adapt it to yourself of course!)

Escaping Special Characters What if we wanted to match a square bracket, or a parenthesis, or a period? All characters which get ‘used up’ by controlling how we match characters can still be included, by placing a back-slash in front of them to indicate that we want the actual character, and not the function it usually provides in a regular expression. So \[[0-9]\] matches [0], [1], [2], … [9]. This also works for others: \*\+\.\, matches the string "*+.,". This can be done with all the special characters.

Representing Regular Expressions in Java

One last thing to note, which can really throw a wrench in the gears: Java itself needs backslashes to represent a quote sign; so Strings are already using the backslash and the quote sign ". In order to keep the two separate, it gets a little messy: to represent the regular expression "[\d]" as a String (notice the pattern is matching some quote characters), it is written:

String str = "\"[\\d]\"";

What happened?! We had to place a backslash in front of the first quote of the regular expression to indicate to Java that it was part of the String and not the end of the String; we had to put an extra backslash in front of the backslash (which was itself in front of d) to indicate that it was not for Java to figure out what control character was being used, but rather that it should leave the slash in as part of the regular expression; the second quote of the regular expression is like the first. The short version of handling this is that all backslashes and quotes get a backslash added in front of them when we go from a regular expression on paper to a regular expression stored as a String in-between the quotes when we construct it.

How to deal with this issue: first, just write out your regular expression, not worrying about Java. Perhaps in a comment if you want to record it in your code. Then, character-for-character, represent them in a Java String. Given the bizarre regex abc"\**\bshe\B\\++". we can represent it character for character:

• a is just "a"; same for b and c. • Then, the very next character to consider is \. Don't worry that it is logically part of \*, because we just attack one character at a time. \ becomes "\\" when we put it in a String. • Next, ** is just "**" • \ is "\\" • bshe is "bshe" • \ is "\\" • B is "B" • \ is "\\", again the second \ is "\\" • ++ becomes "++" • " becomes "\"" • and lastly . is just "."

Let's put it all together:

abc"\**\bshe\B\\++". "abc\"\\**\\bshe\\B\\\\++\"."

Your Turn! • With the testing code above, explore uses of all the rest of the patterns and symbols shown above. If you work with a Scanner, reading input from the user, though, the notion of "end of input" won't be the same as "end of line", and $ won't behave quite the way we'd like for simple explorations. (This would require "embedded flags", but that is advanced enough that we won't cover them in this introductory lab).

We next want to get practice with the basic pattern matching capabilities that were added to the String class. We now have two new methods, with the following signatures:

public boolean matches(String regex) {…} public String replaceAll(String regex, String replacement){…}

matches We will learn how to use the first method, matches. First, we need to create a String to use with this method. In practice, this is often part of a document or user-entered information – something to be checked for resemblance to some known pattern.

String str = "kitty";

Next, we can check if it matches some regular expression, calling the matches method on it with a single parameter, the regular expression:

String regex = "ki[t]{2}y"; boolean b = str.matches(regex); b should be true after this runs. What if we want to find out if a String simply contains some pattern in it, and we don’t care where in the String that might be? Consider this:

String str = "cat on a hot tin roof"; boolean b = str.matches(".*hot.*");

This will return true as long as there are zero or more characters before h, o, t, and then zero or more characters. Test this with different values than what is in str above. Try it at the start of a word, at the end of a word, and with things in between the h, o, and t.

Consider trying to match phone numbers with this pattern:

[0-9]{3}-[0-9]{3}-[0-9]{4}

Test a few phone numbers. Suppose, though, that you were asking someone to enter in their phone number and you didn’t want to be too strict on the input format; let’s assume we also want to allow phone numbers like (555)-334-2190, (555)334-2190, (555) 334 2190, or any other patterns you can think of. How can we handle this?

We have the parentheses to group patterns, and the vertical bar | to say "choose exactly one side of what’s on either side of me". So we can make the patterns, ‘or’ them all together, and get a very flexible regular expression for phone numbers.

Your Turn! • Try testing your program with two different kinds of phone number patterns first, and then adding others.

replaceAll Remember the signature from above, which takes a regular expression String, a replacement string, and returns the String created by replacing all matches:

public String replaceAll(String regex, String replacement){…}

Suppose we want to not just find out if something matches, but we want to replace the occurrence with something else; perhaps you wrote a form letter, and need to replace all instances of "John" with "Jack". Consider the following:

String str = "Please excuse John from school yesterday. John was ill," +"and so I, John’s father, made John stay home.\n\t\tJohn, Sr.";

We can get this letter for Jack by just getting the String returned by replacing all "John"s with "Jack"s:

String jacksLetter = str.replaceAll("\\bJohn\\b", "Jack");

A few things to notice: str’s value was not changed; jacksLetter operates on a copy of str. Also, "John" is a pretty tame pattern to match—that’s where we could match things like our telephone number example from above, replacing all of them with just the local area phone numbers, for instance. Why do we have the \b word boundaries at the edge? What would happen if we left them out, and tried to replace every occurrence of "a" with "an"? Try this:

String str = "Once upon a time there was a funny guy who ate" +" an apple with an aardvark."; String aanstr = str.replaceAll("a", "an"); System.out.println(aanstr);

And that’s why we have the word boundary delimiter: we care about the surrounding environment of the "a" matches that should be replaced; only the ones surrounded by word boundaries should be replaced.

Your Turn! • replace every occurrence of the word "he" with "she", and also "him" with "her" from some given input. Be sure to not accidentally match portions of other words such as there becoming tshere. • replace every digit with an X. (Pretend you're making a safe version of a bank receipt where you don't want to print the account number).

Capture Groups As we write our regular expressions, we can use parentheses to group things. If we have a match, wouldn't it be nice to extract the portions of our String that matched each specific parenthesized group? We can do this, but we first need to understand how to use the Pattern class (for "compiling" a String that we want to use as a regular expression into some actual internal representation of a regular expression), and how to use the Matcher class (for using a Pattern on specific Strings and seeing if matches are possible).

More Efficient Pattern Matching

Although we are capable of writing virtually all the pattern matching we’ll ever want with this, it can get a bit tedious, creating Strings, calling matches on the regular expression and the string again and again; also, though it is definitely not a focus in this class, it takes significant time to figure out what a regular expression means before it can be applied, and so if we were to apply the same pattern numerous times (say, in a loop), it would be wasteful to re-figure out what that regular expression is every single usage. So we can create something to handle a particular regular expression and keep it around, just asking it if a String matches its pattern.

Enter the classes Pattern and Matcher. We can make an object of the class Pattern to store the understanding of a regular expression:

// need to import java.util.regex.*; Pattern p = Pattern.compile( regex ) ;

(Notice that we didn't call the constructor, we called a static method of the Pattern class). Then, we can make an object of the class Matcher, which has an associated String with it:

Matcher m = p.matcher ( str ) ;

Then, when we want to know if str matches regex, we call the matches method on m (notice there’s no parameter now):

boolean b = m.matches();

In the grand scheme of things (such as in a loop), this would be prudent as follows: instead of using the matches method with a String,

while (…) { String str =…; boolean b = str.matches ( "bar"); … }

We can instead pull the compilation of the regular expression out of the loop:

Pattern p = Pattern.compile (regex); while (…) { String str = …; Matcher m = p.matcher(str); boolean b = m.matches(); }

Even though it looks like more code, it turns out that when we call matches on a String, it’s just going to create a Pattern, create a Matcher, and call the matches method of the Matcher object and then throw the two objects away and just yield the result; even though we see more code, there’s still less work going on.

Finally, we can discuss our capture groups! If we have a Matcher object and we call matches, and we successfully find a match, then we can next call the group method and retrieve any specific capture group by number, starting with 1 for the leftmost open parenthesis, and numbering upwards for each found open parenthesis:

public String group (int groupNum)

Here is an example:

// group numbers: 1 2 3 Pattern p = Pattern.compile("(c(a|u)r)tai(n|l)"); Matcher m = p.matcher("curtail"); if (m.matches()){ String first = m.group(1); String second = m.group(2); String third = m.group(3); System.out.println("saw "+second+" inside of " + first +", all in front of "+third+"."); } else { System.out.println("no match. If called, group(#) would throw " +"an IllegalStateException."); }

If you call group(0), you get the entire match. You can also call group(), an overloaded version that is equivalent to group(0).

Your Turn! • Match a phone number in some format, but do so using the Pattern and Matcher class. Now, remove the first three numbers (the area code), and print back just the rest. • If the nth capture group was part of some repetition, only the last of the matches is remembered (and returned from the corresponding call to group(n)). Write a regular expression that matches any number of "yes" or "no" strings, followed by "final answer." Print back what the final answer was (the last yes or no).

Adventures in One-Liner Land

Though we like to keep lines of code "short" for ease of reading, sometimes fewer lines is also considered easy to read. Whenever we put in a String literal, like "puppy", what actually happens is that Java makes a new String object with that contents and then uses it right there. So we could match a literal String to a regular expression in a really-hardwired way, declaring no variables to hold the String or Pattern:

boolean b = ("puppy").matches("p[aeiou][p]{2}y");

Similarly, and perhaps more likely, if we were getting str from some computation or method call, we could put that in the first set of parentheses:

boolean b = (sc.nextLine()).matches("p[aeiou][p]{2,5}y");

Beyond the Chapter

Now you’re a pattern-matching pro. Sort of. But there is so much more than we've done here! This was just the simplest of introductions. There are more complex patterns, strategies for how greedy or hesitant a pattern is to consume as much or as little as possible when finding a match, and many, many more pre-defined character classes. See here (http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html) for more character classes, and see here (http://docs.oracle.com/javase/tutorial/essential/regex/) for a much more in-depth tutorial on regular expressions. Appendix 2 Number Representations

There are many different ways to represent whole numbers. While we are comfortable counting in decimal (0,1,2,3,4,5,6,7,8,9,10,11,12,…), that is only one set of names for those values. There are other numeral systems (ways of representing numbers), such as Roman numerals: I, II, III, IV, V, , VII, VIII, IX, X, XI, XII,…. We also can count in base-one, which are tally-marks: |, ||, |||, ||||, ||||, |||| |, |||| ||, …

The important idea right now is that we have different ways to label the values, but it doesn't change the quantity itself. If there are 12 beans in a pile on the table, it doesn't matter what I call it: 12, XII, a dozen, 1012, there are still that many beans sitting there.

We will discuss some other systems for naming whole numbers. Specifically, we will look at base 2 (binary) and base 16 (hexadecimal). For historical reasons, computer scientists are also occasionally interested in base 8 (octal).

Counting Up Base 10: Base 2: Base 16: Base 8: We first cover a chart that shows how to count upwards 0 0 0 0 in various bases. Whatever the base is, we cycle through 1 1 1 1 the symbols for that base, and when we run out of symbols we clock over into the next column (like from 9 2 10 2 2 to 10, or from 599 to 600). 3 11 3 3 4 100 4 4 Each row gives the various names from various bases 5 101 5 5 for the same quantity. So the base 10 number 13 is also 6 110 6 6 represented as 1101 in base 2, as D in base 16 (and 15 7 111 7 7 in base 8). 8 1000 8 10

9 1001 9 11 We will use this chart throughout our discussion of numeral systems. 10 1010 A 12 11 1011 B 13 Note: in order to differentiate between numbers in 12 1100 C 14 various bases, we write the base as a subscript. Thus 13 1101 D 15 we can tell that 10110, 1012, and 10116 are all different 14 1110 E 16 values (10110, 510, 25710 respectively). If we leave off a 15 1111 F 17 base, it is either base 10, or it is obvious (such as 16 10000 10 20 F8=F816). 17 10001 11 21 18 10010 12 22 19 10011 13 23

… … … …

Different Bases

Let's think a bit more carefully about how we already represent numbers in decimal; then we can use the same description with a different base than 10. We usually assume base 10 when we write numbers, but we will often add subscripts to indicate the base, such as 24510, 24516, 2A816, 1010, 102, 1016, etc. Decimal The base for our numbers is 10, and so we call decimal base 10. This base number indicates both how many symbols we have, as well as the values of each column in our numbers: base 10: ten symbols. 0 1 2 3 4 5 6 7 8 9 base 10: column values: … 103 102 101 100 = … 1000 100 10 1

We define an ordering to the symbols, e.g. so that 0 < 1 < 2 < … < 8 < 9. To increment our number means to represent a number that is one larger than previous. Our symbols are used, odometer-style, so that after we increment upwards and exhaust all the choices of symbol in one column, we can "roll over", or "overflow": we add one to the next column, and restart our numbering in the current column.

Roll-over examples: 9 +1 10 29 +1 30 419 +1 420 1999 +1 2000

We know of other situations where we have this overflow/roll-over situation. (Consider adding one more unit in each case):

• after 2:59pm, we reach 3:00pm. • after February 29th, 11:59pm, we reach March 1st, 12:00am. o The whole 12-is-the-start-of-the-next-day bit also emphasizes that we can choose any ordering for our symbols, as well. • after 1 quart 3 cups, we reach 2 quarts zero cups (because there are 4 cups in a quart). • after 3 weeks and 6 days, we have 4 weeks and zero days. • after using 2 dimes and 4 pennies for 24 cents' change, we would use 1 quarter to make 25 cents' change. Note that the 'column' values here are not all powers of the same base: 1, 5, 10, 25. Thus some of the useful properties we rely on for other numeral systems are not available in counting out the best change. For instance, we didn't clock over to get 2 dimes and 1 nickel, we combined those to get 1 quarter. Incidentally, if we had 20-cent pieces instead of quarters, this situation wouldn't arise: 20 is a multiple of all the previous pieces, unlike 25, so the best change would not have this same issue.

Example: When we see the number 423 (meaning 42310), we know that it equals 400+20+3. We even say "four hundred twenty-three", because base 10 is so embedded in our numeral system. We are seeing that each column in the number has a weighting that is found via the base. Each column is worth 10 raised to a successively higher power, starting with zero exponent at the rightmost column:

Digits of 42310: …(0) (0) 4 2 3 Column Values: … 103 = 1000 102 = 100 101 = 10 100 = 1 value in each column: (0) 0 4*100 2*10 3*1 sum: 400+20+3 = 42310

Hexadecimal

Hexadecimal is base 16. Since we need 16 symbols, and we only have ten common numeric symbols, we just borrow the first 6 letters of the alphabet to complete our set. base 16: sixteen symbols. 0 1 2 3 4 5 6 7 8 9 A B C D E F base 16 column values: … 163 162 161 160 … 4096 256 16 1

Roll-over is the exact same idea as before: when we run out of symbols in a column, we start over in this column with zero, while adding one to the column to the left.

F 10 3F 40 FFF 1000 7FFF 8000 9 A (not roll-over, just the next symbol of 16)

Example: 1AD516. We first construct our columns' values, and then think of the number in each column representing that many times the column's value. Placing column values beneath our number's symbols:

Hexadigits: 1 A D 5 Column Values: … 163 = 4096 162 = 256 161 = 16 160 = 1 (written in base 10) value in each … 1 *4096 10 *256 13 *16 5 *1 column: 10 10 10 10 10 10 10 10 sum: (1*4096) + (10*256) + (13*16) + (5*1) = 686910

We see that our number is worth (116*409610) + (A16*25610) + (D16*1610) + (516*110). We look up A in our chart and realize it's just 1010, and similarly that D is just 1310. (116=110 and 516=510). So we have, all in base 10, (1*4096) + (10*256) + (13*16) + (5*1) = 686910. We actually prefer to just write the second version, where everything is written in base 10. We are most comfortable doing calculations in decimal, and mixing bases within calculations is simply an invitation for calculations errors.

Binary

Binary is base 2. Thus there are only two symbols, and columns are worth powers of two. But all of our ideas carry over:

base 2: two symbols. 0 1

base 2 column values: … 26 25 24 23 22 21 20 = ... 64 32 16 8 4 2 1 Roll-over examples: (affected bits are highlighted) 1 10 111 1000 1101 1110

With only two symbols, we roll over quite frequently. Every other increment involves roll-over!

Conversions Between Bases

After looking at the initial chart and reading through our descriptions of each base, we can now discuss how to convert between one representation and another.

Other bases to decimal To convert from non-decimal bases to decimal, we follow the process described above to figure out what the value being represented would look like in base ten. Here is an algorithmic description:

1. Using the number's original base and exponents from zero and up, figure out the value of each column that your number utilizes. 2. For each column, multiply the column's value by the quantity represented at that column. (Write all of this in base 10 for simplicity's sake – instead of B16, we prefer to write 1110, or even just 11). 3. Add these results from each column to reach the base 10 result.

Examples • What is E316, in base 10? 1 0 (14*16 ) + (3*16 ) = 14*16 + 3*1 = 224 + 3 = 22710.

• What is 10316, in base 10? 2 1 0 (1*16 ) + (0*16 ) + (3*16 ) = 1*256 + 0*16 + 3*1 = 256+0+3 = 25910.

• What is 11012, in base 10? 3 2 1 0 (1*2 ) + (1*2 ) + (0*2 ) + (1*2 ) = 1*8 + 1*4 + 0*2 + 1*1 = 1310.

• What is 1000102, in base 10? 5 1 (1*2 ) + 0 + 0 + 0 + (1*2 ) + 0 = 32 + 2 =3410. o note that we might as well only focus on the non-zero columns to speed up the calculation.

• What is 478, in base 10? 1 0 (4*8 ) + (7*8 ) = 32 + 7 = 3910.

Your Turn! • Convert the following to decimal: o 101102. 11112. 10102. o 3616. 1A16. B216. E4716 o 318. 102103. 3H20 (just use more letters in order for the symbols)

Decimal to Other Bases

We use a different approach to convert from decimal to other bases. The reason we don't just use the previous version (it could work just fine) is that we are not trained in performing addition and multiplication in other bases, so we prefer an approach that allows us to do our calculations in decimal.

Version 1: Subtracting column-values until we have nothing left.

Think of a number that you want to represent in some base; as an analogy, think of your number as a pile of beans.

o Let's assume we have 253 beans.

You want to put all the beans into smaller groups, which happen to be of sizes 1000, 100, 10, and 1. Starting with the largest groupings possible, we first take out as many of that large group-size as possible.

o we take out two 100-bean groups, leaving us with 53 beans left.

Next, we look at the next-largest grouping, and see how many piles of beans of that size we are able to take from our pile of 53 beans.

o we take out five 10-bean groups, leaving us with 3 left.

We continue through each smaller-sized column, until we run out of beans or reach the right-most column (at which point we'd better have run out of beans too, or else we made a mistake somewhere in our actions).

o we take out three 1-bean groups, leaving us with no beans left over.

This same process works, no matter what base you started in (how you initially label your pile of beans), or what base you're going to (the various group-sizes/column values that are allowed).

Your Turn! Again consider 25310 beans. But now we want to take out piles of sizes 512, 64, 8, and 1. how many 512-sized piles do you get? (how many are left over?) how many 64-sized piles do you get? (how many are left over?) how many 8-sized piles do you get? (how many are left over?) how many 1-sized piles do you get? (how many are left over?)

What base is this representing, and now what would 25310 beans look like in that base?

Algorithm: • Find the target base's column values, up to the last column that isn't bigger than your quantity (your starting number). A column exactly equal to your number should be included. • Start with that leftmost/largest column. For the current column, as many times as you can, subtract the column's value without getting a negative result. Each time you do so, add one to this column as part of your result. (Like trading 25 pennies for 1 quarter). • Now that you can't subtract that column any more, you have less than the column's value; move one column to the right, and repeat. • As soon as you run out of quantity to redistribute (no more beans), place a zero in any remaining columns; your entire result is complete.

Example: Convert 53 to binary. 1. find out the column values. Any columns bigger than our number are filled with zeroes (and thus we don't show them in the final result). Value remaining: 53. 0 64 32 16 8 4 2 1

2. Looking at the 32 column (=25), we see that 32≤53, so we put a 1 in the 32 column and subtract 32 from our number. (32 was the largest column value that was ≤53, so that is where we begin). 0 1

64 32 16 8 4 2 1

Remaining value: 53-32 = 21.

3. proceed to the next column, again seeing if we should transfer part of our remaining value into this column. Repeat through all remaining columns.

16≤21. 1 in the 16's column. Remaining value: 21-16 = 5. 0 1 1 64 32 16 8 4 2 1

8>5. 0 in the 8's column. Remaining value: 5-0 = 5. 0 1 1 0 64 32 16 8 4 2 1

4≤5. 1 in the 4's column. Remaining value: 5-4=1. 0 1 1 0 1 64 32 16 8 4 2 1

2>1. 0 in the 2's column. Remaining value: 1-0=1. 0 1 1 0 1 0 64 32 16 8 4 2 1

1≤1. 1 in 1's column. Remaining value: 1-1=0. 0 1 1 0 1 0 1 64 32 16 8 4 2 1

Final Result: 1101012.

Example: Convert 746 to hexadecimal.

Again, we start with listing out column values in the target base until we've gotten large enough to know that we must have a zero in the largest column – this tells us how many columns our number will need, and tells us where to stop.

Remaining value: 746. 0 4096 256 16 1

256≤746. We also ask ourselves, "how many 256's can I extract from 746?" Since we can remove two 256's without going negative, we place a 2 in that column:

256≤746. 2*256≤746. 3*256>746 (too many). 2 in 256's column. Remaining value: 746-2*256 = 234. 0 2 4096 256 16 1

16≤234. "How many 16's can I subtract from 234?" 14*16≤234, but 15*16>234. 14 in 16's column (represented by E). Remaining value: 234 – (14*16) = 234 – 224 = 10. 0 2 E 4096 256 16 1

1≤10. "How many 1's can I subtract from 10?" Ten 1's will fit 10 in 1's column (represented by A). Remaining value: 10 – (1*10)=0.

0 2 E A 4096 256 16 1

Final Result: 2EA16.

Your Turn! • Convert 4610 to binary. • Convert 24310 to binary. • Convert 6310 to binary. • Convert 6710 to hexadecimal. • Convert 7910 to hexadecimal. • Convert 500010 to hexadecimal. • As a warm up to the next section (after version 2), convert 5910 to both binary and hexadecimal.

Version 2. Division and remainders approach.

We already understand that we find out if a number is even by dividing it by two, and checking if the remainder is a 1 or a 0. It turns out that every even number has a 0 in the 1's column in binary representation, and all odd numbers have a 1 in the 1's column in binary.

Using the number 1110 as an example, let's try to use division to consider the next bits of our result. If we consider the result of division, it tells us how many 2's we got. 11/2 = 5R1. This just tells us that 5*2 + 1 = 11. If we consider the five 2's, how would the 5 look in binary? Well, we know 5/2 = 2R1, so it ends in a 1. How would 2 look in binary? Well, we know it ends in a 0. 2/2= 1R1. How would 1 look in binary? 1/2=0R1, so it ends in a 1. We have nothing left, so we're done.

If we keep on dividing by 2, and record the remainders from right to left, stopping when we get a quotient of zero, we can also find out the binary representation of a number:

What is 2310 in binary?

23/2 = 11R1 1 11/2 = 5R1 11 5/2 = 2R1 111 2/2 = 1R0 0111 1/2 = 0R1 10111

Result: 101112. Again, notice that we found the rightmost bit first! We can remember this because finding the remainder when dividing by two (just once) tells us if the number is even or odd, which is evident in just the last (rightmost) column.

The same approach works for other bases; just divide by that base:

5364 / 16 = 335 R 4 4 335/16 = 20 R 15 F4 (1510 represented by F16). 20 / 16 = 1 R 4 4F4 1 / 16 = 0 R 1 14F4

Final answer: 14F416.

Your Turn! • Using this style of conversion, convert the following numbers to binary: o 1310 810 2310 • Using this style of conversion, convert the following numbers to hexadecimal. o 30010 5910 17810

Converting between base 2 and base 16.

Based on the chart on the right (a smaller portion of the original Base 10: Base 2: Base 16: chart shown above), we can see how four-bit binary patterns relate to single-symbol hexadecimal values. Leading zeroes have 0 0000 0 been added to the lower binary numbers so that we always see 1 0001 1 four bits. In order to convert from binary to hexadecimal, if we 2 0010 2 group a binary number in four bit groups starting from the right, 3 0011 3 we can use the chart to just convert 4 bits at a time. Conversely, 4 0100 4 we can convert from hexadecimal to binary by just converting 5 0101 5 each hexadecimal symbol to the corresponding 4-bit pattern. 6 0110 6

This works because 24 =16. It's no coincidence that one base is 7 0111 7 worth a specific column of the other base (16=24). Hence we are 8 1000 8 using 4 bits to represent 16 values. Reconsider 5910 from before: 9 1001 9 10 1010 A 5910 = 1110112 = 3B16. Splitting the binary up as 4-bit groups 11 1011 B starting from the right, we get 11 1011. We can pad the 12 1100 C leftmost group with 0's to make it 4 bits if it helps us think about 13 1101 D it: 0011 1011. 0011 = 3, 1011 = B. Result: 3B16. 14 1110 E Suppose we had some 32-bit number, where the bits were: 15 1111 F

1010 0000 1111 1000 1111 0001 0110 1010 A 0 F 8 F 1 6 A

= A0F8F16A16. It is common to prefix a hex number with 0x instead of writing the 16 suffix: 0xA0F8F16A. This is because source code is just raw text; there's no superscript button in a plaintext editor J

Your Turn!

• Convert these binary numbers directly to hexadecimal: o 10 1010 1010 1010 1010 o 10 1101 o 1 0101 0101 0101 1000 1101 1011 0110 1001 1011 1110 o 1111 1111 1111 1111 0000 1010 1010 o 1111 1010 1100 1110 o 1011 1110 1110 1111

• Convert these hexadecimal numbers to binary: o ACE 1234 1010 CAFE

Summary: Number Representations

You should be comfortable with the table of numbers that showed us how to count upwards. You should be comfortable converting between any two of these bases, in either direction: binary, decimal, hexdecimal.