Chapter 11 There are 10 kinds of people in the world… Introduction to …those that know binary, Programming in and those that don’t.

Based on slides © McGraw-Hill Additional material © 2004/2005 Lewis/Martin

CSE 240 2

Aside: What is ? Aside: The Unix Command Line The most influential Text-based approach to give commands First developed in 1969 at AT&T Bell Labs • Commonly used before graphical displays • Many advantages even today • By Ken Thompson and • Designed for “smaller” of the day Examples • Reject some of the complexity of MIT’s Multics • mkdir cse240hw8 a directory They found writing in assembly tedious • cd cse240hw8 change to the directory • ls list contents of directory • Result: Dennis Ritchie invented the C • cp /mnt/eniac/home1/c/cse240/project/hw/hw8/* . Introduced to UC-Berkeley (Cal) in 1974  Copy files from one location to current dir (“.”) • Bill Joy was an early Unix hacker as a PhD student at Cal • emacs foo.c & run the command “emacs” with input “foo.c” • Much of the early internet consisted of Unix systems Mid-80s • gcc -o foo foo.c compile foo.c (create program called “foo”) • Good, solid TCP/IP for BSD in 1984 Unix eventually developed graphical UIs (GUIs) • X-windows (long before ) • Free implementation of Unix (libre and gratuit) • Announced by Linus Torvalds in 1991 Much more in CSE380! CSE 240 3 CSE 240 4

1 Programming Levels The Course Thus Far…

Application Scripting We did digital logic Interpreted Languages Languages • Bits are bits Or Compiled High-Level (Java, C#) (Perl, Python, VB) • Ultimately, to understand a simple processor Languages System Programming Languages We did programming (C and C++) Compilation • Programming the “raw metal” of the Assembly Language • Ultimately, to understand C programming (x86, PowerPC, SPARC, MIPS) Low-Level Languages Assembly Starting today: we’re doing C programming Machine Language • C is still common for systems programming (x86, PowerPC, SPARC, MIPS) • You’ll need it for the operating systems class (CSE380) Hardware • Ultimately, for a deeper understanding of any language (Java) (Application-Specific Integrated Circuits or ASICs) CSE 240 5 CSE 240 6

Why High-Level Languages? Our Challenge Easier than assembly. Why? 99% of you already know either Java or C • Less primitive constructs • We’re going to try to cover the basics quickly • Variables • We’ll spend more time on pointers & other C-specific nastiness • Type checking Created two decades apart Portability • C: 1970s - AT&T Bell Labs • Write program once, run it on the LC-3 or Intel’s x86 • C++: 1980s - AT&T Bell Labs • Java: 1990s - Sun Microsystems Disadvantages • Slower and larger programs (in most cases) Java and C/C++ • Can’t manipulate low-level hardware • Syntactically similar (Java uses ) All operating systems have some assembly in them • C lacks many of Java’s features • Subtly different semantics Verdict: assembly coding is rare today

CSE 240 7 CSE 240 8

2 C is Similar To Java Without: More C vs Java differences Objects C has a “” • No classes, objects, methods, or inheritance • A separate pre-pass over the code Exceptions • Performs replacements • Check all error codes explicitly Standard class library Include vs Import • C has only a small • Java has import java.io.*; Garbage collection • C has: #include • #include is part of the preprocessor • C requires explicit memory allocate and free Safety Boolean type • Java has strong type checking, checks array bounds • Java has an explicit boolean type • In C, anything goes • C just uses an “int” as zero or non-zero Portability • C’s lack of boolean causes all sorts of trouble • Source: C code is less portable (but better than assembly) • Binary: C compiles to specific More differences as we go along… CSE 240 9 CSE 240 10

What is C++? Quotes on C/C++ vs Java C++ is an extension of C “C is to assembly language as Java is to C” • Backward compatible (good and bad) • Unknown • That is, all C programs are legal C++ programs

C++ adds many features to C "With all due respect, saying Java is just a C++ subset is • Classes, objects, inheritance rather like saying that `Pride and Prejudice' is just a • Templates for polymorphism subset of the Encyclopedia Britanica. While it is true that • A large, cumbersome class library (using templates) one is shorter than the other, and that both have the • Exceptions (not actually implemented for a long time) same syntax, there are rather overwhelming differences.” • More safety (though still unsafe) • Sam Weber, on the ACM SIGSCE mailing list • Operator and function overloading “Java is C++ done right.” Thus, many people uses it (to some extent) • Unknown • However, we’re focusing on only C, not C++

CSE 240 11 CSE 240 12

3 More quotes on C/C++ Compilation vs. Interpretation "The C programming language combines the power of Different ways of translating high-level languages assembly language with the ease-of-use of assembly Interpretation language.” • Interpreter: program that executes program statements Directly interprets program (portable but slow) • Unknown Limited optimization • Easy to debug, make changes, view intermediate results "It is my impression that it's possible to write good • Languages: BASIC, LISP, Perl, Python, Matlab programs in C++, but nobody does.” Compilation • John Levine, moderator of comp.: translates statements into machine language Creates executable program (non-portable, but fast) “C makes it easy to shoot yourself in the foot; C++ makes Performs optimization over multiple statements it harder, but when you do it, it blows your whole leg off.” • Harder to debug, change requires recompilation • Languages: C, C++, , Pascal • Bjarne Stroustrup, creator of C++ Hybrid • Java, has features of both interpreted and compiled languages CSE 240 13 CSE 240 14

C Compilation vs. Interpretation Compiling a C Program Source and Header Files Consider the following algorithm: Entire mechanism is usually called • Get W from the keyboard. the “compiler” • X = W + W Preprocessor • Y = X + X • substitution • Z = Y + Y Compiler • Print Z to screen. • Conditional compilation Source Code • “Source-level” transformations Analysis Output is still C Symbol Table If interpreting, how many arithmetic operations occur? Target Code Compiler Synthesis • Generates object file If compiling, we can analyze the entire program and Machine instructions Library possibly reduce the number of operations. Object Files • Can we simplify the above algorithm to use a single Linker arithmetic operation? • Combine object files (including libraries) Executable into executable image Image CSE 240 15 CSE 240 16

4 Compiler A Simple C Program Source Code Analysis #include #define STOP 0 • “Front end” • Parses programs to identify its pieces main() Variables, expressions, statements, functions, etc. { /* variable declarations */ • Depends on language (not on target machine) int counter; /* an to hold count values */ Code Generation int startPoint; /* starting point for countdown */

• “Back end” /* prompt user for input */ • Generates machine code from analyzed source printf("Enter a positive number: "); • May optimize machine code to make it run more efficiently scanf("%", &startPoint); /* read into startPoint */

• Very dependent on target machine /* count down and print count */ Example Compiler: GCC for (counter=startPoint; counter >= STOP; counter--) { printf("%d\n", counter); • The Free-Software Foundation’s compiler } • Many front ends: C, C++, Fortran, Java } • Many back ends: Intel x86, PowerPC, SPARC, MIPS, Itanium CSE 240 17 CSE 240 18

Preprocessor Directives Comments #include Begins with /* and ends with */ • Before compiling, copy contents of header file (stdio.h) • Can span multiple lines into source code. • Comments are not recognized within a string • Header files typically contain descriptions of functions and example: "my/*don't print this*/string" variables needed by the program. would be printed as: my/*don't print this*/string no restrictions -- could be any C source code Begins with // and ends with “end of line” #define STOP 0 • Single-line comment • Before compiling, replace all instances of the string • Much like “;” in LC-3 assembly "STOP" with the string "0" • Introduced in C++, later back-ported to C • Called a macro • Used for values that won't change during execution, but might change if the program is reused. (Must recompile.) As before, use comments to help reader, not to confuse or to restate the obvious

CSE 240 19 CSE 240 20

5 main Function Variable Declarations Every C program must have a function called main() Variables are used as names for data items • Starting point for every program • Similar to Java’s main method Each variable has a type, tells the compiler: public static void main(String[] args) • How the data is to be interpreted • How much space it needs, etc. The code for the function lives within brackets: void main() int counter; { int startPoint; /* code goes here */ } C has similar primitive types as Java • int, char, long, float, double • More later

CSE 240 21 CSE 240 22

Input and Output More About Output Variety of I/O functions in Can print arbitrary expressions, not just variables • Must include to use them printf("%d\n", startPoint - counter);

printf("%d\n", counter); • String contains characters to print and formatting directions for Print multiple expressions with a single statement variables printf("%d %d\n", counter, • This call says to print the variable counter as a integer, followed by a linefeed (\n) startPoint - counter);

scanf("%d", &startPoint); Different formatting options: • String contains formatting directions for looking at input %d decimal integer • This call says to read a decimal integer and assign it to the variable startPoint (Don't worry about the & yet) %x integer %c ASCII character %f floating-point number

CSE 240 23 CSE 240 24

6 Examples Examples of Input This code: Many of the same formatting characters are printf("%d is a prime number.\n", 43); available for user input printf("43 plus 59 in decimal is %d.\n", 43+59); printf("43 plus 59 in hex is %x.\n", 43+59); printf("43 plus 59 as a character is %c.\n", 43+59); scanf("%c", &nextChar); • reads a single character and stores it in nextChar produces this output: scanf("%f", &radius); 43 is a prime number. 43 plus 59 in decimal is 102. • reads a floating point number and stores it in radius 43 plus 59 in hex is 66. scanf("%d %d", &length, &width); 43 plus 59 as a character is f. • reads two decimal (separated by whitespace), stores the first one in length and the second in width

Must use ampersand (&) for variables being modified (Explained in Chapter 16.) CSE 240 25 CSE 240 26

Compiling and Linking Remaining Chapters Various compilers available A more detailed look at many C features • cc, gcc • Variables and declarations • includes preprocessor, compiler, and linker • Operators • Control Structures Lots and lots of options! • Functions • Pointers and Data Structures • level of optimization, • I/O • preprocessor, linker options • intermediate files -- object (.o), assembler (.s), preprocessor (.i), etc. Emphasis on how C is converted to assembly language

Also see “C Reference” in Appendix D

CSE 240 27 CSE 240 28

7