SIPB’s IAP Programming in C was developed at AT&T Bell Labs between 1971 and 1973, by . It was derived from an experimental language called B, which itself was a stripped-down version of BCPL. All of these are derivatives of the ALGOL family of languages, dating from the 1950s. This work was done on a PDP-11, a machine with a 16 bit address bus. While a PDP-11 should be able to address up to 64K of memory, these machines only had 24K. The C was written in C, and was able to compile itself on these machines, running under the nascent operating system. #include main() The classic { printf("hello, world\n"); } /* * %% - a percent sign * %s - a string * % - 32-bit integer, base 10 * %lld - 64-bit integer, base 10 * %x - 32-bit integer, base 16 * %llx - 64-bit integer, base 16 * %f, %e, %g - double precision * floating point number */

/* * the following should produce: A word on printf() * * printf() test: * string: 'string test' * number: 22 * float: 18.19 */ printf("printf() test:\n" " string: '%s'\n" " number: %d\n" " float: %g\n", "string test", 22, 18.19); Language Structure char i_8; There are five different /* -128 to 127 */ kinds of integer: unsigned char ui_8; /* 0 to 255 */ - char short i_16; - short /* -32768 to 32767 */ unsigned short ui_16; - int /* 0 to 65536 */ int i_32; - long /* -2147483648 to 2147483647 */ unsigned int ui_32; - long long /* 0 to 4294967295U */ long i_arch; unsigned ui_arch; Each of these can be /* architecture either: * dependent */ long long i64; - signed (the default) /* -9223372036854775808LL to * 9223372036854775807LL */ - unsigned unsigned long long ui64; /* 0 to 18446744073709551615ULL */ There are at least four float f_32; /* single precision: * +/- 1.1x10^-38 to 3.4x10^38, different kinds of * roughly 7 digits precision */ floating point value: double f_64; /* double precision: * +/- 2.2x10^-308 to 1.7x10^308 - float * roughly 16 digits precision */ long double f_80; - double /* extended precision: * +/- 1.1x10^-4932 to 1.1x10^4932, - long double * roughly 19 digits precision */ __float128 f_128; /* quadruple precision: * +/- 1.2x10^-4932 to 1.2x10^4932, - __float128 * roughly 34 digits precision */ Additionally, C char zero = '0'; understands strings char *one_as_string = "One"; char *stuff = "I think I see " and characters: "Bob Marley " "in my cornflakes!\n"; There are a number of /* * \a Bell (alert) backslash escapes * \b Backspace * \f Formfeed available, to encode * \n New line * \r Carriage return commonly used but * \t Horizontal tab * \v Vertical tab unprintable (or * \' Single quotation mark * \" Double quotation mark reserved) characters in * \\ Backslash * \? Literal question mark strings: */ /* * 000 NUL '\0' * 001 SOH (start of heading) * 002 STX (start of text) * 003 ETX (end of text) * 004 EOT (end of transmission) * 005 ENQ (enquiry) * 006 ACK (acknowledge) * 007 BEL '\a' (bell) * 010 BS '\b' (backspace) * 011 HT '\t' (horizontal tab) * 012 LF '\n' (new line) * 013 VT '\v' (vertical tab) * 014 FF '\f' (form feed) Backslash escapes can * 015 CR '\r' (carriage ret) * 016 SO (shift out) * 017 SI (shift in) * 020 DLE (data link escape) also be composed with * 021 DC1 (device control 1) * 022 DC2 (device control 2) * 023 DC3 (device control 3) * 024 DC4 (device control 4) octal values, to specify * 025 NAK (negative ack.) * 025 NAK (negative ack.) * 026 SYN (synchronous idle) ASCII coded * 027 ETB (end of trans. blk) * 030 CAN (cancel) * 031 EM (end of medium) * 032 SUB (substitute) characters in a general * 033 ESC (escape) * 034 FS (file separator) * 035 GS (group separator) * 036 RS (record separator) fashion: \NNN * 037 US (unit separator) * * * 40 50 60 70 100 110 120 130 140 150 160 170 * ------* 0: ( 0 8 @ H P X ` h p x * 1: ! ) 1 9 A I Q Y a i q y * 2: " * 2 : B J R Z b j r z * 3: # + 3 ; C K S [ c k s { * 4: $ , 4 < D L T \ d l t | * 5: % - 5 = E M U ] e m u } * 6: & . 6 > F N V ^ f n v ~ * 7: ' / 7 ? G O W _ g o w DEL */ int ten_integers[10]; Arrays double five_doubles[5]; unsigned long long guess[22]; struct NameOfStructure { int integer_field; double floating_point_field; unsigned short an_array[15]; Structs }; struct AnotherStructure { int a; char b; struct NameOfStructure nos; }; struct NameOfStructure { int one_bit:1; Bitfields int two_bits:2; int many_bits:22; }; union NameOfUnion { int integer_field; Unions double floating_point_field; unsigned short an_array[15]; }; enum NameGoesHere { Gives, Each, Enums An, Integer, Value }; typedef unsigned long long my_uint64_t; struct Example { Typedefs my_uint64_t a; my_uint64_t b; }; typedef struct Example Example_t; struct One { int a; int b; double c; char *string; }; struct Two { These composite data char *name; char *desc; types can be init- struct One data; ialized, too, which is }; struct Two values[] = { handy for working with "slug", "squooshy, gross", 1, 2, large sets of constant, 3.0, "four", "bat", "blood thirsty", { 7, 14, complex data 21.0, "rump roast" }, /* ... etc ... */ { "chicken", "walking gizzard", { 9, 7, 5, "boombox" }}, NULL, NULL, 0, 0, 0, NULL }; /* start by declaring: * - the return type * - the function name * - the types and names of the * function parameters */ int sum_of_squares(int a, The building block of int b) { computation in C is the /* first come variable * declarations */ function; all int c; computation must /* then statements; each * statement should end with occur inside one * a semicolon */ c = (a * a) + (b * b); /* return a value, if we said * we would */ return c; } int sum_of_squares(int a, int b) { int c; c = (a * a) + (b * b); return c; } int pythagorean_p(int a, int b, Functions can call int c) { other functions int v1,v2; /* a function call looks * like this */ v1 = sum_of_squares(a,b); v2 = c * c;

if(v1 == v2) return(1); else return(0); } The top level function is main(); that gets called when your program runs

int sum_of_squares(int a, int b) { int c; /* notice we don't even use argc and * argv; not a big deal! */ c = (a * a) + (b * b); int main(int argc, return c; char *argv[]) } { if(pythagorean_p(3,4,5)) int pythagorean_p(int a, printf("3:4:5 is a " int b, "pythagorean triple\n"); int c) else { printf("3:4:5 is NOT " int v1,v2; "a pythagorean triple\n"); v1 = sum_of_squares(a,b); return(0); } v2 = c * c; if(v1 == v2) return(1); else return(0); } /* func1() will return nothing, A C function can which is called 'void' in c. */ void func1(int a) return nothing. { printf("a = %d\n",a); } /* func2() will return a Or a function can * single integer value */ int func2(int a) return a single value. { return(a ^ (a << 2)); } /* func3() will return multiple * integers values, by way of a * struct. we'll learn other * approaches later! */ /* this holds our results: */ struct Func3Results { Or, a function can int v1; int v2; wrap multiple values }; into one return result. struct Func3Results func3(int a) { struct Func3Restults back; back.v1 = a * 2; back.v2 = a * 4; return(back); } #include #include /* sum_integers() accepts first a * count of integers to sum, then * those integers themselves. it C functions can accept * returns their integer sum. */ int sum_integers(int count, variable numbers of ...) { parameters. These are int i,sum; va_list list; called variadic va_start(list,count); for(i=0,sum=0;i>, << * : bitshift right, left * */ /* * boolean operators: * >, >= * : greater than, greater than * : or equal to And also with: * <, <= * : less than, less than or * : equal to - Boolean operators * ==, != * : equal, not equal * &&, ||, ^^ * : and, or, xor - Assignment * * assignment operators: operators * = * : assignment * +=, -=, *=, /= * : add, sub, mul, div, * : and then assign */ /* * Do first is * * 1 () Grouping * 2 ! ~ - ++x --x Unary ops as commonly used in * 3 * / % Mul, div, mod * 4 + - Add, sub mathematics, though * 5 << >> Bit shifts * 6 < <= > >= Comparisons interactions between * 7 == != Equality tests * 8 & Bitwise and e.g. subtract and * 9 ^ Bitwise xor * 10 | Bitwise or bitwise xor may not be * 11 && Logical and * 12 || Logical or * 13 = += -= *= /= Assignments intuitive. * * Do last */ if(a > b) { /* only case */ } Flow control is if(a > b) { achieved with: /* first case */ } else { - if contructs /* alternative case */ } - if/else contructs if(a > b) { /* first case */ - if/else if/else } else if(b > c) { /* second case */ contructs } else { /* default case */ } for(i=0;i

return(FAIL); /* and it can get a lot * worse than this! */ i = 0; loop1: if(++i > 10) goto loop1_end; There is also a dark j = 0; loop2: side to goto... if(++j > 10) goto loop2_end; printf("(i,j) = (%d,%d)\n",i,j); goto loop2; loop2_end:

goto loop1; loop1_end; { int a; /* set a to some value here */ switch(a % 3) { case 0: The switch statement printf("a is 0 mod 3\n"); break; is kind of like a case 1: printf("a is 1 mod 3\n"); multi-way goto break; case 2: printf("a is 2 mod 3\n"); break; default: printf("a is negative\n"); break; } } The computer, compiling, and execution #include main() Still the classic { printf("hello, world\n"); } A interaction with gcc, the GNU C compiler athena% gcc hello.c -o hello athena% ./hello and a timeless classic hello, world in the history of C athena% compilation.

cpp hello(2).c gcc hello.s as hello.o ld hello.c hello compiler assembler linker # 1 "hello.c" # 1 "" # 1 "" # 1 "hello.c" # 1 "/usr/include/stdio.h" 1 3 4 hello.c, after passing # 28 "/usr/include/stdio.h" 3 4 # 1 "/usr/include/features.h" 1 3 4 through the # 352 "/usr/include/features.h" 3 4 preprocessor. /* much text removed here */

extern char *ctermid (char *__s) The preprocessor __attribute__ ((__nothrow__)); # 888 "/usr/include/stdio.h" 3 4 replace comments with extern void flockfile (FILE *__stream) __attribute__ ((__nothrow__)); whitespace, extern int ftrylockfile (FILE *__stream) __attribute__ ((__nothrow__)) ; concatenate included extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__)); files, and conditionally # 918 "/usr/include/stdio.h" 3 4 include text segments. # 2 "hello.c" 2 main() { printf("hello, world\n"); }

cpp hello(2).c gcc hello.s as hello.o ld hello.c hello preprocessor compiler assembler linker hello.c, in i386 assembler. This is

.file "hello.c" what the compiler .section .rodata .LC0: .string "hello, world" .text produces. .globl main .type main, @function main: pushl %ebp movl %esp, %ebp This format can be andl $-16, %esp subl $16, %esp movl $.LC0, (%esp) call puts more easily read and leave ret .size main, .-main .ident "GCC: (GNU) 4.4.1 20090725 (Red Hat 4.4.1-2)" understood by humans, .section .note.GNU-stack,"",@progbits but remains very close to machine language.

cpp hello(2).c gcc hello.s as hello.o ld hello.c hello preprocessor compiler assembler linker The meat of hello, in machine language. This is what is produced by compiling 55 89 e5 83 e4 f0 83 ec 10 b8 b4 84 04 08 c7 44 24 04 22 be 55 fe 89 04 hello.c into assembly 24 e8 12 ff ff ff c9 language, and then assembling that into i386 machine language

cpp hello(2).c gcc hello.s as hello.o ld hello.c hello preprocessor compiler assembler linker athena% readelf -a hello Symbol table '.dynsym' contains 5 entries: Linking is performed Num: Value Size Type Name 0: 00000000 0 NOTYPE 1: 00000000 0 NOTYPE __gmon_start__ by ld, the loader. 2: 00000000 0 FUNC __libc_start_main@GLIBC_2.0 (2) 3: 00000000 0 FUNC puts@GLIBC_2.0 (2) 4: 0804848c 4 OBJECT _IO_stdin_used Symbol table '.symtab' contains 65 entries: Multiple .o files will be Num: Value Size Type Name # snip 40: 00000000 0 FILE hello.c knitted together 41: 080495f8 0 OBJECT _GLOBAL_OFFSET_TABLE_ 42: 08049518 0 NOTYPE __init_array_end 43: 08049518 0 NOTYPE __init_array_start during linking, 44: 0804952c 0 OBJECT _DYNAMIC 45: 08049610 0 NOTYPE data_start 46: 080483d0 5 FUNC __libc_csu_fini 47: 08048300 0 FUNC _start including those from 48: 00000000 0 NOTYPE __gmon_start__ 49: 00000000 0 NOTYPE _Jv_RegisterClasses 50: 08048488 4 OBJECT _fp_hw 51: 0804846c 0 FUNC _fini static libraries. 52: 00000000 0 FUNC __libc_start_main@@GLIBC_ 53: 0804848c 4 OBJECT _IO_stdin_used 54: 08049610 0 NOTYPE __data_start 55: 08048490 0 OBJECT __dso_handle Functions provided by 56: 08049524 0 OBJECT __DTOR_END__ 57: 080483e0 90 FUNC __libc_csu_init 58: 08049614 0 NOTYPE __bss_start dynamic libraries will 59: 0804961c 0 NOTYPE _end 60: 00000000 0 FUNC puts@@GLIBC_2.0 61: 08049614 0 NOTYPE _edata 62: 0804843a 0 FUNC __i686.get_pc_thunk.bx simply be referenced. 63: 080483b4 23 FUNC main 64: 08048290 0 FUNC _init

cpp hello(2).c gcc hello.s as hello.o ld hello.c hello preprocessor compiler assembler linker The C Preprocessor The preprocessor is responsible for trimming your comments. - Comments are understood to be between /* and */ /* nothing in here is - Comments are not between // * going to be seen by the and the end of the line * compiler */ /* nor in here */ - Some will support this latter comment style, but it can adversely affect the portability of your code #include interprets the requested file - Files between < and > will be sought amongst the system header files #include - Files between " and " should be in the include path, which can #include "my-header.h" passed to the compiler - However, the include path by default will include the current directory #define defines substitutions - These can be simpled ’defined’ or ’not defined’ #define _STRING_H - Or they can be scalar values #define NULL (void *)0 - Alternatively, they can be #define SUM(a,b) ((a) + (b)) functions with parameters - These macro substitutions are recursively evaluated - Code between #if and #endif will be conditionally compiled

- #defined(SYMBOL) will #if defined(MSDOS) || \ evaluate true or false, defined(OS2) || \ defined(WINDOWS) depending on whether # if !defined(__GNUC__) && \ SYMBOL is defined or not !defined(__FLAT__) - The !, ||, and && operators /* conditionally compiled code work as expected * goes here */ # endif - Code to be skipped is replaced #endif with blank lines - Terminated with #endif #ifdef and #ifndef are convenient interfaces to common functionality:

- #ifdef SYMBOL is equivalent to #ifndef SYS16BIT #if defined(SYMBOL) # define SYS16BIT #endif - #ifndef SYMBOL is similarly equivalent to #if !defined(SYMBOL) #if 0 is a convenient way to comment out large swaths of #if 0 code, particularly those that # include # include embedded comments. This # ifdef VMS # include latter point, because C # endif comments are not recursive. # define z_off_t off_t #endif :-( #pragma is used to use compiler implementation #pragma warning(disable: 4035) specific parameters and #pragma map(deflateInit_,"DEIN") language extensions in a #pragma message("LIBPNG reserved macros; \ use PNG_USER_PRIVATEBUILD instead.") minimally standard way Compilation can produce either object files (ending in .o), or # generate an object file executables. This, athena% gcc hello.c -c # generate an executable depending on what # (libc is implicitly linked in) flags you pass the athena% gcc hello.c -o hello compiler: Object files can then be linked together, with # link object files with libm each-other and with # to generate executable exec-file athena% gcc obj-file-1.o \ libraries, to produce an obj-file-2.o -o exec-file -lm executable. athena% gcc hello.c -o hello hello.c: In function 'main': hello.c:6: error: expected ';' A more typical before '}' token interaction with gcc [editing] athena% gcc hello.c -o hello athena% ./hello hello, world athena% athena% gcc hello.c -o hello athena% ./hello Segmentation fault (core dumped) athena% ls core* core.1234 athena% gdb hello core.1234 Sometimes things go [gdb spews a lot of verbiage] really wrong [gdb interactions] [editing] athena% gcc hello.c -o hello athena% ./hello hello, world athena%