cnotes.txt Tue Mar 28 15:19:23 2017 1 cnotes.txt Tue Mar 28 15:19:23 2017 2 Beginning for Java Programmers ------C Syntax C started in the early 1970s as a systems-programming ------language. Much C syntax was borrowed by Java. Higher level than assembly language. Same control structures as Java. Lower level than most other languages. allows declaring variable in for loop K & R : Kernighan and Ritchie and anywhere in code. (Like Java) Influential 1978 book ANSI C requires that local vars be declared Greatly tamed in 1989: ANSI C "C89" or "C90" before any statements in any {.....} block.

Also revised 1999 and 2011. Array syntax is not as "logical" as Java’s cnotes.txt Tue Mar 28 15:19:23 2017 3 cnotes.txt Tue Mar 28 15:19:23 2017 4

The Preprocessor Macros in the C Preprocessor ------

Unlike Java, the C compiler has a "preprocessor" #define FOO 12*2 that does textual substitutions and "conditional compilation" before the "real compiler" thereafter, wherever the symbol FOO occurs in your runs. code, it is textually replaced by 12*2 . Think find-and-replace. This was borrowed from assembly languages. Macro can have parameters Fancy uses of preprocessor are often deprecated, especially since C99. #define BAR(x) 12*x

Still commonly used use for defining constants If we have BAR(3*z) in your code, it is textually and in doing the C equivalent to Java’s "import". replaced by 12*3*z

Preprocessor "directives" start with #. No ’;’ at end #undef BAR will "undefine" BAR. After this, BAR(3*z) remains as is. cnotes.txt Tue Mar 28 15:19:23 2017 5 cnotes.txt Tue Mar 28 15:19:23 2017 6

Perils of Macros Making Strings and Concatenating Tokens ------

#define SQUARE(x) ((x)*(x)) (optional fancy topic) used much later with In a macro definition, the # operator puts quote marks around a macro argument. int x = 0; int y = SQUARE(x++); #define STRINGIFYME(x) # x

What value for x is expected by a programmer STRINGIFYME(owen) expands to "owen" looking only at these lines? The ## operator joins two tokens together. What value does x actually have? #define DECL_FUNC(name) int name ## _xy (int z){\ return(3 + name);}

then DECL_FUNC(owen) ends up defining a function called owen_xy that returns 3 + owen cnotes.txt Tue Mar 28 15:19:23 2017 7 cnotes.txt Tue Mar 28 15:19:23 2017 8

Predefined Macros Conditional Compilation in C ------

__FILE__ and __LINE__ expand to the current src #if ZZ == 1 #ifdef FOO code file name and line number. Good for ...somecode1...... auto-generated warning messages etc. #else #else ...somecode2...... __func__ name of current function (C99) #endif #endif

also #ifndef

Check a Boolean condition that can be evaluated at compile time, or chk definedness

** Determines whether somecode1 or somecode2 is seen by the real compiler

Only one of the 2 sequences is used ** cnotes.txt Tue Mar 28 15:19:23 2017 9 cnotes.txt Tue Mar 28 15:19:23 2017 10

Conditional Compilation part II #include statement ------

Conditional compilation is often used to tailor #include "foo.h" will cause the contents of your the code for a particular models of computer. file foo.h to be inserted in place of the #include statement.

And to ensure that a given "include file" isn’t "copy paste the file here, please" seen too many times (more later) #include is usually used like "import" in Java. It makes available library things.

Most C programs have a line #include

since you can’t print otherwise.

(note angle brackets for system libraries) cnotes.txt Tue Mar 28 15:19:23 2017 11 cnotes.txt Tue Mar 28 15:19:23 2017 12

Simple C program Common C Programming Conventions ------

#include Whereas camelCase is the standard in Java, in C the #define N 5 tradition is to use underscores_to_separate the int g; various parts of a compound identifier.

// C99-style comment: program needs a main() UPPER CASE is conventionally used for preprocessor int main(void) { macros used as constants and some macros with int i; arguments (but other macros-with-arguments get written in lower case, pretending to be functions) g = N; for (i=0; i < g; ++i) printf("Hello world"); } cnotes.txt Tue Mar 28 15:19:23 2017 13 cnotes.txt Tue Mar 28 15:19:23 2017 14

Enums C has no objects ------As in Java, C89 provides enumerated types Methods are called "functions". enum studentkind {Undergrad, Grad}; Equivalent to Java static methods. enum studentkind sk = Undergrad; A program consists of a bunch of functions, segmented into multiple source code files (sometimes called "modules"). Need main().

C allows you to group related data together (just like a class does).

Called a "struct". More later.

If you want C with objects, you want C++ (or Objective C) cnotes.txt Tue Mar 28 15:19:23 2017 15 cnotes.txt Tue Mar 28 15:19:23 2017 16

Order of declaration matters Example ------/* doubleslashed comments only with C99 */ A function cannot be called before (in source-code order) it has been declared. int foo(int a); /* function prototype */

It can be declared and defined by giving its code double bar(int x, int y) { /* defined */ (like giving code for a static method). return foo( 2*x*y); /* uses foo */ } Or it can just be declared by giving its parameters .... and return type. Called a "function prototype". int foo(int a) { int x; A full definition must then be given afterward if (a==0) return bar(a,a); else return 27; } cnotes.txt Tue Mar 28 15:19:23 2017 17 cnotes.txt Tue Mar 28 15:19:23 2017 18

Library header files Void parameter list ------A function with no parameters uses void in its One of the main things contained in library parameter list. header files (eg, ) are the function prototypes for the routines that int foobar(void) {.....} the library makes available. vs Java’s int foobar() {....} (Often also some constant declarations and data type (struct) definitions)

So contains prototypes for printf scanf various file I/O routines cnotes.txt Tue Mar 28 15:19:23 2017 19 cnotes.txt Tue Mar 28 15:19:23 2017 20

Order of Parameter Evaluation Static Functions and Extern Variables ------For functions with multiple parameters, it is If you use the qualifier "static" when defining undefined in which order the parameters are a function, it will not be exported for linking, evaluated. which is the default.

{ int i=0; Keyword extern is used when a variable is being foo(i++,i++); imported from another module. One of the } modules needs you may have called foo(0,1) or foo(1,0). extern int v; int v; Do not rely on any particular behaviour. to define the value and pass to the linker. Or use extern int v=23; cnotes.txt Tue Mar 28 15:19:23 2017 21 cnotes.txt Tue Mar 28 15:19:23 2017 22

Stack Frames Widening and Narrowing Rules ------Compilers generate stack frames according to the approved calling conventions for that system. Widening: conversion of a datatype to one that that can handle all the original values (and maybe more) Look at the Resources section: ARM Stack (eg, short to int) Frame Layout (Windows CE 5.0). Narrowing: conversion to a more restricted datatype. More later about "alloca()". Risk a value won’t be representable anymore.

Appears consistent with AAPCS. In Java, you need explicit typecasts more often than you do in C. Note the "Frame Pointer". Some calling conventions use frame pointers (aka BP); others don’t. This one int i = 3.14; //narrowing, Java needs typecast requires FP in only some cases. float f = 3; // widening, Java is also ok with this

my_arg_should_be_float(3.14); // narrowing, Java cast cnotes.txt Tue Mar 28 15:19:23 2017 23 cnotes.txt Tue Mar 28 15:19:23 2017 24

Promotion Simple IO in C ------If i is an int and l is a long, i*l is same as (long)i * l (same as Java) printf for screen output (note: Java now has printf!) also putchar What if i is an int and u is an unsigned int. scanf and getchar for keyboard input What is i*u? printf takes first a "format string", then a Ans: same as (unsigned int)i * u variable # of things No such issue in Java, since no unsigned... eg printf("Hello world!"); How about if c is a signed character and u is printf("value of x=%d, and y=%s", an unsigned int? x, my_str); c*u is same as ( (unsigned int) ( (int) c))* u scanf uses a format string too ---> assuming 2’s complement, using sign extension scanf("%d %d %d", &x, &y, &z);

Note use of "address-of" operator & Reason will be explained later. cnotes.txt Tue Mar 28 15:19:23 2017 25 cnotes.txt Tue Mar 28 15:19:23 2017 26

Getchar and the EOF bug I/O Buffering ------

Function getchar() returns a character from the On many systems, input is buffered: nothing happens standard input stream (normally, the keyboard). until a newline is hit. (Until then you can edit your keyboard input). If there are no more characters (standard input has closed), it returns the value EOF. And many systems buffer output: nothing actually gets sent to the screen (or to a file) until a getchar() actually returns an int. EOF is usually newline is sent, or more than some number of kb -1. of data is ready to send.

If getchar’s output has been typecast to char, Buffered I/O is good for system efficiency but hard it will never be equal to EOF. on programer heads.

So getchar() into an int. After checking for EOF, then typecast (or assign) to a char. cnotes.txt Tue Mar 28 15:19:23 2017 27 cnotes.txt Tue Mar 28 15:19:23 2017 28

Data Types Important Difference with Java ------

Basic types are sorta similar to Java: C was originally envisioned as a platform dependent int, double, float, long, short, char language. integer types can be qualified with "unsigned". So an int means different things on different int A[57]; vs Java int [] A = new int[57]; platforms. (Is it 16 bits? 32 bits? 64 bits?)

Before C99, the array size must be known C can be used on systems that don’t use 2’s complement at compile time. short must be able to represent -32767 to 32767. Before C99, no boolean type. Now there is int must be able to represent -32767 to 32767. bool, if you #include . long: -2147483647 to 2147483647

But ranges can be larger. However, range of short cannot exceed range of int, which cannot exceed range of long. cnotes.txt Tue Mar 28 15:19:23 2017 29 cnotes.txt Tue Mar 28 15:19:23 2017 30

Fixed-Width Integer Types Variable scope ------global variables It’s hard to do some low-level programming without are like object variables. Except the whole program knowing bit width. So C99 adds header file is essentially one object... visible to all functions in the source code Defines data types like int32_t, uint64_t etc. where declared. int32_t is guaranteed to give 32 bits range local variable same as Java .... except that you can qualify with "static", There are macros to declare constants of these same scope, but now allocated as with the globals. types. Slightly tamed global variable, sometimes useful.

Eg, UINT64_C(12345) it is okay to use an uninitialized local: your problem if the garbage chokes you. cnotes.txt Tue Mar 28 15:19:23 2017 31 cnotes.txt Tue Mar 28 15:19:23 2017 32

Strings Const ------

In C, strings are just null-terminated char arrays. Types can be qualified with const. Just like in assembly language. Similar to some uses of "final" in Java.

Libraries have many convenient string functions. This indicates things whose values cannot be changed (after initialization). Join strings with strncat. No "+" operator, sadly. eg const int months_per_year = 12; Hint: Linux command man 3 strncat tells you all about the strcat function, Using these effectively can be annoying, as the which is in section 3 (C library) of manuals. compiler may force you to do a lot of typecasting when you insert one or two uses. cnotes.txt Tue Mar 28 15:19:23 2017 33 cnotes.txt Tue Mar 28 15:19:23 2017 34

Volatile The Harder Parts of C ------Types can also be qualified with volatile. (Same keyword with similar meaning exists in Java * Pointers. Lower level access to memory than Java but is not widely used) Chapter from K&R in D2L. Read carefully. * Memory Management. Lower level. Volatile means that the compiler cannot assume Read wikibook section. that the value won’t change, mysteriously. * How nontrivial programs are segmented into modules (eg, value coming from a hardware device or value Read my notes on the website that another thread could be messing with) volatile int v; .... while (v) {...something...} is NOT equivalent to if (v) while (true) {....something...} cnotes.txt Tue Mar 28 15:19:23 2017 35 cnotes.txt Tue Mar 28 15:19:23 2017 36

Pointers Pointers vs Java’s Object References ------A C pointer is a slightly higher level view of a memory address than an assembler programmer would In Java, an object reference is very much have. like a pointer, except you are very limited in what you can do with it. A C pointer has a memory address, but the compiler also knows the type of the thing that is in memory at that address. More importantly, the compiler knows how many memory addresses are taken up by each type.

Aside: you can find this out with sizeof operator. cnotes.txt Tue Mar 28 15:19:23 2017 37 cnotes.txt Tue Mar 28 15:19:23 2017 38

C Pointers The * operator ------You dereference a pointer using the unary * Pointer values can be operator. * stored in variables * passed in parameters * involved in weird pointer expressions * incremented * dereferenced

Dereferencing means getting the value that is in the pointed-to memory address. It is "following the pointer" cnotes.txt Tue Mar 28 15:19:23 2017 39 cnotes.txt Tue Mar 28 15:19:23 2017 40

Declaring a Pointer Variable Const and Pointer Declarations ------int *foo; <-- declares a pointer to an int const int *p; <-- p cannot be used to change the float *bar; <-- bar points to a float pointee int **baz <-- variable baz stores a pointer p can be reassigned to point to to a memory location that itself other constant integers contains a pointer (to an int) int * const q; <-- q points to an integer Picture on board is needed. q can be used to alter the pointee q cannot be reassigned.

const int * const r <-- r points to an integer r cannot be reassigned, nor can r be used to change pointee cnotes.txt Tue Mar 28 15:19:23 2017 41 cnotes.txt Tue Mar 28 15:19:23 2017 42

Const pointers and typecasting Address-of Operator ------void foo( int *p) {...do stuff with p...} Given a variable, you can get a pointer to it using unary & operator, which we sometimes call int bar( const int *q){... the "address of" operator. foo(q);... } int x = 5; int *y = &x; You get an error or warning about implicitly stripping (*y++)++; the const-ness from q when handing it to foo. int *z; z = y; /* pointer assignment */ You _should_ make int foo (const int *p) if foo can guarantee this or (Actually & can be applied to any expression int bar (int *q) if foo cannot. that yields an "lvalue". An lvalue is something that, except for possible "constant-ness" or do explicit cast foo( (int *) q) to shut up restrictions, could be assigned to. compiler but get officially undefined behaviour. More or less.) cnotes.txt Tue Mar 28 15:19:23 2017 43 cnotes.txt Tue Mar 28 15:19:23 2017 44

Operator Precedences Pointers as Parameters ------The * and & operators have fairly high precedence. void swap(int *x, int *y) { So *x+3 means (*x) + 3, not *(x+3). int temp = *x; *x = *y; But watch out, because ++ has equal precedence. *y = temp; } So you must write (*x)++ instead of *x++ because both * and ++ are right associative. int a=5, b=6; swap(&a, &b);

vs the simpler but broken

void swap(int x, int y) { int temp = x; x=y; y=temp; }

swap(a,b); cnotes.txt Tue Mar 28 15:19:23 2017 45 cnotes.txt Tue Mar 28 15:19:23 2017 46

Simulating Output Parameters Generic Pointers ------Some languages let a method communicate back Sometimes you need a pointer for which the to its caller by changing the parameters. type of the pointed-to information is unknown Not Java, and not C. or temporarily irrelevant. But in C, you can simulate this via pointers. C provides the "void pointer" for this. You The swap routine did this. Handy when you want cannot do much with a void pointer - eg, you a method to "return several values". cannot dereference it.

You can typecast a void pointer to any other pointer type, though.

void *foobar; cnotes.txt Tue Mar 28 15:19:23 2017 47 cnotes.txt Tue Mar 28 15:19:23 2017 48

Malloc Pointer Fun with Binky ------Pointers can be to blocks of memory in Let’s look at a classic claymation video on C * the stack pointers. (Good for your inner 11-year-old.) * the global uninitialized area * the global initialized area * the "heap"

The runtime heap corresponds to the pool of memory Java uses when you use the "new" operator.

In C, you can ask for a block of heap memory using the malloc() library function. Argument is the number of bytes you want. Result returned as a (void *) that you can typecast to the desired ptr type.

Any memory taken with malloc() should eventually be returned with free(). C has no GC. cnotes.txt Tue Mar 28 15:19:23 2017 49 cnotes.txt Tue Mar 28 15:19:23 2017 50

More on the Heap Memory Layout ------The heap is a region of memory, where chunks of this So we have the stack, which occupies no set amount memory get handed out on demand. of space. It grows and shrinks dynamically. Demand for C: malloc(). For Java: "new". And we now have the heap, which also grows (and Later, some of these chunks may become reuseable, maybe shrinks) dynamically. because a C programmer called free(). The heap starts out at some size. But eventually The processor has a limited number of memory a malloc() may find there is no free block that addresses. We’d like to be able to support almost is big enough. all addresses for the stack (highly recursive pgm) or almost all addresses for the heap (malloc-intensive). Maybe all memory is in use. Maybe not, but only small blocks are available. Classic approach: Stack starts at a very high address and grows down; heap starts at a small address and When this happens, the heap needs to grow. It grows up. As long as the top of the stack doesn’t expands dynamically as needed. move into the heap area, we are okay. cnotes.txt Tue Mar 28 15:19:23 2017 51 cnotes.txt Tue Mar 28 15:19:23 2017 52

Other Memory Areas Putting Code into RAM ------

Other possible regions of memory are for Although a single-purpose controller can have its * your machine code program in ROM or flash memory, if you want to have * your global initialized variables an OS load various apps or programs on demand, you * your global uninitialized variables will need for all memory areas to be in RAM. * constants Putting code into RAM enables a historical horror [Draw picture on board] called self-modifying code. The program can overwrite some of its own machine code while it is Machine-code and constants can be in read-only running. Commonly write NOP (no-op) instructions or flash memory, whereas global variables, on top of things you want to disable. Or, write stack and heap need to be in RAM. Constants can BL instructions into strategically placed NOPs in be a part of the global initialized region, or the code. (as with our ARM examples) in with the machine code.

Many calling conventions (dunno about AAPCS) dedicate registers to point at the bases of the global variable or constants regions. Facilitate use of the CPU’s pre- or post-indexed addressing mode cnotes.txt Tue Mar 28 15:19:23 2017 53 cnotes.txt Tue Mar 28 15:19:23 2017 54

Evils of Self-Modifying Code Typedef ------Declaring C types can be tricky. + Every software engineering principle known to man is probably violated. You cannot easily tell Nice to know about typedef keyword, which lets you what the code is going to do by inspecting the declare an alias for any type. source code - what happens at runtime is determined by earlier stuff that happened at Put the keyword in front of what would otherwise be runtime. a variable declaration of a given type. And your "variable" is now a new type name. + Hardware designers can boost performance if they can assume that the machine codes are unchanging eg typedef int *my_int_ptr; at the time the program starts running. (In 3813, my_int_ptr pi; you might see that this kind of assumption can my_int_ptr *ppi; <-- pointer to ptr to integer simplify the "instruction cache" design.) Typedef makes an alias, so there is no error if you do int *foo = pi; <-- foo and pi are both (int *) cnotes.txt Tue Mar 28 15:19:23 2017 55 cnotes.txt Tue Mar 28 15:19:23 2017 56

Arrays in C Bounds checking ------int A[20], B[20][30]; <-- declare a 1D and 2D array In Java, all array accesses are checked to see whether there is a bounds error. 1D case is similar to Java int [] A = new int[20]; Not the case in C, where you will read junk and write corruption... However, with C if the declaration is outside of any function (or is in a function, but prefixed with C11 may help with this. "static"), the bytes are allocated in the same memory region as other uninitialized global variables (which will be set to 0, same as Java).

A non-static array as a local variable takes up space in the activation record. Different from Java. If uninitialized, its value is weird junk, unlike Java. cnotes.txt Tue Mar 28 15:19:23 2017 57 cnotes.txt Tue Mar 28 15:19:23 2017 58

Array Initializers Pointers and Arrays ------Similar to Java, C arrays can be initialized Pointers and arrays - very closely related in C. int a[] = {3,1,4,1,5}; int A[10]; <-- similar to Java int [] A = new int[10]; char *names[] = {"bob","penny","yulin"}; int *pa = A; <-- allowed A = pa; <-- not allowed, array name is constant int b[2][2]={ {1,2},{3,4} }; *pa = 5; <-- same as A[0] = 5 If a global, it will be stored in a memory region with other initialized globals (different region than the *(pa+i) is same as A[i] or pa[i] uninitialized globals). cnotes.txt Tue Mar 28 15:19:23 2017 59 cnotes.txt Tue Mar 28 15:19:23 2017 60

Pointer Arithmetic Pointer Comparisons ------To make it so that *(pa+i) means same as pa[i], If p and q point to elements of the same array we need that pa+i points to the address of the (or fields of the same object), then it is allowed ith element of the array. If each array element to compare them for ==, !=, >, >= etc. occupies k bytes, we need the memory address of (pa+i) to be i*k bytes after the address of pa. The integer 0 is used as NULL in C.

So: What does while ( *p++){} do? In earlier versions C, pointers are integers were somewhat interchangeable. Not any more. However, If pa and pb are pointers to different parts of 0 is special. the same array, you are allowed to subtract them. Result is the number of array elements between It is okay to do a pointer comparison between a them. pointer within an array and a pointer that is ONE past the end of an array.

0 (as a pointer or integer) is equivalent to false in C.

Hence if ( p && *p == ’a’) ..... or if ( p != 0 && *p == ’a’).... cnotes.txt Tue Mar 28 15:19:23 2017 61 cnotes.txt Tue Mar 28 15:19:23 2017 62

String Copy Malloc, Calloc, Realloc and Free ------void strcpy( char *s, const char *t) { while (*s++ = *t++) Malloc allocates (uninitialized) memory on the heap. ; } So does calloc(n,s) where n is the number of items and s is the size of each. All bytes zeroed. vs realloc(p, s) attempts to enlarge the malloc’ed area void strcpy(char *s, const char *t) { at p to size s. If this can’t be done, it mallocs int i = 0; a new area of size s and copies the data at p over while (s[i] = t[i]) to it. Returns a pointer, regardless i++; } free(p) releases the previously allocated memory. It can then be reused.

Forget to release: "memory leak" Release twice: heap corruption. cnotes.txt Tue Mar 28 15:19:23 2017 63 cnotes.txt Tue Mar 28 15:19:23 2017 64

Valgrind and GCC options Alloca ------

Manual memory management is the source of much The alloca(s) function is like malloc() except efficiency and many tears. that it generates space in the stack frame.

Software tools exist that check for memory leaks, heap Can be more efficient, but its use is discouraged. corruption and other common errors. One is "valgrind". Space created with alloca is automatically recovered when the function returns. Within the last year or two, the GNU C compiler suite has added options that let you check for DO NOT CALL FREE on space gotten with alloca similar corruption and also use of uninitialized values.

-Wuninitialized -Wmaybe-uninitialized -fsanitize variations such as -fsanitize=address or -fsanitize=undefined cnotes.txt Tue Mar 28 15:19:23 2017 65 cnotes.txt Tue Mar 28 15:19:23 2017 66

Function Pointers Example: Library Quicksort ------Generic comparison-based quicksort. Some of the neat things that object-oriented folks It should be able to take any kind of data, as long as are used to doing - you can accomplish them in C you tell the library. using "function pointers". Library prototype (from stdlib.h): A function pointer is essentially the starting void qsort(void *base, size_t nmemb, size_t size, memory address of a piece of code. The compiler int (*compar)(const void *, const void *)); knows the type of the function (its return type or void qsort(void *base, size_t nmemb, size_t size, and the number and type of its parameters). int compar(const void *, const void *));

Sample declaration: Use: int foo(float bar){...... } #define N 100 int goo(float baz){...... } double my_arr[N];

int (*fptr)(float); int my_compare( const void *p1, const void *p2) { fptr = foo; double d1 = * ((double *) p1); (*fptr)(3.14); double d2 = * ((double *) p2); if (d1 > d2) return +1; if (d2 > d1) return -1; return 0;}

int main() { qsort( my_arr, N, sizeof(double), my_compare);} cnotes.txt Tue Mar 28 15:19:23 2017 67 cnotes.txt Tue Mar 28 15:19:23 2017 68

Alternative with Hairy Typecast ------#include

#define N 100 double my_arr[N];

int my_compare( double *d1, double *d2) { if (*d1 > *d2) return +1; if (*d2 > *d1) return -1; return 0;}

int main() { qsort( my_arr, N, sizeof(double), (int (*)(const void *, const void *) ) my_compare); } cnotes.txt Tue Mar 28 15:19:23 2017 69 cnotes.txt Tue Mar 28 15:19:23 2017 70

Memcpy etc Structs ------A few library routines in allow memory Structs are like Java classes without copying a. inheritance b. methods void *memcpy(void *dest, const void *src, size_t n); copy n bytes; dest and src areas must not overlap And all fields ("members") are public. void *memmove(void *dest, const void *src, size_t n); struct point { <-- "point" is optional, but... less efficient, but overlap is okay float xcoord, ycoord; } point1, point2; char *strncpy(char *dest, const char *src, size_t n); stops copy early if null byte seen. No overlap. point1.xcoord = 23.1;

struct point point3; <-- ..this would not be possible w/o having the point tag

point3.ycoord = 5.4; point2 = point1; <-- copy entire struct cnotes.txt Tue Mar 28 15:19:23 2017 71 cnotes.txt Tue Mar 28 15:19:23 2017 72

Structs and Typedef Structs: Assignments ------

Repeatedly having to type the keyword "struct" when In Java, you have a references to objects. To get an declaring variable/parameters is tedious. actual object, you need "new".

Recall typedef sets up a type alias...so So assignment jRef1 = jRef2 means that there is 1 object, and 2 references to it. typedef struct { <-- didn’t use the "point" tag here jRef1.field = 5 changes jRef2.field. float xcoord, ycoord;} point_t; In C, struct assignment is more like basic type point_t point1, point2, point3; <-- same as before assignment. It is "by value". Declaring the struct variable makes "the object".

point_t point1, point2; <-- 2 different structs point1 = point2; <-- still 2 different structs point1.xcoord - 4.2; <-- no change to point2.xcoord cnotes.txt Tue Mar 28 15:19:23 2017 73 cnotes.txt Tue Mar 28 15:19:23 2017 74

Structs and Functions Pointers to Structs ------point_t foo( point_t pt) { ....} Java object references are actually closer to a C pointer to a struct. point2 = foo(point1); point_t *foo( point_t *pt) {....} Passing point1 as parameter means that a bitwise point_t *pt2; assignment (copy) of point1 is put into the activation record of foo. Subsequent changes made pt2 = foo( &point1); to parameter pt have no effect on point1. Now the activation record of foo has a pointer to the When foo returns, a full assignment is done same struct object that the main program called into point2. "point1" Subsequent changes made to *pt WILL change point1. C structs as parameters and returns: handled more the way that Java handles basic types When foo returns, a pointer is copied. (not how Java handles object types). Not the pointee. cnotes.txt Tue Mar 28 15:19:23 2017 75 cnotes.txt Tue Mar 28 15:19:23 2017 76

Arrow Operator Linked-List-of_Ints Example ------typedef struct nd { int data; struct nd *next } node; Pointers to structs are very common in C. Given point_t *pt = .... node *head = NULL; <-- note capitals a very common operation is to dereference (ie, follow) the pointer to the pointee (a struct), then access for (i=0; i < 10; ++i) { one of the members of the struct. node *temp = (node *) malloc( sizeof node); temp->data = i; ... (*pt).xcoord... temp->next = head; head = temp; Parenthesis needed due to operator precedence. } ... do stuff with list, eg As a shorthand, we have the arrow operator, node *x; pt->xcoord for (x=head; x != NULL; x = x->next) which does the same thing, more elegantly. printf("I see %d\n", x->data); ... when finished, don’t forget to free up memory

while (head != NULL) { node *temp = head->next; free(head); head = temp; } cnotes.txt Tue Mar 28 15:19:23 2017 77 cnotes.txt Tue Mar 28 15:19:23 2017 78

Other Data Structures Incomplete/Undefined Struct types ------

For a couple of years in the 1990s, you would have You are allowed to declare the existence of a taken your intro data structures course in C struct BEFORE the compiler has seen the actual [Electrical Enggs still do], and you’d see stacks, definition of the struct. Useful with parameters queues, BSTs, RedBlackTrees etc in C. if they are pointers to structs. struct BSTNode { struct s z; int key; struct BSTNode *left, *right; int foo(struct s *zz) {} // legal! } int bar(struct s zzz) {} // illegal

int zs() { z.a = 1; } // move down to make legal

struct s { int a, b; }; cnotes.txt Tue Mar 28 15:19:23 2017 79 cnotes.txt Tue Mar 28 15:19:23 2017 80

Unions Type Punning with Unions ------In a struct, the members are laid out in memory, one after another (with possible padding). A union provides a way to refer to a glob of bits as if several different types. A union is basically a struct where all the members Wanna peek at the bits of a float? overlap in their storage space. union {float f; int i;} u; It’s a low-level horror missing from Java. u.f = 3.141592; printf("The exponent field is %d\n", Approved use: you want to store an int or a double, (u.i & 0x7f800000)>>23); but not both at the same time. We are relying on the compiler laying out f and i struct { /* union inside a struct */ both starting at the same address (not padding one int storing_an_int; /* boolean */ and not the other). This may not be technically safe union { int ival; double dval;} u; with C89. But it is apparently safe in C99. } s; (cf Wikipedia page on "type punning") s.storing_an_int = 0; s.u.dval = 3.1415; cnotes.txt Tue Mar 28 15:19:23 2017 81 cnotes.txt Tue Mar 28 15:19:23 2017 82

Type Punning with Pointers Bit fields ------

Can do something similar without a union: struct { unsigned int opcode : 6, Take a pointer to one type. Typecast it to another unsigned int register1: 4, pointer type. Dereference the result. unsigned int register2: 4, unsigned int constant1: 12 Strictly speaking, the result might be "undefined", } my_packed_structure; but most compilers have historically done what you want. This declares opcode to be a 6-bit field that is packed with two 4-bit fields and a 12-bit field; float f = 3.141592; float *pf = &f; my_packed_structure.opcode = 0b110100; int *pi = (int *) pf; my_packed_structure.register1 = 9; int i = *pi; Whether opcode is the most- or least- significant 6 (If ints have more stringent alignment rules on a bits is system dependent, as are most aspects of platform than floats, you could get in trouble here.) bit fields. Good for very low-level un-portable code. cnotes.txt Tue Mar 28 15:19:23 2017 83 cnotes.txt Tue Mar 28 15:19:23 2017 84

File IO More File IO ------fprintf and fscanf allow writing/reading files. getc is like getchar, but for an opened file. Used in conjunction with fopen and fclose fgets returns a line (or a max num of chars) from an opened file. Use a (FILE *) parameter. C has no exceptions. Return codes widely used, with FILE *f = fopen("hello.txt","w"); a global variable called "errno". fprintf(f, "Hello world"); fflush(f); /* optional */ Things that would be void in Java return an int in C. fclose(f); One int value means "it worked". Others tell you how it failed. cnotes.txt Tue Mar 28 15:19:23 2017 85 cnotes.txt Tue Mar 28 15:19:23 2017 86

Interpreting Errno Another EOF check ------For an opened file, use feof() to check whether A function perror looks at errno and prints an error end-of-file has been reached. message to the "standard error" stream. cnotes.txt Tue Mar 28 15:19:23 2017 87 cnotes.txt Tue Mar 28 15:19:23 2017 88

Additional Features for Low-Level Programming Intrinsics ------(cf Wikipedia: Intrinsic_function) While maybe not part of the C language definition, major C compilers (GCC, Intel’s, CLANG, Visual C) An intrinsic function is syntactically a library tend to have similar support for function. However, the compiler will have some additional features for low-level programming. baked-in magic when it sees this "function" used, and it will usually compile the intrinsic to a * Intrinsic functions straight-line code sequence that makes use of some * Inline assembly particular special CPU instruction.

Intrinsics are slightly higher level and I’d Intrinsics make available special CPU capabilities consider them more "modern". while still appearing to be library function calls.

Intrinsics and inline assembly can be contrasted with just writing some subroutines in assembly, and then linking the resulting object code with object code resulting from compiling C source. cnotes.txt Tue Mar 28 15:19:23 2017 89 cnotes.txt Tue Mar 28 15:19:23 2017 90

Portable and Non-Portable Intrinsics Embedded/inline assembly ------For details: You can’t use an Intel x86 intrinsic when compiling http://www.ethernut.de/en/documents/arm-inline-asm.html for ARM, and vice versa, unless both platforms have the same capability. But some builtins are portable. In case you feel the need to break into assembly language in the midst of some C, you can use Here’s one that might be pretty portable: "inline assembly". That way, you can use some int __builtin_parity (unsigned int x); super-duper new machine instruction that does not yet have an intrinsic function. Here’s one that generates a particular fancy AVX2 instruction for AVX2. So only recent AMD/Intel for (i = 0; i < 10; ++i) { platforms: asm( "mov r0, r0\n\t" v32qi __builtin_ia32_pmaddubsw256 (v32qi,v32qi) "mov r0, r0\n\t" "mov r0, r0\n\t" "mov r0, r0" ); }

[Note: a decent will throw away all the code written above...] cnotes.txt Tue Mar 28 15:19:23 2017 91 cnotes.txt Tue Mar 28 15:19:23 2017 92

Compilers and inline assembly Example (from earlier reference) ------’ ------Early compilers supporting inline assembly would throw up their hands when encountering a glob of inline unsigned long ByteSwap(unsigned long val) assembly. They would assume it could do anything. { Any value in a register or memory location could asm volatile ( possibly be trashed. "eor r3, %1, %1, ror #16\n\t" "bic r3, r3, #0x00FF0000\n\t" So they did very conservative and inefficient things "mov %0, %1, ror #8\n\t" to compensate (eg, save all registers before, restore "eor %0, %0, r3, lsr #8" all after). : "=r" (val) : "0"(val) Nowadays, you need to declare information about : "r3" what are inputs to, output from the glob, and what ); gets clobbered by the glob. return val; } And then the compiler’s optimizer will feel free to rearrange/discard code in your glob, just like it declared to clobber r3. Compiler is free to choose feels free to rearrange code that its earlier phases which register will be %1, which will be initialized generated. to val. %0 is declared to be compiler-chosen register that will be the output (which is to be put into val) Details are too hairy for this course. cnotes.txt Tue Mar 28 15:19:23 2017 93 cnotes.txt Tue Mar 28 15:19:23 2017 94

Inline and Embedded Assembly in Keil C Compiler’s Symbol Table ------

Your textbook’s Chapter 18 is about mixing assembler Just like an assembler maintains a symbol table, so and C with the Keil tools. does a C compiler.

In inline assembly, you cannot refer to any specific For an identifier, it will track such things as data register - compiler chooses for you. * name * type (int? (int *)?, function?) In embedded assembly you write a C header for a * scope (local? global-uninitialied? global-init?) function and then write its body in assembly. * qualifiers (const?, volatile?) * location relative to Crossware would have its own way of doing things. + base of its global area + stack pointer or frame pointer

[example] cnotes.txt Tue Mar 28 15:19:23 2017 95 cnotes.txt Tue Mar 28 15:19:23 2017 96

Other C99 Features Other C11 Features ------* restrict qualifier: "while this pointer exists, * alignment control (aligned_alloc() etc) it is the _only_ way to reach the things that it points at". Helps compiler. * support for multithreaded programming

* inline keyword. Functions declared inline as a * options for bounds checking on arrays hint to the compiler that it can insert their bodies in place of their calls. (Compilers can inline other functions anyway and can ignore inlining requests...)

* long long int (at least 64 bits long) %lli

* complex numbers (in various precisions)

* variable sized arrays