Lecture 08

 Compiling a program  Compiler structure

CS 50011: Introduction to Systems II  Static vs. dynamic linking

Lecture 8: Compiling and Linking

Prof. Jeff Turkstra

© 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2

Program

 Some slides by Prof. Gustavo  File in a particular format containing Rodriguez-Rivera necessary information to load an application into memory and execute it  Often time part of this is split off into the “loader” and libraries  Programs include:  Machine instructions  Initialized data  List of dependencies  List of memory sections  © 2017 Dr. Jeffrey A. Turkstra 3 List of values© determined2017 Dr. Jeffrey A. Turkstra at load time 4

Executable file formats ELF

 Number of formats  File header  ELF – Link File  Magic number  Used on most *NIX systems  Version  COFF – Common Format  Target ABI  Windoze  ISA  a.out – Used in BSD (Berkeley Standard  Distribution) and early  Pointers to  Not usually used anymore  Program header  BSD UNIX and AT&T UNIX are  Section header predecessors to modern *NIXes  etc © 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6

Program header Section header

 How to create the process image  Type (data, string, notes, etc  Segments  Flags (writable, executable, etc  Types  Virtual address  Flags  Offset in file image  File offset  Size  Virtual address   Size in file Alignment  Size in memory

© 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8

Building a program

 readelf –headers /bin/ls  Start with source code  objdump -p, -h, -t  hello.  Prepocessor  Compiler  Assembler 

© 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10

Preprocessor

 When a .c file is compiled, it is first scanned and modified by a preprocessor before being handed to the real compiler  Finds lines beginning with #, hides them from the compiler, or takes some action  #include, #define  #ifdef, #else, #endif

© 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12

 Can do math  Parentheses around substitution  #if (FLAG % 4 == 0) || (FLAG == 13) variables  Macros #define ABS(x) ( (x) < 0 ? -(x) : (x) )  #define INC(x) x+1  No semi-colon  Have to be careful  #define ABS(x) x < 0 ? -x : x  ABS(B+C)

© 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14

Why macros? Lots of other tricks

 Run time efficiency printf(“The date is %s\n”, __DATE__);  No function call overhead  Passed arguments can be any type  Most preprocessor features are used  #define MAX(x,y) ( (x) > (y) ? (x) : (y) ) for large/advanced software  Works with ints, floats, doubles, even development practices chars

© 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16

Compiler?

 gcc -E

* http://www.cs.montana.edu/~david.watson5/

© 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18

Compiler?

int main() {  Scanning or lexical analysis int i = getint(), j = getint();  Groups program into tokens  Token = smallest meaningful unit of a while (i != j) { program if (i > j) i = i - j;  Parsing or syntax analysis else j = j - i;  Create a parse tree }  Shows how tokens “fit together” putint(i);  Context-free grammar }

© 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20

 Semantic analysis  Determines/discovers meaning  Builds symbol table  Builds syntax tree

© 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22

 Code generation  gcc -c  Traverse symbol table and syntax tree nm -v  Generate loads, stores, arithmetic ops, tests, branches, etc

© 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24

Assembler Libraries

 Discussed in architecture lecture  Libraries are just collections of object  gcc -S files  Internal symbols are indexed for fast lookup by the linker  Searched for symbols that aren’t defined in the program  Symbol found, pull it into executable (static)  Otherwise include a pointer to the file, loaded by loader

© 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26

Statically linked Dynamically linked

 Faster, to a degree  More complexity  Portable  Easy to upgrade libraries  Larger binaries  Vulnerabilities  Fixed version, no updates  Have to manage versions  Loader re-links every time program is executed

readelf --dynamic /bin/ls ldd /bin/ls

© 2017 Dr. Jeffrey A. Turkstra 27 © 2017 Dr. Jeffrey A. Turkstra 28

Interpreter Lazy binding

readelf --headers /bin/ls  Binding a function call to a library can be expensive  Have to go through code and replace the symbol with its address  Delay until the call actually takes place  Calls stub PLT function  Invokes dynamic linker to load the function into memory and obtain real address  Rewrites address that the sub code references  Only happens once  Procedure Lookup Table (PLT) © 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30

Makefile

 gcc -o  Simple way to help organize code nm compilation gcc -o hello hello.c somefunc.c -I.

© 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32

hello: hello.c hellofunc.c CC=gcc gcc -o hello hello.c hellofunc.c -I. CFLAGS=-I. DEPS = hello.h Or...

CC=gcc %.o: %.c $(DEPS) CFLAGS=-I. $(CC) -c -o $@ $< $(CFLAGS)

hello: hello.o hellofunc.o hello: hello.o hellofunc.o $(CC) -o hello hello.o hellofunc.o -I. gcc -o hello hellomake.o hellofunc.o -I.

© 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34

CC=gcc  IDIR =../include  CC=gcc CFLAGS=-I.  CFLAGS=-I$(IDIR) 

DEPS = hellomake.h  ODIR=obj  LDIR =../lib

OBJ = hellomake.o hellofunc.o 

 LIBS=-lm

 _DEPS = hellomake.h

%.o: %.c $(DEPS)  DEPS = $(patsubst %,$(IDIR)/%,$(_DEPS)) $(CC) -c -o $@ $< $(CFLAGS)   _OBJ = hellomake.o hellofunc.o

 OBJ = $(patsubst %,$(ODIR)/%,$(_OBJ))

 hellomake: $(OBJ)   $(ODIR)/%.o: %.c $(DEPS) gcc -o $@ $^ $(CFLAGS)  $(CC) -c -o $@ $< $(CFLAGS)

 hellomake: $(OBJ) © 2017 Dr. Jeffrey A. Turkstra 35 © 2017 Dr. Jeffrey A. Turkstra 36  gcc -o $@ $^ $(CFLAGS) $(LIBS)

 .PHONY: clean

 clean:

 rm -f $(ODIR)/*.o *~ core $(INCDIR)/*~

Questions?

.PHONY: clean clean: rm -f $(ODIR)/*.o *~ core $(INCDIR)/*~

© 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 38