Lecture 08 Program Executable File Formats
Total Page:16
File Type:pdf, Size:1020Kb
Lecture 08 Compiling a program Compiler structure CS 50011: Introduction to Systems II Static vs. dynamic linking Lecture 8: Compiling and Linking Prof. Jeff Turkstra © 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2 Program Some slides by Prof. Gustavo File in a particular format containing Rodriguez-Rivera necessary information to load an application into memory and execute it Often time part of this is split off into the “loader” and libraries Programs include: Machine instructions Initialized data List of library dependencies List of memory sections © 2017 Dr. Jeffrey A. Turkstra 3 List of values© determined2017 Dr. Jeffrey A. Turkstra at load time 4 Executable file formats ELF Number of formats File header ELF – Executable Link File Magic number Used on most *NIX systems Version COFF – Common Object File Format Target ABI Windoze ISA a.out – Used in BSD (Berkeley Standard Entry point Distribution) and early UNIX Pointers to Not usually used anymore Program header BSD UNIX and AT&T UNIX are Section header predecessors to modern *NIXes etc © 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6 Program header Section header How to create the process image Type (data, string, notes, etc Segments Flags (writable, executable, etc Types Virtual address Flags Offset in file image File offset Size Virtual address Size in file Alignment Size in memory © 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8 Building a program readelf –headers /bin/ls Start with source code objdump -p, -h, -t hello.c Prepocessor Compiler Assembler Linker © 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10 Preprocessor When a .c file is compiled, it is first scanned and modified by a preprocessor before being handed to the real compiler Finds lines beginning with #, hides them from the compiler, or takes some action #include, #define #ifdef, #else, #endif © 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12 Can do math Parentheses around substitution #if (FLAG % 4 == 0) || (FLAG == 13) variables Macros #define ABS(x) ( (x) < 0 ? -(x) : (x) ) #define INC(x) x+1 No semi-colon Have to be careful #define ABS(x) x < 0 ? -x : x ABS(B+C) © 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14 Why macros? Lots of other tricks Run time efficiency printf(“The date is %s\n”, __DATE__); No function call overhead Passed arguments can be any type Most preprocessor features are used #define MAX(x,y) ( (x) > (y) ? (x) : (y) ) for large/advanced software Works with ints, floats, doubles, even development practices chars © 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16 Compiler? gcc -E * http://www.cs.montana.edu/~david.watson5/ © 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18 Compiler? int main() { Scanning or lexical analysis int i = getint(), j = getint(); Groups program into tokens Token = smallest meaningful unit of a while (i != j) { program if (i > j) i = i - j; Parsing or syntax analysis else j = j - i; Create a parse tree } Shows how tokens “fit together” putint(i); Context-free grammar } © 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20 Semantic analysis Determines/discovers meaning Builds symbol table Builds syntax tree © 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22 Code generation gcc -c Traverse symbol table and syntax tree nm -v Generate loads, stores, arithmetic ops, tests, branches, etc © 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24 Assembler Libraries Discussed in architecture lecture Libraries are just collections of object gcc -S files Internal symbols are indexed for fast lookup by the linker Searched for symbols that aren’t defined in the program Symbol found, pull it into executable (static) Otherwise include a pointer to the file, loaded by loader © 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26 Statically linked Dynamically linked Faster, to a degree More complexity Portable Easy to upgrade libraries Larger binaries Vulnerabilities Fixed version, no updates Have to manage versions Loader re-links every time program is executed readelf --dynamic /bin/ls ldd /bin/ls © 2017 Dr. Jeffrey A. Turkstra 27 © 2017 Dr. Jeffrey A. Turkstra 28 Interpreter Lazy binding readelf --headers /bin/ls Binding a function call to a library can be expensive Have to go through code and replace the symbol with its address Delay until the call actually takes place Calls stub PLT function Invokes dynamic linker to load the function into memory and obtain real address Rewrites address that the sub code references Only happens once Procedure Lookup Table (PLT) © 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30 Makefile gcc -o Simple way to help organize code nm compilation gcc -o hello hello.c somefunc.c -I. © 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32 hello: hello.c hellofunc.c CC=gcc gcc -o hello hello.c hellofunc.c -I. CFLAGS=-I. DEPS = hello.h Or... CC=gcc %.o: %.c $(DEPS) CFLAGS=-I. $(CC) -c -o $@ $< $(CFLAGS) hello: hello.o hellofunc.o hello: hello.o hellofunc.o gcc -o hello hellomake.o hellofunc.o -I. $(CC) -o hello hello.o hellofunc.o -I. © 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34 CC=gcc IDIR =../include CC=gcc CFLAGS=-I. CFLAGS=-I$(IDIR) DEPS = hellomake.h ODIR=obj LDIR =../lib OBJ = hellomake.o hellofunc.o LIBS=-lm _DEPS = hellomake.h %.o: %.c $(DEPS) DEPS = $(patsubst %,$(IDIR)/%,$(_DEPS)) $(CC) -c -o $@ $< $(CFLAGS) _OBJ = hellomake.o hellofunc.o OBJ = $(patsubst %,$(ODIR)/%,$(_OBJ)) hellomake: $(OBJ) $(ODIR)/%.o: %.c $(DEPS) gcc -o $@ $^ $(CFLAGS) $(CC) -c -o $@ $< $(CFLAGS) hellomake: $(OBJ) © 2017 Dr. Jeffrey A. Turkstra 35 © 2017 Dr. Jeffrey A. Turkstra 36 gcc -o $@ $^ $(CFLAGS) $(LIBS) .PHONY: clean clean: rm -f $(ODIR)/*.o *~ core $(INCDIR)/*~ Questions? .PHONY: clean clean: rm -f $(ODIR)/*.o *~ core $(INCDIR)/*~ © 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 38 .