Kitty Reeves MTWR 10:20-12:25Pm
Total Page:16
File Type:pdf, Size:1020Kb
Summer 2013 CSE2421 Systems1 Introduction to Low-Level Programming and Computer Organization Kitty Reeves MTWR 10:20-12:25pm 1 Compiler Drivers = GCC When you invoke GCC, it normally does preprocessing, compilation, assembly and linking, as needed, on behalf of the user accepts options and file names as operands % gcc –O1 -g -o prog main.c swap.c % man gcc Pre-processor (cpp) .c to .i Compiler (cc1) .i file to .s Compiler Assembler (as) .s to “relocatable” object file .o driver Linker (ld) creates executable object file called a.out as default Loader the unix shell invokes a function in the OS to call the loader Copies the code and data in the executable file into memory Transfers control to the beginning of the program Remember the flow chart of this info that we’ve seen multiple times already? 2 Overview example 3 Compiling, linking… What happens when you use the compiler driver? gcc -O2 -g -o test main.c swap.c Step1 – Preprocessor (cpp) The C preprocessor translates the C source file main.c into an ASCII intermediate file main.i cpp [other args] main.c /tmp/main.i Step2 – C compiler (ccl) Driver runs C compiler cc1 translates main.i into an ASCII assembly language file main.s cc1 /tmp/main.i main.c -O2 [other args] -o /tmp/main.s 4 Compiling, Linking… (cont) Step3 – Assembler (as) driver runs assembler as to translate main.s into relocatable object file main.o as [other args] -o /tmp/main.o /tmp/main.s Same 3 steps are executed for swap.c Step4 – Linker (ld) driver runs linker ld to combine main.o and swap.o into the executable file test ld -o test [sys obj files and args] /tmp/main.o /tmp/swap.o 5 So far… test 6 Compiling, Linking… Executing unix> ./test shell invokes function in OS called loader copy code and data of executable test into memory then transfers control to the beginning of the *main* function 7 Classic Unix Linker is inside of gcc command Loader is part of exec system call Executable image contains all object and library modules needed by program Entire image is loaded at once Every image contains copy of common library routines Every loaded program contain duplicate copy of library routines 8 Object Files Putting it all together: Compiler and assembler generate relocatable object files, including shared object files. Linker generates executable object files. Object formats first Unix systems used a.out format (the term is still used) early System V used Common Object File format (COFF) Linux and other modern unix systems use Executable and Linkable Format (ELF)*** 9 Types of object files (file.o) Object files, created by the assembler and link editor, are binary representations of programs intended to be executed directly on a processor A relocatable object file holds code and data suitable for linking with other object files to create an executable or a shared object file. Each .o file is produced from exactly one source (.c) file An executable object file (default is a.out) holds code and data that can be copied directly into memory then executed. A shared object file (.so file) is a special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run time. Load time: the link editor processes the shared object file with other relocatable and shared object files to create another object file Run time: the dynamic linker combines it with an executable file and other shared objects to create a process image 10 Linkage Combining a set of programs, including library routines, to create a loadable image Resolving symbols defined within the set Listing symbols needing to be resolved by loader After the individual source files comprising a program are compiled, the object files are linked together with functions from one or more libraries to form the executable program When the same identifier appears in more than one source file, do they refer to the same entity or two different entities? 11 Static linking What does the static linker do? takes as input a collection of relocatable object files and command-line arguments generates fully linked executable object file that can be loaded and run Linker performs two tasks: symbol resolution: associates each symbol reference with exactly one symbol definition relocation: compiler and assembler generate code and data sections that start at address 0. Linker relocates these sections: associates memory location with each symbol definition, then updates all references to point to these locations. 12 Static linking – What do linkers do? Step 1. Symbol resolution Programs define and reference “symbols” (variables and functions): void swap() {…} // define symbol swap swap(); // reference to a symbol swap int *xp = &x; // define symbol xp, reference x Symbol definitions are stored (by compiler) in a “symbol table” A symbol table is an array of structs Each entry includes name, size, and location of symbol Linker associates each symbol reference with exactly one symbol definition %objdump –d –t hello // -t option gives symbol table 13 Static linking – What do linkers do? Step 2. Relocation Merges separate code and data sections into single sections Relocates symbols from their relative locations in the .o files to their final absolute memory locations in the executable. Updates all references to these symbols to reflect their new positions. 14 Linking Summary Linker – key part of OS Combines object files and libraries into a “standard” format that the OS loader can interpret Resolves references and does static relocation of addresses Creates information for loader to complete binding process Supports dynamic shared libraries 15 Loading Copying the loadable image into memory, connecting it with any other programs already loaded, and updating addresses as needed (in unix) interpreting file to initialize the process address space (in all systems) kernel image is special (own format) 16 From source code to a process Process = Systems 2 17 Static Linking and Loading #include <stdio.h> int main() { printf("Hello World"); return 0; } //HelloWorld.c r 18 Dynamic Loading Routine is not loaded until it is called Better memory-space utilization; unused routine is never loaded. Useful when large amounts of code are needed to handle infrequently occurring cases. 19 Shared Libraries Observation – “everyone” links to standard libraries (libc.a, etc.) Consume space in every executable image every process memory at runtime Would it be possible to share the common libraries? Automatically load at runtime? Libraries designated as “shared” .so, .dll, etc. Supported by corresponding “.a” libraries containing symbol information Linker sets up symbols to be resolved at runtime Loader: Is library already in memory? If so, map into new process space If not, load and then map 20 Run-time Linking/Loading 21 Executable files The OS expects executable files to have a specific format Header info Code locations Data locations Code & data Symbol Table List of names of things defined in your program and where they are defined List of names of things defined elsewhere that are used by your program, and where they are used. 22 Example 23 Example (cont) 24 0000000000000000 <main>: Hello World 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp revisited 4: bf 00 00 00 00 mov $0x0,%edi 9: b8 00 00 00 00 mov $0x0,%eax #include <stdio.h> Object file e: e8 00 00 00 00 callq 13 <main+0x13> int main() { has not 13: b8 00 00 00 00 mov $0x0,%eax printf("Hello World"); been 18: c9 leaveq return 0; linked yet 19: c3 retq } //hello.c % gcc -S -m32 hello.c OR %gcc -c -m32 hello.c then % objdump -d hello.o %gcc -m32 -o hello hello.c % objdump -d hello 00000000004004cc <main>: 4004cc: 55 push %rbp Relocatable 4004cd: 48 89 e5 mov %rsp,%rbp 4004d0: bf dc 05 40 00 mov $0x4005dc,%edi vs 4004d5: b8 00 00 00 00 mov $0x0,%eax Executable 4004da: e8 e1 fe ff ff callq 4003c0 <printf@plt> 4004df: b8 00 00 00 00 mov $0x0,%eax object code 4004e4: c9 leaveq 4004e5: c3 retq 25 ELF - Executable and Linkable Format From wiki: The segments contain information that is necessary for runtime execution of the file, LOADING LINKING while sections contain important data for linking and relocation. Any byte in the entire file can be owned by at most one section, and there can be orphan bytes which are not owned by any section. ELF describes two separate ‘views’ of an executable, a linking view and a loading view Linking view is used at static link time to combine relocatable files Loading view is used at run time to load and execute program 26 ELF Link View Not all sections needed at run time Information used for linking Debugging information Difference between link time and run time 27 Program Linkage Table Stored in the .plt section Allows executables to call functions that aren’t present at compile time Shared library functions (e.g printf()) Set of function stubs Relocations point them to real location of the functions Normally relocated ‘lazily’ 28 ELF Loading View w/ segment types Much simpler view, divides executable into ‘Segments’ Describes Parts of file to be loaded into memory at run time Locations of important data at run time Segments have: A simple type Requested memory location Permissions (R/W/X) Size (in file and in memory) 29 ELF Linking to Loading Semantics of section table (Linking View) are irrelevant in Loading View Section information can be removed from executable Operating system routines to load executable and begin execution 30 Object File Format/Organization The object file formats provide parallel views of a file's contents, reflecting the differing needs of those activities.