Linkers and Loaders
Total Page:16
File Type:pdf, Size:1020Kb
Linkers and Loaders CS 167 VI–1 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Does Location Matter? int main(int argc, char *[ ]) { return(argc); } main: pushl %ebp ; push frame pointer movl %esp, %ebp ; set frame pointer to point to new frame movl 8(%ebp), %eax ; put argc into return register (eax) movl %ebp, %esp ; restore stack pointer popl %ebp ; pop stack into frame pointer ret ; return: pops end of stack into eip CS 167 VI–2 Copyright © 2008 Thomas W. Doeppner. All rights reserved. The material in this slide through slide 14 is taken directly from the textbook. Location Matters … int X=6; int *aX = &X; int main( ) { void subr(int); int y=X; subr(y); return(0); } void subr(int i) { printf("i = %d\n", i); } CS 167 VI–3 Copyright © 2008 Thomas W. Doeppner. All rights reserved. We don’t need to look at the assembler code to see what’s different about this program: the machine code produced for it can’t simply be copied to an arbitrary location in our computer’s memory and executed. The location identified by the name aX should contain the address of the location containing X. But since the address of X will not be known until the program is copied into memory, neither the compiler nor the assembler can initialize aX correctly. Similarly, the addresses of subr and printf are not known until the program is copied into memory — again, neither the compiler nor the assembler would know what addresses to use. A Slight Revision extern int X; #include <stdio.h> int *aX = &X; int X; int main( ) { void subr(int i) { void subr(int); printf("i = %d\n", i); int y = *aX; } subr(y); subr.c return(0); } main.c gcc –o prog main.c subr.c CS 167 VI–4 Copyright © 2008 Thomas W. Doeppner. All rights reserved. main.s 0: .data ; what follows is initialized data 0: .globl aX ; aX is global: it may be used by others 0: aX: 0: .long X 4: 0: .text ; offset restarts; what follows is text (read-only code) 0: .globl main 0: main: 0: pushl %ebp ; save the frame pointer 1: movl %esp,%ebp ; point to current frame 3: subl $4,%esp ; make space for y on stack 6: movl aX,%eax ; put contents of X into eax 11: movl (%eax),%eax ; put *X into %eax 13: movl %eax,-4(%ebp) ; store *aX into y 16: pushl -4(%ebp) ; push y onto stack 19: call subr 24: addl $4,%esp ; remove y from stack 27: movl $0,%eax ; set return value to 0 31: movl %ebp, %esp ; restore stack pointer 33: popl %ebp ; pop frame pointer 35: ret CS 167 VI–5 Copyright © 2008 Thomas W. Doeppner. All rights reserved. subr.s 0: .data ; what follows is initialized data 0: printfarg: 0: .string "i = %d\n" 8: 0: .comm X,4 ; 4 bytes in BSS is required for global X 4: 0: .text ; offset restarts; what follows is text (read-only code) 0: .globl subr 0: subr: 0: pushl %ebp ; save the frame pointer 1: movl %esp, %ebp ; point to current frame 3: pushl 8(%ebp) ; push i onto stack 6: pushl $printfarg ; push address of string onto stack 11: call printf 16: addl $8, %esp ; pop arguments from stack 19: movl %ebp, %esp ; restore stack pointer 21: popl %ebp ; pop frame pointer 23: ret CS 167 VI–6 Copyright © 2008 Thomas W. Doeppner. All rights reserved. main.o Data: Size: 4 Global: aX, offset 0 Undefined: X Relocation: offset 0, size 4, value: address of X Contents: 0x00000000 bss: Size: 0 Text: Size: 36 Global: main, offset 0 Undefined: subr Relocation: offset 7, size 4, value: address of aX offset 20, size 4, value: PC-relative address of subr Contents: [machine instructions] CS 167 VI–7 Copyright © 2008 Thomas W. Doeppner. All rights reserved. subr.o Data: Size: 8 Contents: "i = %d\n" bss: Size: 4 Global: X, offset 0 Text: Size: 44 Global: subr, offset 0 Undefined: printf Relocation: offset 7, size 4, value: address of printfarg offset 12, size 4, value: PC-relative address of printf Contents: [machine instructions] CS 167 VI–8 Copyright © 2008 Thomas W. Doeppner. All rights reserved. printf.o Data: Size: 1024 Global: StandardFiles Contents: … bss: Size: 256 Text: Size: 12000 Global: printf, offset 100 … Undefined: write Relocation: offset 211, value: address of StandardFiles offset 723, value: PC-relative address of printf Contents: [machine instructions] CS 167 VI–9 Copyright © 2008 Thomas W. Doeppner. All rights reserved. write.o Data: Size: 0 bss: Size: 4 Global: errno, offset 0 Text: Size: 16 Contents: [machine instructions] CS 167 VI–10 Copyright © 2008 Thomas W. Doeppner. All rights reserved. prog Text main 4096 subr 4132 printf 4156 write 16156 startup 16172 Data aX 16384 printfargs 16388 StandardFiles 16396 bss X 17420 errno 17680 CS 167 VI–11 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Shared Libraries Process A Process B printf( ) stdio printf( ) printf( ) CS 167 VI–12 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Consider the situation shown in the slide: we have two processes, each containing a program that calls printf. Up to this point in our discussion, the two processes have no means for sharing a single copy of printf—each must have its own. If you consider that pretty much every C program calls printf, a huge amount of disk space in the world could be wasted because of all the copies of printf. Furthermore, when each program is loaded into primary memory, large amount of such memory is wasted because of multiple copies of printf. What is needed is a means for programs to share a single copy of printf (as well as other routines). However, sharing of code is not trivial to implement. A big problem is relocation. The code for printf might well contain relocatable addreses, such as references to global data and other procedures. What makes things difficult is that the code for printf might be mapped into the two processes at different virtual locations. Relocation and Shared Libraries 1) Prerelocation: relocate libraries ahead of time 2) Limited sharing: relocate separately for each process 3) Position-Independent Code: no need for relocation CS 167 VI–13 Copyright © 2008 Thomas W. Doeppner. All rights reserved. If all users of printf agree to load it and everything it references into the same locations in their address spaces, we would have no relocation problem. But such agreement is, in general, hard to achieve. It is, however, the approach used in Windows. A possibility might be for the users of printf to share a single on-disk copy, but for this copy to be relocated separately in each process when loaded. This would allow sharing of disk space, but not of primary storage. Another possibility is for printf to be written in such a way that relocation is not necessary. Code written in this fashion is known as position-independent code (PIC). Position-Independent Code ld r2, r1[printf] 0 printf( ) { ld r2, r1[printf] ld r2, call r2 r1[doprint] call r2 call r2 . } 1000 doprint( ) { r1 printf 10000 . r1 printf 20000 } doprint 11000 doprint 21000 CS 167 VI–14 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Here is an example of the use of position-independent code (PIC). Processes A and B are sharing the library containing printf (note that printf contains a call to another shared routine, doprint), though each has it mapped into a different location. Each process maintains a private table, pointed to by register r1. In the table are the addresses of shared routines, as mapped into the process. Thus, rather than call a routine directly (via an address embedded in the code), a position-independent call is made: the address of the desired routine is stored at some fixed offset within the table. The contents of the table at this offset are loaded into register r2, and then the call is made via r2. Linking and Loading on Linux with ELF • Substitution • Shared libraries • Versioning • Dynamic linking • Interpositioning CS 167 VI–15 Copyright © 2008 Thomas W. Doeppner. All rights reserved. ELF stands for “executable and linking format” and is used on most Unix systems, including Linux, Solaris, FreeBSD, NetBSD, and OpenBSD, but not MacOS X. Creating a Library % gcc -c sub1.c sub2.c sub3.c % ls sub1.c sub2.c sub3.c sub1.o sub2.o sub3.o % ar cr libpriv1.a sub1.o sub2.o sub3.o % ar t libpriv1.a sub1.o sub2.o sub3.o % CS 167 VI–16 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Using a Library % cat prog.c % gcc -o prog prog.c -L. -lpriv1 int main() { sub1(); sub2(); sub3(); Where does puts come from? } % cat sub1.c void sub1() { puts("sub1"); } %gcc –o prog prog.c –L. \ -lpriv1 –L/lib -lc CS 167 VI–17 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Substitution % cat myputs.c int puts(char *s) { write(1, "My puts: ", 9); write(1, s, strlen(s)); write(1, "\n", 1); return 1; } % gcc –c myputs.c % ar cr libmyputs.a myputs.o % gcc -o prog prog.c -L. -lpriv1 -lmyputs % CS 167 VI–18 Copyright © 2008 Thomas W. Doeppner. All rights reserved. Shared Libraries 1 Compile program 2 Track down linkages with ld – archives (containing relocatable objects) in “.a” files are statically linked – shared objects in “.so” files are dynamically linked 3 Run program – ld.so is invoked to complete the linking and relocation steps, if necessary CS 167 VI–19 Copyright © 2008 Thomas W.