Processes, Execution, and State What Is a Process?

4/8/2018 Processes, Execution, and State What is a Process? 3A. What is a Process? • an executing instance of a program 3B. Process Address Space – how is this different from a program? 3Y. Static and shared libraries • a virtual private computer 3C. Process operations – what does a virtual computer look like? 3D. Implementing processes – how is a process different from a virtual machine? a process is an object 3E. Asynchronous Exceptions • – characterized by its properties (state) 3U. User-mode exception handling – characterized by its operations Processes, Execution, and State 1 Processes, Execution, and State 2 What is “state”? Program vs Process Address Space • the primary dictionary definition of “state” is ELF header section 1 header target ISA section 2 header section 3 header type : code # load sections type : data type : sym – “a mode or condition of being” load adr: 0xxx # info sections load adr: 0xxx length : ### length : ### length : ### – an object may have a wide range of possible states initialized compiled symbol data • all persistent objects have “state” code table values – distinguishing it from other objects – characterizing object's current condition 0x00000000 0x0100000 0x0110000 • contents of state depends on object shared code private data shared lib1 shared lib2 – complex operations often mean complex state – we can save/restore the aggregate/total state shared lib3 private stack – we can talk of a subset (e.g. scheduling state) 0x0120000 0xFFFFFFFF Processes, Execution, and State 3 Processes, Execution, and State 4 (Address Space: Code Segments) (Address Space: Data Segments) • load module (output of linkage editor) • data too must be initialized in address space – all external references have been resolved – process data segment must be created – all modules combined into a few segments – initial contents must be copied from load module includes multiple segments (text, data, BSS) – – BSS: segments to be initialized to all zeroes • code must be loaded into memory – map segment into virtual address space – a virtual code segment must be created • data segments – code must be read in from the load module – map segment into virtual address space – are read/write, and process private • code segments are read/only and sharable – program can grow or shrink it (with sbrk syscall) – many processes can use the same code segments Processes, Execution, and State 5 Processes, Execution, and State 6 1 4/8/2018 (Address Space: Stack Segment) (Address Space: Shared Libraries) • size of stack depends on program activities • static libraries are added to load module – grows larger as calls nest more deeply – each load module has its own copy of each library – amount of local storage allocated by each procedure – program must be re-linked to get new version – after calls return, their stack frames can be recycled • OS manages the process's stack segment • make each library a sharable code segment – stack segment created at same time as data segment – one in-memory copy, shared by all processes – some allocate fixed sized stack at program load time – keep the library separate from the load modules – some dynamically extend stack as program needs it – operating system loads library along with program • Stack segments are read/write and process private • reduced memory use, faster program loads • easier and better library upgrades Processes, Execution, and State 7 Processes, Execution, and State 8 Characteristics of Libraries Shared Libraries • Many advantages • library modules are usually added to load module – Reusable code makes programming easier – each load module has its own copy of each library – Asingle well written/maintained copy • this dramatically increases the size of each process – Encapsulates complexity … better building blocks – program must be re-linked to incorporate new library • Multiple bind-time options • existing load modules don't benefit from bug fixes – Static … include in load module at link time • make each library a sharable code segment – Shared … map into address space at exec time – one in physical memory copy, shared by all processes – Dynamic … choose and load at run-time – keep the library separate from the load modules • It is only code … it has no special privileges – operating system loads library along with program Advantages of Shared Libraries Implementing Shared Libraries • reduced memory consumption • multiple code segments in a single address space – one copy shared by multiple processes/programs – one for the main program, one for each shared library • faster program start-ups – each sharable, mapped in at a well-known address – if it is already in memory, it need not be loaded again • deferred binding of references to shared libs • simplified updates – applications are linkage edited against a stub library – library modules not included program load modules • stub module has addresses for each entry point, but no code • linkage editor resolves all refs to standard map-in locations – library can be updated (e.g. new version w/ bug fixes) – loader must find a copy of each referenced library – programs automatically get new version on restart • and map it in at the address where it is expected to be 2 4/8/2018 Stub modules vs real shared libraries Indirect binding to shared libraries stub module: libfoo.a The real shared object is mapped symbol table: into the process’ address space at that fixed address. It begins with a 0: libfoo.so, shared library redirection table D1 jump table, that effectively seems to jump foo 1: foosub1, global, absolute, 0x1020000 give each entry point a fixed address. 2: foosub2, global, absolute, 0x1020008 shared library: libfoo.so … 3: foosub3, global, absolute, 0x1020010 … (to be mapped in at 0x1020000) call foo 4: foosub4, global, absolute, 0x1020018 … foo: … 0x1020000 jmp foosub1 … ... Program is linkage edited against the 0x1020008 jmp foosub2 stub module, and so believes each of 0x1020010 jmp foosub3 the contained routines to be at a fixed code segment address. 0x1020018 jmp foosub4 (read only) shared library (read only, at well known …. location) foosub1: … foosub2: … Limitations of Shared Libraries Loading and Binding w/DLLs code segment new code • not all modules will work in a shared library (read only) segment – they cannot define/include static data storage foo: … • they are read into program memory … ... – whether they are actually needed or not call foo … • called routines must be known at compile-time Procedure Linkage Table (writeable) – only the fetching of the code is delayed 'til run-time – symbols known at compile time, bound at link time run time loadjump foo.dll foo loader • Dynamically Loadable Libraries are more general – they eliminate all of these limitations ... at a price (run-time binding to DLLs) Shared Libraries vs. DLLs • load module includes a Procedure Linkage Table • both allow code sharing and run-time binding – addresses for routines in DLL resolve to entries in PLT • shared libraries – each PLT entry contains a system call to run-time – do not require a special linkage editor loader (asking it to load the corresponding routine) – shared objects obtained at program load time • first time a routine is called, we call run-time • Dynamically Loadable Libraries loader – require smarter linkage editor, run-time loader – which finds, loads, and initializes the desired routine E1 – modules are not loaded until they are needed – changes the PLT entry to be a jump to loaded routine • automatically when needed, or manually by program – then jumps to the newly loaded routine – complex, per-routine, initialization can be performed • subsequent calls through that PLT entry go • e.g. allocation of private data area for persistent local directly variables 3 4/8/2018 Dynamic Loading Process Operations: fork • DLLs are not merely “better” shared libraries • parent and child are identical: – libraries are loaded to satisfy static external references – data and stack segments are copied – DLLs are designed for dynamic binding – all the same files are open • Typical DLL usage scenario • code sample: – identify a needed module (e.g. device driver) int rc = fork(); if (rc < 0) { – call RTL to load the module, get back a descriptor fprintf(stderr, “Fork failed\n”); – use descriptor to call initialization entry-point } else if (rc == 0) { – initialization function registers all other entry points fprintf(stderr, “Child\n”); – module is used as needed } else – later we can unregister, free resources, and unload fprintf(stderr, “Fork succeeded, child pid = %d\n”, rc); Processes, Execution, and State 20 Variations on Process Creation Windows Process Creation • tabula rasa – a blank slate • The CreateProcess() system call – a new process with minimal resources • A very flexible way to create a new process – a few resources may be passed from parent • Numerous parameters to shape the child – most resources opened, from scratch, by child name of program to run • run – fork + exec – security attributes (new or inherited) – create new process to run a specified command – • a cloning fork is a more expensive operation – open file handles to pass/inherit – much data and resources to be copied – environment variables – convenient for setting up pipelines – initial working directory – allows inheritance of exclusive use devices Processes, Execution, and State 21 Process Forking What Happens After a Fork? • There are now two processes • The way Unix/Linux creates processes – with different process ID numbers – child is a clone of the parent

Load more