4/8/2018

Processes, Execution, and State What is a ?

3A. What is a Process? • an executing instance of a program 3B. Process Address Space – how is this different from a program? 3Y. Static and shared libraries • a virtual private 3C. Process operations – what does a virtual computer look like? 3D. Implementing processes – how is a process different from a virtual machine? a process is an object 3E. Asynchronous Exceptions • – characterized by its properties (state) 3U. User-mode exception handling – characterized by its operations

Processes, Execution, and State 1 Processes, Execution, and State 2

What is “state”? Program vs Process Address Space

• the primary dictionary definition of “state” is ELF header section 1 header target ISA section 2 header section 3 header type : code # sections type : data type : sym – “a mode or condition of being” load adr: 0xxx # info sections load adr: 0xxx length : ### length : ### length : ### – an object may have a wide range of possible states initialized compiled symbol data • all persistent objects have “state” code table values – distinguishing it from other objects

– characterizing object's current condition 0x00000000 0x0100000 0x0110000

• contents of state depends on object shared code private data shared lib1 shared lib2 – complex operations often mean complex state – we can save/restore the aggregate/total state shared lib3 private stack – we can talk of a subset (e.g. state) 0x0120000 0xFFFFFFFF Processes, Execution, and State 3 Processes, Execution, and State 4

(Address Space: Code Segments) (Address Space: Data Segments)

• load module (output of linkage editor) • data too must be initialized in address space – all external references have been resolved – process data segment must be created – all modules combined into a few segments – initial contents must be copied from load module includes multiple segments (text, data, BSS) – – BSS: segments to be initialized to all zeroes • code must be loaded into memory – map segment into virtual address space – a virtual code segment must be created • data segments – code must be read in from the load module – map segment into virtual address space – are read/write, and process private • code segments are read/only and sharable – program can grow or shrink it (with sbrk syscall) – many processes can use the same code segments

Processes, Execution, and State 5 Processes, Execution, and State 6

1 4/8/2018

(Address Space: Stack Segment) (Address Space: Shared Libraries)

• size of stack depends on program activities • static libraries are added to load module – grows larger as calls nest more deeply – each load module has its own copy of each library – amount of local storage allocated by each procedure – program must be re-linked to get new version – after calls return, their stack frames can be recycled • OS manages the process's stack segment • make each library a sharable code segment – stack segment created at same time as data segment – one in-memory copy, shared by all processes – some allocate fixed sized stack at program load time – keep the library separate from the load modules – some dynamically extend stack as program needs it – loads library along with program • Stack segments are read/write and process private • reduced memory use, faster program loads • easier and better library upgrades

Processes, Execution, and State 7 Processes, Execution, and State 8

Characteristics of Libraries Shared Libraries

• Many advantages • library modules are usually added to load module – Reusable code makes programming easier – each load module has its own copy of each library – Asingle well written/maintained copy • this dramatically increases the size of each process – Encapsulates complexity … better building blocks – program must be re-linked to incorporate new library • Multiple bind-time options • existing load modules don't benefit from bug fixes – Static … include in load module at link time • make each library a sharable code segment – Shared … map into address space at exec time – one in physical memory copy, shared by all processes – Dynamic … choose and load at run-time – keep the library separate from the load modules • It is only code … it has no special privileges – operating system loads library along with program

Advantages of Shared Libraries Implementing Shared Libraries

• reduced memory consumption • multiple code segments in a single address space – one copy shared by multiple processes/programs – one for the main program, one for each shared library • faster program start-ups – each sharable, mapped in at a well-known address – if it is already in memory, it need not be loaded again • deferred binding of references to shared libs • simplified updates – applications are linkage edited against a stub library – library modules not included program load modules • stub module has addresses for each entry point, but no code • linkage editor resolves all refs to standard map-in locations – library can be updated (e.g. new version w/ bug fixes) – loader must find a copy of each referenced library – programs automatically get new version on restart • and map it in at the address where it is expected to be

2 4/8/2018

Stub modules vs real shared libraries Indirect binding to shared libraries stub module: libfoo.a The real shared object is mapped symbol table: into the process’ address space at that fixed address. It begins with a 0: libfoo.so, shared library redirection table D1 jump table, that effectively seems to jump foo 1: foosub1, global, absolute, 0x1020000 give each entry point a fixed address. 2: foosub2, global, absolute, 0x1020008 shared library: libfoo.so … 3: foosub3, global, absolute, 0x1020010 … (to be mapped in at 0x1020000) call foo 4: foosub4, global, absolute, 0x1020018 … foo: … 0x1020000 jmp foosub1 … ... Program is linkage edited against the 0x1020008 jmp foosub2 stub module, and so believes each of 0x1020010 jmp foosub3 the contained routines to be at a fixed code segment address. 0x1020018 jmp foosub4 (read only) shared library (read only, at well known …. location) foosub1: … foosub2: …

Limitations of Shared Libraries Loading and Binding w/DLLs

code segment new code • not all modules will work in a shared library (read only) segment – they cannot define/include static data storage foo: … • they are read into program memory … ... – whether they are actually needed or not call foo … • called routines must be known at compile-time Procedure Linkage Table (writeable) – only the fetching of the code is delayed 'til run-time – symbols known at compile time, bound at link time run time jumpload foo.dll foo loader • Dynamically Loadable Libraries are more general – they eliminate all of these limitations ... at a price

(run-time binding to DLLs) Shared Libraries vs. DLLs

• load module includes a Procedure Linkage Table • both allow code sharing and run-time binding – addresses for routines in DLL resolve to entries in PLT • shared libraries – each PLT entry contains a to run-time – do not require a special linkage editor loader (asking it to load the corresponding routine) – shared objects obtained at program load time • first time a routine is called, we call run-time • Dynamically Loadable Libraries loader – require smarter linkage editor, run-time loader – which finds, loads, and initializes the desired routine E1 – modules are not loaded until they are needed – changes the PLT entry to be a jump to loaded routine • automatically when needed, or manually by program – then jumps to the newly loaded routine – complex, per-routine, initialization can be performed • subsequent calls through that PLT entry go • e.g. allocation of private data area for persistent local directly variables

3 4/8/2018

Dynamic Loading Process Operations: fork

• DLLs are not merely “better” shared libraries • parent and child are identical: – libraries are loaded to satisfy static external references – data and stack segments are copied – DLLs are designed for dynamic binding – all the same files are open • Typical DLL usage scenario • code sample: – identify a needed module (e.g. device driver) int rc = fork(); if (rc < 0) { – call RTL to load the module, get back a descriptor fprintf(stderr, “Fork failed\n”); – use descriptor to call initialization entry-point } else if (rc == 0) { – initialization function registers all other entry points fprintf(stderr, “Child\n”); – module is used as needed } else – later we can unregister, free resources, and unload fprintf(stderr, “Fork succeeded, child pid = %d\n”, rc);

Processes, Execution, and State 20

Variations on Process Creation Windows Process Creation

• tabula rasa – a blank slate • The CreateProcess() system call – a new process with minimal resources • A very flexible way to create a new process – a few resources may be passed from parent • Numerous parameters to shape the child – most resources opened, from scratch, by child name of program to run • run – fork + exec – security attributes (new or inherited) – create new process to run a specified command – • a cloning fork is a more expensive operation – open file handles to pass/inherit – much data and resources to be copied – environment variables – convenient for setting up pipelines – initial working directory – allows inheritance of exclusive use devices

Processes, Execution, and State 21

Process Forking What Happens After a Fork? • There are now two processes • The way Unix/Linux creates processes – with different process ID numbers – child is a clone of the parent – but otherwise nearly identical – the classical Computer Science fork operation • How do I profitably use that? • Occasionally a clone is what you wanted – two processes w/same code & program counter – likely for some kinds of parallel programming – figure out which is which • Program in child process can adjust resources – parent process goes one way – change input/output file descriptors – child process goes another • perhaps adjust process resources – change working directory • perhaps load a new program into the process – change environment variables • this code takes the place of (CreateProcess) parameters – choose which program to run – enable/disable signals

4 4/8/2018

Process Operations: exec How Processes Terminate

• load new program, pass parameters • Perhaps voluntarily – address space is completely recreated – by calling the (2) system call – open files remain open, disabled signals disabled • Perhaps involuntarily – available in many polymorphisms – as a result of an unhandled signal/exception • code sample: – a few signals (e.g. SIGKILL) cannot be caught char *myargs[3]; • Perhaps at the hand of another myargs[0] = “wc”; – a parent sends a termination signal (e.g. TERM) myargs[1] = “myfile”; a system management process (e.g. INT, HUP) myargs[2] = NULL; – int rc = execvp(myargs[0], myargs);

Processes, Execution, and State 25 Processes, Execution, and State 26

Process Operations: The State of a Process

• await termination of a child process • Registers – collect exit status – Program Counter, Processor Status Word • code sample: – Stack Pointer, general registers int rc = waitpid(pid, &status, 0); • Address space if (rc == 0) { – size and location of text, data, stack segments fprintf(stderr, “process %d exited rc=%d\n”, pid, status); – size and location of supervisor mode stack } • System Resources and Parameters – open files, current working directory, ... – owning user ID, parent PID, scheduling priority, ...

Processes, Execution, and State 27

Representing a Process Resident and non-Resident State • all (not just OS) objects have descriptors Resident Process Table Non-resident Process State – the identity of the object PID: 1 – the current state of the object STS: in mem – references to other associated objects … • Process state is in multiple places PID: 2 STS: on disk – parameters and object references in a descriptor … app execution state is on the stack, in registers – PID: 3 STS: swapout – each Linux process has a supervisor-mode stack … • to retain the state of in-progress system calls • to save the state of an interrupt preempted process in memory on disk

Processes, Execution, and State 29 Processes, Execution, and State 30

5 4/8/2018

(resident process descriptor) (non-resident process state)

• state that could be needed at any time • information needed only when process runs • information needed to schedule process – can swap out to free memory for other processes – run-state, priority, statistics • execution state – data needed to signal or awaken process – supervisor mode stack • identification information – including: saved register values, PC, PS – process ID, user ID, group ID, parent ID • pointers to resources used when running • communication and synchronization resources – current working directory, open file descriptors – semaphores, pending signals, mail-boxes • pointers to text, data and stack segments • pointer to non-resident state – used to reconstruct the address space

Processes, Execution, and State 31 Processes, Execution, and State 32

Creating a new process Forking and the Data Segments

• allocate/initialize resident process description • Forked child shares parent’s code segment • allocate/initialize non-resident description – a single read only segment, referenced by both • duplicate parent resource references (e.g. fds) • Stack and Data segments are private • create a virtual address space – each process has its own read/write copy – allocate memory for code, data and stack – child’s is initialized as a copy of parent’s – load/copy program code and data – copies diverge w/subsequent updates – copy/initialize a stack segment • Common optimization: Copy-on-Write – set up initial registers (PC, PS, SP) – start with a single shared read/only segment • return from supervisor mode into new process – make a copy only if parent (or child) changes it

Processes, Execution, and State 33

Forking a New Process Loading (exec) a Program • We have a load module

Process Descriptors Physical Memory – The output of linkage editor All external references have been resolved PID=1000 – code – All modules combined into a few segments code data data – Includes multiple segments (code, data, etc.) stack • A computer cannot “execute” a load module PID=1001 stack code – execute instructions in memory data – An entirely new address space must be created stack data’ – Memory must be allocated for each segment stack’ – Code must be copied from load module to memory

6 4/8/2018

Loading a new program (exec) How to Terminate a Process?

ELF header section 1 header target ISA section 2 header section 3 header • Reclaim any resources it may be holding type : code # load sections type : data type : sym load adr: 0xxx # info sections load adr: 0xxx length : ### length : ### length : ### Load – memory initialized compiled symbol Module – locks data code table values – access to hardware devices • Inform any other process that needs to know Process – those waiting for interprocess communications text Virtual shared code privatedata data – parent (and maybe child) processes Remove process descriptor from process table Address shared lib privatestack stack • Space

Asynchronous Events Asynchronous Exceptions • some things are worth waiting for • some errors are routine – when I read(), I want to wait for the data – end of file, arithmetic overflow, conversion error • sometimes waiting doesn’t make sense – we should check for these after each operation – I want to do something else while waiting • some errors occur unpredictably – I have multiple operations outstanding – segmentation fault (e.g. dereferencing NULL) – some events demand very prompt attention – user abort (^C), hang-up, power-failure • we need event completion call-backs • these must raise asynchronous exceptions – this is a common programming paradigm – some languages support try/catch operations – computers support interrupts (similar to traps) – computers support traps – commonly associated with I/O devices and timers – operating systems also use these for system calls

Processes, Execution, and State 39 Processes, Execution, and State 40

Hardware: Traps and Interrupts Review (User vs. Supervisor mode) • Used to get immediate attention from S/W • the OS executes in supervisor mode – Traps: exceptions recognized by CPU – able to perform I/O operations – Interrupts: events generated by external devices – able to execute privileged instructions • The basic processes are very similar • e.g. enable, disable and return from interrupts – program execution is preempted immediately – able update memory management registers to create and modify process address spaces – each trap/interrupt has a numeric code (0-n) • – access data structures within the OS – that is used to index into a table of PC/PS vectors – new PS is loaded from the selected vector • application programs execute in user mode – previous PS/PC are pushed on to the (new) stack – they can only execute normal instructions – new PC is loaded from the selected vector – they are restricted to the process's address space

Processes, Execution, and State 41

7 4/8/2018

Direct Execution Limited Direct Execution

• Most operations have no security implications • CPU directly executes all application code – arithmetic, logic, local flow control & data movement – Punctuated by occasional traps (for system calls) • Most user-mode programs execute directly – With occasional timer interrupts (for time sharing) – CPU fetches, pipelines, and executes each instruction • Maximizing direct execution is always the goal this is very fast, and involves zero overhead – – For Linux user mode processes • A few operations do have security implications – For OS emulation (e.g., Windows on Linux) – h/w refuses to perform these in user-mode – For virtual machines – these must be performed by the operating system • Enter the OS as seldom as possible – program must request service from the kernel – Get back to the application as quickly as possible

Using Traps for System Calls System Call Trap Gates Application Program • reserve one illegal instruction for system calls instr ; instr ; instr ; trap ; instr ;instr ; – most computers specifically define such instructions user mode PS/PC supervisor mode • define system call linkage conventions PS/PC PS/PC – call: r0 = system call number, r1 points to arguments PS/PC – return: r0 = return code, cc indicates success/failure TRAP vector table return to st user mode • prepare arguments for the desired system call 1 level trap handler • execute the designated system call instruction 2nd level handler • OS recognizes & performs requested operation (system service implmementation) • returns to instruction after the system call system call dispatch table

Processes, Execution, and State 45 Processes, Execution, and State 46

Stacking and unstacking a System Call (Trap Handling)

User-mode Stack Supervisor-mode Stack • hardware trap handling

user mode – trap cause as index into trap vector table for PC/PS stack frames PC & PS – load new processor status word, switch to supv mode from saved application user mode – push PC/PS of program that cuased trap onto stack computation registers – load PC (w/addr of 1st level handler) resumed parameters computation to system • software trap handling call handler st direction – 1 level handler pushes all other registers return PC of growth – 1st level handler gathers info, selects 2 nd level handler system call handler – 2nd level handler actually deals with the problem stack frame • handle the event, kill the process, return ...

Processes, Execution, and State 47 Processes, Execution, and State 48

8 4/8/2018

(Returning to User-Mode) User-Mode Signal Handling

• return is opposite of interrupt/trap entry • OS defines numerous types of signals – 2nd level handler returns to 1st level handler – exceptions, operator actions, communication – 1st level handler restores all registers from stack • processes can control their handling – use privileged return instruction to restore PC/PS – ignore this signal (pretend it never happened) – resume user-mode execution at next instruction – designate a handler for this signal • saved registers can be changed before return – default action (typically kill or coredump process) – change stacked user r0 to reflect return code • analogous to hardware traps/interrupts – change stacked user PS to reflect success/failure – but implemented by the operating system – delivered to user mode processes

Processes, Execution, and State 49 Processes, Execution, and State 50

Signals and Signal Handling Signals: sample code

• when an asynchronous exception occurs int fault_expected, fault_happened; – the system invokes a specified exception handler void handler( int sig) { if (!fault_expected) exit(-1); /* if not expected, die */ • invocation looks like a procedure call else fault_happened = 1; /* if expected, note it happened */ – save state of interrupted computation } signal(SIGHUP, SIGIGNORE); /* ignore hang-up signals */ – exception handler can do what ever is necessary signal(SIGSEGV, &handler); /* handle segmentation faults */ – handler can return, resume interrupted computation ... fault_happened = 0; fault_expected = 1; • more complex than a procedure call and return ... /* code that might cause a segmentation fault */ – must also save/restore condition codes & volatile regs fault_expected = 0; – may abort rather than return

Processes, Execution, and State 51 Processes, Execution, and State 52

Stacking a signal delivery Assignments

p1: parameters • Projects p0: return address – get started on 1A p1: saved registers stack • terminal I/O modes control is complex API p1: local variables (at time of exception) • multi-process apps have more modes of failure p1: computation

signal handler sees PC/PS • For next Lecture a completely (at time of exception) Arpaci C7-8 (scheduling) standard appearing – handler: parameters stack frame stack frame. (pushed by signal) – Real-Time scheduling addr of signal unstacker handler: saved registers – Quiz #4 (in CCLE) handler: local variables

Processes, Execution, and State 53 Processes, Execution, and State 54

9 4/8/2018

Where Do Processes Come From?

• Created by the operating system – Using some method to initialize their state – In particular, to set up a particular program to run Supplementary Slides • At the request of other processes – Which specify the program to run – And other aspects of their initial state • Parent processes – The process that created your process • Child processes – The processes your process created

Starting With a Blank Process Why Did Unix Use Forking? • Avoids costs of copying a lot of code Basically, create a brand new process • – If it’s the same code as the parents’ . . . • The system call that creates it obviously needs • Historical reasons to provide some information – Parallel processing literature used a cloning fork – everything needed to set up the process properly – Fork allowed parallelism before threads invented – at the minimum, what code is to be run • Practical reasons – generally a lot more than that – Easy to manage shared resources • Like stdin, stdout, stderr • Other than bootstrapping, the new process is – Easy to set up process pipe-lines (e.g. ls | more) created by command of an existing process – Eases design of command shells

exec Fork isn’t what I usually want! The Call • two identical processes • A Linux/Unix system call to handle the – running the same program on the same data common case • we usually want new program in new process • Replaces a process’ existing program with a – to do a different thing in a different process different one • fork(2) and exec(2) are orthogonal operations – New code – fork(2) creates a new process – Different set of other resources – exec(2 ) loads a new program into a process – Different PC and stack • Essentially, called after you do a fork

10 4/8/2018

How Does the OS Handle Exec? the compilation process

Must get rid of the child’s old code • source.c – And its stack and data areas B1 compiler asm.s optimizer asm.s Latter is easy if you are using copy-on-write – header.h • Must load a brand new set of code for that process • Must initialize child’s stack, PC, and other assembler object.o object.o library.a relevant control structure linkage load – To start a fresh program run for the child process editor module load map

(Compilation/Assembly) Typical Object Module Format • compiler section 1 header section 2 header section 3 header section 4 header reads source code and header files – type : code type : data type : sym type : reloc length : ### length : ### length : ### length : ### – parses and understands "meaning" of source code flags : … flags : … flags : … flags : … – optimizer decides how to produce best possible code initialized relocation compiled symbol data description – code generation typically produces assembler code code table values entries • assembler – translates assembler directives into machine language – produces relocatable object modules each code/data section is a block of information that • code, data, symbol tables, relocation information should be kept together, as a unit, in the final program

(Relocatable Object Modules) object modules, symbols, & relocation

code segments assembler code object module library member • main.s main.o foo.o

– relocatable machine language instructions code code … … • data segments 0x108 call instruction code for foo 0x00000000 0x40 0x10c … – non-executable initialized data, also relocatable extern foo … … symbol table symbol table • symbol table call foo … … – list of symbols defined and referenced by this … 53: foo unresolved 15: foo, 0x40, global … … module relocation table • relocation information … reloc 0x10c sym=53 – pointers to all relocatable code and data items …

11 4/8/2018

Libraries Linkage Editing

• programmers need not write all code for • obtain additional modules from libraries programs – search libraries to satisfy unresolved external references – standard utility functions can be found in libraries • combine all specified object modules • a library is a collection of object modules – resolve cross-module references – a single file that contains many files (like a zip or jar) – copy all required modules into a single address space – these modules can be used directly, w/o – relocate all references to point to the chosen locations recompilation • result should be complete load module • most systems come with many standard libraries – no unresolved external addresses – system services, encryption, statistics, etc. – all data items assigned to specific virtual addresses – additional libraries may come with add-on products – all code references relocated to assigned addresses • programmers can build their own libraries – functions commonly needed by parts of a product

Linkage editing: resolution & relocation Load Modules (ELF) Load module Resolution code segment search all specified libraries to ELF header section 1 header section 2 header section 3 header find modules that can satisfy all (main.o loaded at 0x4000): unresolved external references . version info type : code type : data type : sym target ISA load adr: 0xxx load adr: 0xxx length : ### ... # load sections length : ### length : ### flags : … 0x4108 call instruction # info sections alignment: … alignment: … 0x410c 0x000060400x00000000... flags flags : … flags : …

(foo.o loaded at 0x6000) Relocation initialized compiled symbol ... data update all pointers to externally code table 0x6040 code for foo values ... resolved symbols to correctly refer to the locations where the corresponding modules were actually loaded.

program loading – executable code program loading – data segments

• load module (output of linkage editor) • code segments are read-only & fixed size – all external references have been resolved • programs include data as well as code – all modules combined into a few segments • data too must be initialized in address space – includes multiple segments (text, data, BSS) – memory must be allocated for each data segment C1 • each to be loaded, contiguously, at a specified address – initial contents must be copied from load module • a computer cannot "execute" a load module – BSS: segments to be initialized to all zeroes – computers execute instructions in memory • data segments read/write & variable size – memory must be allocated for each segment – execution can change contents of data segments – code must be copied from load module to memory – program can extend data segment to get more • in ancient times this involved an additional relocation step C1 memory

12 4/8/2018

Processes – stack frames Simple procedure linkage conventions

• modern programming languages are stack-based calling routine called routine

– greatly simplified procedure storage management push p1; push first parameter push p2; push second parameter • each procedure call allocates a new stack frame call foo ; save pc, call routine foo : push r2-r6 ; save registers sub =12,sp ; space for locals – storage for procedure local (vs global) variables ... mov rslt,r0 ; return value – storage for invocation parameters add =12,sp ; pop locals pop r2-r6 ; restore regs return ; restore pc – save and restore registers add =8,sp ; pop parameters • popped off stack when call returns … • most modern computers also have stack support – stack too must be preserved as part of process state F1

Sample stack frames Process Stacks

• size of stack depends on activity of program p1: parameters stack frame n- p0: return address 1 – grows larger as calls nest more deeply p1: saved registers – amount of local storage allocated by each procedure p1: local variables – after calls return, their stack frames can be recycled p1: computation stack frame n (owned by • OS manages the process's stack segment p2: parameters caller) p1: return address – stack segment created at same time as data segment p2: saved registers – some allocate fixed sized stack at program load time p2: local variables stack frame – some dynamically extend stack as program needs it n+1 p2: computation owned by callee

UNIX stack space management Thread state and thread stacks • each thread has its own registers, PS, PC • each thread must have its own stack area

code segment data segment stack segment G • maximum size specified when thread is created 1

0x00000000 0xFFFFFFFF – a process can contain many threads – they cannot all grow towards a single hole

Data segment starts at page boundary after code segment – thread creator must know maximum required stack Stack segment starts at high end of address space size Unix extends stack automatically as program needs more. – stack space must be relclaimed when thread exits Data segment grows up; Stack segment grows down Both grow towards the hole in the middle. They are not allowed to meet . • procedure linkage conventions remain the same

13 4/8/2018

Thread Stack Allocation

0x00000000

code data thread thread thread stack 1 stack 2 stack 3 Discussion Slides

stack

0x0120000 0xFFFFFFFF

14