CS 4414 — introduction

1 Changelog

Changes made in this version not seen in first lecture: 27 Aug 2019: remove mention of department login server being alternative for xv6, though it may be useful for other assignments

1 course webpage https://www.cs.virginia.edu/~cr4bd/4414/F2019/ linked off Collab

2 homeworks there will be programming assignments …mostly in or C++ possibly one assignment in Python one or two weeks if two weeks “checkpoint” submission after first week two week assignments worth more schedule is aggressive…

3 xv6 some assignments will use xv6, a teaching simplified OS based on an old version built by some people at MIT theoretically actually boots on real 32-bit hardware …and supports multicore! (but we’ll run it only single-core, in an emulator)

4 quizzes there will be online quizzes after each week of lecture …starting this week (due next Tuesday) same interface as CS 3330, but no time limit (haven’t seen it? we’ll talk more on Thursday) quizzes are open notes, open book, open Internet

5 exams midterm and final let us know soon if you can’t the midterm

6 textbook recommended textbook: Operating Systems: Principles and Practice no required textbook alternative: Operating Systems: Three Easy Pieces (free !) some topics we’ll cover where this may be primary textbook alternative: Silberchartz (used in previous semesters) full version: Operating System Concepts, Ninth Edition

7 cheating: homeworks don’t homeworks are individual no code from prior semesters no sharing code, pesudocode, detailed descriptions of code no code from Internet/etc., with limited exceptions tiny things solving problems that aren’t point of assignment …credited where used in your code e.g. code to split string into array for non-text-parsing assignment exception: something explicitly referred to by the assignent writeup in doubt: ask

8 cheating: quizzes don’t quizzes: also individual don’t share answers don’t IM people for answers don’t ask on StackOverflow for answers

9 getting help Piazza office hours (will be posted soon) emailing me

10 what is an operating system? (1) several overalpping definitions abstraction layer over hardware? alternative to hardware interface? several distinct jobs relating to sharing/accessing resources?

11 what is an operating system? (2) layer of software to provide access to HW abstraction of complex hardware protected access to shared resources communication security app 1 app 2 app 3 operating system hardware

12 history: computer operator

via National Library of Medicine; computer operators operating an Honeywell 800 13 what is an operating system? (3) software providing a more convenient/featureful machine interface

14 what is an operating system? (4) software performing certain jobs for computer system: textbook’s roles referee — resource sharing, protection, isolation illusionist — clean, easy abstractions glue — common services storage, window systems, authorization, networking, …

15 the virtual machine interface application virtual machine interface operating system physical machine interface hardware

system virtual machine process virtual machine (VirtualBox, VMWare, Hyper-V, …) (typical operating systems) imitate physical interface chosen for convenience (of some real hardware) (of applications)

16 system virtual machines run entire operating systems for OS development, portability interface ≈ hardware interface (but maybe not the real hardware) aid reusing existing raw hardware-targeted code different “application programmer”

17 (virtually)filesmemory — allocation open/read/write/close infinite functions “threads”∼ ( virtual interface CPus) no matterdetailsworries of number about hard organization drive of CPUs operation of “real” memory or keyboard operation or …

process virtual machine process VM real hardware thread processors memory allocation page tables files devices … …

18 filesmemory — allocation open/read/write/close functions interface no detailsworries of about hard organization drive operation of “real” memory or keyboard operation or …

process virtual machine process VM real hardware thread processors memory allocation page tables files devices … …

(virtually) infinite “threads”∼ ( virtual CPus) no matter number of CPUs

18 (virtually)files — open/read/write/close infinite “threads”∼ ( virtual interface CPus) no matterdetails of number hard drive of CPUs operation or keyboard operation or …

process virtual machine process VM real hardware thread processors memory allocation page tables files devices … …

memory allocation functions no worries about organization of “real” memory

18 (virtually)memory allocation infinite functions “threads”∼ ( virtual CPus) no matterworries number about organization of CPUs of “real” memory

process virtual machine process VM real hardware thread processors memory allocation page tables files devices … …

files — open/read/write/close interface no details of hard drive operation or keyboard operation or …

18 the abstract virtual machine applications

threads, address spaces, OS’s interface processes, files, sockets, …

operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

19 abstract VM: application view applications

threads, address spaces, OS’s interface processes, files, sockets, … the application’s “machine”operatingis the system operating system no hardware I/O details visible — future-proof more featureful interfaces than real hardware

20 applications abstract VM: OS view threads, address spaces, OS’s interface processes, files, sockets, …

operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, … operating system’s job: translatekeyboard/mouse one interfacemonitor todisks anothernetwork … hardware CPU memory

21 application 2

process process app 1’s memory

app 1’s registers

program → process → CPU and memory applications application 1

threads, address spaces, OS’s interface processes, files, sockets, …

operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

22 application 2

process

program → process → CPU and memory applications application 1

processthreads, address spaces, OS’s interface processes,app files, 1’s memory sockets, …

app 1’s registers operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

22 program → process → CPU and memory applications application 1 application 2

processthreads,process address spaces, OS’s interface processes,app files, 1’s memory sockets, …

app 1’s registers operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

22 program → process → CPU and memory applications application 1 application 2

processthreads,process address spaces, OS’s interface processes,app files, 1’s memory sockets, …

app 1’s registers operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

22 files → input/output applications application 1

threads, addressfiles spaces, OS’s interface processes, files, sockets, …

operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

23 security and protection applications application 1

threads, address spaces, OS’s interface processes,segmentation files, fault sockets, …

operating system

interrupts, memory addresses, special registers, hardware interface memory-mapped devices, I/O buses, …

keyboard/mouse monitor disks network … hardware CPU memory

24 The Process process = thread(s) + address space illusion of dedicated machine: thread = illusion of own CPU address space = illusion of own memory

25 goal: protection run multiple applications, and … keep them from crashing the OS keep them from crashing each other (keep parts of OS from crashing other parts?)

26 mechanism 1: dual-mode operation processor has two modes: kernel (privileged) and user some operations require kernel mode

OS controls what runs in kernel mode

27 mechanism 2: address translation real memory Program A mapping Program A code (set by OS) addresses Program B code Program A data Program B mapping Program B data addresses (set by OS) OS data = kernel-mode only …

trigger error

28 aside: alternate mechanisms dual mode operation and address translation are common today …so we’ll talk about them a lot not the only ways to implement operating system features (plausibly not even the most efficient…)

29 hardware support for running OS: exception need hardware support because CPU is running application instructions

problem: OS needs to respond to events keypress happens? program using CPU for too long? …

30 problem: OS needs to respond to events keypress happens? program using CPU for too long? … hardware support for running OS: exception need hardware support because CPU is running application instructions

30 exceptions and dual-mode operation rule: user code always runs in user mode rule: only OS code ever runs in kernel mode on exception: changes from user mode to kernel mode …and is only mechanism for doing so how OS controls what runs in kernel mode

31 exception terminology CS 3330 terms: interrupt: triggered by external event timer, keyboard, network, … fault: triggered by program doing something “bad” invalid memory access, divide-by-zero, … traps: triggered by explicit program action system calls aborts: something in the hardware broke

32 xv6 exception terms everything is a called a trap or sometimes an interrupt no real distinction in name about kinds

33 real world exception terms it’s all over the place… context clues

34 kernel services allocating memory? (change address space) reading/writing to file? (communicate with hard drive) read input? (communicate with keyboard) all need privileged instructions! need to run code in kernel mode

35 hardware mechanism: deliberate exceptions some instructions exist to trigger exceptions still works like normal exception starts executing OS-chosen handler …in kernel mode allows program requests privilieged instructions OS handler decides what program can request OS handler decides format of requests

36 hardware‘priviliged’ knows to operations go here because of pointerallowed set during boot (change memory layout, I/O, exceptions)

‘priviliged’ operations prohibited

system call timeline (x86-64 ) in user mode in kernel mode (the standard library) (the “kernel”) /* set arguments in registers */ mov $SYS_write, %rax mov $FILENO_stdout, %rsi mov $buffer, %rdi mov $BUFFER_LEN, %r8 /* trigger exception */ syscall // special instruction syscall_handler: /* ... save registers and actually do read and set return value ... */ /* go back to "user" code */ iret // special instruction // now use return value testq %rax, %rax 37 ... ‘priviliged’ operations allowed (change memory layout, I/O, exceptions)

‘priviliged’ operations prohibited

system call timeline (x86-64 Linux) in user mode in kernel mode (the standard library) (the “kernel”) /* set arguments in registers */ mov $SYS_write, %rax mov $FILENO_stdout, %rsi hardware knows to go here mov $buffer, %rdi mov $BUFFER_LEN, %r8 because of pointer set during boot /* trigger exception */ syscall // special instruction syscall_handler: /* ... save registers and actually do read and set return value ... */ /* go back to "user" code */ iret // special instruction // now use return value testq %rax, %rax 37 ... hardware‘priviliged’ knows to operations go here because of pointerallowed set during boot (change memory layout, I/O, exceptions)

system call timeline (x86-64 Linux) in user mode in kernel mode (the standard library) (the “kernel”) /* set arguments in registers */ mov $SYS_write, %rax mov $FILENO_stdout, %rsi mov $buffer, %rdi mov $BUFFER_LEN, %r8 /* trigger exception */ syscall // special instruction ‘priviliged’ operations syscall_handler: prohibited /* ... save registers and actually do read and set return value ... */ /* go back to "user" code */ iret // special instruction // now use return value testq %rax, %rax 37 ... hardware knows to go here because of pointer set during boot

‘priviliged’ operations prohibited

system call timeline (x86-64 Linux) in user mode in kernel mode (the standard library) (the “kernel”) /* set arguments in registers */ mov $SYS_write, %rax mov $FILENO_stdout, %rsi ‘priviliged’ operations mov $buffer, %rdi mov $BUFFER_LEN, %r8 allowed /* trigger exception */ (change memory layout, I/O, exceptions) syscall // special instruction syscall_handler: /* ... save registers and actually do read and set return value ... */ /* go back to "user" code */ iret // special instruction // now use return value testq %rax, %rax 37 ... the OS? the OS?

the classic Unix design applications standard library functions / shell commands standard libraries and libc (C standard library) the shell utility programs login login… system call interface CPU scheduler filesystems networking kernel virtual memory device drivers signals pipes swapping … hardware interface

hardware memory management unit device controllers …

38 the OS?

the classic Unix design applications standard library functions / shell commands standard libraries and libc (C standard library) the shell utility programs login login… system call interface the OS? CPU scheduler filesystems networking kernel virtual memory device drivers signals pipes swapping … hardware interface

hardware memory management unit device controllers …

38 the OS?

the classic Unix design applications standard library functions / shell commands standard libraries and libc (C standard library) the shell utility programs login login… system call interface CPU scheduler filesystems networking the OS? kernel virtual memory device drivers signals pipes swapping … hardware interface

hardware memory management unit device controllers …

38 aside: is the OS the kernel? OS = stuff that runs in kernel mode? OS = stuff that runs in kernel mode + libraries to use it? OS = stuff that runs in kernel mode + libraries + utility programs (e.g. shell, finder)? OS = everything that comes with machine? no consensus on where the line is each piece can be replaced separately…

39 xv6 we will be using an teaching OS called “xv6” based on Sixth Edition Unix modified to be multicore and use 32-bit x86 (not PDP-11)

40 xv6 setup/assignment first assignment — adding simple xv6 system call includes xv6 download instructions and link to xv6 book

41 xv6 technical requirements you will need a Linux environment we will supply one (VM on website), or get your own should also have department lab accounts (but not usable for xv6) (it’s probably possible to use OS X, but you need a cross-compiler and we don’t have instructions) …with qemu installed qemu (for us) = emulator for 32-bit x86 system Ubuntu/Debian package qemu-system-i386

42 first assignment get compiled and xv6 working …toolkit uses an emulator could run on real hardware or a standard VM, but a lot of details also, emulator lets you use GDB

43 xv6: what’s included Unix-like kernel very small set of syscalls some less featureful (e.g. exit without exit status) userspace library very limited userspace programs command line, , mkdir, echo, cat, etc. some self-testing programs

44 xv6: echo.c

#include "types.h" #include "stat.h" #include "user.h" int main(int argc, char *argv[]) { int i;

for(i = 1; i < argc; i++) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n"); exit(); }

45 xv6: echo.c

#include "types.h" #include "stat.h" #include "user.h" int main(int argc, char *argv[]) { int i;

for(i = 1; i < argc; i++) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n"); exit(); }

45 xv6: echo.c

#include "types.h" #include "stat.h" #include "user.h" int main(int argc, char *argv[]) { int i;

for(i = 1; i < argc; i++) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n"); exit(); }

45 xv6 demo

46 syscalls in xv6 fork, exec, exit, wait, kill, getpid — process control open, read, write, close, fstat, dup — file operations mknod, unlink, link, chdir — directory operations …

47 intxv6 syscallerrupt —calling trigger convention: an exception similar to a keypress parametereax = syscall (0x40 numberin this case) — type of exception otherwise: same as 32-bit x86 calling convention (arguments on stack)

write syscall in xv6: user mode syscall.h usys.S ... (after macro replacement) #define SYS_write 16 #include "syscall.h" ... // ... main.c .globl write ... write: write(1, /* 16 = SYS_write */ "Hello, World!\n", movl $16, %eax 14); /* 0x40 = T_SYSCALL */ ... int $0x40 ret

49 xv6 syscall calling convention: eax = syscall number otherwise: same as 32-bit x86 calling convention (arguments on stack)

write syscall in xv6: user mode syscall.h usys.S ... (after macro replacement) #define SYS_write 16 #include "syscall.h" ... // ... main.c .globl write ... write: write(1, /* 16 = SYS_write */ "Hello, World!\n", movl $16, %eax 14); /* 0x40 = T_SYSCALL */ ... int $0x40 ret interrupt — trigger an exception similar to a keypress parameter (0x40 in this case) — type of exception

49 interrupt — trigger an exception similar to a keypress parameter (0x40 in this case) — type of exception

write syscall in xv6: user mode syscall.h usys.S ... (after macro replacement) #define SYS_write 16 #include "syscall.h" ... // ... main.c .globl write ... write: write(1, /* 16 = SYS_write */ "Hello, World!\n", movl $16, %eax 14); /* 0x40 = T_SYSCALL */ ... int $0x40 ret xv6 syscall calling convention: eax = syscall number otherwise: same as 32-bit x86 calling convention (arguments on stack)

49 (fromsetlidt1:vectors[T_SYSCALL]trap do theit to not mmu.h):—returnsT_SYSCALL use disable the to kernelalltraps interrupts(= “code0x40— during segment” OS) interrupt function syscalls to for processor to run setalltrapsmeaning:e.g.// Set to keypress pointerup restores runa normal handling in to registers kernel assemblyinterrupt can mode from function interrupt/traptf,gate thenvector64 slowdescriptor returns syscall to. user-mode //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:(yes,con:// hardwareinterrupt makes code triggers segments jumps writinggate faulthere systemclears specifies like privileged callsFL_IF safely,more instruction)trap than moregate complicatedleavesthatFL_IF — nothingalone we care about) pro:// - slowsel: systemCode segment calls don’tselector stopfor timers,interrupt keypresses,/trap handler etc. from working //setsvectors.S- theoff:interruptOffset intrapasm.S descriptorcode segment table for interrupttrap.c/trap handler vector64:table// - dpl of :handlerDescriptor functionsalltraps:Privilegefor eachvoidLevel interrupt- type //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //xv6pushl choice:$64this interruptsinterruptcallaretrap/trapdisabledgate{ duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#definejmp don’talltrapsSETGATE(gate, worry about... istrap, keypress sel,... being off, handled d) while timer \ being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

50 (fromset1:vectors[T_SYSCALL]trap do theit to not mmu.h):returnsT_SYSCALL use disable the to kernelalltraps interrupts(= “code0x40— during segment” OS) interrupt function syscalls to for processor to run setalltrapsmeaning:e.g.// Set to keypress pointerup restores runa normal handling in to registers kernel assemblyinterrupt can mode from function interrupt/traptf,gate thenvector64 slowdescriptor returns syscall to. user-mode //be- callableistrap: from1 for usera trap modegate via,int0 forinstructionan interrupt gate. (otherwise:(yes,con:// hardwareinterrupt makes code triggers segments jumps writinggate faulthere systemclears specifies like privileged callsFL_IF safely,more instruction)trap than moregate complicatedleavesthatFL_IF — nothingalone we care about) pro:// - slowsel: systemCode segment calls don’tselector stopfor timers,interrupt keypresses,/trap handler etc. from working // vectors.S- off: Offset intrapasm.Scode segment for interrupttrap.c/trap handler vector64:// - dpl: Descriptoralltraps:Privilege voidLevel - //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //xv6pushl choice:$64this interruptsinterruptcallaretrap/trapdisabledgate{ duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#definejmp don’talltrapsSETGATE(gate, worry about... istrap, keypress sel,... being off, handled d) while timer \ being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

lidt — function (in x86.h) wrapping lidt instruction

sets the interrupt descriptor table table of handler functions for each interrupt type

50 setlidt1:vectors[T_SYSCALL]trap do theit to not—returnsT_SYSCALL use disable the to kernelalltraps interrupts(= “code0x40— during segment” OS) interrupt function syscalls to for processor to run befunctionsetalltrapsmeaning:e.g. callable to keypress pointer restores (in run from x86.h) handling in to registers kerneluser assembly wrapping mode can mode from function interrupt via lidttfint instruction,instruction thenvector64 slow returns syscall to user-mode (otherwise:(yes,con:hardware makes code triggers segments jumps writing faulthere system specifies like privileged calls safelymore instruction) than more complicatedthat — nothing we care about) setspro:vectors.S the slowinterrupt system callstrapasm.S descriptor don’t tablestop timers, keypresses,trap.c etc. from working vector64:table of handler functionsalltraps:for eachvoid interrupt type pushl $0 ... trap(struct trapframe *tf) xv6pushl choice:$64 interruptscallaretrapdisabled{ during non-syscall exception handling (e.g.jmp don’talltraps worry about... keypress... being handled while timer being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

(from mmu.h): // Set up a normal interrupt/trap gate descriptor. // - istrap: 1 for a trap gate, 0 for an interrupt gate. // interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \

50 (fromlidtset1:vectors[T_SYSCALL]trap do it to not mmu.h):—returns use disable the to kernelalltraps interrupts “code— during segment” OS function syscalls for processor to run setalltrapsmeaning:e.g.// Set to keypress pointerup restores runa normal handling in to registers kernel assemblyinterrupt can mode from function interrupt/traptf,gate thenvector64 slowdescriptor returns syscall to. user-mode //function- istrap (in: x86.h)1 for a wrappingtrap gate lidt, 0 instructionfor an interrupt gate. (yes,con:// hardwareinterrupt makes code segments jumps writinggate here systemclears specifies callsFL_IF safely,moretrap than moregate complicatedleavesthatFL_IF — nothingalone we care about) pro:// - slowsel: systemCode segment calls don’tselector stopfor timers,interrupt keypresses,/trap handler etc. from working //setsvectors.S- theoff:interruptOffset intrapasm.S descriptorcode segment table for interrupttrap.c/trap handler vector64:table// - dpl of :handlerDescriptor functionsalltraps:Privilegefor eachvoidLevel interrupt- type //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //xv6pushl choice:$64this interruptsinterruptcallaretrap/trapdisabledgate{ duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#definejmp don’talltrapsSETGATE(gate, worry about... istrap, keypress sel,... being off, handled d) while timer \ being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

set the T_SYSCALL (= 0x40) interrupt to be callable from user mode via int instruction (otherwise: triggers fault like privileged instruction)

50 (fromsetlidt1:vectors[T_SYSCALL]trap do the not mmu.h):—returnsT_SYSCALL disable to alltraps interrupts(= 0x40— during OS) interrupt function syscalls to for processor to run setalltrapse.g.// Set to keypress pointerup restoresa normal handling to registers assemblyinterrupt can from function interrupt/traptf,gate thenvector64 slowdescriptor returns syscall to. user-mode //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:con:// hardwareinterrupt makes triggers jumps writinggate faulthere systemclears like privileged callsFL_IF safely, instruction)trap moregate complicatedleaves FL_IF alone pro:// - slowsel: systemCode segment calls don’tselector stopfor timers,interrupt keypresses,/trap handler etc. from working //setsvectors.S- theoff:interruptOffset intrapasm.S descriptorcode segment table for interrupttrap.c/trap handler vector64:table// - dpl of :handlerDescriptor functionsalltraps:Privilegefor eachvoidLevel interrupt- type //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //xv6pushl choice:$64this interruptsinterruptcallaretrap/trapdisabledgate{ duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#definejmp don’talltrapsSETGATE(gate, worry about... istrap, keypress sel,... being off, handled d) while timer \ being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

set it to use the kernel “code segment” meaning: run in kernel mode (yes, code segments specifies more than that — nothing we care about)

50 (fromsetlidtvectors[T_SYSCALL]trap theit to mmu.h):—returnsT_SYSCALL use the to kernelalltraps(= “code0x40— segment” OS) interrupt function to for processor to run setalltrapsmeaning:// Set to pointerup restores runa normal in to registers kernel assemblyinterrupt mode from function/traptf,gate thenvector64descriptor returns to. user-mode //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:(yes,// hardwareinterrupt code triggers segments jumpsgate faulthereclears specifies like privilegedFL_IF ,more instruction)trap thangate leavesthatFL_IF — nothingalone we care about) // - sel: Code segment selector for interrupt/trap handler //setsvectors.S- theoff:interruptOffset intrapasm.S descriptorcode segment table for interrupttrap.c/trap handler vector64:table// - dpl of :handlerDescriptor functionsalltraps:Privilegefor eachvoidLevel interrupt- type //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //pushl $64this interruptcall trap/trap gate{ explicitly using an int instruction. #definejmp alltrapsSETGATE(gate,... istrap, sel,... off, d) \ ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

1: do not disable interrupts during syscalls e.g. keypress handling can interrupt slow syscall con: makes writing system calls safely more complicated pro: slow system calls don’t stop timers, keypresses, etc. from working

xv6 choice: interrupts are disabled during non-syscall exception handling (e.g. don’t worry about keypress being handled while timer being handled)

50 (fromsetlidt1:trap do theit to not mmu.h):—returnsT_SYSCALL use disable the to kernelalltraps interrupts(= “code0x40 during segment”) interrupt syscalls to alltrapsmeaning:e.g.// Set keypressup restores runa normal handling in registers kernelinterrupt can mode from interrupt/traptf,gate then slowdescriptor returns syscall to. user-mode //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:(yes,con:// hardwareinterrupt makes code triggers segments jumps writinggate faulthere systemclears specifies like privileged callsFL_IF safely,more instruction)trap than moregate complicatedleavesthatFL_IF — nothingalone we care about) pro:// - slowsel: systemCode segment calls don’tselector stopfor timers,interrupt keypresses,/trap handler etc. from working //setsvectors.S- theoff:interruptOffset intrapasm.S descriptorcode segment table for interrupttrap.c/trap handler vector64:table// - dpl of :handlerDescriptor functionsalltraps:Privilegefor eachvoidLevel interrupt- type //pushl $0the privilege... level requiredtrap(structfor trapframesoftware to*tf)invoke //xv6pushl choice:$64this interruptsinterruptcallaretrap/trapdisabledgate{ duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#definejmp don’talltrapsSETGATE(gate, worry about... istrap, keypress sel,... being off, handled d) while timer \ being handled) ... iret

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64

50 (fromsetlidt1:trap do theit to not mmu.h):—returnsT_SYSCALL use disable the to kernelalltraps interrupts(= “code0x40 during segment”) interrupt syscalls to alltrapsmeaning:e.g.// Set keypressup restores runa normal handling in registers kernelinterrupt can mode from interrupt/traptf,gate then slowdescriptor returns syscall to. user-mode //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:(yes,con:// interrupt makes code triggers segments writinggate fault systemclears specifies like privileged callsFL_IF safely,more instruction)trap than moregate complicatedleavesthatFL_IF — nothingalone we care about) // - sel: Code segment selector for interrupt/trap handler //setspro:- the slowoff:interrupt systemOffset in calls descriptorcode don’tsegment tablestop timers,for interrupt keypresses,/trap etc.handler from working table// - dpl of :handlerDescriptor functionsPrivilegefor eachLevel interrupt- type // the privilege level required for software to invoke //xv6 choice:this interruptsinterruptare/trapdisabledgate duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#define don’tSETGATE(gate, worry about istrap, keypress sel, being off, handled d) while timer \ being handled)

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 hardware jumps here vectors.S trapasm.S trap.c vector64: alltraps: void pushl $0 ... trap(struct trapframe *tf) pushl $64 call trap { jmp alltraps ...... iret

50 syscall()structmyproc() trapframe — actual— pseudo-global — implementations set by assembly variable usesrepresentsinterruptmyproc()->tf type, currentlyapplication runningto determine registers process,… whatexample: operationtf−>eax to do= for old program value of eax much more on this later in semester

write syscall in xv6: the trap function trap.c void trap(struct trapframe *tf) { if(tf−>trapno == T_SYSCALL){ if(myproc()−>killed) exit(); myproc()−>tf = tf; syscall(); if(myproc()−>killed) exit(); return; } ... }

51 syscall()myproc() — actual— pseudo-global implementations variable usesrepresentsmyproc()->tf currently runningto determine process what operation to do for program much more on this later in semester

write syscall in xv6: the trap function trap.c void struct trapframe — set by assembly trap(struct trapframe *tf) { interrupt type, application registers,… if(tf−>trapno == T_SYSCALL){ example: tf−>eax = old value of eax if(myproc()−>killed) exit(); myproc()−>tf = tf; syscall(); if(myproc()−>killed) exit(); return; } ... }

51 syscall()struct trapframe — actual — implementations set by assembly usesinterruptmyproc()->tf type, applicationto determine registers,… whatexample: operationtf−>eax to do= for old program value of eax

write syscall in xv6: the trap function trap.c void myproc() — pseudo-global variable trap(struct trapframe *tf) { represents currently running process if(tf−>trapno == T_SYSCALL){ if(myproc()−>killed) exit(); much more on this later in semester myproc()−>tf = tf; syscall(); if(myproc()−>killed) exit(); return; } ... }

51 structmyproc() trapframe— pseudo-global — set by assembly variable representsinterrupt type, currentlyapplication running registers process,… example: tf−>eax = old value of eax much more on this later in semester

write syscall in xv6: the trap function trap.c void syscall() — actual implementations trap(struct trapframe *tf) { uses myproc()->tf to determine if(tf−>trapno == T_SYSCALL){ if(myproc()−>killed) what operation to do for program exit(); myproc()−>tf = tf; syscall(); if(myproc()−>killed) exit(); return; } ... }

51 (ifarrayresult system of assigned functions call number to eax— one in range)for syscall call(assembly sys_…function code this from returns table to store‘copies[number] resulttf−>eax in value user’sintoeax’:%eax syscalls[number]register) = value

write syscall in xv6: the syscall function syscall.c static int (*syscalls[])(void) = { ... [SYS_write] sys_write, ... };

... void syscall(void) { ... num = curproc−>tf−>eax; if(num > 0 && num < NELEM(syscalls) && syscalls[num]) { curproc−>tf−>eax = syscalls[num](); } else { ...

52 (ifresult system assigned call number to eax in range) call(assembly sys_…function code this from returns table to storecopies resulttf−>eax in user’sintoeax%eaxregister)

write syscall in xv6: the syscall function syscall.c static int (*syscalls[])(void) = { ... [SYS_write] sys_write, ... }; array of functions — one for syscall

... ‘[number] value’: syscalls[number] = value void syscall(void) { ... num = curproc−>tf−>eax; if(num > 0 && num < NELEM(syscalls) && syscalls[num]) { curproc−>tf−>eax = syscalls[num](); } else { ...

52 arrayresult of assigned functions to eax— one for syscall (assembly code this returns to ‘copies[number]tf−>eax valueinto’:%eax syscalls[number]) = value

write syscall in xv6: the syscall function syscall.c static int (*syscalls[])(void) = { ... [SYS_write] sys_write, ... }; (if system call number in range) call sys_…function from table ... store result in user’s eax register void syscall(void) { ... num = curproc−>tf−>eax; if(num > 0 && num < NELEM(syscalls) && syscalls[num]) { curproc−>tf−>eax = syscalls[num](); } else { ...

52 (ifarray system of functions call number — one in range)for syscall call sys_…function from table store‘[number] result in value user’s eax’: syscalls[number]register = value

write syscall in xv6: the syscall function syscall.c static int (*syscalls[])(void) = { ... [SYS_write] sys_write, ... }; result assigned to eax (assembly code this returns to ... copies tf−>eax into %eax) void syscall(void) { ... num = curproc−>tf−>eax; if(num > 0 && num < NELEM(syscalls) && syscalls[num]) { curproc−>tf−>eax = syscalls[num](); } else { ...

52 utilityactual functions internal function that read that arguments implements from writing user’s to stack a file returns(the terminal -1 on errorcounts (e.g. as a stack file) pointer invalid) (more on this later) (note: 32-bit x86 calling convention puts all args on stack)

write syscall in xv6: sys_write sysfile.c int sys_write(void) { struct file *f; int n; char *p;

if(argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) return −1; return filewrite(f, p, n); }

53 actual internal function that implements writing to a file (the terminal counts as a file)

write syscall in xv6: sys_write sysfile.c int sys_write(void) { struct file *f; int n; char *p;

if(argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) return −1; return filewrite(f, p, n); } utility functions that read arguments from user’s stack returns -1 on error (e.g. stack pointer invalid) (more on this later) (note: 32-bit x86 calling convention puts all args on stack)

53 utility functions that read arguments from user’s stack returns -1 on error (e.g. stack pointer invalid) (more on this later) (note: 32-bit x86 calling convention puts all args on stack)

write syscall in xv6: sys_write sysfile.c int sys_write(void) { struct file *f; int n; char *p;

if(argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) return −1; return filewrite(f, p, n); } actual internal function that implements writing to a file (the terminal counts as a file)

53 (fromsetlidt1:vectors[T_SYSCALL] do theit to not mmu.h):—T_SYSCALL use disable the kernel interrupts(= “code0x40— during segment” OS) interrupt function syscalls to for processor to run setmeaning:e.g.// Set to keypress pointerup runa normal handling in to kernel assemblyinterrupt can mode function interrupt/trap gatevector64 slowdescriptor syscall . //befunction- callableistrap (in: from x86.h)1 for usera wrappingtrap modegate via lidt,int0 instructionforinstructionan interrupt gate. (otherwise:(yes,con:// interrupt makes code triggers segments writinggate fault systemclears specifies like privileged callsFL_IF safely,more instruction)trap than moregate complicatedleavesthatFL_IF — nothingalone we care about) // - sel: Code segment selector for interrupt/trap handler //setspro:- the slowoff:interrupt systemOffset in calls descriptorcode don’tsegment tablestop timers,for interrupt keypresses,/trap etc.handler from working table// - dpl of :handlerDescriptor functionsPrivilegefor eachLevel interrupt- type // the privilege level required for software to invoke //xv6 choice:this interruptsinterruptare/trapdisabledgate duringexplicitly non-syscallusing an exceptionint instruction handling. (e.g.#define don’tSETGATE(gate, worry about istrap, keypress sel, being off, handled d) while timer \ being handled)

write syscall in xv6: interrupt table setup trap.c (run on boot) ... lidt(idt, sizeof(idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ...

trap returns to alltraps alltraps restores registers from tf, then returns to user-mode hardware jumps here vectors.S trapasm.S trap.c vector64: alltraps: void pushl $0 ... trap(struct trapframe *tf) pushl $64 call trap { jmp alltraps ...... iret

54 user mode + stack

kernel mode + stack

write syscall in xv6 user program return from interrupt HW switches stacks return via trap() function call: write() C function: sys_write() syscall wrapper (int $0x40) read args from user stack

trigger exception using syscall # from eax

interrupt table C function: syscall() HW does lookup C function: trap() HW switches stacks + calls

assembly func: vector64() save regs in trapframe then function call 57 write syscall in xv6 user program user mode return from interrupt HW switches stacks + stack return via trap() function call: write() C function: sys_write() syscall wrapper (int $0x40) read args from user stack

trigger exception using syscall # from eax

interrupt table kernel mode C function: syscall() HW does lookup + stack C function: trap() HW switches stacks + calls

assembly func: vector64() save regs in trapframe then function call 57 user mode + stack

kernel mode + stack

write syscall in xv6 user program return from interrupt HW switches stacks return via trap() function call: write() C function: sys_write() syscall wrapper (int $0x40) read args from user stack

trigger exception using syscall # from eax

interrupt table C function: syscall() HW does lookup C function: trap() HW switches stacks + calls

assembly func: vector64() save regs in trapframe then function call 58 xv6intro homework get familiar with xv6 OS add a new system call: writecount() returns total number of times write call happened

59 homework steps system call implementation: sys_writecount hint in writeup: imitate sys_uptime need a counter for number of writes add writecount to several tables/lists (list of handlers, list of library functions to create, etc.) recommendation: imitate how other system calls are listed create a userspace program that calls writecount recommendation: copy from given programs

60 note on locks some existing code uses acquire/release you do not have to do this only for multiprocessor support

…but, copying what’s done for ticks would be correct

61 = kernel-mode only

recall: address translation real memory Program A mapping Program A code (set by OS) addresses Program B code Program A data Program B mapping Program B data addresses (set by OS) OS data …

trigger error

63 larger addresses are for kernel (accessible in kernel mode only) kernel stack allocated here smaller addresses are for applications processorone kernelswitches stack per stacks process whenchange execption/interrupt/…happens which one exceptions use locationas part of of switching stack stored which processes inis active special on “task a processor state selector”

xv6 memory layout

Virtual 4 Gig

Device memory RW-

0xFE000000

Unused if less than 2 Gig of physical memory

Free memory RW-- end

Kernel data RW-

Kernel text R-- + 0x100000 Physical RW- 4 Gig KERNBASE Memory-mapped 32-bit I/O devices PHYSTOP

Unused if less than 2 Gig of physical memory Program data & heap

At most 2 Gig

Extended memory PAGESIZE User stack RWU

User data RWU 0x100000 I/O space 640K User text RWU Base memory 0 0

64 kernel stack allocated here processorone kernelswitches stack per stacks process whenchange execption/interrupt/…happens which one exceptions use locationas part of of switching stack stored which processes inis active special on “task a processor state selector”

xv6 memory layout

Virtual 4 Gig

Device memory RW- 0xFE000000 larger addresses are for kernel Unused if less than 2 Gig of physical memory (accessible in kernel mode only)

Free memory RW-- end Kernel data RW- smaller addresses are for applications Kernel text R-- + 0x100000 Physical RW- 4 Gig KERNBASE Memory-mapped 32-bit I/O devices PHYSTOP

Unused if less than 2 Gig of physical memory Program data & heap

At most 2 Gig

Extended memory PAGESIZE User stack RWU

User data RWU 0x100000 I/O space 640K User text RWU Base memory 0 0

64 larger addresses are for kernel (accessible in kernel mode only)

smaller addresses are for applications one kernel stack per process change which one exceptions use as part of switching which processes is active on a processor

xv6 memory layout

Virtual 4 Gig

Device memory RW-

0xFE000000

Unused if less than 2 Gig of physical memory

Free memory RW-- kernel stack allocated here end

Kernel data RW- Kernel text R-- processor switches stacks + 0x100000 Physical RW- 4 Gig KERNBASE when execption/interrupt/…happens Memory-mapped 32-bit I/O devices location of stack stored PHYSTOP

Unused if less than 2 Gig of physical memory in special “task state selector” Program data & heap

At most 2 Gig

Extended memory PAGESIZE User stack RWU

User data RWU 0x100000 I/O space 640K User text RWU Base memory 0 0

64 larger addresses are for kernel (accessible in kernel mode only)

smaller addresses are for applications processor switches stacks when execption/interrupt/…happens location of stack stored in special “task state selector”

xv6 memory layout

Virtual 4 Gig

Device memory RW-

0xFE000000

Unused if less than 2 Gig of physical memory

Free memory RW-- kernel stack allocated here end

Kernel data RW- Kernel text R-- one kernel stack per process + 0x100000 Physical RW- 4 Gig KERNBASE change which one exceptions use Memory-mapped 32-bit I/O devices as part of switching which processes PHYSTOP

Unused if less than 2 Gig of physical memory is active on a processor Program data & heap

At most 2 Gig

Extended memory PAGESIZE User stack RWU

User data RWU 0x100000 I/O space 640K User text RWU Base memory 0 0

64 aside: nested exceptions x86 switches to kernel stack on exception… assuming it’s switching to kernel mode system call or timer interrupt in user mode start at top of kernel stack timer interrupt during system call continue using current kernel stack

65 66 backup slides

67 competing applications — failures, malicious applications text editor shouldn’t need to know if browser is running varying hardware — diverse and changing interfaces different keyboard interfaces, disk interfaces, video interfaces, etc. applications shouldn’t change

common goal: hide complexity hiding complexity

68 common goal: hide complexity hiding complexity competing applications — failures, malicious applications text editor shouldn’t need to know if browser is running varying hardware — diverse and changing interfaces different keyboard interfaces, disk interfaces, video interfaces, etc. applications shouldn’t change

68 common goal: for application programmer write once for lots of hardware avoid reimplementing common functionality don’t worry about other programs

69