Operating Systems Engineering
Total Page:16
File Type:pdf, Size:1020Kb
Operating Systems Engineering Recitation 2: Boot and first process Based on MIT 6.828 (2014, lec3-5) Focus on xv6 • An educational OS based on UNIX V6 • only a few abstractions \ services • Processes • File system • I/O (via file descriptors) Power on (“boot”) the machine ● Initial state – nothing in RAM, need to read kernel from disk ● BIOS in charge, start bootloader (sheets 84-85) – copy first “boot” sector to 0x7c00 in mem – boot sector = bootasm.S + bootmain.c – executing at end of bootasm.S – stack below 0x7c00, so we can call into C – call bootmain xv6 – bootloader (1) ● Bootloader in charge: – bootmain() – sheet 85 ● Two jobs: – copy kernel from disk to mem (0x100000 phys addr) ● not file–-sectors, raw disk ● linker writes in ELF format – jump to kernel's first instruction ● elf->entry ● objdump -f kernel or readelf -e kernel ● kernel.ld and entry.S xv6 – bootloader (2) ● Why phys address 0x100000 (1MB)? – can't use 0x0 → memory mapped devices 0x0->1M – can use 0x200000 (2MB)? ● yes! it is possible ● Bootloader can load kernel to phys address if – it is a DRAM address – kernel must be able to find itself ● 0x100000 satisfies both conditions xv6 – bootloader (3) ● Where bootmain jumps? – entry->elf_entry from ELF header – not 0x100000 but 0x10000c ● this is "start", in entry.S, sheet 10 – linker put 0x10000c in the ELF header ● (gdb) b *0x10000c (entry.S) ● (gdb) si ● continue and then si to jmp *%eax ● (gdb) #0 0x801033b2 in main () at main.c:19 xv6 – processes • Process has user space memory: – instructions - actual computation flow – data - variables used in computation – stack – organize procedures calls • Per-process state private to the kernel – page table – kernel stack – file descriptor table xv6 – Process Isolation • Prevent process X from spying on Y • Prevent process X from corrupting Y – Separated memory, file descriptors – Prevent resource exhaustion (fairness) • Protect kernel from processes • Defensive tactic – Against buggy programs – Against malicious programs xv6 – Isolation Mechanisms • User/Kernel mode flag • System call abstraction • Address spaces • Timeslicing User/Kernel Mode Flag • Called CPL in x86 • Bottom two bits of the cs register cs: CPL • CPL=0 – kernel mode – privileged • CPL=3 – user mode – not privileged User/Kernel Mode Flag ● CPL is the base to almost every isolation – CPL in low 2 bits of CS – CPL=0 -> can modify cr*, devices, can use any PTE – CPL=3 -> can't modify cr*, or use devs, and PTE_U enforced ● Writes to control registers (cs, for instance) ● Writes to certain flags ● Memory access ● I/O Port access ● However, setting CPL=3 is not enough ● Kernel needs to manage policy System calls ● Call from user to kernel – needs to change CPL ● Can this be done? – set CPL=0 – jump sys_open() ● How about a combined instruction that forces the user to jump to a kernel address? System calls - x86 solution ● Kernel sets allowed entry points ● int instruction sets CPL=0 and jumps – saves the values of cs and eip on stack – system call returns with iret – restores old cs and eip ● Should these instructions be privileged? xv6 – First Process (1) ● Each process state in struct proc (2103) ● Process states – UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE ● Each process address space maps – program memory (<0x80000000) – kernel instructions and data (>0x80100000) xv6 – First Process ● Each process has two stacks: user and kernel ● Thread of execution (aka thread) – p->kstack + code – p->state – p->pgdir ● When in user mode kernel stack empty ● When in kernel (syscall) user stack contains data but not used – thread state stored in kernel stack: ● local variables ● return address xv6 – Virtual Memory Layout ● 0x80000000=KERNBASE ● memlayout.h (0207) xv6 – First Address Space ● main.c → main() sheet 12 ● First process (see userinit, 2252) – allocproc() sheet 22, set up stack for "returning" to user space ● save trapret (3027) ● p->context->eip = forkret (2533) , ret will run forkret ● forkret will return to trapret (3027), then to userpace – Fill in kernel part of address space (setupkvm) – Fill in user part of address space ● 1 page containing initcode (see initcode.S) – Setup trapframe to exit kernel ● User-mode bit ● tf->eip = 0 (beginning of initcode.S) ● User-stack lives at top of 1 page of initcode – Set process to runnable xv6 – trap frame ● Kernel stack after allocproc() ● Ready to return to user space ● trapasm.S (3027) : trapret: popal popl %gs popl %fs popl %es popl %ds addl $0x8, %esp # trapno # and errcode iret xv6 – First System Call exec() ● initcode.S (7708) calls exec “/init” ● Instruction int enters kernel again ● System call exec (3207) – replace initcode with /init binary – run /init which ● creates new console ● start shell ● handles orphaned zombies ● system is up Questions? Backup Slides xv6 – A Monolithic Kernel • Kernel is a big program • Contains all services, low level hardware mechanisms • Entire kernel runs with full privileges • Pros – easy kernel subsystem interactions • Cons – complex interactions => bugs => system crash – no isolation in the kernel • Unix, Linux, BSD family, Solaris, xv6 Micro Kernel • Kernel is a small program – A micro kernel tries to run most services as daemons in user space. • Only kernel runs with full privileges • Microkernel is essentially a high speed context-switching engine • Pros – complex service interactions => bugs => service crash but system alive! – kernel isolated from services, services isolated from user • Cons – complex OS subsystem interactions using IPC – a lot of of messaging and context switching involved ● MINIX, QNX, L4 Exo Kernel • Kernel is a very small program – concept of an exokernel is orthogonal to that of micro- vs. monolithic kernels. – there are no forced abstractions – security separated from abstraction • Kernel and users can run with full privileges • Pros – simplicity and performance – freedom: users can implement their own optimal subsystems • Cons – additional effort from users and system maintainers ● JOS, nonkernel, BareMetal OS.