<<

Outline

COS 318: Operating Systems u Protection Mechanisms and OS Structures

Protection and u Virtual Memory: Protection and Address Translation

Jaswinder Pal Singh and a Fabulous Course Staff Science Department Princeton University

(http://www.cs.princeton.edu/courses/cos318/)

2

Some Protection Goals Architecture Support for CPU Protection

u CPU • Privileged Mode l Enable kernel to take CPU away to prevent a user from using CPU forever An or exception/ (INT) l Users should not have this ability

u Memory User mode Kernel (privileged) mode l Prevent a user from accessing others’ • Regular instructions • Regular instructions • Access user memory • Privileged instructions l Prevent users from modifying kernel code and data • Access user memory structures • Access kernel memory u I/O l Prevent users from performing “illegal” I/Os A special instruction (IRET) u Question l What’s the difference between protection and security? 3 4

1 Privileged Instruction Examples OS Structures and Protection: Monolithic

u mapping u All kernel routines are together, linked in single large executable u Flush or invalidate data l Each can call any other u Invalidate TLB entries l Services and utilities User User program program u Load and read system registers u Provides a API

u u Change modes from kernel to user Examples: syscall l , BSD , Windows, … u Change the voltage and frequency of processor syscall u Pros u Halt a processor l Shared kernel space u Reset a processor l Good performance Kernel u Perform I/O operations u Cons (many things) l Instability: crash in any procedure • Other architectural support for protection in system? brings system down l Unweildy/difficult to maintain, extend

5 6

Layered Structure Possible Implementation: Protection Rings

u Hiding information at each layer u Layered dependency Privileged instructions u Examples Level N can be executed only l THE (6 layers) . when current privileged • Mostly for functionality splitting . level (CPR) is 0 l MS-DOS (4 layers) u Pros Level 2 kernel l Layered abstraction Level 0 • Separation of concerns, elegance Level 1 Operating system Level 1 u Cons services l Inefficiency Hardware Level 2 l Inflexibility Applications Level 3

7 8

2 Structure

u Services are regular processes u Virtual machine monitor u Micro-kernel obtains services for l Virtualize hardware users by messaging with services Apps Apps l Run several OSes u Examples: User OS OS . . . OS l , Taos, L4, OS-X program Services l Examples 1 k • IBM VM/370 u Pros? VM1 VMk syscall l Flexibility to modify services • VM Virtual Machine Monitor l Fault isolation • VMWare, Xen u Cons? entry l Inefficient (boundary crossings) u What would you use µ-kernel l Inconvenient to share data virtual machine for? Raw Hardware between kernel and services l Just shifts the problem, to level with less protection? Testing? 9 10

Memory Protection The Big Picture

u Kernel vs. user mode, plus u DRAM is fast, but relatively expensive

u Virtual address spaces and Address Translation u Disk is inexpensive, but slow l 100X less expensive CPU Physical memory Abstraction: virtual memory l 100,000X longer latency No protection Every program isolated from all others and from the OS l 1000X less bandwidth Memory Limited size Illusion of “infinite” memory u Goals Sharing visible to programs Transparent --- can’t tell if l Run programs efficiently physical memory is shared l Make the system safe Disk Virtual addresses are translated to physical addresses

13

3 Issues Consider A Simple System

u Many processes u Only physical memory l Applications use it directly l The more processes a system can , the better OS u Run three processes x9000 u size l Email, browser, gcc l Many processes whose total size may exceed memory email u What if x7000 l Even one may exceed physical memory size l browser writes at x7050? browser x5000 u Protection l email needs to expand? l A user process should not crash the system l browser needs more gcc x2500 l A user process should not do bad things to other memory than is on the Free processes machine? x0000

14 15

Need to Handle Handling Protection

u Protection u Errors/malice in one process should not affect others u For each process, check each load and store u Finiteness instruction to allow only legal memory references l Not having entire application/data in memory at once gcc l address Physical CPU Check l Not having programmer worry about it (too much) memory error data

16 17

4 Handling Finiteness Virtual Memory

u A process should be able to run regardless of physical u Flexible memory size or where its data are physically placed l Processes (and data) can move in memory as they u Give each process a large, static “fake” address execute, and can be part in memory and part on disk space that is large and contiguous and entirely its own u Simple u As a process runs, relocate or map each load and l Applications generate loads and stores to addresses in store to addresses in actual physical memory the contiguous, large, “fake” address space u Efficient email address Check & Physical l 20/80 rule: 20% of memory gets 80% of references CPU relocate memory l Keep the 20% in physical memory (a form of caching) u Protective data l Protection check integrated with translation mechanism

18 19

Address Mapping and Granularity Generic Address Translation: the MMU

u Must have some “mapping” mechanism u CPU view l Map virtual to physical addresses in RAM or disk l Virtual addresses l Each process has its own CPU memory space [0, high] – u Mapping must have some granularity Virtual address

l Finer granularity provides more flexibility u Memory or I/O device view MMU l Finer granularity requires more mapping information l Physical addresses

u Unit (MMU) translates virtual address into physical address for each load and store Physical I/O memory device u Combination of hardware and (privileged) software controls the translation

20 21

5 Where to Keep Translation Information? Address Translation Methods

Goals of translation u Base and Bound u Implicit translation for each Registers u Segmentation memory reference u A hit should be very fast u Paging L1 2-4x u Trigger an exception on a miss u Multilevel translation u Protect from user’s errors u Inverted tables L2-L3 10-20x

Memory 100-500x Paging

20M-30Mx Disk

22

Base and Bound Base and Bound (or Limit) Example: Cray-I

u Protection l A process can only access physical virtual memory physical memory memory in [base, base+bound] bound 0 u On a l Save/restore base, bound regs code 6250 (base) u Pros virtual address > l Simple error data l Inexpensive (Hardware cost: 2 registers, , comparator) + base u Cons bound l Can’t fit all processes in memory, have stack 6250+bound to swap physical address l Fragmentation in memory l Relocate processes when they grow? Each program loaded into contiguous l Compare and add on every instruction regions of physical memory. l Very coarse grained Example on next slide Why not have multiple contiguous segments for each process, and keep their base/bound data in hardware? 25

6 Segmentation Segmentation Example u Every process has table of (assume 2 bit segment ID, 12 bit segment offset) (seg, size) for its segments Virtual address v-segment # p-segment segment physical memory u Treats (seg, size) as a finer- segment offset > start size grained (base, bound) code (00) 0x4000 0x700 error 0 u Protection data (01) 0 0x500 seg size - (10) 0 0 l Every entry contains rights stack (11) 0x2000 0x1000 4ff u On a context switch . . virtual memory l Save/restore table in kernel memory 2000 u Pros 0 l Programmer knows program and so 6ff segments, therefore can be efficient 2fff + 1000 l Easy to share data 14ff u Cons 4000 physical address 3000 l Complex management 46ff l Fragmentation 3fff

26

Segmentation Example (Cont’d) Segmentation

u Pros l Provides logical protection: programmer “knows program” Virtual memory for strlen(x) physical memory for strlen(x) and therefore how to design and manage segments Main: 240 store 1108, r2 x: 108 a b c \0 l Therefore efficient 244 store pc+8, r31 … l Easy to share data 248 jump 360 24c Main: 4240 store 1108, r2 … 4244 store pc+8, r31 u Cons strlen: 360 loadbyte (r2), r3 4248 jump 360 l … 424c Complex management, programmer burden 420 jump (r31) … l Fragmentation strlen: 4360 loadbyte (r2), r3 … l Difficult to find the right granularity balance between ease of … use and efficiency x: 1108 a b c \0 4420 jump (r31) … …

29

7 Paging Paging example

u Use a fixed size unit called page instead of segment Virtual address size virtual memory physical memory 0 u Use page table to translate VPage # offset 4 error a i u Various bits in each entry > b VP# PP# j u Context switch k Page table c 0 4 l l Similar to segmentation PPage# d 8 ... 1 3 u What should page size be? ... e 2 12 . 1 e u Pros . f f l Simple allocation PPage# ... g g h h l Easy to share 16 a page size: 4 bytes u Cons i b l Big table PPage # offset j c k d l PTEs even for big holes in Physical address memory l

30

How Many PTEs Do We Need? Segmentation with Paging

u Assume 4KB page Virtual address l Needs “low order” 12 bits to address byte within page Vseg # VPage # offset u Worst case for 32-bit address machine l # of processes ´ 220 Page table 20 l 2 PTEs per page table (~4Mbytes), but there might be seg size PPage# ... 10K processes. They won’t even fit in memory together ... . . u What about 64-bit address machine? . . l # of processes ´ 252 PPage# ... 52 l A page table cannot fit in a disk (2 PTEs = 16PBytes)! Every segment has its own page table. > PPage # offset error Physical address

32

8 Multiple-Level Page Tables 30386 address translation

Virtual address pte dir table offset Segmentation with paging. . . with a two-level paging . scheme.

Directory . . . .

. . What does this buy us?

34

Inverted Page Tables Making Translation Lookups Faster: TLBs

u Main idea u Programs only know virtual addresses Physical l One PTE for each Virtual l Each program or process starts from 0 to high address address physical page frame address u Each virtual address must be translated l Hash (Vpage, pid) to pid vpage offset k offset l May involve walking through a hierarchical page table Ppage# l Since the page table is in memory, a program memory access may requires several actual memory accesses u Pros 0 u Solution l Small page table for l Cache recent virtual to physical translations, i.e. large address space pid vpage k u Cons “active” part of page table, in a very fast memory l Lookup is difficult l If virtual address hits in TLB, use cached translation n-1 l Overhead of managing l Typically fully associative cache, match against entries hash chains, etc Inverted page table

36 37

9 TLB and Page Table Translation Translation Look-aside Buffer (TLB)

Virtual address Virtual Virtual Address Address Raise VPage # offset Processor TLB Miss Page Invalid Exception Table Hit VPage# PPage# Valid ... VPage# PPage# ... Miss Frame Frame . . Real VPage# PPage# ... page Offset Physical TLB Memory Physical table Address Hit Data PPage # offset Data Physical address

39

Bits in a TLB Entry Hardware-Controlled TLB

u Common (necessary) bits u On a TLB miss l Virtual page number l If the page containing the PTE is valid (in memory), l Physical page number: translated address hardware loads the PTE into the TLB • Write back and replace an entry if there is no free entry l Valid bit l Generate a fault if the page containing the PTE is l Access bits: kernel and user (none, read, write) invalid, or if there is a protection fault u Optional (useful) bits l VM software performs fault handling l Process tag l Restart the CPU l Reference bit u On a TLB hit, hardware checks the valid bit l Modify bit l If valid, pointer to page frame in memory l Cacheable bit l If invalid, the hardware generates a • Perform page fault handling • Restart the faulting instruction 40 41

10 Software-Controlled TLB Cache vs. TLB

u On a miss in TLB, software is invoked Address Data Vpage # offset l Write back if there is no free entry Hit l Check if the page containing the PTE is in memory Cache Hit TLB l If not, perform page fault handling l Load the PTE into the TLB Miss Miss l Restart the faulting instruction ppage # offset u On TLB hit, same as in hardware-controlled TLB Memory Memory

u Similarities u Differences l Cache a portion of memory l Associativity l Write back on a miss l Consistency

42 44

TLB Related Issues Summary: Virtual Memory

u What TLB entry to replace? u Virtual Memory l Random l makes software development easier and enables memory resource utilization better l Pseudo LRU l Separate address spaces provide protection and isolate faults u What happens on a context switch? u Address Translation l Process tag: invalidate appropriate TLB entries l Translate every memory operation using table (page table, l No process tag: Invalidate the entire TLB contents segment table). u What happens when changing a page table entry? l Speed: cache frequently used translations l Change the entry in memory u Result l Invalidate the TLB entry l Every process has a private address space l Programs run independently of actual physical memory addresses used, and actual memory size l Protection: processes only access memory they are allowed to

45

11