ECE 4750 Computer Architecture Topic 16: Address Translation and Protection

ECE 4750 Computer Architecture Topic 16: Address Translation and Protection Christopher Batten School of Electrical and Computer Engineering Cornell University ! http://www.csl.cornell.edu/courses/ece4750! ECE 4750 T16: Address Translation and Protection Memory Management • From early absolute addressing schemes, to modern virtual memory systems with support for virtual machine monitors • Can separate into orthogonal functions: – Translation (mapping of virtual address to physical address) – Protection (permission to access word in memory) – Virtual memory (transparent extension of memory space using slower disk storage) • But most modern systems provide support for all the above functions with a single page-based system ECE 4750 T16: Address Translation and Protection 2! Bare Machine Physical Physical Address Inst. Address Data Decode PC Cache D E + M Cache W Physical Physical Address Memory Controller Address Physical Address Main Memory (DRAM) • In a bare machine, the only kind of address is a physical address ECE 4750 T16: Address Translation and Protection 3! Dynamic Address Translation Motivation In the early machines, I/O operations were slow and each word transferred involved the CPU Higher throughput if CPU and I/O of 2 or more programs were overlapped. prog1 How?⇒ multiprogramming Location-independent programs Programming and storage management ease ⇒ need for a base register Protection prog2 Memory Physical Independent programs should not affect each other inadvertently ⇒ need for a bound register ECE 4750 T16: Address Translation and Protection 4! Simple Base and Bound Translation Segment Length Bound Bounds Register ≤ Violation? Physical current Address Load X Effective segment Address + Base Memory Physical Register Base Physical Address Program Address Space Base and bounds registers are visible/accessible only when processor is running in the supervisor mode ECE 4750 T16: Address Translation and Protection 5! Separate Areas for Program and Data Data Bound Bounds Register < Violation? data Effective Addr Load X Register segment Data Base Register + Program Program Bound Bounds Address Register < Violation? Space Memory Physical Program program Counter segment Program Base Register + What is an advantage of this separation? ECE 4750 T16: Address Translation and Protection 6! Base and Bound Machine Prog. Bound Data Bound Register Register < Bounds < Bounds Logical Logical Violation? Violation? Address Address Inst. Data Decode PC + Cache D E + M + Cache W Physical Physical Address Address Program Base Data Base Register Register Physical Physical Address Address Memory Controller Physical Address Main Memory (DRAM) [ Can fold addition of base register into (base+offset) calculation using a carry-save adder (sum three numbers with only a few gate delays more than adding two numbers) ] ECE 4750 T16: Address Translation and Protection 7! Memory Fragmentation Users 4 & 5 Users 2 & 5 free OS arrive OS leave OS Space Space Space 16K user 1 16K user 1 user 1 16K user 2 24K user 2 24K 24K user 4 16K 24K user 4 16K 8K 8K 32K user 3 32K user 3 user 3 32K 24K user 5 24K 24K As users come and go, the storage is “fragmented”. Therefore, at some stage programs have to be moved around to compact the storage. ECE 4750 T16: Address Translation and Protection 8! Paged Memory Systems • Processor generated address can be interpreted as a pair <page number, offset> page number offset • A page table contains the physical address of the base of each page 1 0 0 0 1 1 2 2 3 3 3 Address Space Page Table of User-1 of User-1 2 Page tables make it possible to store the pages of a program non-contiguously. ECE 4750 T16: Address Translation and Protection 9! Private Address Space per User OS User 1 VA1 pages Page Table Memory Physical User 2 VA1 Page Table User 3 VA1 Page Table free • Each user has a page table • Page table contains an entry for each user page ECE 4750 T16: Address Translation and Protection 10! Where Should Page Tables Reside? • Space required by the page tables (PT) is proportional to the address space, number of users, size of each page, ... ⇒ Space requirement is large ⇒ Too expensive to keep in registers • Idea: Keep PTs in the main memory – needs one reference to retrieve the page base address and another to access the data word ⇒ doubles the number of memory references! ECE 4750 T16: Address Translation and Protection 11! Page Tables in Physical Memory PT User 1 VA1 PT User 2 User 1 VA1 User 2 ECE 4750 T16: Address Translation and Protection 12! Linear Page Table Data Pages • Page Table Entry (PTE) Page Table contains: PPN PPN – A bit to indicate if a page exists PPN – PPN is the physical page PPN number, ie where virtual page is Data word mapped into physical memory Offset – Status bits for protection and usage PPN • OS sets the Page Table PPN PPN Base Register whenever PPN active user process PPN changes VPN PPN PPN PPN PT Base Register VPN Offset Virtual address ECE 4750 T16: Address Translation and Protection 13! Size of Linear Page Table With 32-bit addresses, 4-KB pages & 4-byte PTEs: ⇒ Potentially 4 GB of physical memory needed per user ⇒ 4-KB page means VPN is 20 bits and offset is 12 bits ⇒ 220 PTEs, i.e, 4 MB page table overhead per user Larger pages? • Internal fragmentation (Not all memory in a page is used) What about 64-bit virtual address space??? • 1MB pages means VPN is 44 bits and offset is 20 bits • Would still require 244 8-byte PTEs (35 TB!) How can this possibly ever work? sparsity of virtual address usage ECE 4750 T16: Address Translation and Protection 14! Hierarchical (Two-Level) Page Table Virtual Address 31 22 21 12 11 0 p1 p2 offset 10-bit 10-bit L1 index L2 index offset Root of the Current Page Table p2 p1 (Processor Level 1 Register) Page Table Level 2 page in memory Page Tables PTE of a nonexistent page Data Pages ECE 4750 T16: Address Translation and Protection 15! Two-Level Page Tables in Physical Memory Physical Virtual Memory Address Spaces Level 1 PT User 1 VA1 Level 1 PT User 2 User 1 User2/VA1 VA1 User1/VA1 User 2 Level 2 PT User 2 ECE 4750 T16: Address Translation and Protection 16! Address Translation & Protection Virtual Address Virtual Page No. (VPN) offset Kernel/User Mode Read/Write Protection Address Check Translation Exception? Physical Address Physical Page No. (PPN) offset • Every instruction and data access needs address translation and protection checks A good translation and protection design needs to be fast (~ one cycle) and space efficient ECE 4750 T16: Address Translation and Protection 17! Translation Lookaside Buffers Address translation is very expensive! In a two-level page table, each reference becomes several memory accesses Solution: Cache translations in TLB TLB hit ⇒ Single Cycle Translation TLB miss ⇒ Page Table Walk to refill TLB virtual address VPN offset V R W D tag PPN (VPN = virtual page number) (PPN = physical page number) hit? physical address PPN offset ECE 4750 T16: Address Translation and Protection 18! TLB Designs • Typically 32-128 entries, usually fully associative – Each entry maps large number of consecutive addresses so most spatial locality within page as opposed to across pages -> More likely that two entries conflict – Sometimes larger TLBs (256-512 entries) are 4-8 way set-associative – Larger systems sometimes have multi-level (L1 and L2) TLBs • Random or FIFO replacement policy • No process information in the TLB – Flush TLB on process context switch • TLB Reach: Size of largest virtual address space that can be simultaneously mapped by TLB Example: 64 TLB entries, 4KB pages, one page per entry TLB Reach = _____________________________________________64 entries * 4 KB = 256 KB (if contiguous) ? ECE 4750 T16: Address Translation and Protection 19! Address Translation in CPU Pipeline Inst Inst. Data Data Decode PC TLB Cache D E + M TLB Cache W TLB miss? Protection violation? TLB miss? Protection violation? • Software handlers need restartable exception on TLB fault • Handling a TLB miss needs a hardware or software mechanism to refill TLB • Need mechanisms to cope with the additional latency of a TLB: – slow down the clock – pipeline the TLB and cache access – virtual address caches – parallel TLB/cache access ECE 4750 T16: Address Translation and Protection 20! Handling a TLB Miss Software (MIPS, Alpha) TLB miss causes an exception and the operating system walks the page tables and reloads TLB. A privileged “untranslated” addressing mode used for walk Hardware (SPARC v8, x86, PowerPC) A memory management unit (MMU) walks the page tables and reloads the TLB, any additional complexities encountered during walk causes MMU to give up and signal an exception ECE 4750 T16: Address Translation and Protection 21! Page-Based Memory Management Machine (Hardware Page Table Walk) Protection violation? Protection violation? Virtual Virtual Address Physical Address Physical Address Address Inst. Inst. Decode Data Data PC TLB Cache D E + M TLB Cache W Miss? Miss? Page Table Base Register Hardware Page Table Walker Physical Physical Address Address Memory Controller Physical Address Main Memory (DRAM) • Assumes page tables held in untranslated physical memory ECE 4750 T16: Address Translation and Protection 22! Acknowledgements • These slides contain material developed and copyright by: – Arvind (MIT) – Krste Asanovic (MIT/UCB) – Joel Emer (Intel/MIT) – James Hoe (CMU) – John Kubiatowicz (UCB) – David Patterson (UCB) • MIT material derived from course 6.823 • UCB material derived from course CS252 & CS152 ECE 4750 T16: Address Translation and Protection 23!.

Load more