ECE 4750 Computer Architecture
Topic 16: Address Translation and Protection
Christopher Batten School of Electrical and Computer Engineering Cornell University ! http://www.csl.cornell.edu/courses/ece4750!
ECE 4750 T16: Address Translation and Protection Memory Management
• From early absolute addressing schemes, to modern virtual memory systems with support for virtual machine monitors
• Can separate into orthogonal functions: – Translation (mapping of virtual address to physical address) – Protection (permission to access word in memory) – Virtual memory (transparent extension of memory space using slower disk storage)
• But most modern systems provide support for all the above functions with a single page-based system
ECE 4750 T16: Address Translation and Protection 2! Bare Machine
Physical Physical Address Inst. Address Data Decode PC Cache D E + M Cache W
Physical Physical Address Memory Controller Address
Physical Address Main Memory (DRAM)
• In a bare machine, the only kind of address is a physical address
ECE 4750 T16: Address Translation and Protection 3! Dynamic Address Translation
Motivation In the early machines, I/O operations were slow and each word transferred involved the CPU
Higher throughput if CPU and I/O of 2 or more programs were overlapped. prog1 How?⇒ multiprogramming
Location-independent programs Programming and storage management ease ⇒ need for a base register
Protection prog2 Memory Physical Independent programs should not affect each other inadvertently ⇒ need for a bound register
ECE 4750 T16: Address Translation and Protection 4! Simple Base and Bound Translation
Segment Length Bound Bounds Register ≤ Violation?
Physical current Address Load X Effective segment Address +
Base Memory Physical Register Base Physical Address Program Address Space Base and bounds registers are visible/accessible only when processor is running in the supervisor mode
ECE 4750 T16: Address Translation and Protection 5! Separate Areas for Program and Data
Data Bound Bounds Register < Violation? data Effective Addr Load X Register segment Data Base Register +
Program Program Bound Bounds Address Register < Violation? Space Memory Physical Program program Counter segment Program Base Register +
What is an advantage of this separation?
ECE 4750 T16: Address Translation and Protection 6! Base and Bound Machine Prog. Bound Data Bound Register Register < Bounds < Bounds Logical Logical Violation? Violation? Address Address
Inst. Data Decode PC + Cache D E + M + Cache W Physical Physical Address Address Program Base Data Base Register Register Physical Physical Address Address Memory Controller
Physical Address Main Memory (DRAM)
[ Can fold addition of base register into (base+offset) calculation using a carry-save adder (sum three numbers with only a few gate delays more than adding two numbers) ]
ECE 4750 T16: Address Translation and Protection 7! Memory Fragmentation
Users 4 & 5 Users 2 & 5 free OS arrive OS leave OS Space Space Space 16K user 1 16K user 1 user 1 16K user 2 24K user 2 24K 24K user 4 16K 24K user 4 16K 8K 8K 32K user 3 32K user 3 user 3 32K
24K user 5 24K 24K
As users come and go, the storage is “fragmented”. Therefore, at some stage programs have to be moved around to compact the storage.
ECE 4750 T16: Address Translation and Protection 8! Paged Memory Systems
• Processor generated address can be interpreted as a pair
1 0 0 0 1 1 2 2 3 3 3 Address Space Page Table of User-1 of User-1 2
Page tables make it possible to store the pages of a program non-contiguously.
ECE 4750 T16: Address Translation and Protection 9! Private Address Space per User
OS User 1 VA1 pages Page Table Memory Physical
User 2 VA1
Page Table
User 3 VA1
Page Table free • Each user has a page table • Page table contains an entry for each user page
ECE 4750 T16: Address Translation and Protection 10! Where Should Page Tables Reside? • Space required by the page tables (PT) is proportional to the address space, number of users, size of each page, ... ⇒ Space requirement is large ⇒ Too expensive to keep in registers
• Idea: Keep PTs in the main memory – needs one reference to retrieve the page base address and another to access the data word ⇒ doubles the number of memory references!
ECE 4750 T16: Address Translation and Protection 11! Page Tables in Physical Memory
PT User 1
VA1
PT User 2 User 1
VA1
User 2
ECE 4750 T16: Address Translation and Protection 12! Linear Page Table
Data Pages • Page Table Entry (PTE) Page Table contains: PPN PPN – A bit to indicate if a page exists PPN – PPN is the physical page PPN number, ie where virtual page is Data word mapped into physical memory Offset – Status bits for protection and usage PPN • OS sets the Page Table PPN PPN Base Register whenever PPN active user process PPN changes VPN PPN PPN PPN
PT Base Register VPN Offset Virtual address ECE 4750 T16: Address Translation and Protection 13! Size of Linear Page Table
With 32-bit addresses, 4-KB pages & 4-byte PTEs: ⇒ Potentially 4 GB of physical memory needed per user ⇒ 4-KB page means VPN is 20 bits and offset is 12 bits ⇒ 220 PTEs, i.e, 4 MB page table overhead per user
Larger pages? • Internal fragmentation (Not all memory in a page is used)
What about 64-bit virtual address space??? • 1MB pages means VPN is 44 bits and offset is 20 bits • Would still require 244 8-byte PTEs (35 TB!) How can this possibly ever work? sparsity of virtual address usage
ECE 4750 T16: Address Translation and Protection 14! Hierarchical (Two-Level) Page Table Virtual Address 31 22 21 12 11 0 p1 p2 offset
10-bit 10-bit L1 index L2 index offset Root of the Current Page Table p2 p1
(Processor Level 1 Register) Page Table
Level 2 page in memory Page Tables PTE of a nonexistent page
Data Pages
ECE 4750 T16: Address Translation and Protection 15! Two-Level Page Tables in Physical Memory Physical Virtual Memory Address Spaces Level 1 PT User 1
VA1 Level 1 PT User 2 User 1
User2/VA1 VA1 User1/VA1
User 2
Level 2 PT User 2
ECE 4750 T16: Address Translation and Protection 16! Address Translation & Protection
Virtual Address Virtual Page No. (VPN) offset Kernel/User Mode
Read/Write Protection Address Check Translation
Exception? Physical Address Physical Page No. (PPN) offset
• Every instruction and data access needs address translation and protection checks A good translation and protection design needs to be fast (~ one cycle) and space efficient
ECE 4750 T16: Address Translation and Protection 17! Translation Lookaside Buffers Address translation is very expensive! In a two-level page table, each reference becomes several memory accesses Solution: Cache translations in TLB TLB hit ⇒ Single Cycle Translation TLB miss ⇒ Page Table Walk to refill TLB
virtual address VPN offset
V R W D tag PPN (VPN = virtual page number)
(PPN = physical page number)
hit? physical address PPN offset
ECE 4750 T16: Address Translation and Protection 18! TLB Designs
• Typically 32-128 entries, usually fully associative – Each entry maps large number of consecutive addresses so most spatial locality within page as opposed to across pages -> More likely that two entries conflict – Sometimes larger TLBs (256-512 entries) are 4-8 way set-associative – Larger systems sometimes have multi-level (L1 and L2) TLBs • Random or FIFO replacement policy • No process information in the TLB – Flush TLB on process context switch • TLB Reach: Size of largest virtual address space that can be simultaneously mapped by TLB Example: 64 TLB entries, 4KB pages, one page per entry TLB Reach = ______64 entries * 4 KB = 256 KB (if contiguous) ?
ECE 4750 T16: Address Translation and Protection 19! Address Translation in CPU Pipeline
Inst Inst. Data Data Decode PC TLB Cache D E + M TLB Cache W
TLB miss? Protection violation? TLB miss? Protection violation?
• Software handlers need restartable exception on TLB fault • Handling a TLB miss needs a hardware or software mechanism to refill TLB • Need mechanisms to cope with the additional latency of a TLB: – slow down the clock – pipeline the TLB and cache access – virtual address caches – parallel TLB/cache access
ECE 4750 T16: Address Translation and Protection 20! Handling a TLB Miss
Software (MIPS, Alpha) TLB miss causes an exception and the operating system walks the page tables and reloads TLB. A privileged “untranslated” addressing mode used for walk
Hardware (SPARC v8, x86, PowerPC) A memory management unit (MMU) walks the page tables and reloads the TLB, any additional complexities encountered during walk causes MMU to give up and signal an exception
ECE 4750 T16: Address Translation and Protection 21! Page-Based Memory Management Machine (Hardware Page Table Walk)
Protection violation? Protection violation? Virtual Virtual Address Physical Address Physical Address Address
Inst. Inst. Decode Data Data PC TLB Cache D E + M TLB Cache W
Miss? Miss? Page Table Base Register Hardware Page Table Walker Physical Physical Address Address Memory Controller
Physical Address Main Memory (DRAM) • Assumes page tables held in untranslated physical memory
ECE 4750 T16: Address Translation and Protection 22! Acknowledgements
• These slides contain material developed and copyright by: – Arvind (MIT) – Krste Asanovic (MIT/UCB) – Joel Emer (Intel/MIT) – James Hoe (CMU) – John Kubiatowicz (UCB) – David Patterson (UCB)
• MIT material derived from course 6.823 • UCB material derived from course CS252 & CS152
ECE 4750 T16: Address Translation and Protection 23!