Lecture 07 Typical Memory Specs Motivation Memory Management

Lecture 07 Virtual memory management Based on slides by Prof. George CS 50011: Introduction to Systems II Adams III Lecture 6: Memory Management and Virtual Memory Prof. Jeff Turkstra © 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2 Typical memory specs “… a system has been devised to Level Size in bytes Typ. access time (ns) make the core and drum* Registers (64) 512 0.25 combination appear to the DRAM (4 GB) 4,294,967,296 60.00 programmer as a single level store, Hard disk (1 TB) 1,000,000,000,000 10,000,000.00 the requisite transfers taking place automatically.” Kilburn et al., “One-level storage Use of hard disk should be carefully systems”, 1962 managed for performance reasons © 2017 Dr. Jeffrey A. Turkstra 3 © 2017 Dr. Jeffrey A. Turkstra 4 Motivation Memory management Efficient and safe (correct) sharing of Suppose we have: memory among multiple programs Internet Exploder (100MB) Permit caching of hard drive data in Micro$oft Word (100MB) main memory Yahoo Messenger (30MB) Operating System (200MB) Allow programs to run even if Computer has 256MB of RAM footprint is larger than available main Have to quit programs before starting memory others Sometimes motivations change as Virtual memory allows us to load/unload technology changes portions as needed © 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6 Programmer burden Isolation Without virtual memory, it’s the Virtual memory limits sharing to programmer’s job to make programs explicit cases fit How? Every program has its own Divide into mutually exclusive chunks address space Dynamically load/unload chunks as Virtual memory translates virtual needed addresses to physical addresses Same for libraries Also enforces protection Sounds like fun. Not. © 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8 More efficient memory Can speed OS tasks utilization Keep in RAM only the portion of Program loading address space currently in use Demand-based, faulted in instead of Working set, Peter Denning, former head loaded all at once of Purdue CS department Fork and copy-on-write Swap space Again, no duplication of memory unless Can do deduplication to some degree needed Shared libraries Spawning new processes is fast Multiple processes of the same program Critical for fork() / exec() paradigm © 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10 Sharing Implementations Permits simple, dynamic sharing Historic among processes Process swapping – entire memory footprint of process moved in and out (swapped) Point the virtual addresses to the same between memory and disk physical addresses Segment swapping – entire parts, “segments” (determined by programmer) are swapped Drawbacks Too much information at a time Slow, inefficient Fragmentation © 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12 Segmentation Demand-based paging Unit of memory swapped is a fixed- size page Usually 4KiB now, can be 2MiB on x86_64 “long mode” Also supports 1GB Eliminates external fragmentation Not internal fragmentation * http://cs.bc.edu/~donaldja/362/addresstranslation.html © 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14 Terminology Time to load page is huge, 10 7 Physical memory divided into frames nanoseconds Virtual memory into pages Main memory operates as a fully Any page can be placed in any frame associative cache Missing page? Called a page fault Try to avoid loading a page multiple CPU emits virtual addresses times Translated/mapped by a combination of Only compulsory misses and capacity hardware and software misses Memory mapping or address translation © 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16 Pages currently residing in main memory are resident Resident set refers to all in-memory pages for a given process Ideally resident set ~= working set © 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18 Page tables Virtual memory Page tables provide the mapping from a virtual address to a physical address Stored in main memory Managed by the OS Referenced by the MMU © 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20 Hardware/software Translation approach Hardware handles the common case Translate virtual address for a resident page to a physical address/frame Software invoked for exceptions Page fault – moving pages between disk and memory Context switches Configuring hardware * http://cs.bc.edu/~donaldja/362/addresstranslation.html © 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22 Control registers © 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24 Page table Page tables Virtual address range Virtual page Mapped to Page metadata Are stored in main memory 0x00000000 to 0x00000 = (0)10 Page frame 0x024 Read only, not Dirty, 0x00000FFF Executable (Text) 0x00001000 to 0x00001 Page frame 0xF05 Read, write (Data), 0x00001FFF not Dirty 0x00002000 to 0x00002 Page frame 0xXXX Invalid 0x00002FFF 0x00003000 to 0x00003 Swap space Read, write, Swap, 0x00003FFF (via inode) not Dirty Memory management unit (MMU) … … … … 0xFFFFE000 to 0xFFFFE Page frame 0xF43 Read, write, not Historically external, newer 0xFFFFEFFF Dirty architectures include it on-chip 20 0xFFFFF000 to 0xFFFFF = (2 -1)10 Page frame 0x010 Read, write, Dirty 0xFFFFFFFF © 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26 Virtually addressed caches CPU Data bus (bidirectional) MMU/TLB below the cache Virtual-address bus Address bus (one-way) L1, L2, Have to distinguish virtual addresses … cache from different processes Virtual-address bus Invalidate all entries on context switch Memory Page Table Translation Look- Management DRAM Register aside Buffer (TLB) Augment cache to include ASID (address Unit (MMU) space identifier) along with tag Page table 0 Still have to worry about aliasing Page table 1 … Some designs have MMU/TLB above Physical-address bus Page table n I/O devices the cache © 2017 Dr. Jeffrey A. Turkstra 2727 © 2017 Dr. Jeffrey A. Turkstra 28 Fast translation is critical Multi-level page tables Translation lookaside buffer, TLB For modern, large address spaces Caches recent translations they are necessary Invented by IBM Used even in 32-bit land MMU looks in TLB same time as page Page directory table lookup starts Or... In TLB, win! Program locality → 90% of the time or more Context switches? Tagged TLB or TLB shootdown© 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30 PTE metadata Valid bit, dirty bit Permission bits Page replacement algorithm support LRU, etc © 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32 Processing a page fault Program attempts read/write to non- Interrupt handler saves return resident page address and registers Fetch next instruction Jumps to appropriate handler Access non-resident data If page is invalid, SIGSEGV MMU attempts translation, finds If valid, load it from disk and establish valid bit = 0 the mapping Generates interrupt to CPU What if there are no free frames? Restore registers, resume executing instruction © 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34 Restarting instructions Page replacement Not always easy Optimal (MIN), Belady’s Algorithm Page fault may have been in the Replace pages that won’t be used for middle of an instruction longest time Can it be skipped? Only works offline, minimal page faults though Restart from beginning? Where? FIFO Side effects (eg, autoincrementing NRU address modes) FIFO with second chance, “Clock Hardware support for tracking side Algorithm” effects and rolling back © 2017 Dr. Jeffrey A. Turkstra 35 etc © 2017 Dr. Jeffrey A. Turkstra 36 mmap() Text segment, for example Pages not read by default On-demand as accessed 0xFFFFFFFF Fast startup text Reduced memory usage mmap text 0x00020000 Physical pages for text segment and Executable File shared libraries can be shared Protections: PROT_READ|PROT_EXEC 0x00000000 Disk MAP_PRIVATE Virtual Memory © 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 3838 Shared code/library Data segment Multiple instances? Data shared until write “Copy on write” text text text Process 1 Physical Process 2 Virtual Memory Virtual Memory Memory © 2017 Dr. Jeffrey A. Turkstra 3939 © 2017 Dr. Jeffrey A. Turkstra 40 Data page A Data page A Data page A Data page A* Data page A Data page A Data page B Data page B Data page B Data page B Data page B Data page B Data page C Data page C Data page C Data page C Data page C Data page C Process 1 Process 2 Data page A* Physical Process 1 Virtual Virtual Physical Process 2 Memory Virtual Memory Memory Memory Virtual 41 Memory Memory © 2017 Dr. Jeffrey A. Turkstra 41 © 2017 Dr. Jeffrey A. Turkstra 42 Copy-on-write Happens on fork() too Critical optimization that allows fork()/exec() approach to work page A 0’s page B 0’s 0’s page C 0’s Parent’s Physical Virtual Memory Memory © 2017 Dr. Jeffrey A. Turkstra 43 © 2017 Dr. Jeffrey A. Turkstra 44 Questions? page A 0’s page B X 0’s page C 0’s page B X Parent’s Physical Virtual Memory Memory © 2017 Dr. Jeffrey A. Turkstra 45 © 2017 Dr. Jeffrey A. Turkstra 46 .

Lecture 07 Typical Memory Specs Motivation Memory Management

SYSCALL, IRET ● Opportunistic SYSRET Failed, Do IRET

X86 Memory Protection and Translation

Virtual Memory in X86

Protected Mode - Wikipedia

The Evolution of an X86 Virtual Machine Monitor

AMD 64-Bit Technology

X86 Memory Protection and Translation

The Memory Sinkhole : an Architectural Privilege Escalation Vulnerability { Domas // Black Hat 2015  Christopher Domas

CIS 3207 - Operating Systems CPU Mode

Lecture 22: Address Spaces

Linux Boot Process

A Comparative Study of 64 Bit Microprocessors Apple, Intel and AMD