Lecture 07

 Virtual memory management  Based on slides by Prof. George CS 50011: Introduction to Systems II Adams III

Lecture 6: Memory Management and Virtual Memory

Prof. Jeff Turkstra

© 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2

Typical memory specs

 “… a system has been devised to Level Size in Typ. access time (ns) make the core and drum* Registers (64) 512 0.25 combination appear to the DRAM (4 GB) 4,294,967,296 60.00 programmer as a single level store, Hard disk (1 TB) 1,000,000,000,000 10,000,000.00 the requisite transfers taking place automatically.”  Kilburn et al., “One-level storage  Use of hard disk should be carefully systems”, 1962 managed for performance reasons

© 2017 Dr. Jeffrey A. Turkstra 3 © 2017 Dr. Jeffrey A. Turkstra 4

Motivation Memory management

 Efficient and safe (correct) sharing of  Suppose we have: memory among multiple programs  Internet Exploder (100MB)  Permit caching of hard drive data in  Micro$oft Word (100MB) main memory  Yahoo Messenger (30MB)  (200MB)  Allow programs to run even if  Computer has 256MB of RAM footprint is larger than available main  Have to quit programs before starting memory others  Sometimes motivations change as  Virtual memory allows us to load/unload technology changes portions as needed

© 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6

Programmer burden Isolation

 Without virtual memory, it’s the  Virtual memory limits sharing to programmer’s job to make programs explicit cases fit  How? Every program has its own  Divide into mutually exclusive chunks address space  Dynamically load/unload chunks as  Virtual memory translates virtual needed addresses to physical addresses  Same for libraries  Also enforces protection  Sounds like fun. Not.

© 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8

More efficient memory Can speed OS tasks utilization  Keep in RAM only the portion of  Program loading address space currently in use  Demand-based, faulted in instead of  Working set, Peter Denning, former head loaded all at once of Purdue CS department  Fork and copy-on-write  Swap space  Again, no duplication of memory unless  Can do deduplication to some degree needed   Shared libraries Spawning new processes is fast   Multiple processes of the same program Critical for fork() / exec() paradigm

© 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10

Sharing Implementations

 Permits simple, dynamic sharing  Historic among processes  Process swapping – entire memory footprint of process moved in and out (swapped)  Point the virtual addresses to the same between memory and disk physical addresses  Segment swapping – entire parts, “segments” (determined by programmer) are swapped  Drawbacks  Too much information at a time  Slow, inefficient  Fragmentation © 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12

Segmentation Demand-based paging

 Unit of memory swapped is a fixed- size  Usually 4KiB now, can be 2MiB on x86_64 “long mode”  Also supports 1GB  Eliminates external fragmentation  Not internal fragmentation

* http://cs.bc.edu/~donaldja/362/addresstranslation.html

© 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14

Terminology

 Time to load page is huge, 10 7  Physical memory divided into frames nanoseconds  Virtual memory into pages  Main memory operates as a fully  Any page can be placed in any frame associative cache  Missing page? Called a page fault  Try to avoid loading a page multiple  CPU emits virtual addresses times  Translated/mapped by a combination of  Only compulsory misses and capacity hardware and software misses  Memory mapping or address translation

© 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16

 Pages currently residing in main memory are resident  Resident set refers to all in-memory pages for a given process  Ideally resident set ~= working set

© 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18

Page tables Virtual memory

 Page tables provide the mapping from a virtual address to a  Stored in main memory  Managed by the OS  Referenced by the MMU

© 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20

Hardware/software Translation approach  Hardware handles the common case  Translate virtual address for a resident page to a physical address/frame  Software invoked for exceptions  Page fault – moving pages between disk and memory  Context switches  Configuring hardware

* http://cs.bc.edu/~donaldja/362/addresstranslation.html

© 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22

Control registers

© 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24

Page table Page tables

Virtual address range Virtual page Mapped to Page metadata  Are stored in main memory 0x00000000 to 0x00000 = (0)10 Page frame 0x024 Read only, not Dirty, 0x00000FFF Executable (Text) 0x00001000 to 0x00001 Page frame 0xF05 Read, write (Data), 0x00001FFF not Dirty 0x00002000 to 0x00002 Page frame 0xXXX Invalid 0x00002FFF 0x00003000 to 0x00003 Swap space Read, write, Swap, 0x00003FFF (via inode) not Dirty  (MMU) … … … … 0xFFFFE000 to 0xFFFFE Page frame 0xF43 Read, write, not  Historically external, newer 0xFFFFEFFF Dirty architectures include it on-chip 20 0xFFFFF000 to 0xFFFFF = (2 -1)10 Page frame 0x010 Read, write, Dirty 0xFFFFFFFF

© 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26

Virtually addressed caches CPU Data bus (bidirectional)  MMU/TLB below the cache Virtual-address bus Address bus (one-way) L1, L2,  Have to distinguish virtual addresses … cache from different processes Virtual-address bus  Invalidate all entries on context switch Memory Translation Look- Management DRAM  Register aside Buffer (TLB) Augment cache to include ASID (address Unit (MMU) space identifier) along with tag

Page table 0  Still have to worry about aliasing Page table 1  … Some designs have MMU/TLB above Physical-address bus Page table n I/O devices the cache

© 2017 Dr. Jeffrey A. Turkstra 2727 © 2017 Dr. Jeffrey A. Turkstra 28

Fast translation is critical Multi-level page tables

 Translation lookaside buffer, TLB  For modern, large address spaces  Caches recent translations they are necessary  Invented by IBM  Used even in 32-bit land  MMU looks in TLB same time as page  Page directory table lookup starts  Or...  In TLB, win!  Program locality → 90% of the time or more  Context switches? Tagged TLB or

TLB shootdown© 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30

PTE metadata

 Valid bit, dirty bit  Permission bits  Page replacement algorithm support  LRU, etc

© 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32

Processing a page fault

 Program attempts read/write to non-  Interrupt handler saves return resident page address and registers  Fetch next instruction  Jumps to appropriate handler  Access non-resident data  If page is invalid, SIGSEGV  MMU attempts translation, finds  If valid, load it from disk and establish valid bit = 0 the mapping  Generates interrupt to CPU  What if there are no free frames?  Restore registers, resume executing instruction

© 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34

Restarting instructions Page replacement

 Not always easy  Optimal (MIN), Belady’s Algorithm  Page fault may have been in the  Replace pages that won’t be used for middle of an instruction longest time   Can it be skipped? Only works offline, minimal page faults though  Restart from beginning?   Where? FIFO  Side effects (eg, autoincrementing  NRU address modes)  FIFO with second chance, “Clock  Hardware support for tracking side Algorithm” effects and rolling back © 2017 Dr. Jeffrey A. Turkstra 35  etc © 2017 Dr. Jeffrey A. Turkstra 36

mmap()

 Text segment, for example  Pages not read by default

 On-demand as accessed 0xFFFFFFFF  Fast startup text  Reduced memory usage mmap text 0x00020000  Physical pages for text segment and Executable File shared libraries can be shared  Protections: PROT_READ|PROT_EXEC 0x00000000 Disk  MAP_PRIVATE Virtual Memory © 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 3838

Shared code/library Data segment

 Multiple instances?  Data shared until write  “Copy on write” text text text

Process 1 Physical Process 2 Virtual Memory Virtual Memory Memory

© 2017 Dr. Jeffrey A. Turkstra 3939 © 2017 Dr. Jeffrey A. Turkstra 40

Data page A Data page A Data page A Data page A* Data page A Data page A Data page B Data page B Data page B Data page B Data page B Data page B Data page C Data page C Data page C Data page C Data page C Data page C

Process 1 Process 2 Data page A* Physical Process 1 Virtual Virtual Physical Process 2 Memory Virtual Memory Memory Memory Virtual 41 Memory Memory

© 2017 Dr. Jeffrey A. Turkstra 41 © 2017 Dr. Jeffrey A. Turkstra 42

Copy-on-write

 Happens on fork() too  Critical optimization that allows fork()/exec() approach to work page A 0’s page B 0’s 0’s page C 0’s

Parent’s Physical Virtual Memory Memory

© 2017 Dr. Jeffrey A. Turkstra 43 © 2017 Dr. Jeffrey A. Turkstra 44

Questions?

page A 0’s page B X 0’s page C 0’s page B X

Parent’s Physical Virtual Memory Memory

© 2017 Dr. Jeffrey A. Turkstra 45 © 2017 Dr. Jeffrey A. Turkstra 46