<<

Lecture 14: Paging

Fall 2018 Jason Tang

Slides based upon Concept slides, http://codex.cs.yale.edu/avi/os-book/OS9/slide-dir/index.html Copyright Silberschatz, Galvin, and Gagne, 2013 1 Topics

• Memory Mappings

• Page Table

• Translation Lookaside Buffers

• Page Protection

2 Memory Mapped

• Logical address translated not to memory but some other location

• Memory-mapped I/O (MMIO): hardware redirects / of certain addresses to physical device

• For example, on , address 0x3F8 usually mapped to first serial port

• Memory-mapped file (): OS redirects read/write of mapped memory region to file on disk

• Every call to read() / write() involves a

• Writing to a pointer faster, and OS can translate in the background (also see upcoming lecture)

3 Memory-Mapped File

• In , typical pattern is to:

• Open file, using open() function

• Optional: preallocate file size, using ftruncate()

• Create memory mapping, using mmap() function

• Do work, and then release mapping using munmap() function

• Kernel might not write data to disk until munmap()

4 mmap() function

void *mmap(void *addr, size_t length, prot, int flags, int fd, off_t offset)

• addr is target address of mapping, or NULL to let kernel decide

• length is number of bytes to map

• prot defines what mapping protection (read-only or read/write)

• flags sets other options

• fd is file descriptor that was returned by open()

• offset is offset into file specified by fd

5 mmap() example part 1

• See man page for each of these functions to find which #include header files must be included #include #include #include • Used here, open() creates #include the file /tmp/mmap if it does #include not already exist, and sets that #include #include file’s permissions to be readable and writable by user int main(void) { int fd = open("/tmp/mmap", O_RDWR | O_CREAT, • Used here, ftruncate() S_IRUSR | S_IWUSR); resizes /tmp/mmap to be 1000 ftruncate(fd, 1000); bytes

6 mmap() example part 2

• is set to starting address void *dest; dest dest = mmap(NULL, of memory mapped region 1000, PROT_READ | PROT_WRITE, MAP_SHARED, • 1000 bytes are mapped for fd, reading and writing 0); if (dest == MAP_FAILED) { fprintf(stderr, • strcpy() will indirectly "mmap() error\n"); modify contents of /tmp/mmap (1); } strcpy(dest, "Hello, world!"); • Changes to /tmp/mmap will be munmap(dest, 1000); cached until munmap() (fd); flushes the data return 0; }

7 Paging

• A ’s physical address space need not be contiguous, as long as there exists a segmentation table

• Avoids external fragmentation

• Avoids problem of varying sized memory chunks

• Page Frame (or just frame): physical memory divided into fixed-size blocks

• Frame sizes are powers of 2, between 512 B and 16 MiB

• On x86-64, frames default to 4096 bytes (212 bytes)

8 Paging

• Logical memory divided into blocks called pages (not to be confused with page frames)

• Size of page equal to size of page frame

• OS keeps track of all free frames within its allocation table

• To run a program of size N pages, need to find N free frames to load program

• Page Table: translates logical to physical addresses (that is, pages to frames)

• Every process has its own page table

9 Address Translation

• Physical address generated by CPU is divided into:

• Page Number (p): used as an index into page table which contains base address of each page in physical memory

• Page Offset (d): combined with base address to calculate physical address sent to memory unit

10 Paging Example

Page Number Frame Number 0x0000 0x1000 0x1000 0x2000 0x2000 0x4300 0x3000 0xA000

• In this example, both logical and physical addresses range from 0 to 232 - 1

• Let the page size (and thus frame size) be 64 KiB (216 bytes)

• Then page offsets d range from 0 to 216 -1 (lower 16 bits of logical address)

• Therefore page numbers p range from 0 to 216 - 1 (top 16 bits of logical address)

11 More Complicated Paging Example

• Let logical addresses be 4 bits (0 to 24 - 1)

• Let top 2 bits be page number (and thus offset is remaining 2 bits and frame size is 4 bytes) given paging table:

Page Number Frame Number 0x0 0x5 0x1 0x6 0x2 0x1 0x3 0x2 • Then logical address 0xd is physical address 0x9 1 1 0 1 • p = 3 and d = 1, so address = (2 * 4) + 1 = 9

12 Page Sizes

• Range of logical addresses need not match range of physical addresses

• Example:

• CPU has 32-bit (logical) addressing

• Page size is 4096 bytes, d is 12

• Within page table, each page number can refer to one of 232 frames

• Total physical address space is 244 (within one of 232 pages and 212 offset)

13 Page Sizes

• Calculating internal fragmentation:

• Let page size = 2048 bytes and a process size = 72766 bytes

• Requires 35 pages + 1086 bytes

• Internal fragmentation of 962 bytes (2048 - 1086)

• Smaller frame sizes means less fragmentation, but larger page table (and OS must maintain more bookkeeping)

• Frame Table: data structure maintained by OS of all frames free or in use

14 Implementing Page Table

• Each process has its own page table; current process’s page table loaded into memory (either RAM, or entirely in registers if table is small enough)

• Page-table base register (PTBR): points to page table within RAM

• Page-table length register (PTLR): size of page table

• Every address requires two memory accesses: one to page table, then one to final physical address

• Can be sped up via hardware cache, via associative memory or translation look-aside buffers (TLBs)

15 Page Table

• Every page table entry has 47 12 11 0 control flags: Virtual Page Number Page Offset

36 • Valid Bit: If set, Dirty Valid Physical Page Number then proceed with Page Table Register memory access; otherwise raise exception

28 • Dirty Bit: Set by hardware 39 12 11 0

when page is modified Physical Page Number Page Offset

• In simplest case, virtual page number is an index number into page table

16 TLBs

• Some TLBs store address-space identifiers (ASIDs) - uniquely identifies each process

• TLBs are generally few

• Example: Intel Skylake has 1536 TLB entries

• When process with correct ASID performs address translation, TLB will contain correct physical address (a TLB hit)

• On TLB miss, hardware looks up physical address within page table, and loads address into TLB for faster access next time

• Some TLB entries can be wired down for permanent fast access

17 Paging Hardware with TLB

18 Effective Access Time

• Let ε = time to perform TLB lookup

• Let α = hit ratio (percentage of times that requested page is in TLB)

• If ε = 20 ns, α = 80%, and 100 ns for each memory access, then effective access time = ε + α × 100 + (1 - α)(2 × 100) = 20 ns + 0.8 × 100 ns + 0.2 × 200 ns = 140 ns

• More realistic example: α = 99%

• EAT = 20 ns + 0.99 × 100 ns + 0.01 × 200 ns = 121 ns

• During a TLB miss, a CPU with hyper-threading can execute other opcodes

19 Memory Protection

• Each page table entry has protection bit(s) indicating permitted accesses (read, write, execute, and others)

• Page tables also have valid bit for each entry:

• Valid: associated page is within process’s logical address space, so is thus legal to access

• Invalid: page not in logical address space

20 Invalid Pages

• Violation of protection or valid bits results in an

: dereferencing an invalid page

• NULL is defined as: (void *) 0

• By default, on Linux x86, the lowest legal address for userspace is 65536

• Pages less than virtual address 65536 are marked as invalid

• Reading or writing to those invalid pages causes a hardware interrupt

sends 11 (SIGSEGV) to process

21 Pages and Frames in Linux

• Multiple pages could point to same frame

• In Linux, if two programs both use same shared , then both processes’ page tables will have entry/entries to shared library frame(s)

• Threads share same pages to heap frames and other global values

• A page could refer to different a frame over time

• When process is swapped out and then swapped in, Linux can assign a different frame

• Linux can migrate pages to different frames to defragment memory or for NUMA systems

22