Chapter 9. Address Translation and Virtual Memory

SPEDE-2000 Lab Manual, CSU Sacramento 79 CChhaapptteerr 99.. AAddddrreessss TTrraannssllaattiioonn aanndd VViirrttuuaall MMeemmoorryy You step in the stream But the water has moved on. Page not found. — Computer Haiku error message The Intel386 and later models include an on-chip paging memory-mapping unit (MMU). Paging occurs after the logical address has been resolved to a linear address. If paging is enabled, the linear address will be translated into a page frame number and offset, run through the page tables, then sent to the bus interface unit of the CPU. This process creates a virtual address space. The first section describes how a logical address (selector and offset) is converted to a physical address. It starts with an overview of the whole process, then explains the two steps in detail. The next page describes the paging system. When a page-fault occurs the offending linear address is stored in CR2. The chapter ends with information about setting up virtual address spaces. many O.S. textbooks describe the Intel segmentation-paging system; you might want to also reference those texts. The Address Conversion Process All addresses on a Pentium CPU begin as logical addresses consisting of a selector and an offset. The CPU sends a physical address to memory. When paging is enabled, each address goes through two conversions. A protected-mode, virtual memory OS uses both. First the selector is used to index into a descriptor table, usually the Global Descriptor Table (GDT). The descriptor provides a base address, which is added to the segment’s offset, provided by the original logical address. This sum is the linear address. The offset is compared against the segment’s limit value to ensure the offset is within bounds. See Figure 9-1 for a picture of this. With paging enabled (bit 31 in CR0 set), the linear address is really two values: a page frame number (PFN) and an offset with that page. The Intel Pentium uses a two-level scheme for page numbers, so the PFN is actually a directory index and a page table index. This scheme reduces the number of page tables required when there are “holes” (unmapped areas) in the address space. This way a very large “address space” can be supported with a small amount of physical RAM. Each “page entry” is 4 bytes. The descriptor table and the page tables are all located in system memory. To realize one memory access for a program, the CPU must actually read the descriptor from memory, a page directory, and a page table. So for every program memory access, the CPU must perform three additional accesses. This would really slow down any program. For that reason, the CPU caches as much information as it can onboard itself. In normal operation, only three or so selectors are used. When a selector register is first loaded, the CPU checks to make sure the descriptor is valid, and if so, loads its contents into the selector’s cache storage (these registers are hidden from the programmer). When the CPU performs the addition of the segments base address and the offset in the logical address, both values are already inside the CPU. Even though each page table is 4K, the CPU doesn’t need to read the whole thing to translate a linear address. It needs only one page entry from the page directory (top level) and one entry from the page table (second level). A translation lookaside buffer (TLB) is used to cache these entries. It remembers recent page entries. Each time a new set of page tables is used (e.g., each address space has its own SPEDE-2000 Lab Manual, CSU Sacramento 80 Logical Selector Offset Address Dir Table Offset + Segment + Descriptor Page Entry Linear Page Entry Physical Address Address Global Descriptor Table (from GDTR) Segmentation Paging Figure 9-1: Overview of Segmentation and Paging set), this cache must be flushed (i.e., emptied). This is done automatically by the CPU when CR3 (page directory base register) is loaded. After the TLB is flushed (i.e., “cold”), the next few memory accesses will incur a lot of memory clock cycles. ♦ Segmentation Figure 9-2 below shows how a segmented protected mode address becomes a linear address. It takes a logical address and generates a linear address. Open arrows indicate base addresses. Each descriptor has four fields of primary interest. The first is the type information defining it as a code or data segment. Second are the access (permission) bits, which state if the whole segment can be written (if data) or executed (if code). Third is the base linear address of the segment, and lastly is the size or limit of the segment. For 159, all the segments are setup with a base address of zero. This way all addresses point to the same place in the address space. The limit is set to 4GB, so that won’t get in your way. All this is done by the boot loader, before FLAMES runs. The CPU register GDTR (global descriptor base register) supplies a base address and segment limit for the descriptor table. Using the selector’s upper 13 bits, a descriptor is selected and the limit and size fields are examined. If the limit is exceeded a general protection fault will occur. This stops the memory and terminates the instructions, but the EIP register will point to the faulting instruction so it can be retired once the OS has recovered from the general protection fault. Note the LDTR holds a selector, not a pointer value. Its base and limit are from the descriptor is indicates. There are a couple of places where an incorrect segment can be referenced. First, the descriptor index must be with the descriptor table. Bit 2 is the table indicator, and determines whether the GDT (zero) or the LDT (one) is used. The segment might also be accessed in an invalid manner, e.g., writing to a code segment. All these conditions will generate a general protection fault. SPEDE-2000 Lab Manual, CSU Sacramento 81 Really 13 Selector (16) Offset (32) Logical Byte Offset bit index Address Index Into Limit (TI=0, so use GDT) Base Addr Segment Ref Add Offset and Local Descriptor Segment’s Base Segment Address Linear Code or Data Address Descriptor Compare Offset and Segment’s Limit GDTR LDTR Offset >= Limit, then SegFault! Global and Local Descriptor Tables (8,192 entries each) Figure 9-2. Logical to Linear Address Translation (first part) ♦ How the Page Tables Work This section describes how a linear address is translated through the page tables to generate a physical address. The page directory and all the page tables are stored in main memory. If paging is disabled, then the linear address is emitted from the CPU as the physical address. If paging is enabled, the two-level page tables are referenced. As shown in Figure 9-3 below, the linear address is chopped into three fields (described next). Two of those fields index into page tables with 1024 page table entries (PTE). Each PTE contains a physical base address and some status bits. Twenty bits form the base address used in the next level down. The base address from the page table provides the upper 20 address bits of the frame. The CPU will cache portions of the tables in a Translation Look-aside Buffer (TLB). Thus, if it caches two entries, it can now access a 4K chunk of linear memory without having to read those parts again. GENERATING A PHYSICAL ADDRESS This base address of the segment is added to the offset from the memory reference to generate a logical address. If paging is enabled, CPU’s memory interface unit (MIU) gets a chance to change this address. The linear address is split into three pieces. The top two fields are used as index values into the page tables for the current address space. The pages tables form a sparse, two-level, 1024-ary tree, anchored by the CPU’s CR3 (page directory base register) register. The upper 10 bits are combined with CR3 to find the appropriate page directory. Address bits 31 to 22 index into the directory to get a page table pointer. Address bits 21 to 12 are used to select the page table entry with the frame’s base address. This base is combined to the lower 12 bits (page offset) to get SPEDE-2000 Lab Manual, CSU Sacramento 82 Byte Offset Linear Address Index Into Limit msb lsb Base Addr Page Page Table Page Frame Segment Ref Directory Index Offset Index (10) (10) (12) Combine Offset and Frame’s Base Address PDBR (CR3) Physical Address Page Directory Tables Figure 9-3. Logical to Physical Address Translation (second part) a physical address inside the page frame. Each index is 10 bits, so it can index 1024 different page entries. Each page entry is 4 bytes, therefore each page table is 4K bytes in size. This is also the size of a page frame! When paging is enabled, the two-level page tables are referenced. The upper ten bits index into a page directory structure. Each page table entry (PTE) contains a physical base address and some status bits. Twenty bits form this physical base address, and they are combined with the lower twelve bits of the linear address (a perfect match) to finally generate the physical address. (The status bits are masked out when forming an address.) If either the page directory or PTE is marked not present, a page fault will occur. Register CR2 will contain the virtual address that caused the fault.

Chapter 9. Address Translation and Virtual Memory

Lab 7: Floating-Point Addition 0.0

The Hexadecimal Number System and Memory Addressing

POINTER (IN C/C++) What Is a Pointer?

Geekos Overview

SOS Internals

Assignment No. 6 Aim: Write X86/64 ALP To

Subtyping Recursive Types

A Variable Precision Hardware Acceleration for Scientific Computing Andrea Bocco

Bringing Virtualization to the X86 Architecture with the Original Vmware Workstation

Virtual Memory in X86

Supervisor-Mode Virtualization for X86 in Vdebug

CS31 Discussion 1E Spring 17’: Week 08