<<

CCChahahaptptpteeerrr 222 MMMeeemmmooorrryyy AAAddddddrrreeessssssiiingngng

Hsung-Pin Chang Department of Computer Science National Chung Hsing University Outline • Memory address • Segmentation – Segmentation in Hardware – Segmentation in • Paging – Paging in Hardware – Paging in Linux Memory Address • Previously, a memory address is the way to access the memory cell – However, with 80x86, we have to specify the “address” precisely • Memory address – Logical address – Linear address ( or virtual address ) – Physical address Memory Address (Cont.) • Logical address – Used to specify the address of an operand or of an instruction – Consist of a segment and an offset values • Linear address – A single 32-bit unsigned integer that can be used to address up to 4GB Memory Address (Cont.) • Physical address – Used to address memory cells in memory chips – Represented as 32-bit unsigned integers

logical address SEGMENTATION Linear address PAGING Physical address UNIT (HW) UNIT (HW) Segmentation in Hardware • Segmentation Registers • Segment Descriptors • Fast Access to Segment Descriptors • Segmentation Unit Segmentation Registers • A logical address consists of two parts – A segment identifier • A 16-bit field called Segment Selector – An offset that specifies the relative address within the segment • A 32-bit field Segmentation Registers (Cont.) • To quickly retrieve Segment Selectors, provides segmentation registers – cs: code segment register – ss: stack segment register – ds: data segment register – es, fsand gs: general purpose Segmentation Registers (Cont.) • csregister has a 2-bit field that specified the Current Privilege Level (CPL) of the CPU – 0: highest privilege level – 3: the lowest one • Linux use only level 0 and 3 as Kernel Mode and User Mode Segment Descriptors

• Each segment is represented by an 8- Segment Descriptor • Segment Descriptors are stored either in Global Descriptor Table (GDT) or in Local Descriptor Table (LDT) – gdtrregister points to the address of GDT in memory – ldtrregister points to the address of LDT in memory Segment Descriptors (Cont.) • Each Segment Descriptor consists of the following fields – 32-bit Base field: contain the linear address of the segment – A G granularity flag • 0: the segment size is expressed in byte • 1: it is expressed in multiple of 4096 – 20-bit Limit field: denote the segment length Segment Descriptors (Cont.) – S (system) flag • 0: a system segment that stores kernel data structure • 1: a normal code or data segment – 4-bit Type field: characterizes the segment type and its access right – DPL (Descriptor Privilege Level): 2-bit field represent the minimal CPU privilege level requested for accessing the segment Segment Descriptors (Cont.) – Segment-Present flag • 0: not stored in memory • 1: in memory • Linux always set to 1 since it never swap out whole segment to disk Segment Descriptors (Cont.) • Code Segment Descriptor • Data Segment Descriptor • Task State Segment Descriptor (TSSD) – Refer to a Task State Segment (TSS), that is, a segment used to save the contents of the processor registers – Appear only in the GDT • Local Descriptor Table Descriptor (LDTD) – Refer to a segment containing an LDT – Appear only in the GDT Translating a Logical Address • As stated before – A logical address consist of a 16-bit Segment Selector and a 32-bit Offset – Segmentation registers store the Segment Selector Translating a Logical Address (Cont.) • Each Segment Selector – A 13-bit index identifies the Segment Descriptor entry contained in the GDT or LDT – A TI (Table Indicator): • 0: the Segment Descriptor is in GDT • 1: the Segment Descriptor is in GDT – A 2-bit RPL (Requestor Privilege Level) field • Equal to the Current Privilege Level in the cs register when loaded Translating a Logical Address (Cont.) 15 Selector 0 31 0 logical address Index TI Offset

gdtor ldt

gdtror ldtr

Segment + Descriptor

linear address Segment Descriptor Table Fast Access to Segment Descriptors • Intel also provides an additional nonprogrammable register for each segmentation register – Contains the 8-byte Segment Descriptor specified by the corresponding segmentation register – Once a Segment Selector is loaded in a segmentation register • The corresponding Segment Descriptor is also loaded into the matching nonprogrammable register – Thus, translations of logical address can be performed without accessing the GDT or LDT in main memory Segment Selector and Segment Descriptor

Segment Selector Nonprogrammable Register 15 0 Segment Selector Segment Descriptor

Segment Descriptor

Segment Descriptor Table Segment Segmentation in Linux • Segmentation and Paging are similar since they both separate the physical address space of process • Linux prefers paging to segmentation since – Memory management is simpler when they share the same set of linear address – To portable since RISC architecture have limited support for segmentation Segmentation in Linux (Cont.) • Linux uses segmentation in a very limited way – Only when required by the 80x86 CPU – All processes use the same logical addressed – Try to store all Segment Descriptor in the GDT Segmentation in Linux (Cont.) • Linux uses the following segments – Kernel code segment • Base = 0x00000000 • Limit = 0xfffff • G = 1, that is, expressed in pages • S = 1, for normal code or data segment • Type = 0xa, can be read and executed • DLP = 0, for Kernel Mode – kernel 4GB code at 0x00000000 Segmentation in Linux (Cont.) – Kernel data segment • Base = 0x00000000 • Limit = 0xfffff • G = 1, that is, expressed in pages • S = 1, for normal code or data segment • Type = 2, can be read and written • DLP = 0, for Kernel Mode – kernel 4GB code at 0x00000000 Segmentation in Linux (Cont.) – User code segment shared by all processes in User Mode • Base = 0x00000000 • Limit = 0xfffff • G = 1, that is, expressed in pages • S = 1, for normal code or data segment • Type = 0xa, can be read and executed • DLP = 3, for Kernel Mode – User 4GB code at 0x00000000 Segmentation in Linux (Cont.) – User data segment shared by all processes in User Mode • Base = 0x00000000 • Limit = 0xfffff • G = 1, that is, expressed in pages • S = 1, for normal code or data segment • Type =2, can be read and written • DLP = 3, for Kernel Mode – User 4GB data at 0x00000000 Segmentation in Linux (Cont.) • A Task State Segment (TSS) for each processor – Stored in init_tss array and each segment is 236 bytes • A default Local Descriptor Table (LDT) that is shared by all processes – Include only a single entry consists of a null Segment Descriptor Segmentation in Linux (Cont.) – Four segments related to the Advanced Power Management (APM) support • APM consists of a set of BIOS routines devoted to the management of the power states of the system • Two data segments and two code segments for APM related kernel functions • Figure 2.5 Segmentation in Linux (Cont.) • Thus, when switch from Kernel Mode to User Mode – The dsregister original contains the Segment Selector of the kernel data segment – Change to the Segment Selector of the user data segment – The ssregister also have to be changed accordingly Paging in Hardware • Paging unit translates linear addresses into physical one • Linear addresses are grouped in fixed- length intervals called pages – The corresponding unit in RAM is called page frames • In 80x86, paging is enabled by setting the PG flag in cr0 control register. Otherwise, paging is disable. Paging by 80x86 • Each page is 4KB • The 32 bits of a liner address are divided into three fields – Directory: the most significant 10 bits – Table: the intermediate 10 bits – Offset: the least significant 12 bits Paging by 80x86 Linear address 31 22 21 12 11 0 Directory Table Offset

12 10 Page directory 10 Physical Address

Page-Table Entry

Directory Entry

32 Page frame CR3 Entries of Page Directories and Page Tables • Present flag – 1: in memory – 0: not in memory and the remaining entry bits may be used by O.S. • Paging unit stores the linear address in a control register named cr2 and generate the exception Entries of Page Directories and Page Tables (Cont.)

• Field containing the 20 most significant bits of a page frame physical address – Since a page frame is 4KB, thus, to access a page, the 12 least significant bits of the physical address are always set to zero – If the field refers to a Page Directory, the page frame contains a Page Table – If it refers to a Page Table, the page frame contains a page of data Entries of Page Directories and Page Tables (Cont.) • Access flag – Set when paging unit addresses the corresponding page frame – Used by O.S. to select page to be swapped out – Never reset by paging unit, but only by O.S. Entries of Page Directories and Page Tables (Cont.) • Dirty page – Applied only to Page Table entries – Set each time a write operation is performed – Used by O.S. to select page to be swapped out – Never reset by paging unit, but only by O.S. Entries of Page Directories and Page Tables (Cont.) • Read/Write flag – Contain the access right of the page or of the Page Table • User/Supervisor flag – The privilege level required to access the page or Page Table • PCD ( Disable) and PWT (Page Write-Through) – Intel allow a different cache management policy with each page frame Entries of Page Directories and Page Tables (Cont.) • Page Size flag – Applies only to Page Directory entries – If set, refer to 2 MB or 4MB page frame • Global flag – Prevent frequently used page from being flushed from the TLB cache Extended Paging • Extended Paging – Allow page frames to be 4MB – The 32-bit linear address is divided into • Directory: the most significant 10 bit • Offset: the remaining 22 bits – Does not need intermediate Page Tables and thus save memory and preserve TLB entries Extended Paging (Cont.)

Linear address 31 22 21 12 11 0 Directory Offset

12 10

Physical Address

Directory Entry

CR3 32 Page Directory Page frame Hardware Protection Scheme • Segment – Allow four possible privilege levels – Three types of access rights (Read, Write, Execute) • Pages – Only two privilege levels – Only two types of access right (Read and Write) Three-Level Paging • In 64-bit architecture – Three-level paging is used instead of two-level paging – Otherwise, Page Directory and Page Table would include too much entries Physical Address Extension (PAE) Paging • From , Intel CPU can address up to 2^36 = 64 GB of RAM – A new paging mechanism must be introduced that translates 32-bit linear address into 36-bit physical address – Physical Address Extension (PAE) • Enable the PAE flag in the cr4 control register Physical Address Extension (PAE) Paging (Cont.) • 64 GB of RAM are split into 2^24 page frames – Physical address field of Page Table entries expended from 20 to 24 bits – Each Page Table entry size is double from 32 bits to 64 bits • A new level of Page Table called the Page Directory Pointer Table (PDPT) – Consist of four 64-bit entries Physical Address Extension (PAE) Paging (Cont.) • cr3 contains the Page Directory Pointer Table base address field • When page is 4KB, the 32 bit linear address is divided by – Bits 31-30: Point to one of 4 possible entries in PDPT – Bits 29-21: Point to one of 512 possible entries in Page Directory – Bits 20-21: Point to one of 512 possible entries in Page Table – Bits 11-0: Offset of 4 KB Physical Address Extension (PAE) Paging (Cont.) • When page is 2MB, the 32 bit linear address is divided by – Bits 31-30: Point to one of 4 possible entries in PDPT – Bits 29-21: Point to one of 512 possible entries in Page Directory – Bits 20-0: Offset of 2 MB Hardware Cache • Hardware cache works based on the locality principle • Cache management scheme – Direct mapped – Fully associated – N-way set associated Hardware Cache • CD flag of the cr0 register – Enable or disable the cache circuitry • NW flag of the cr0 register – Whether write-through or write-back is used • Intel also allow O.S. uses a different cache management policy for each page frame – PCD (Page Cache Disable) – PWT (Page Write-Through Translation Lookaside Buffer (TLB) • Keep a set of the association of a linear address with its physical address in a cache • This cache is called TLB – Speed up the linear address translation Paging in Linux • Linux adopted a three-level paging model so paging is feasible on 64-bit architectures – Page Global Directory – Page Middle Directory – Page Table Linux Paging Model Linear address Global Dir Middle Dir Table Offset

Page Global Directory Physical Address

Page Middle Directory 32 Page frame CR3 Paging in Linux (Cont.) • Linux’shandling of processes relies heavily on paging – Assign a different physical address space to each process, ensuring an efficient protection against addressing errors – Distinguish pages from page frames (physical address in main memory) • A page can be stored in different page frames for time to time Paging in Linux (Cont.) • Each process has its own Page Global Directory and its own set of Page Tables • When a process switch occurs – Linux saves the cr3 control register in the descriptor of current process and load cr3 with the value stored in the new process’s descriptor Paging in Linux (Cont.) • When this three-level paging model is applied to the Pentium, which uses only two types of Page Tables – Linux eliminates the Page Middle Directory field – However, the kernel still kept Page Middle Directory by setting the number of entries in it to 1 Paging in Linux (Cont.)

• When Linux uses the Physical Address Extension (PAE) mechanism of the Pentium Pro and later processors – Linux’sPage Global Directory = 80x86’s Page Directory Pointer Table – Linux’sPage Middle Directory = 80x86’s Page Directory Table – Linux’sPage = 80x86’s Page Table Reserved Page Frames • The kernel’s code and data structures are stored in a group of reserved page frames – The page frames in one of these page frames can never by dynamically assigned or swapped to disk Reserved Page Frames (Cont.) • The is installed in RAM starting from the physical address 0x00100000 ( = 1MB) – The total number of page frames required depends on how the kernel is configured, often less than 2MBs of RAM Reserved Page Frames (Cont.) • Why isn’t the kernel loaded starting with 0x00000000 – Page frame 0 is used by BIOS to store the system hardware configuration detected during the Power-On Self-Test (POST) – Physical addresses ranging from 0x000a0000 to 0x000fffff are usually reserved to BIOS routines and to map the internal memory of ISA graphics card • The well-known hole from 640 KB to 1 MB in all IBM- compatible PCs Reserved Page Frames (Cont.) – Additional pages frames within the first megabyte may be reserved by specific computer models • In the boot sequence, the kernel queries the BIOS and learns the size of the physical memory – In recent computers, the kernel also invokes a BIOS procedure to build a list of physical address ranges and their corresponding memory types Example of BIOS-provided physical address map • StartEnd Type • 0x00000000 0x0009ffff Usable • 0x000f0000 0x000fffff Reserved • 0x00100000 0x07feffff Usable • 0x07ff0000 0x07ff2fff ACPI data • 0x07ff3000 0x07ffffff ACPI NVS • 0xffff0000 0xffffffff Reserved Reserved Page Frames (Cont.) • To avoid loading the kernel into groups of noncontiguous page frames – Linux prefers to skip the first megabyte of RAM – Figure 2-12 shows the first 2MB of RAM are filled by Linux • Kernel code: _text ~ _etext • Kernel data -initialized: _etext~ _edata • Kernel data – uninitialized: _edata~ _end Process Page Tables • The linear address space of a process is divided into two parts – Linear addresses from 0x00000000 to oxbfffffff( =3GB – 1) can be addressed when the process is in either User or Kernel Mode – Linear addresses from 0xc0000000 ( =3GB) to 0xffffffff (=4GB-1) can be addressed only when the process is in Kernel Mode Process Page Tables (Cont.) • The content of the first 768 entries when PAE disable of the Page Global Directory that map linear addresses lower than 0xc0000000 depends on the specific process – In User Mode, depends on which processes runs • However, the remaining entries show be the same for all process and equal to the corresponding entries of the master Kernel Page Global Directory Table – In Kernel Mode, execute the kernel code Kernel Page Table • The kernel maintains a set of Page Tables, rooted at a so-called master kernel Page Global Directory – However, it never directly used by another process or kernel thread – It is just a reference model for the corresponding entries of the Page Global Directories of every regular process Kernel Page Table (Cont.) • The kernel initializes its own Page Tables by a two-phase activity – Phase 1: the kernel creates a limited 8 MB address space, which is enough for it to install itself in RAM – Phase 2: the kernel takes advantages of all of the existing RAM and sets up the paging table properly • The details is skipped Handling the Hardware Cache and the TLB • To optimize the cache hit rate, the kernel makes the following decisions – The most frequently used fields of a data structure are placed at the low offset with the data structure sot they can be cached in the same line – When allocating a large set of data structures, the kernel tries to store each of them in memory so that all cache lines are used uniformly – When performing a process switch the kernel has a small preference for processes that use the same set of Page Tables as the previously running process. Handling the Hardware Cache and the TLB (Cont.) • When a process switch occurs, the set of active Page Tables are also cached – Local TLB entries relative to the old Page Tables must be flushed – Performed automatically when the kernel writes the address of the new Page Global Directory into the cr3 register Handling the Hardware Cache and the TLB (Cont.) – However, in some cases, the kernel succeeds in avoiding TLB flushes • When performing a process switch between two regular processes that use the same set of Page Tables • When performing a process switch between a regular process and a kernel thread – A kernel thread does not have its own set of Page Tables, rather, it makes uses of the set of Page Tables belonging to a regular process