Revisiting Swapping in User-Space with Lightweight Threading

Total Page:16

File Type:pdf, Size:1020Kb

Revisiting Swapping in User-Space with Lightweight Threading Revisiting Swapping in User-space with Lightweight Threading Kan Zhongy, Wenlin Cuiy, Youyou Lux, Quanzhang Liuy, Xiaodan Yany, Qizhao Yuany, Siwei Luoy, and Keji Huangy yHuawei Technologies Co., Ltd Chengdu, China xDepartment of Computer Science and Technology, Tsinghua University Beijing, China Abstract Based on the virtual memory system, existing OS pro- Memory-intensive applications, such as in-memory databases, vides swapping to enlarge the main memory by writing caching systems and key-value stores, are increasingly de- inactive pages to a backend store, which today is usually manding larger main memory to fit their working sets. Con- backed by SSDs. Compared to SSDs, DRAM still has orders ventional swapping can enlarge the memory capacity by of magnitude performance advantage, providing memory- paging out inactive pages to disks. However, the heavy I/O like performance by paging with SSDs has been explored stack makes the traditional kernel-based swapping suffers for decades [3–13] and still remains great challenges. Espe- from several critical performance issues. cially, we find that the heavy kernel I/O stack introduces In this paper, We redesign the swapping system and pro- large performance penalty, more than 40% of the time is cost pose LightSwap, an high-performance user-space swapping by the I/O stack when swapping in/out a signal page, and scheme that supports paging with both local SSDs and re- this number will keep increasing if ultra-low latency storage mote memories. First, to avoids kernel-involving, a novel media, such as Intel Optane [14] and KIOXIA XL-FLash [15] page fault handling mechanism is proposed to handle page are adopted as the backend stores. faults in user-space and further eliminates the heavy I/O To avoid the high-latency of paging with SSDs, memory stack with the help of user-space I/O drivers. Second, we disaggregation architecture [16–20] proposes to expose a co-design Lightswap with light weight thread (LWT) to im- global memory address space to all machines to improve prove system throughput and make it be transparent to user memory utilization and avoid memory over-provisioning. applications. Finally, we propose a try-catch framework in However, existing memory disaggregation proposals require Lightswap to deal with paging errors which are exacerbated new hardware supports [17, 18, 21–23], making these solu- by the scaling in process technology. tions infeasible and expensive in real production environ- We implement Lightswap in our production-level system ments. Fortunately, recent researches, such as Infiniswap [24], and evaluate it with YCSB workloads running on memcached. Fluidmem [25], and AIFM [26] have shown that paging or Results show that Ligthswap reduces the page faults han- swapping with remote memory is a promising solution to en- dling latency by 3–5 times, and improves the throughput of able memory disaggregation. However, we still find several memcached by more than 40% compared with the stat-of-art critical issues of these existing approaches. swapping systems. First, kernel-based swapping, which relies on the heavy kernel I/O stack exhibits large software stack overheads, CCS Concepts: • Software and its engineering ! Oper- making it cannot fully exploit the high-performance and low- ating systems. arXiv:2107.13848v1 [cs.OS] 29 Jul 2021 latency characteristics of emerging storage media (e.g., Intel Optane) or networks (e.g., RDMA). We measured the remote Keywords: user-space swapping, memory disaggregation, memory access latency of Infiniswap, which is based the light weight thread kernel swapping, can be as high as 40`s even using one-side RDMA. Further, modern applications exhibit diverse mem- 1 Introduction ory access patterns, kernel-based swapping fails to provide Memory-intensive applications[1, 2], such as in-memory enough flexibility to customize the eviction and prefetch- databases, caching systems, in-memory key-value stores are ing policy. Second, recent research [25] has also explored increasingly demanding more and more memory to meet the user-space swapping with userfaultfd, which however their low-latency and high-throughput requirements as these needs extra memory copy operations and exhibits ultra-high applications often experience significant performance loss page fault latency under high-concurrency (i.e., 107`s un- once their working set cannot fit in memory. Therefore, ex- der 64 threads), leading to systems that based on userfaultfd tending the memory capacity becomes a mission-critical task cannot tolerate frequent page faults. Finally, new proposals for both researchers and system administrators. like AIFM [26] and Semeru [27], that do not rely on virtual 40 memory abstraction, provide a runtime-managed swapping Disk Kernel stack User stack to applications and can largely reduce the I/O amplification. 30 60.2% 56.6% However, these schemes break the compatibility and require 66.5% 62.9% 20 large efforts to rewrite existing applications. 4.1% 6.3% 5.3% 5.2% Therefore, we argue that swapping need to be redesigned 10 Latency (us) Latency to become high-performance and remains transparent to 0 applications. In this paper, we propose Lightswap, an user- 4K-Rnd-Rd 4K-Rnd-Wr 8K-Rnd-Rd 8K-Rnd-Wr space swapping that supports paging with both SSDs and Figure 1. remote memories. First, Lightswap handles page faults in Random read and write latency breakdown. The user-space and utilizes the high-performance user I/O stack numbers on top of each bar denote the relative fraction of I/O to eliminate the software stack overheads. To this end, we stack time in the total latency. propose an ultra-low latency page fault handling mechanism 2 Background and Motivation to handle page faults in user-space (§4.2). Second, when page faults happen, existing swapping schemes will block the 2.1 Swapping faulting thread to wait for data fetch, which lowers the sys- Existing swapping approaches [3–10] depend on the existing tem throughput. In Lightswap, we co-design swapping with kernel data path that is optimized for slow block devices, light weight thread (LWT) to achieve both high-performance both reading and writing pages from/to the backend store and application-transparent (§4.3). When page fault happens, would introduce high software stack overheads. Figure1 leveraging the low context switch cost of LWT, we use a ded- compares the I/O latency breakdown for random read and icated swap-in LWT to handle page fault and allow other write of XL-Flash [15] while using the kernel data path and LWTs to occupy the CPU while waiting for data fetch. Finally, user-space SPDK [28] driver. The figure shows that over half due to the scaling in process technology and the ever increas- (56.6% – 66.5%) of the time is spent on the kernel I/O stack ing capacity, we find that both DRAM and SSD become more for both read and write while using the kernel data path. prone to errors. Therefore, we propose a try-catch exception This overhead mostly comes from the generic block layer framework in Lightswap to deal with paging errors (§4.4). and device driver. Comparably, the I/O stack overhead is We evaluate Lightswap with memcached workloads. Eval- negligible while using the user-space SPDK driver. uation results show that Lightswap can reduce the page fault Besides using local SSDs as the backend store, the high handling latency by 3–5 times and improve the throughput bandwidth and low latency RDMA network offers the oppor- by more than 40% when compared to Infiniswap. In summary, tunity for swapping pages to remote memories. The newly we make the following contributions: proposed memory disaggregation architecture [16–20] takes a first look on remote memory paging. In disaggregation model, computing nodes can be composed of large amount of memory borrowing space from remote memory servers. Ex- • We propose an user-space page fault handling mecha- isting works, such as Infiniswap [24] and FluidMem [25] have nism that can achieve ultra-low page fault notification showed that swapping to remote memories is a promising so- latency (i.e., 2.4`s under 64 threads). lution for memory disaggregation. Nevertheless, Infniswap • We demonstrate that with the help of LWT, user-space exposes remote memory as a local block device, paging in/out swapping system can achieve both high-performance still needs to go through the whole kernel I/O stack. Our and application-transparent. evaluation shows that the remote memory access in Infin- • We propose a try-catch based exception framework to iswap can as high as 40`s, which is about 10x higher than handle paging errors. To best of our knowledge, we the latency of a 4KB page RDMA read. The difference is all are the first work to explore handling both memory caused by the slow kernel data path. errors and storage device errors in a swapping system. • We show the performance benefits of Lightswap with 2.2 Linux eBPF YSCB workloads on memcached, and compare it to other swapping schemes. eBPF (for extended Berkeley Packet Filter) is a general virtual machine that running inside the Linux kernel. It provides an instruction set and an execution environment to run eBPF programs in kernel. Thus, user-space applications can instru- The rest of the paper is organized as follows. Section2 ment the kernel by eBPF programs without changing kernel presents the background and motivation. Section3 and4 source code or loading kernel modules. eBFP programs are discuss our design considerations and the design details of written in a special assembly language. Figure2 shows how Lightswap, respectively. Section5 presents the implementa- eBPF works. As shown, eBPF bytecode can be loaded into tion and evaluation of Lightswap. We cover the related work the kernel using bpf() system call.
Recommended publications
  • Memory Deduplication: An
    1 Memory Deduplication: An Effective Approach to Improve the Memory System Yuhui Deng1,2, Xinyu Huang1, Liangshan Song1, Yongtao Zhou1, Frank Wang3 1Department of Computer Science, Jinan University, Guangzhou, 510632, P. R. China {[email protected][email protected][email protected][email protected]} 2Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences Beijing, 100190, PR China 3 School of Computing,University of Kent, CT27NF, UK [email protected] Abstract— Programs now have more aggressive demands of memory to hold their data than before. This paper analyzes the characteristics of memory data by using seven real memory traces. It observes that there are a large volume of memory pages with identical contents contained in the traces. Furthermore, the unique memory content accessed are much less than the unique memory address accessed. This is incurred by the traditional address-based cache replacement algorithms that replace memory pages by checking the addresses rather than the contents of those pages, thus resulting in many identical memory contents with different addresses stored in the memory. For example, in the same file system, opening two identical files stored in different directories, or opening two similar files that share a certain amount of contents in the same directory, will result in identical data blocks stored in the cache due to the traditional address-based cache replacement algorithms. Based on the observations, this paper evaluates memory compression and memory deduplication. As expected, memory deduplication greatly outperforms memory compression. For example, the best deduplication ratio is 4.6 times higher than the best compression ratio.
    [Show full text]
  • Overcoming Traditional Problems with OS Huge Page Management
    MEGA: Overcoming Traditional Problems with OS Huge Page Management Theodore Michailidis Alex Delis Mema Roussopoulos University of Athens University of Athens University of Athens Athens, Greece Athens, Greece Athens, Greece [email protected] [email protected] [email protected] ABSTRACT KEYWORDS Modern computer systems now feature memory banks whose Huge pages, Address Translation, Memory Compaction aggregate size ranges from tens to hundreds of GBs. In this context, contemporary workloads can and do often consume ACM Reference Format: Theodore Michailidis, Alex Delis, and Mema Roussopoulos. 2019. vast amounts of main memory. This upsurge in memory con- MEGA: Overcoming Traditional Problems with OS Huge Page Man- sumption routinely results in increased virtual-to-physical agement. In The 12th ACM International Systems and Storage Con- address translations, and consequently and more importantly, ference (SYSTOR ’19), June 3–5, 2019, Haifa, Israel. ACM, New York, more translation misses. Both of these aspects collectively NY, USA, 11 pages. https://doi.org/10.1145/3319647.3325839 do hamper the performance of workload execution. A solu- tion aimed at dramatically reducing the number of address translation misses has been to provide hardware support for 1 INTRODUCTION pages with bigger sizes, termed huge pages. In this paper, we Computer memory capacities have been increasing sig- empirically demonstrate the benefits and drawbacks of using nificantly over the past decades, leading to the development such huge pages. In particular, we show that it is essential of memory hungry workloads that consume vast amounts for modern OS to refine their software mechanisms to more of main memory. This upsurge in memory consumption rou- effectively manage huge pages.
    [Show full text]
  • Murciano Soto, Joan; Rexachs Del Rosario, Dolores Isabel, Dir
    This is the published version of the bachelor thesis: Murciano Soto, Joan; Rexachs del Rosario, Dolores Isabel, dir. Anàlisi de presta- cions de sistemes d’emmagatzematge per IA. 2021. (958 Enginyeria Informàtica) This version is available at https://ddd.uab.cat/record/248510 under the terms of the license TFG EN ENGINYERIA INFORMATICA,` ESCOLA D’ENGINYERIA (EE), UNIVERSITAT AUTONOMA` DE BARCELONA (UAB) Analisi` de prestacions de sistemes d’emmagatzematge per IA Joan Murciano Soto Resum– Els programes d’Intel·ligencia` Artificial (IA) son´ programes que fan moltes lectures de fitxers per la seva naturalesa. Aquestes lectures requereixen moltes crides a dispositius d’emmagatzematge, i aquestes poden comportar endarreriments en l’execucio´ del programa. L’ample de banda per transportar dades de disc a memoria` o viceversa pot esdevenir en un bottleneck, incrementant el temps d’execucio.´ De manera que es´ important saber detectar en aquest tipus de programes, si les entrades/sortides (E/S) del nostre sistema se saturen. En aquest treball s’estudien diferents programes amb altes quantitats de lectures a disc. S’utilitzen eines de monitoritzacio,´ les quals ens informen amb metriques` relacionades amb E/S a disc. Tambe´ veiem l’impacte que te´ el swap, el qual tambe´ provoca un increment d’operacions d’E/S. Aquest document preten´ mostrar la metodologia utilitzada per a realitzar l’analisi` descrivint les eines i els resultats obtinguts amb l’objectiu de que serveixi de guia per a entendre el comportament i l’efecte de les E/S i el swap. Paraules clau– E/S, swap, IA, monitoritzacio.´ Abstract– Artificial Intelligence (IA) programs make many file readings by nature.
    [Show full text]
  • Asynchronous Page Faults Aix Did It Red Hat Author Gleb Natapov August 10, 2010
    Asynchronous page faults Aix did it Red Hat Author Gleb Natapov August 10, 2010 Abstract Host memory overcommit may cause guest memory to be swapped. When guest vcpu access memory swapped out by a host its execution is suspended until memory is swapped back. Asynchronous page fault is a way to try and use guest vcpu more efficiently by allowing it to execute other tasks while page is brought back into memory. Part I How KVM Handles Guest Memory and What Inefficiency it Has With Regards to Host Swapping Mapping guest memory into host memory But we do it on demand Page fault happens on first guest access What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn()
    [Show full text]
  • What Is an Operating System III 2.1 Compnents II an Operating System
    Page 1 of 6 What is an Operating System III 2.1 Compnents II An operating system (OS) is software that manages computer hardware and software resources and provides common services for computer programs. The operating system is an essential component of the system software in a computer system. Application programs usually require an operating system to function. Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers.
    [Show full text]
  • Strict Memory Protection for Microcontrollers
    Master Thesis Spring 2019 Strict Memory Protection for Microcontrollers Erlend Sveen Supervisor: Jingyue Li Co-supervisor: Magnus Själander Sunday 17th February, 2019 Abstract Modern desktop systems protect processes from each other with memory protection. Microcontroller systems, which lack support for memory virtualization, typically only uses memory protection for areas that the programmer deems necessary and does not separate processes completely. As a result the application still appears monolithic and only a handful of errors may be detected. This thesis presents a set of solutions for complete separation of processes, unleash- ing the full potential of the memory protection unit embedded in modern ARM-based microcontrollers. The operating system loads multiple programs from disk into indepen- dently protected portions of memory. These programs may then be started, stopped, modified, crashed etc. without affecting other running programs. When loading pro- grams, a new allocation algorithm is used that automatically aligns the memories for use with the protection hardware. A pager is written to satisfy additional run-time demands of applications, and communication primitives for inter-process communication is made available. Since every running process is unable to get access to other running processes, not only reliability but also security is improved. An application may be split so that unsafe or error-prone code is separated from mission-critical code, allowing it to be independently restarted when an error occurs. With executable and writeable memory access rights being mutually exclusive, code injection is made harder to perform. The solution is all transparent to the programmer. All that is required is to split an application into sub-programs that operates largely independently.
    [Show full text]
  • Measuring Software Performance on Linux Technical Report
    Measuring Software Performance on Linux Technical Report November 21, 2018 Martin Becker Samarjit Chakraborty Chair of Real-Time Computer Systems Chair of Real-Time Computer Systems Technical University of Munich Technical University of Munich Munich, Germany Munich, Germany [email protected] [email protected] OS program program CPU .text .bss + + .data +/- + instructions cache branch + coherency scheduler misprediction core + pollution + migrations data + + interrupt L1i$ miss access + + + + + + context mode + + (TLB flush) TLB + switch data switch miss L1d$ +/- + (KPTI TLB flush) miss prefetch +/- + + + higher-level readahead + page cache miss walk + + multicore + + (TLB shootdown) TLB coherency page DRAM + page fault + cache miss + + + disk + major minor I/O Figure 1. Event interaction map for a program running on an Intel Core processor on Linux. Each event itself may cause processor cycles, and inhibit (−), enable (+), or modulate (⊗) others. Abstract that our measurement setup has a large impact on the results. Measuring and analyzing the performance of software has More surprisingly, however, they also suggest that the setup reached a high complexity, caused by more advanced pro- can be negligible for certain analysis methods. Furthermore, cessor designs and the intricate interaction between user we found that our setup maintains significantly better per- formance under background load conditions, which means arXiv:1811.01412v2 [cs.PF] 20 Nov 2018 programs, the operating system, and the processor’s microar- chitecture. In this report, we summarize our experience on it can be used to improve high-performance applications. how performance characteristics of software should be mea- CCS Concepts • Software and its engineering → Soft- sured when running on a Linux operating system and a ware performance; modern processor.
    [Show full text]
  • Hiding Process Memory Via Anti-Forensic Techniques
    DIGITAL FORENSIC RESEARCH CONFERENCE Hiding Process Memory via Anti-Forensic Techniques By: Frank Block (Friedrich-Alexander Universität Erlangen-Nürnberg (FAU) and ERNW Research GmbH) and Ralph Palutke (Friedrich-Alexander Universität Erlangen-Nürnberg) From the proceedings of The Digital Forensic Research Conference DFRWS USA 2020 July 20 - 24, 2020 DFRWS is dedicated to the sharing of knowledge and ideas about digital forensics research. Ever since it organized the first open workshop devoted to digital forensics in 2001, DFRWS continues to bring academics and practitioners together in an informal environment. As a non-profit, volunteer organization, DFRWS sponsors technical working groups, annual conferences and challenges to help drive the direction of research and development. https://dfrws.org Forensic Science International: Digital Investigation 33 (2020) 301012 Contents lists available at ScienceDirect Forensic Science International: Digital Investigation journal homepage: www.elsevier.com/locate/fsidi DFRWS 2020 USA d Proceedings of the Twentieth Annual DFRWS USA Hiding Process Memory Via Anti-Forensic Techniques Ralph Palutke a, **, 1, Frank Block a, b, *, 1, Patrick Reichenberger a, Dominik Stripeika a a Friedrich-Alexander Universitat€ Erlangen-Nürnberg (FAU), Germany b ERNW Research GmbH, Heidelberg, Germany article info abstract Article history: Nowadays, security practitioners typically use memory acquisition or live forensics to detect and analyze sophisticated malware samples. Subsequently, malware authors began to incorporate anti-forensic techniques that subvert the analysis process by hiding malicious memory areas. Those techniques Keywords: typically modify characteristics, such as access permissions, or place malicious data near legitimate one, Memory subversion in order to prevent the memory from being identified by analysis tools while still remaining accessible.
    [Show full text]
  • Guide to Openvms Performance Management
    Guide to OpenVMS Performance Management Order Number: AA–PV5XA–TE May 1993 This manual is a conceptual and tutorial guide for experienced users responsible for optimizing performance on OpenVMS systems. Revision/Update Information: This manual supersedes the Guide to VMS Performance Management, Version 5.2. Software Version: OpenVMS VAX Version 6.0 Digital Equipment Corporation Maynard, Massachusetts May 1993 Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. Possession, use, or copying of the software described in this publication is authorized only pursuant to a valid written license from Digital or an authorized sublicensor. © Digital Equipment Corporation 1993. All rights reserved. The postpaid Reader’s Comments forms at the end of this document request your critical evaluation to assist in preparing future documentation. The following are trademarks of Digital Equipment Corporation: ACMS, ALL–IN–1, Bookreader, CI, DBMS, DECnet, DECwindows, Digital, MicroVAX, OpenVMS, PDP–11, VAX, VAXcluster, VAX DOCUMENT, VMS, and the DIGITAL logo. All other trademarks and registered trademarks are the property of their respective holders. ZK4521 This document was prepared using VAX DOCUMENT Version 2.1. Contents Preface ............................................................ ix 1 Introduction to Performance Management 1.1 Knowing Your Workload ....................................... 1–2 1.1.1 Workload Management .................................... 1–3 1.1.2 Workload Distribution ..................................... 1–4 1.1.3 Code Sharing ............................................ 1–4 1.2 Evaluating User Complaints .
    [Show full text]
  • Virtual Memory §5.4 Virt Virtual Memory U Al Memo Use Main Memory As a “Cache” For
    Chapter 5 Large and Fast: Exppgloiting Memory Hierarchy Part II Virtual Memory §5.4 Virt Virtual Memory u al Memo Use main memory as a “cache” for secondary (disk) storage r y Managed jointly by CPU hardware and the operating system (OS) Programs share main memory Each gets a private virtual address space holdinggqy its frequently used code and data Protected from other programs CPU and OS translate virtual addresses to physical addresses VM “block” is called a page VM translation “miss” is called a page fault Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 2 Address Translation Fixed-size pages (e.g., 4K) Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 3 Page Fault Penalty On page fault, the page must be fetched from disk Takes millions of clock cycles Handled by OS code Try to minimize page fault rate Fully associative placement Smart replacement algorithms Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 4 Page Tables Stores placement information Array of page table entries, indexed by virtual page number Page table register in CPU points to page table in physical memory If page is present in memory PTE stores the physical page number Plus other status bits (referenced, dirty, …) If page is not present PTE can refer to location in swap space on dis k Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 5 Translation Using a Page Table Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 6 Mapping Pages to Storage Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — 7 Replacement and Writes
    [Show full text]
  • Virtual Memory - Paging
    Virtual memory - Paging Johan Montelius KTH 2020 1 / 32 The process code heap (.text) data stack kernel 0x00000000 0xC0000000 0xffffffff Memory layout for a 32-bit Linux process 2 / 32 Segments - a could be solution Processes in virtual space Address translation by MMU (base and bounds) Physical memory 3 / 32 one problem Physical memory External fragmentation: free areas of free space that is hard to utilize. Solution: allocate larger segments ... internal fragmentation. 4 / 32 another problem virtual space used code We’re reserving physical memory that is not used. physical memory not used? 5 / 32 Let’s try again It’s easier to handle fixed size memory blocks. Can we map a process virtual space to a set of equal size blocks? An address is interpreted as a virtual page number (VPN) and an offset. 6 / 32 Remember the segmented MMU MMU exception no virtual addr. offset yes // < within bounds index + physical address segment table 7 / 32 The paging MMU MMU exception virtual addr. offset // VPN available ? + physical address page table 8 / 32 the MMU exception exception virtual address within bounds page available Segmentation Paging linear address physical address 9 / 32 a note on the x86 architecture The x86-32 architecture supports both segmentation and paging. A virtual address is translated to a linear address using a segmentation table. The linear address is then translated to a physical address by paging. Linux and Windows do not use use segmentation to separate code, data nor stack. The x86-64 (the 64-bit version of the x86 architecture) has dropped many features for segmentation.
    [Show full text]
  • Chapter 4 Memory Management and Virtual Memory Asst.Prof.Dr
    Chapter 4 Memory management and Virtual Memory Asst.Prof.Dr. Supakit Nootyaskool IT-KMITL Object • To discuss the principle of memory management. • To understand the reason that memory partitions are importance for system organization. • To describe the concept of Paging and Segmentation. 4.1 Difference in main memory in the system • The uniprogramming system (example in DOS) allows only a program to be present in memory at a time. All resources are provided to a program at a time. Example in a memory has a program and an OS running 1) Operating system (on Kernel space) 2) A running program (on User space) • The multiprogramming system is difference from the mentioned above by allowing programs to be present in memory at a time. All resource must be organize and sharing to programs. Example by two programs are in the memory . 1) Operating system (on Kernel space) 2) Running programs (on User space + running) 3) Programs are waiting signals to execute in CPU (on User space). The multiprogramming system uses memory management to organize location to the programs 4.2 Memory management terms Frame Page Segment 4.2 Memory management terms Frame Page A fixed-lengthSegment block of main memory. 4.2 Memory management terms Frame Page A fixed-length block of data that resides in secondary memory. A page of data may temporarily beSegment copied into a frame of main memory. A variable-lengthMemory management block of data that residesterms in secondary memory. A segment may temporarily be copied into an available region of main memory or the segment may be divided into pages which can be individuallyFrame copied into mainPage memory.
    [Show full text]