VSWAPPER: a Memory Swapper for Virtualized Environments

Total Page:16

File Type:pdf, Size:1020Kb

VSWAPPER: a Memory Swapper for Virtualized Environments VSWAPPER: A Memory Swapper for Virtualized Environments Nadav Amit Dan Tsafrir Assaf Schuster Technion – Israel Institute of Technology {namit,dan,assaf}@cs.technion.ac.il Abstract 1. Introduction The number of guest virtual machines that can be consoli- The main enabling technology for cloud computing is ma- dated on one physical host is typically limited by the mem- chine virtualization, which abstracts the rigid physical in- ory size, motivating memory overcommitment. Guests are frastructure and turns it into soft componentseasily managed given a choice to either install a “balloon” driver to coordi- and used. Clouds and virtualization are driven by strong eco- nate the overcommitment activity, or to experience degraded nomic incentives, notably the ability to consolidate multiple performance due to uncooperative swapping. Ballooning, guest servers on one physical host. The number of guests however, is not a complete solution, as hosts must still fall that one host can support is typically limited by the physi- back on uncooperative swapping in various circumstances. cal memory size [4, 23, 25, 85]. So hosts overcommit their Additionally, ballooning takes time to accommodate change, memory to increase their capacity. and so guests might experience degraded performance under Memory overcommitment of guest virtual machines is changing conditions. commonly done through a special “balloon” driver installed Our goal is to improve the performance of hosts when in the guest [82]. Balloons allocate pinned memory pages at they fall back on uncooperative swapping and/or operate the host’s request, thereby ensuring guests will not use them under changing load conditions. We find that the associ- so that they can be utilized for some other purpose. When a ated poor performance is caused by various types of su- balloon is “inflated”, it prompts the guest operating system perfluous swap operations, ruined swap file sequentiality, to do memory reclamation on its own, which often results in and uninformed prefetch decisions upon page faults. We ad- the guest swapping out some of its less used pages to disk. dress these problems by implementing VSWAPPER, a guest- Ballooning is a common-case optimization for memory agnostic memory swapper suited for virtual environments reclamation and overcommitment, but, inherently, it is not that allows efficient, uncooperative overcommitment. With a complete solution [26, 66, 82, 86]. Hosts cannot rely on inactive ballooning, VSWAPPER yields up to an orderof mag- guest cooperation, because: (1) clients may have disabled or nitude performance improvement. Combined with balloon- opted not to install the balloon [8, 51, 72]; (2) clients may ing, VSWAPPER can achieve up to double the performance have failed to install the balloon due to technical difficulties under changing load conditions. [75, 76, 77, 78, 79, 80, 81]; (3) balloons could reach their Categories and Subject Descriptors D.4.2 [Operating Sys- upper bound, set by the hypervisor (and optionally adjusted tems]: Storage Management—Virtual memory, Main mem- by clients) to enhance stability and to accommodate various ory, Allocation/deallocation strategies, Storage hierarchies guest limitations [12, 17, 57, 67, 73, 82]; (4) balloons might be unable to reclaim memory fast enough to accommodate General Terms Design, Experimentation, Measurement, the demand that the host must satisfy, notably since guest Performance memory swapping involves slow disk activity [29, 44, 82]; Keywords Virtualization; memory overcommitment;mem- and (5) balloons could be temporarily unavailable due to ory balloon; swapping; paging; page faults; emulation; vir- inner guest activity such as booting [82] or running high tual machines; hypervisor priority processes that starve guest kernel services. In all these cases, the host must resort to uncooperative swapping, Permission to make digital or hard copies of all or part of this work for personal or which is notorious for its poor performance (and which has classroom use is granted without fee provided that copies are not made or distributed motivated ballooning in the first place). for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the While operational, ballooning is a highly effective opti- author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or mization. But estimating the memory working set size of republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. guests is a hard problem, especially under changing condi- ASPLOS ’14, March 1–4, 2014, Salt Lake City, UT, USA. tions [25, 33, 43], and the act of transferring memory pages Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2305-5/14/03. $15.00. from one guest to another is slow when the memory is over- http://dx.doi.org/10.1145/2541940.2541969 committed [13, 29, 34, 44]. Consequently, upon change, it takes time for the balloon manager to appropriately adjust in the guest’s virtual disk for future reference and discards the balloon sizes and to achieve good results. Hosts might them, thereby eliminating the root cause of silent writes, therefore rely on uncooperative swapping during this period, stale reads, decayed sequentiality, and false page anonymity. and so guests might experience degraded performance until The second component is the False Reads Preventer, which the balloon sizes stabilize. eliminates false reads by emulating faulting write instruc- We note in passing that the use of ballooning constitutes tions directed at swapped out pages. Instead of immediately a tradeoff that embodies both a benefit and a price. The ben- faulting-in the latter, the Preventer saves the written data in efit is the improved performance achieved through curbing a memory buffer for a short while, hoping the entire buffer the uncooperative swapping activity. Conversely, the price would fill up soon, which would then obviate the need to for clients is that they need to modify their guest operating read. We describe VSWAPPER in detail in Section 4. systems by installing host-specific software, which has vari- We evaluate VSWAPPER in Section 5 and find that when ous undesirable consequences such as burdening the clients, memory is tight VSWAPPER is typically significantly bet- being nonportable across different hypervisors, and entail- ter than baseline swapping and is oftentimes competitive ing a small risk of causing undesirable interactions between to ballooning. At its worst, VSWAPPER is respectively 1.1x new and existing software [37]. The price for vendors is that and 2.1x slower than baseline and ballooning. At its best, they need to put in the effort to support different drivers for VSWAPPER is respectively 10x and 2x faster than baseline every guest operating system kernel and version. (We spec- and ballooning, under changing load conditions. In all cases, ulate that, due to this effort, for example, there is no balloon combining VSWAPPER and ballooning yields performance driver available for OS X under KVM and VirtualBox, and comparable to the optimum. the latter supports ballooning for only 64-bit guests [15].) Finally, we discuss the related work in Section 6, outline Therefore, arguably, reducing the overheads of uncoopera- possible future work in Section 7, and conclude in Section 8. tive swapping could sway the decision of whether to employ ballooning or not. 2. Motivation Our goal in this paper is twofold. To provide a superior 2.1 The Benefit of Ballooning alternative to baseline uncooperative host swapping, to be used by hosts as a performant fall back for when balloons Current architectural support for machine virtualization al- cannot be used. And to enhance guests’ performance while lows the host operating system (OS) to manage the mem- ballooning is utilized under changing load conditions. We ory of its guest virtual machines (VMs) as if they were pro- motivate this goal in detail in Section 2. cesses, and it additionally allows the guest OSes to do their We investigate why uncooperative swapping degrades the own memory management for their internal processes, with- performance in practice and find that it is largely because of: out host involvement. (1) “silent swap writes” that copy unchanged blocks of data Hardware provides this capability by supporting a two- from the guest disk image to the host swap area; (2) “stale level address translation mechanism (Figure 1). The upper swap reads” triggered when guests perform explicit disk level is controlled by the guest and is comprised of page reads whose destination buffers are pages swapped out by tables translating “guest virtual addresses” to “guest physical the host; (3) “false swap reads” triggered when guests over- addresses”. The lower level is controlled by the host and write whole pages previously swapped out by the host while is comprised of tables translating guest physical addresses disregarding their old content (e.g., when copying-on-write); to “host physical addresses”. A guest physical address is (4) “decayed swap sequentiality” that causes unchanged of course not real in any sense. The host can (1) map it guest file blocks to gradually lose their contiguity while be- to some real memory page or (2) mark it as non-present, ing kept in the host swap area and thereby hindering swap which ensures the host will get a page fault if/when the guest prefetching; and (5) “false page anonymity” that occurs attempts to access the non-presentpage. Consequently,when when mislabeling guest pages backed by files as anonymous memory is tight, the host can temporarily store page content and thereby confusing the page reclamation algorithm. We on disk and read it back into memoryonly when handlingthe characterize and exemplify these problems in Section 3. corresponding page faults [60]. We denote the latter activity To address the problems, we design VSWAPPER, a guest- as uncooperative swapping, because the host can conduct it agnostic memory swapper to be used by hypervisors.
Recommended publications
  • Demystifying the Real-Time Linux Scheduling Latency
    Demystifying the Real-Time Linux Scheduling Latency Daniel Bristot de Oliveira Red Hat, Italy [email protected] Daniel Casini Scuola Superiore Sant’Anna, Italy [email protected] Rômulo Silva de Oliveira Universidade Federal de Santa Catarina, Brazil [email protected] Tommaso Cucinotta Scuola Superiore Sant’Anna, Italy [email protected] Abstract Linux has become a viable operating system for many real-time workloads. However, the black-box approach adopted by cyclictest, the tool used to evaluate the main real-time metric of the kernel, the scheduling latency, along with the absence of a theoretically-sound description of the in-kernel behavior, sheds some doubts about Linux meriting the real-time adjective. Aiming at clarifying the PREEMPT_RT Linux scheduling latency, this paper leverages the Thread Synchronization Model of Linux to derive a set of properties and rules defining the Linux kernel behavior from a scheduling perspective. These rules are then leveraged to derive a sound bound to the scheduling latency, considering all the sources of delays occurring in all possible sequences of synchronization events in the kernel. This paper also presents a tracing method, efficient in time and memory overheads, to observe the kernel events needed to define the variables used in the analysis. This results in an easy-to-use tool for deriving reliable scheduling latency bounds that can be used in practice. Finally, an experimental analysis compares the cyclictest and the proposed tool, showing that the proposed method can find sound bounds faster with acceptable overheads. 2012 ACM Subject Classification Computer systems organization → Real-time operating systems Keywords and phrases Real-time operating systems, Linux kernel, PREEMPT_RT, Scheduling latency Digital Object Identifier 10.4230/LIPIcs.ECRTS.2020.9 Supplementary Material ECRTS 2020 Artifact Evaluation approved artifact available at https://doi.org/10.4230/DARTS.6.1.3.
    [Show full text]
  • Proceedings 2005
    LAC2005 Proceedings 3rd International Linux Audio Conference April 21 – 24, 2005 ZKM | Zentrum fur¨ Kunst und Medientechnologie Karlsruhe, Germany Published by ZKM | Zentrum fur¨ Kunst und Medientechnologie Karlsruhe, Germany April, 2005 All copyright remains with the authors www.zkm.de/lac/2005 Content Preface ............................................ ............................5 Staff ............................................... ............................6 Thursday, April 21, 2005 – Lecture Hall 11:45 AM Peter Brinkmann MidiKinesis – MIDI controllers for (almost) any purpose . ....................9 01:30 PM Victor Lazzarini Extensions to the Csound Language: from User-Defined to Plugin Opcodes and Beyond ............................. .....................13 02:15 PM Albert Gr¨af Q: A Functional Programming Language for Multimedia Applications .........21 03:00 PM St´ephane Letz, Dominique Fober and Yann Orlarey jackdmp: Jack server for multi-processor machines . ......................29 03:45 PM John ffitch On The Design of Csound5 ............................... .....................37 04:30 PM Pau Arum´ıand Xavier Amatriain CLAM, an Object Oriented Framework for Audio and Music . .............43 Friday, April 22, 2005 – Lecture Hall 11:00 AM Ivica Ico Bukvic “Made in Linux” – The Next Step .......................... ..................51 11:45 AM Christoph Eckert Linux Audio Usability Issues .......................... ........................57 01:30 PM Marije Baalman Updates of the WONDER software interface for using Wave Field Synthesis . 69 02:15 PM Georg B¨onn Development of a Composer’s Sketchbook ................. ....................73 Saturday, April 23, 2005 – Lecture Hall 11:00 AM J¨urgen Reuter SoundPaint – Painting Music ........................... ......................79 11:45 AM Michael Sch¨uepp, Rene Widtmann, Rolf “Day” Koch and Klaus Buchheim System design for audio record and playback with a computer using FireWire . 87 01:30 PM John ffitch and Tom Natt Recording all Output from a Student Radio Station .
    [Show full text]
  • Virtual Memory
    Chapter 4 Virtual Memory Linux processes execute in a virtual environment that makes it appear as if each process had the entire address space of the CPU available to itself. This virtual address space extends from address 0 all the way to the maximum address. On a 32-bit platform, such as IA-32, the maximum address is 232 − 1or0xffffffff. On a 64-bit platform, such as IA-64, this is 264 − 1or0xffffffffffffffff. While it is obviously convenient for a process to be able to access such a huge ad- dress space, there are really three distinct, but equally important, reasons for using virtual memory. 1. Resource virtualization. On a system with virtual memory, a process does not have to concern itself with the details of how much physical memory is available or which physical memory locations are already in use by some other process. In other words, virtual memory takes a limited physical resource (physical memory) and turns it into an infinite, or at least an abundant, resource (virtual memory). 2. Information isolation. Because each process runs in its own address space, it is not possible for one process to read data that belongs to another process. This improves security because it reduces the risk of one process being able to spy on another pro- cess and, e.g., steal a password. 3. Fault isolation. Processes with their own virtual address spaces cannot overwrite each other’s memory. This greatly reduces the risk of a failure in one process trig- gering a failure in another process. That is, when a process crashes, the problem is generally limited to that process alone and does not cause the entire machine to go down.
    [Show full text]
  • Context Switch in Linux – OS Course Memory Layout – General Picture
    ContextContext switchswitch inin LinuxLinux ©Gabriel Kliot, Technion 1 Context switch in Linux – OS course Memory layout – general picture Stack Stack Stack Process X user memory Process Y user memory Process Z user memory Stack Stack Stack tss->esp0 TSS of CPU i task_struct task_struct task_struct Process X kernel Process Y kernel Process Z kernel stack stack and task_struct stack and task_struct and task_struct Kernel memory ©Gabriel Kliot, Technion 2 Context switch in Linux – OS course #1 – kernel stack after any system call, before context switch prev ss User Stack esp eflags cs … User Code eip TSS … orig_eax … tss->esp0 es Schedule() function frame esp ds eax Saved on the kernel stack during ebp a transition to task_struct kernel mode by a edi jump to interrupt and by SAVE_ALL esi macro edx thread.esp0 ecx ebx ©Gabriel Kliot, Technion 3 Context switch in Linux – OS course #2 – stack of prev before switch_to macro in schedule() func prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 4 Context switch in Linux – OS course #3 – switch_to: save esi, edi, ebp on the stack of prev prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 ESI EDI EBP esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 5 Context switch in Linux – OS course #4 – switch_to: save esp in prev->thread.esp
    [Show full text]
  • Overcoming Traditional Problems with OS Huge Page Management
    MEGA: Overcoming Traditional Problems with OS Huge Page Management Theodore Michailidis Alex Delis Mema Roussopoulos University of Athens University of Athens University of Athens Athens, Greece Athens, Greece Athens, Greece [email protected] [email protected] [email protected] ABSTRACT KEYWORDS Modern computer systems now feature memory banks whose Huge pages, Address Translation, Memory Compaction aggregate size ranges from tens to hundreds of GBs. In this context, contemporary workloads can and do often consume ACM Reference Format: Theodore Michailidis, Alex Delis, and Mema Roussopoulos. 2019. vast amounts of main memory. This upsurge in memory con- MEGA: Overcoming Traditional Problems with OS Huge Page Man- sumption routinely results in increased virtual-to-physical agement. In The 12th ACM International Systems and Storage Con- address translations, and consequently and more importantly, ference (SYSTOR ’19), June 3–5, 2019, Haifa, Israel. ACM, New York, more translation misses. Both of these aspects collectively NY, USA, 11 pages. https://doi.org/10.1145/3319647.3325839 do hamper the performance of workload execution. A solu- tion aimed at dramatically reducing the number of address translation misses has been to provide hardware support for 1 INTRODUCTION pages with bigger sizes, termed huge pages. In this paper, we Computer memory capacities have been increasing sig- empirically demonstrate the benefits and drawbacks of using nificantly over the past decades, leading to the development such huge pages. In particular, we show that it is essential of memory hungry workloads that consume vast amounts for modern OS to refine their software mechanisms to more of main memory. This upsurge in memory consumption rou- effectively manage huge pages.
    [Show full text]
  • The Case for Compressed Caching in Virtual Memory Systems
    THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the USENIX Annual Technical Conference Monterey, California, USA, June 6-11, 1999 The Case for Compressed Caching in Virtual Memory Systems _ Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis aUniversity of Texas at Austin © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org The Case for Compressed Caching in Virtual Memory Systems Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis Dept. of Computer Sciences University of Texas at Austin Austin, Texas 78751-1182 g fwilson|sfkaplan|smaragd @cs.utexas.edu http://www.cs.utexas.edu/users/oops/ Abstract In [Wil90, Wil91b] we proposed compressed caching for virtual memory—storing pages in compressed form Compressed caching uses part of the available RAM to in a main memory compression cache to reduce disk pag- hold pages in compressed form, effectively adding a new ing. Appel also promoted this idea [AL91], and it was level to the virtual memory hierarchy. This level attempts evaluated empirically by Douglis [Dou93] and by Russi- to bridge the huge performance gap between normal (un- novich and Cogswell [RC96]. Unfortunately Douglis’s compressed) RAM and disk.
    [Show full text]
  • Memory Protection at Option
    Memory Protection at Option Application-Tailored Memory Safety in Safety-Critical Embedded Systems – Speicherschutz nach Wahl Auf die Anwendung zugeschnittene Speichersicherheit in sicherheitskritischen eingebetteten Systemen Der Technischen Fakultät der Universität Erlangen-Nürnberg zur Erlangung des Grades Doktor-Ingenieur vorgelegt von Michael Stilkerich Erlangen — 2012 Als Dissertation genehmigt von der Technischen Fakultät Universität Erlangen-Nürnberg Tag der Einreichung: 09.07.2012 Tag der Promotion: 30.11.2012 Dekan: Prof. Dr.-Ing. Marion Merklein Berichterstatter: Prof. Dr.-Ing. Wolfgang Schröder-Preikschat Prof. Dr. Michael Philippsen Abstract With the increasing capabilities and resources available on microcontrollers, there is a trend in the embedded industry to integrate multiple software functions on a single system to save cost, size, weight, and power. The integration raises new requirements, thereunder the need for spatial isolation, which is commonly established by using a memory protection unit (MPU) that can constrain access to the physical address space to a fixed set of address regions. MPU-based protection is limited in terms of available hardware, flexibility, granularity and ease of use. Software-based memory protection can provide an alternative or complement MPU-based protection, but has found little attention in the embedded domain. In this thesis, I evaluate qualitative and quantitative advantages and limitations of MPU-based memory protection and software-based protection based on a multi-JVM. I developed a framework composed of the AUTOSAR OS-like operating system CiAO and KESO, a Java implementation for deeply embedded systems. The framework allows choosing from no memory protection, MPU-based protection, software-based protection, and a combination of the two.
    [Show full text]
  • System Programming Guide
    Special Edition for CSEDU12 Students Students TOUCH -N-PASS EXAM CRAM GUIDE SERIES SYSTEM PROGRAMMING Prepared By Sharafat Ibn Mollah Mosharraf CSE, DU 12 th Batch (2005-2006) TABLE OF CONTENTS APIS AT A GLANCE ....................................................................................................................................................................... 1 CHAPTER 1: ASSEMBLER, LINKER & LOADER ................................................................................................................................ 4 CHAPTER 2: KERNEL ..................................................................................................................................................................... 9 CHAPTER 3: UNIX ELF FILE FORMAT ............................................................................................................................................ 11 CHAPTER 4: DEVICE DRIVERS ...................................................................................................................................................... 16 CHAPTER 5: INTERRUPT .............................................................................................................................................................. 20 CHAPTER 6: SYSTEM CALL ........................................................................................................................................................... 24 CHAPTER 7: FILE APIS ................................................................................................................................................................
    [Show full text]
  • Virtual Memory
    Virtual Memory Copyright ©: University of Illinois CS 241 Staff 1 Address Space Abstraction n Program address space ¡ All memory data ¡ i.e., program code, stack, data segment n Hardware interface (physical reality) ¡ Computer has one small, shared memory How can n Application interface (illusion) we close this gap? ¡ Each process wants private, large memory Copyright ©: University of Illinois CS 241 Staff 2 Address Space Illusions n Address independence ¡ Same address can be used in different address spaces yet remain logically distinct n Protection ¡ One address space cannot access data in another address space n Virtual memory ¡ Address space can be larger than the amount of physical memory on the machine Copyright ©: University of Illinois CS 241 Staff 3 Address Space Illusions: Virtual Memory Illusion Reality Giant (virtual) address space Many processes sharing Protected from others one (physical) address space (Unless you want to share it) Limited memory More whenever you want it Today: An introduction to Virtual Memory: The memory space seen by your programs! Copyright ©: University of Illinois CS 241 Staff 4 Multi-Programming n Multiple processes in memory at the same time n Goals 1. Layout processes in memory as needed 2. Protect each process’s memory from accesses by other processes 3. Minimize performance overheads 4. Maximize memory utilization Copyright ©: University of Illinois CS 241 Staff 5 OS & memory management n Main memory is normally divided into two parts: kernel memory and user memory n In a multiprogramming system, the “user” part of main memory needs to be divided to accommodate multiple processes. n The user memory is dynamically managed by the OS (known as memory management) to satisfy following requirements: ¡ Protection ¡ Relocation ¡ Sharing ¡ Memory organization (main mem.
    [Show full text]
  • Asynchronous Page Faults Aix Did It Red Hat Author Gleb Natapov August 10, 2010
    Asynchronous page faults Aix did it Red Hat Author Gleb Natapov August 10, 2010 Abstract Host memory overcommit may cause guest memory to be swapped. When guest vcpu access memory swapped out by a host its execution is suspended until memory is swapped back. Asynchronous page fault is a way to try and use guest vcpu more efficiently by allowing it to execute other tasks while page is brought back into memory. Part I How KVM Handles Guest Memory and What Inefficiency it Has With Regards to Host Swapping Mapping guest memory into host memory But we do it on demand Page fault happens on first guest access What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn() 4 get user pages fast() no previously mapped page and no swap entry found empty page is allocated 5 page is added into shadow/nested page table What happens on a page fault? 1 VMEXIT 2 kvm mmu page fault() 3 gfn to pfn()
    [Show full text]
  • What Is an Operating System III 2.1 Compnents II an Operating System
    Page 1 of 6 What is an Operating System III 2.1 Compnents II An operating system (OS) is software that manages computer hardware and software resources and provides common services for computer programs. The operating system is an essential component of the system software in a computer system. Application programs usually require an operating system to function. Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers.
    [Show full text]
  • Measuring Software Performance on Linux Technical Report
    Measuring Software Performance on Linux Technical Report November 21, 2018 Martin Becker Samarjit Chakraborty Chair of Real-Time Computer Systems Chair of Real-Time Computer Systems Technical University of Munich Technical University of Munich Munich, Germany Munich, Germany [email protected] [email protected] OS program program CPU .text .bss + + .data +/- + instructions cache branch + coherency scheduler misprediction core + pollution + migrations data + + interrupt L1i$ miss access + + + + + + context mode + + (TLB flush) TLB + switch data switch miss L1d$ +/- + (KPTI TLB flush) miss prefetch +/- + + + higher-level readahead + page cache miss walk + + multicore + + (TLB shootdown) TLB coherency page DRAM + page fault + cache miss + + + disk + major minor I/O Figure 1. Event interaction map for a program running on an Intel Core processor on Linux. Each event itself may cause processor cycles, and inhibit (−), enable (+), or modulate (⊗) others. Abstract that our measurement setup has a large impact on the results. Measuring and analyzing the performance of software has More surprisingly, however, they also suggest that the setup reached a high complexity, caused by more advanced pro- can be negligible for certain analysis methods. Furthermore, cessor designs and the intricate interaction between user we found that our setup maintains significantly better per- formance under background load conditions, which means arXiv:1811.01412v2 [cs.PF] 20 Nov 2018 programs, the operating system, and the processor’s microar- chitecture. In this report, we summarize our experience on it can be used to improve high-performance applications. how performance characteristics of software should be mea- CCS Concepts • Software and its engineering → Soft- sured when running on a Linux operating system and a ware performance; modern processor.
    [Show full text]