Dynamic and Elastic Memory Management in Virtualized Clouds
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Demystifying the Real-Time Linux Scheduling Latency
Demystifying the Real-Time Linux Scheduling Latency Daniel Bristot de Oliveira Red Hat, Italy [email protected] Daniel Casini Scuola Superiore Sant’Anna, Italy [email protected] Rômulo Silva de Oliveira Universidade Federal de Santa Catarina, Brazil [email protected] Tommaso Cucinotta Scuola Superiore Sant’Anna, Italy [email protected] Abstract Linux has become a viable operating system for many real-time workloads. However, the black-box approach adopted by cyclictest, the tool used to evaluate the main real-time metric of the kernel, the scheduling latency, along with the absence of a theoretically-sound description of the in-kernel behavior, sheds some doubts about Linux meriting the real-time adjective. Aiming at clarifying the PREEMPT_RT Linux scheduling latency, this paper leverages the Thread Synchronization Model of Linux to derive a set of properties and rules defining the Linux kernel behavior from a scheduling perspective. These rules are then leveraged to derive a sound bound to the scheduling latency, considering all the sources of delays occurring in all possible sequences of synchronization events in the kernel. This paper also presents a tracing method, efficient in time and memory overheads, to observe the kernel events needed to define the variables used in the analysis. This results in an easy-to-use tool for deriving reliable scheduling latency bounds that can be used in practice. Finally, an experimental analysis compares the cyclictest and the proposed tool, showing that the proposed method can find sound bounds faster with acceptable overheads. 2012 ACM Subject Classification Computer systems organization → Real-time operating systems Keywords and phrases Real-time operating systems, Linux kernel, PREEMPT_RT, Scheduling latency Digital Object Identifier 10.4230/LIPIcs.ECRTS.2020.9 Supplementary Material ECRTS 2020 Artifact Evaluation approved artifact available at https://doi.org/10.4230/DARTS.6.1.3. -
Memory Deduplication: An
1 Memory Deduplication: An Effective Approach to Improve the Memory System Yuhui Deng1,2, Xinyu Huang1, Liangshan Song1, Yongtao Zhou1, Frank Wang3 1Department of Computer Science, Jinan University, Guangzhou, 510632, P. R. China {[email protected];[email protected];[email protected];[email protected]} 2Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences Beijing, 100190, PR China 3 School of Computing,University of Kent, CT27NF, UK [email protected] Abstract— Programs now have more aggressive demands of memory to hold their data than before. This paper analyzes the characteristics of memory data by using seven real memory traces. It observes that there are a large volume of memory pages with identical contents contained in the traces. Furthermore, the unique memory content accessed are much less than the unique memory address accessed. This is incurred by the traditional address-based cache replacement algorithms that replace memory pages by checking the addresses rather than the contents of those pages, thus resulting in many identical memory contents with different addresses stored in the memory. For example, in the same file system, opening two identical files stored in different directories, or opening two similar files that share a certain amount of contents in the same directory, will result in identical data blocks stored in the cache due to the traditional address-based cache replacement algorithms. Based on the observations, this paper evaluates memory compression and memory deduplication. As expected, memory deduplication greatly outperforms memory compression. For example, the best deduplication ratio is 4.6 times higher than the best compression ratio. -
Virtual Memory
Chapter 4 Virtual Memory Linux processes execute in a virtual environment that makes it appear as if each process had the entire address space of the CPU available to itself. This virtual address space extends from address 0 all the way to the maximum address. On a 32-bit platform, such as IA-32, the maximum address is 232 − 1or0xffffffff. On a 64-bit platform, such as IA-64, this is 264 − 1or0xffffffffffffffff. While it is obviously convenient for a process to be able to access such a huge ad- dress space, there are really three distinct, but equally important, reasons for using virtual memory. 1. Resource virtualization. On a system with virtual memory, a process does not have to concern itself with the details of how much physical memory is available or which physical memory locations are already in use by some other process. In other words, virtual memory takes a limited physical resource (physical memory) and turns it into an infinite, or at least an abundant, resource (virtual memory). 2. Information isolation. Because each process runs in its own address space, it is not possible for one process to read data that belongs to another process. This improves security because it reduces the risk of one process being able to spy on another pro- cess and, e.g., steal a password. 3. Fault isolation. Processes with their own virtual address spaces cannot overwrite each other’s memory. This greatly reduces the risk of a failure in one process trig- gering a failure in another process. That is, when a process crashes, the problem is generally limited to that process alone and does not cause the entire machine to go down. -
Context Switch in Linux – OS Course Memory Layout – General Picture
ContextContext switchswitch inin LinuxLinux ©Gabriel Kliot, Technion 1 Context switch in Linux – OS course Memory layout – general picture Stack Stack Stack Process X user memory Process Y user memory Process Z user memory Stack Stack Stack tss->esp0 TSS of CPU i task_struct task_struct task_struct Process X kernel Process Y kernel Process Z kernel stack stack and task_struct stack and task_struct and task_struct Kernel memory ©Gabriel Kliot, Technion 2 Context switch in Linux – OS course #1 – kernel stack after any system call, before context switch prev ss User Stack esp eflags cs … User Code eip TSS … orig_eax … tss->esp0 es Schedule() function frame esp ds eax Saved on the kernel stack during ebp a transition to task_struct kernel mode by a edi jump to interrupt and by SAVE_ALL esi macro edx thread.esp0 ecx ebx ©Gabriel Kliot, Technion 3 Context switch in Linux – OS course #2 – stack of prev before switch_to macro in schedule() func prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 4 Context switch in Linux – OS course #3 – switch_to: save esi, edi, ebp on the stack of prev prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 ESI EDI EBP esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 5 Context switch in Linux – OS course #4 – switch_to: save esp in prev->thread.esp -
The Case for Compressed Caching in Virtual Memory Systems
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the USENIX Annual Technical Conference Monterey, California, USA, June 6-11, 1999 The Case for Compressed Caching in Virtual Memory Systems _ Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis aUniversity of Texas at Austin © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org The Case for Compressed Caching in Virtual Memory Systems Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis Dept. of Computer Sciences University of Texas at Austin Austin, Texas 78751-1182 g fwilson|sfkaplan|smaragd @cs.utexas.edu http://www.cs.utexas.edu/users/oops/ Abstract In [Wil90, Wil91b] we proposed compressed caching for virtual memory—storing pages in compressed form Compressed caching uses part of the available RAM to in a main memory compression cache to reduce disk pag- hold pages in compressed form, effectively adding a new ing. Appel also promoted this idea [AL91], and it was level to the virtual memory hierarchy. This level attempts evaluated empirically by Douglis [Dou93] and by Russi- to bridge the huge performance gap between normal (un- novich and Cogswell [RC96]. Unfortunately Douglis’s compressed) RAM and disk. -
Linux Physical Memory Analysis
Linux Physical Memory Analysis Paul Movall International Business Machines Corporation 3605 Highway 52N, Rochester, MN Ward Nelson International Business Machines Corporation 3605 Highway 52N, Rochester, MN Shaun Wetzstein International Business Machines Corporation 3605 Highway 52N, Rochester, MN Abstract A significant amount of this information can alo be found in the /proc filesystem provided by the kernel. We present a tool suite for analysis of physical memory These /proc entries can be used to obtain many statis- usage within the Linux kernel environment. This tool tics including several related to memory usage. For ex- suite can be used to collect and analyze how the physi- ample, /proc/<pid>/maps can be used to display a pro- cal memory within a Linuxenvironment is being used. cess' virtual memory map. Likewise, the contents of / proc/<pid>/status can be used to retreive statistics about virtual memory usage as well as the Resident Set Size (RSS). Embedded subsystems are common in today's computer In most cases, this process level detail is sufficient to systems. These embedded subsystems range from the analyze memory usage. In systems that do not have very simple to the very complex. In such embedded sys- backing store, more details are often needed to analyze tems, memory is scarce and swap is non-existent. When memory usage. For example, it's useful to know which adapting Linux for use in this environment, we needed pages in a process' address map are resident, not just to keep a close eye on physical memory usage. how many. This information can be used to get some clues on the usage of a shared library. -
Memory Protection at Option
Memory Protection at Option Application-Tailored Memory Safety in Safety-Critical Embedded Systems – Speicherschutz nach Wahl Auf die Anwendung zugeschnittene Speichersicherheit in sicherheitskritischen eingebetteten Systemen Der Technischen Fakultät der Universität Erlangen-Nürnberg zur Erlangung des Grades Doktor-Ingenieur vorgelegt von Michael Stilkerich Erlangen — 2012 Als Dissertation genehmigt von der Technischen Fakultät Universität Erlangen-Nürnberg Tag der Einreichung: 09.07.2012 Tag der Promotion: 30.11.2012 Dekan: Prof. Dr.-Ing. Marion Merklein Berichterstatter: Prof. Dr.-Ing. Wolfgang Schröder-Preikschat Prof. Dr. Michael Philippsen Abstract With the increasing capabilities and resources available on microcontrollers, there is a trend in the embedded industry to integrate multiple software functions on a single system to save cost, size, weight, and power. The integration raises new requirements, thereunder the need for spatial isolation, which is commonly established by using a memory protection unit (MPU) that can constrain access to the physical address space to a fixed set of address regions. MPU-based protection is limited in terms of available hardware, flexibility, granularity and ease of use. Software-based memory protection can provide an alternative or complement MPU-based protection, but has found little attention in the embedded domain. In this thesis, I evaluate qualitative and quantitative advantages and limitations of MPU-based memory protection and software-based protection based on a multi-JVM. I developed a framework composed of the AUTOSAR OS-like operating system CiAO and KESO, a Java implementation for deeply embedded systems. The framework allows choosing from no memory protection, MPU-based protection, software-based protection, and a combination of the two. -
Extending the UHC LLVM Backend: Adding Support for Accurate Garbage Collection
Extending the UHC LLVM backend: Adding support for accurate garbage collection Paul van der Ende MSc Thesis October 18, 2010 INF/SCR-09-59 Daily Supervisors: Center for Software Technology dr. A. Dijkstra Dept. of Information and Computing Sciences drs. J.D. Fokker Utrecht University Utrecht, the Netherlands Second Supervisor: prof. dr. S.D. Swierstra Abstract The Utrecht Haskell Compiler has an LLVM backend for whole program analyse mode. This backend pre- viously used the Boehm-Weiser library for heap allocation and conservative garbage collection. Recently an accurate garbage collector for the UHC compiler has been developed. We wanted this new garbage collection library to work with the LLVM backend. But since this new collector is accurate, it needs cooperation with the LLVM backend. Functionality needs to be added to find the set of root references and traversing all live references. To find the root set, the bytecode interpreter backend currently uses static stack maps. This is where the problem arises. The LLVM compiler is known to do various aggressive transformations. These optimizations might change the stack layout described by the static stack map. So to avoid this problem we wanted to use the shadow-stack provided by the LLVM framework. This is a dynamic structure maintained on the stack to safely find the garbage collection roots. We expect that the shadow-stack approach comes with a runtime overhead for maintaining this information dynamically on the stack. So we did measure the impact of this method. We also measured the performance improvement of the new garbage collection method used and compare it with the other UHC backends. -
Murciano Soto, Joan; Rexachs Del Rosario, Dolores Isabel, Dir
This is the published version of the bachelor thesis: Murciano Soto, Joan; Rexachs del Rosario, Dolores Isabel, dir. Anàlisi de presta- cions de sistemes d’emmagatzematge per IA. 2021. (958 Enginyeria Informàtica) This version is available at https://ddd.uab.cat/record/248510 under the terms of the license TFG EN ENGINYERIA INFORMATICA,` ESCOLA D’ENGINYERIA (EE), UNIVERSITAT AUTONOMA` DE BARCELONA (UAB) Analisi` de prestacions de sistemes d’emmagatzematge per IA Joan Murciano Soto Resum– Els programes d’Intel·ligencia` Artificial (IA) son´ programes que fan moltes lectures de fitxers per la seva naturalesa. Aquestes lectures requereixen moltes crides a dispositius d’emmagatzematge, i aquestes poden comportar endarreriments en l’execucio´ del programa. L’ample de banda per transportar dades de disc a memoria` o viceversa pot esdevenir en un bottleneck, incrementant el temps d’execucio.´ De manera que es´ important saber detectar en aquest tipus de programes, si les entrades/sortides (E/S) del nostre sistema se saturen. En aquest treball s’estudien diferents programes amb altes quantitats de lectures a disc. S’utilitzen eines de monitoritzacio,´ les quals ens informen amb metriques` relacionades amb E/S a disc. Tambe´ veiem l’impacte que te´ el swap, el qual tambe´ provoca un increment d’operacions d’E/S. Aquest document preten´ mostrar la metodologia utilitzada per a realitzar l’analisi` descrivint les eines i els resultats obtinguts amb l’objectiu de que serveixi de guia per a entendre el comportament i l’efecte de les E/S i el swap. Paraules clau– E/S, swap, IA, monitoritzacio.´ Abstract– Artificial Intelligence (IA) programs make many file readings by nature. -
Improving Memory Management Security for C and C++
Improving memory management security for C and C++ Yves Younan, Wouter Joosen, Frank Piessens, Hans Van den Eynden DistriNet, Katholieke Universiteit Leuven, Belgium Abstract Memory managers are an important part of any modern language: they are used to dynamically allocate memory for use in the program. Many managers exist and depending on the operating system and language. However, two major types of managers can be identified: manual memory allocators and garbage collectors. In the case of manual memory allocators, the programmer must manually release memory back to the system when it is no longer needed. Problems can occur when a programmer forgets to release it (memory leaks), releases it twice or keeps using freed memory. These problems are solved in garbage collectors. However, both manual memory allocators and garbage collectors store management information for the memory they manage. Often, this management information is stored where a buffer overflow could allow an attacker to overwrite this information, providing a reliable way to achieve code execution when exploiting these vulnerabilities. In this paper we describe several vulnerabilities for C and C++ and how these could be exploited by modifying the management information of a representative manual memory allocator and a garbage collector. Afterwards, we present an approach that, when applied to memory managers, will protect against these attack vectors. We implemented our approach by modifying an existing widely used memory allocator. Benchmarks show that this implementation has a negligible, sometimes even beneficial, impact on performance. 1 Introduction Security has become an important concern for all computer users. Worms and hackers are a part of every day internet life. -
Memory Safety Without Garbage Collection for Embedded Applications
Memory Safety Without Garbage Collection for Embedded Applications DINAKAR DHURJATI, SUMANT KOWSHIK, VIKRAM ADVE, and CHRIS LATTNER University of Illinois at Urbana-Champaign Traditional approaches to enforcing memory safety of programs rely heavily on run-time checks of memory accesses and on garbage collection, both of which are unattractive for embedded ap- plications. The goal of our work is to develop advanced compiler techniques for enforcing memory safety with minimal run-time overheads. In this paper, we describe a set of compiler techniques that, together with minor semantic restrictions on C programs and no new syntax, ensure memory safety and provide most of the error-detection capabilities of type-safe languages, without using garbage collection, and with no run-time software checks, (on systems with standard hardware support for memory management). The language permits arbitrary pointer-based data structures, explicit deallocation of dynamically allocated memory, and restricted array operations. One of the key results of this paper is a compiler technique that ensures that dereferencing dangling pointers to freed memory does not violate memory safety, without annotations, run-time checks, or garbage collection, and works for arbitrary type-safe C programs. Furthermore, we present a new inter- procedural analysis for static array bounds checking under certain assumptions. For a diverse set of embedded C programs, we show that we are able to ensure memory safety of pointer and dy- namic memory usage in all these programs with no run-time software checks (on systems with standard hardware memory protection), requiring only minor restructuring to conform to simple type restrictions. Static array bounds checking fails for roughly half the programs we study due to complex array references, and these are the only cases where explicit run-time software checks would be needed under our language and system assumptions. -
Extracting Compressed Pages from the Windows 10 Virtual Store WHITE PAPER | EXTRACTING COMPRESSED PAGES from the WINDOWS 10 VIRTUAL STORE 2
white paper Extracting Compressed Pages from the Windows 10 Virtual Store WHITE PAPER | EXTRACTING COMPRESSED PAGES FROM THE WINDOWS 10 VIRTUAL STORE 2 Abstract Windows 8.1 introduced memory compression in August 2013. By the end of 2013 Linux 3.11 and OS X Mavericks leveraged compressed memory as well. Disk I/O continues to be orders of magnitude slower than RAM, whereas reading and decompressing data in RAM is fast and highly parallelizable across the system’s CPU cores, yielding a significant performance increase. However, this came at the cost of increased complexity of process memory reconstruction and thus reduced the power of popular tools such as Volatility, Rekall, and Redline. In this document we introduce a method to retrieve compressed pages from the Windows 10 Memory Manager Virtual Store, thus providing forensics and auditing tools with a way to retrieve, examine, and reconstruct memory artifacts regardless of their storage location. Introduction Windows 10 moves pages between physical memory and the hard disk or the Store Manager’s virtual store when memory is constrained. Universal Windows Platform (UWP) applications leverage the Virtual Store any time they are suspended (as is the case when minimized). When a given page is no longer in the process’s working set, the corresponding Page Table Entry (PTE) is used by the OS to specify the storage location as well as additional data that allows it to start the retrieval process. In the case of a page file, the retrieval is straightforward because both the page file index and the location of the page within the page file can be directly retrieved.