Efficient Memory Mapped File I/O for In-Memory File Systems Jungsik Choi†, Jiwon Kim, Hwansoo Han Storage Latency Close to DRAM

Total Page:16

File Type:pdf, Size:1020Kb

Efficient Memory Mapped File I/O for In-Memory File Systems Jungsik Choi†, Jiwon Kim, Hwansoo Han Storage Latency Close to DRAM Efficient Memory Mapped File I/O for In-Memory File Systems Jungsik Choi†, Jiwon Kim, Hwansoo Han Storage Latency Close to DRAM PCIe 3D-XPoint & Optane Flash SSD (~10 μs) SATA/SAS (~60 μs) NVDIMM-N Flash SSD (~10ns) (~100μs) SATA HDD Operations Per Second Operations Millisec Microsec Nanosec 2 Large Overhead in Software • Existing OSes were designed for fast CPUs and slow block devices • With low-latency storage, SW overhead becomes the largest burden • Software overhead includes Software Storage NIC – Complicated I/O stacks 100% 80% 82% 77% – Redundant memory copies 60% – Frequent user/kernel mode switches 40% 50% Latency 20% (Normalized) 7% Decomposition 0% HDD NVMe SSD PCM PCM HDD 10Gb NVMe PCM 10Gb PCM • SW overhead must be addressed 10Gb NIC 10Gb NIC 10Gb NIC 100Gb NIC NIC 10Gb NIC NIC 100Gb NIC to fully exploit low-latency NVM storage (Source: Ahn et al. MICRO-48) 3 Eliminate SW Overhead with mmap • In recent studies, memory mapped I/O has been commonly proposed – Memory mapped I/O can expose the NVM storage’s raw performance • Mapping files onto user memory space App App – Provide memory-like access user r/w syscalls – Avoid complicated I/O stacks kernel load/store File System – Minimize user/kernel mode switches instructions – No data copies Device Driver Persistent Disk/Flash Memory • A mmap syscall will be a critical interface Traditional I/O Memory mapped I/O 4 Memory Mapped I/O is Not as Fast as Expected • Microbenchmark: sequential read, a 4GB file (Ext4-DAX on NVDIMM-N) 1.5 1.42 0.16 • read : 1.06 1.02 read system calls 1 0.64 계열 4 • default-mmap : 0.64 mmappagewithout fault overhead any special flags 0.5 0.63 • populatepage table-mmap entry : construction Elapsed Time (sec) Time Elapsed 0.39 mmap with a MAP_POPULATE flag 0 file access 5 Memory Mapped I/O is Not as Fast as Expected • Microbenchmark: sequential read, a 4GB file (Ext4-DAX on NVDIMM-N) 1.5 1.42 0.16 1.06 1.02 1 0.64 계열 4 0.64 page fault overhead 0.5 0.63 page table entry construction Elapsed Time (sec) Time Elapsed 0.39 0 file access 5 Overhead in Memory Mapped File I/O • Memory mapped I/O can avoid the SW overhead of traditional file I/O • However, Memory mapped I/O causes another SW overhead – Page fault overhead – TLB miss overhead – PTE construction overhead • It decreases the advantages of memory mapped file I/O 6 Techniques to Alleviate Storage Latency • Readahead – Preload pages that are expected to be accessed App • Page cache User Hints – Cache frequently accessed pages (madvise/fadvise) Main Memory • fadvise/madvise interfaces Page Cache – Utilize user hints to manage pages Readahead Secondary • However, these can’t be used in memory based FSs Storage 7 Techniques to Alleviate Storage Latency • Readahead ⇒ Map-ahead – Preload pages that are expected to be accessed App • Page cache ⇒ Mapping cache User Hints – Cache frequently accessed pages (madvise/fadvise) Main Memory • fadvise/madvise interfaces ⇒ Extended madvise Page Cache – Utilize user hints to manage pages Readahead Secondary • However, these can’t be used in memory based FSs Storage ⇒ New optimization is needed in memory based FSs 7 Map-ahead Constructs PTEs in Advance • When a page fault occurs, the page fault handler handles – A page that caused the page fault (existing demand paging) – Pages that are expected to be accessed (map-ahead) • Kernel analyzes page fault patterns to predict pages to be accessed – Sequential fault : map-ahead window size ↑ rnd – Random fault : map-ahead window size ↓ NON_SEQ winsize ↓ seq • Map-ahead can reduce # of page faults rnd rnd PURE_SEQ NEARLY_SEQ winsize ↑ seq winsize - seq 8 Comparison of Demand Paging & Map-ahead • Existing demand paging application a b c d e f page fault time 1 2 3 4 5 handler 9 Comparison of Demand Paging & Map-ahead • Existing demand paging overhead in user/kernel mode switch, page fault, TLB miss application a b c d e f page fault time 1 2 3 4 5 handler 9 Comparison of Demand Paging & Map-ahead • Existing demand paging overhead in user/kernel mode switch, page fault, TLB miss application a b c d e f page fault time 1 2 3 4 5 handler • Map-ahead application a b c d e f time page fault 1 2 3 4 5 handler Performance gain Map-ahead window size == 5 9 Async Map-ahead Hides Mapping Overhead • Sync map-ahead application a b c d e f time page fault 1 2 3 4 5 handler Map-ahead window size == 5 10 Async Map-ahead Hides Mapping Overhead • Sync map-ahead application a b c d e f time page fault 1 2 3 4 5 handler • Async map-ahead application a b c d e f time page fault 1 2 handler Performance gain kernel threads 3 4 5 10 Extended madvise Makes User Hints Available • Taking advantage of user hints can maximize the app’s performance – However, existing interfaces are to optimize only paging • Extended madvise makes user hints available in memory based FSs Existing madvise Extended madvise MADV_SEQUENTIAL Run readahead aggressively Run map-ahead aggressively MADV_WILLNEED MADV_RANDOM Stop readahead Stop map-ahead MADV_DONTNEED Release pages in page $ Release mapping in mapping $ 11 Mapping Cache Reuses Mapping Data • When munmap is called, existing kernel releases the VMA and PTEs – In memory based FSs, mapping overhead is very large • With mapping cache, mapping overhead can be reduced – VMAs and PTEs are cached in mapping cache – When mmap is called, they are reused if possible (cache hit) VMA VMA VMA mmaped VA | PA VA | PA VMA region VMA VA | PA VMA mm_struct Process address Page table Physical memory red-black tree space 12 Experimental Environment • Experimental machine – Intel Xeon E5-2620 – 32GB DRAM, 32GB NVDIMM-N – Linux kernel 4.4 – Ext4-DAX filesystem • Benchmarks – fio : sequential & random read – YCSB on MongoDB : load workload Netlist DDR4 NVDIMM-N 16GB × 2 – Apache HTTP server with httperf 13 fio : Sequential Read default-mmap populate-mmap map-ahead extended madvise 7 6 5 4 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 14 fio : Sequential Read default-mmap populate-mmap map-ahead extended madvise 7 6 5 23% 4 38% 3 46% 14 thousand page faults 2 13 thousand page faults Bandwidth (GB/s) 1 16 million page faults 0 4 8 16 32 64 Record Size (KB) 14 fio : Sequential Read default-mmap populate-mmap map-ahead extended madvise 7 6 5 4 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 14 fio : Random Read default-mmap populate-mmap map-ahead extended madvise 4 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 15 fio : Random Read default-mmap populatepopulate-mmap-mmap map-ahead extended madvise 4 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 15 fio : Random Read default-mmap populate-mmap mapmap-ahead-ahead extended madvise 4 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 15 fio : Random Read default-mmap populate-mmap map-ahead extended madvise 4 madvise(MADV_RANDOM) == default-mmap 3 2 Bandwidth (GB/s) 1 0 4 8 16 32 64 Record Size (KB) 15 Web Server Experimental Setting • Apache HTTP server – Memory mapped file I/O Apache HTTP server – 10 thousand 1MB-size HTML files (from Wikipedia) – Total size is about 10GB • httperf clients – 8 additional machines – Zipf-like distribution 1Gbps switch 24 ports (lperf: 8,800Mbps bandwidth) httperf clients 16 Apache HTTP Server kernel libphp libc others No mapping cache ) 12 6 Mapping cache (2GB) 5 Mapping cache (No-Limit) 4 3 2 1 CPU Cycles (trillions, 10 (trillions, Cycles CPU 0 600 600 700700 800800 900900 10001000 Request Rate (reqs/s) 17 Apache HTTP Server kernel libphp libc others No mapping cache ) 12 6 Mapping cache (2GB) 5 Mapping cache (No-Limit) 4 3 2 1 CPU Cycles (trillions, 10 (trillions, Cycles CPU 0 600 600 700700 800800 900900 10001000 Request Rate (reqs/s) 17 Conclusion • SW latency is becoming bigger than the storage latency – Memory mapped file I/O can avoid the SW overhead • Memory mapped file I/O still incurs expensive additional overhead – Page fault, TLB miss, and PTEs construction overhead • To exploit the benefits of memory mapped I/O, we propose – Map-ahead, extended madvise, mapping cache • Our techniques demonstrate good performance by mitigating the mapping overhead – Map-ahead : 38% ↑ – Map-ahead + extended madvise : 23% ↑ – Mapping cache : 18% ↑ 18 QnA [email protected] 19.
Recommended publications
  • Copy on Write Based File Systems Performance Analysis and Implementation
    Copy On Write Based File Systems Performance Analysis And Implementation Sakis Kasampalis Kongens Lyngby 2010 IMM-MSC-2010-63 Technical University of Denmark Department Of Informatics Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk Abstract In this work I am focusing on Copy On Write based file systems. Copy On Write is used on modern file systems for providing (1) metadata and data consistency using transactional semantics, (2) cheap and instant backups using snapshots and clones. This thesis is divided into two main parts. The first part focuses on the design and performance of Copy On Write based file systems. Recent efforts aiming at creating a Copy On Write based file system are ZFS, Btrfs, ext3cow, Hammer, and LLFS. My work focuses only on ZFS and Btrfs, since they support the most advanced features. The main goals of ZFS and Btrfs are to offer a scalable, fault tolerant, and easy to administrate file system. I evaluate the performance and scalability of ZFS and Btrfs. The evaluation includes studying their design and testing their performance and scalability against a set of recommended file system benchmarks. Most computers are already based on multi-core and multiple processor architec- tures. Because of that, the need for using concurrent programming models has increased. Transactions can be very helpful for supporting concurrent program- ming models, which ensure that system updates are consistent. Unfortunately, the majority of operating systems and file systems either do not support trans- actions at all, or they simply do not expose them to the users.
    [Show full text]
  • The Page Cache Today’S Lecture RCU File System Networking(Kernel Level Syncmem
    2/21/20 COMP 790: OS Implementation COMP 790: OS Implementation Logical Diagram Binary Memory Threads Formats Allocators User System Calls Kernel The Page Cache Today’s Lecture RCU File System Networking(kernel level Syncmem. Don Porter management) Memory Device CPU Management Drivers Scheduler Hardware Interrupts Disk Net Consistency 1 2 1 2 COMP 790: OS Implementation COMP 790: OS Implementation Recap of previous lectures Background • Page tables: translate virtual addresses to physical • Lab2: Track physical pages with an array of PageInfo addresses structs • VM Areas (Linux): track what should be mapped at in – Contains reference counts the virtual address space of a process – Free list layered over this array • Hoard/Linux slab: Efficient allocation of objects from • Just like JOS, Linux represents physical memory with a superblock/slab of pages an array of page structs – Obviously, not the exact same contents, but same idea • Pages can be allocated to processes, or to cache file data in memory 3 4 3 4 COMP 790: OS Implementation COMP 790: OS Implementation Today’s Problem The address space abstraction • Given a VMA or a file’s inode, how do I figure out • Unifying abstraction: which physical pages are storing its data? – Each file inode has an address space (0—file size) • Next lecture: We will go the other way, from a – So do block devices that cache data in RAM (0---dev size) physical page back to the VMA or file inode – The (anonymous) virtual memory of a process has an address space (0—4GB on x86) • In other words, all page
    [Show full text]
  • Oracle® Linux Administrator's Solutions Guide for Release 6
    Oracle® Linux Administrator's Solutions Guide for Release 6 E37355-64 August 2017 Oracle Legal Notices Copyright © 2012, 2017, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S.
    [Show full text]
  • Beej's Guide to Unix IPC
    Beej's Guide to Unix IPC Brian “Beej Jorgensen” Hall [email protected] Version 1.1.3 December 1, 2015 Copyright © 2015 Brian “Beej Jorgensen” Hall This guide is written in XML using the vim editor on a Slackware Linux box loaded with GNU tools. The cover “art” and diagrams are produced with Inkscape. The XML is converted into HTML and XSL-FO by custom Python scripts. The XSL-FO output is then munged by Apache FOP to produce PDF documents, using Liberation fonts. The toolchain is composed of 100% Free and Open Source Software. Unless otherwise mutually agreed by the parties in writing, the author offers the work as-is and makes no representations or warranties of any kind concerning the work, express, implied, statutory or otherwise, including, without limitation, warranties of title, merchantibility, fitness for a particular purpose, noninfringement, or the absence of latent or other defects, accuracy, or the presence of absence of errors, whether or not discoverable. Except to the extent required by applicable law, in no event will the author be liable to you on any legal theory for any special, incidental, consequential, punitive or exemplary damages arising out of the use of the work, even if the author has been advised of the possibility of such damages. This document is freely distributable under the terms of the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. See the Copyright and Distribution section for details. Copyright © 2015 Brian “Beej Jorgensen” Hall Contents 1. Intro................................................................................................................................................................1 1.1. Audience 1 1.2. Platform and Compiler 1 1.3.
    [Show full text]
  • Mmap and Dma
    CHAPTER THIRTEEN MMAP AND DMA This chapter delves into the area of Linux memory management, with an emphasis on techniques that are useful to the device driver writer. The material in this chap- ter is somewhat advanced, and not everybody will need a grasp of it. Nonetheless, many tasks can only be done through digging more deeply into the memory man- agement subsystem; it also provides an interesting look into how an important part of the kernel works. The material in this chapter is divided into three sections. The first covers the implementation of the mmap system call, which allows the mapping of device memory directly into a user process’s address space. We then cover the kernel kiobuf mechanism, which provides direct access to user memory from kernel space. The kiobuf system may be used to implement ‘‘raw I/O’’ for certain kinds of devices. The final section covers direct memory access (DMA) I/O operations, which essentially provide peripherals with direct access to system memory. Of course, all of these techniques requir e an understanding of how Linux memory management works, so we start with an overview of that subsystem. Memor y Management in Linux Rather than describing the theory of memory management in operating systems, this section tries to pinpoint the main features of the Linux implementation of the theory. Although you do not need to be a Linux virtual memory guru to imple- ment mmap, a basic overview of how things work is useful. What follows is a fairly lengthy description of the data structures used by the kernel to manage memory.
    [Show full text]
  • An Introduction to Linux IPC
    An introduction to Linux IPC Michael Kerrisk © 2013 linux.conf.au 2013 http://man7.org/ Canberra, Australia [email protected] 2013-01-30 http://lwn.net/ [email protected] man7 .org 1 Goal ● Limited time! ● Get a flavor of main IPC methods man7 .org 2 Me ● Programming on UNIX & Linux since 1987 ● Linux man-pages maintainer ● http://www.kernel.org/doc/man-pages/ ● Kernel + glibc API ● Author of: Further info: http://man7.org/tlpi/ man7 .org 3 You ● Can read a bit of C ● Have a passing familiarity with common syscalls ● fork(), open(), read(), write() man7 .org 4 There’s a lot of IPC ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memory attach ● Pseudoterminals ● proc_vm_readv() / proc_vm_writev() ● Sockets ● Signals ● Stream vs Datagram (vs Seq. packet) ● Standard, Realtime ● UNIX vs Internet domain ● Eventfd ● POSIX message queues ● Futexes ● POSIX shared memory ● Record locks ● ● POSIX semaphores File locks ● ● Named, Unnamed Mutexes ● System V message queues ● Condition variables ● System V shared memory ● Barriers ● ● System V semaphores Read-write locks man7 .org 5 It helps to classify ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memory attach ● Pseudoterminals ● proc_vm_readv() / proc_vm_writev() ● Sockets ● Signals ● Stream vs Datagram (vs Seq. packet) ● Standard, Realtime ● UNIX vs Internet domain ● Eventfd ● POSIX message queues ● Futexes ● POSIX shared memory ● Record locks ● ● POSIX semaphores File locks ● ● Named, Unnamed Mutexes ● System V message queues ● Condition variables ● System V shared memory ● Barriers ● ● System V semaphores Read-write locks man7 .org 6 It helps to classify ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memoryn attach ● Pseudoterminals tio a ● proc_vm_readv() / proc_vm_writev() ● Sockets ic n ● Signals ● Stream vs Datagram (vs uSeq.
    [Show full text]
  • Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser
    Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser The purpose of this presentation is the explanation of: IBM JFS goals: Where was Journaled File System (JFS) designed for? JFS cache design: How the JFS File System and Cache work. JFS benchmarking: How to measure JFS performance under OS/2. JFS cache tuning: How to optimize JFS performance for your job. What do these settings say to you? [E:\]cachejfs SyncTime: 8 seconds MaxAge: 30 seconds BufferIdle: 6 seconds Cache Size: 400000 kbytes Min Free buffers: 8000 ( 32000 K) Max Free buffers: 16000 ( 64000 K) Lazy Write is enabled Do you have a feeling for this? Do you understand the dynamic cache behaviour of the JFS cache? Or do you just rely on the “proven” cachejfs settings that the eCS installation presented to you? Do you realise that the JFS cache behaviour may be optimized for your jobs? November 13, 2009 / page 2 Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Where was Journaled File System (JFS) designed for? 1986 Advanced Interactive eXecutive (AIX) v.1 based on UNIX System V. for IBM's RT/PC. 1990 JFS1 on AIX was introduced with AIX version 3.1 for the RS/6000 workstations and servers using 32-bit and later 64-bit IBM POWER or PowerPC RISC CPUs. 1994 JFS1 was adapted for SMP servers (AIX 4) with more CPU power, many hard disks and plenty of RAM for cache and buffers. 1995-2000 JFS(2) (revised AIX independent version in c) was ported to OS/2 4.5 (1999) and Linux (2000) and also was the base code of the current JFS2 on AIX branch.
    [Show full text]
  • Extracting Compressed Pages from the Windows 10 Virtual Store WHITE PAPER | EXTRACTING COMPRESSED PAGES from the WINDOWS 10 VIRTUAL STORE 2
    white paper Extracting Compressed Pages from the Windows 10 Virtual Store WHITE PAPER | EXTRACTING COMPRESSED PAGES FROM THE WINDOWS 10 VIRTUAL STORE 2 Abstract Windows 8.1 introduced memory compression in August 2013. By the end of 2013 Linux 3.11 and OS X Mavericks leveraged compressed memory as well. Disk I/O continues to be orders of magnitude slower than RAM, whereas reading and decompressing data in RAM is fast and highly parallelizable across the system’s CPU cores, yielding a significant performance increase. However, this came at the cost of increased complexity of process memory reconstruction and thus reduced the power of popular tools such as Volatility, Rekall, and Redline. In this document we introduce a method to retrieve compressed pages from the Windows 10 Memory Manager Virtual Store, thus providing forensics and auditing tools with a way to retrieve, examine, and reconstruct memory artifacts regardless of their storage location. Introduction Windows 10 moves pages between physical memory and the hard disk or the Store Manager’s virtual store when memory is constrained. Universal Windows Platform (UWP) applications leverage the Virtual Store any time they are suspended (as is the case when minimized). When a given page is no longer in the process’s working set, the corresponding Page Table Entry (PTE) is used by the OS to specify the storage location as well as additional data that allows it to start the retrieval process. In the case of a page file, the retrieval is straightforward because both the page file index and the location of the page within the page file can be directly retrieved.
    [Show full text]
  • Table of Contents
    TABLE OF CONTENTS Chapter 1 Introduction ............................................................................................. 3 Chapter 2 Examples for FPGA ................................................................................. 4 2.1 Factory Default Code ................................................................................................................................. 4 2.2 Nios II Control for Programmable PLL/ Temperature/ Power/ 9-axis ....................................................... 6 2.3 Nios DDR4 SDRAM Test ........................................................................................................................ 12 2.4 RTL DDR4 SDRAM Test ......................................................................................................................... 14 2.5 USB Type-C DisplayPort Alternate Mode ............................................................................................... 15 2.6 USB Type-C FX3 Loopback .................................................................................................................... 17 2.7 HDMI TX and RX in 4K Resolution ........................................................................................................ 21 2.8 HDMI TX in 4K Resolution ..................................................................................................................... 26 2.9 Low Latency Ethernet 10G MAC Demo .................................................................................................. 29 2.10 Socket
    [Show full text]
  • NOVA: a Log-Structured File System for Hybrid Volatile/Non
    NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu and Steven Swanson, University of California, San Diego https://www.usenix.org/conference/fast16/technical-sessions/presentation/xu This paper is included in the Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST ’16). February 22–25, 2016 • Santa Clara, CA, USA ISBN 978-1-931971-28-7 Open access to the Proceedings of the 14th USENIX Conference on File and Storage Technologies is sponsored by USENIX NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu Steven Swanson University of California, San Diego Abstract Hybrid DRAM/NVMM storage systems present a host of opportunities and challenges for system designers. These sys- Fast non-volatile memories (NVMs) will soon appear on tems need to minimize software overhead if they are to fully the processor memory bus alongside DRAM. The result- exploit NVMM’s high performance and efficiently support ing hybrid memory systems will provide software with sub- more flexible access patterns, and at the same time they must microsecond, high-bandwidth access to persistent data, but provide the strong consistency guarantees that applications managing, accessing, and maintaining consistency for data require and respect the limitations of emerging memories stored in NVM raises a host of challenges. Existing file sys- (e.g., limited program cycles). tems built for spinning or solid-state disks introduce software Conventional file systems are not suitable for hybrid mem- overheads that would obscure the performance that NVMs ory systems because they are built for the performance char- should provide, but proposed file systems for NVMs either in- acteristics of disks (spinning or solid state) and rely on disks’ cur similar overheads or fail to provide the strong consistency consistency guarantees (e.g., that sector updates are atomic) guarantees that applications require.
    [Show full text]
  • Hibachi: a Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays
    Hibachi: A Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays Ziqi Fan, Fenggang Wu, Dongchul Parkx, Jim Diehl, Doug Voigty, and David H.C. Du University of Minnesota–Twin Cities, xIntel Corporation, yHewlett Packard Enterprise Email: [email protected], ffenggang, park, [email protected], [email protected], [email protected] Abstract—Adopting newer non-volatile memory (NVRAM) Application Application Application technologies as write buffers in slower storage arrays is a promising approach to improve write performance. However, due OS … OS … OS to DRAM’s merits, including shorter access latency and lower Buffer/Cache Buffer/Cache Buffer/Cache cost, it is still desirable to incorporate DRAM as a read cache along with NVRAM. Although numerous cache policies have been Application, web or proposed, most are either targeted at main memory buffer caches, database servers or manage NVRAM as write buffers and separately manage DRAM as read caches. To the best of our knowledge, cooperative hybrid volatile and non-volatile memory buffer cache policies specifically designed for storage systems using newer NVRAM Storage Area technologies have not been well studied. Network (SAN) This paper, based on our elaborate study of storage server Hibachi Cache block I/O traces, proposes a novel cooperative HybrId NVRAM DRAM NVRAM and DRAM Buffer cACHe polIcy for storage arrays, named Hibachi. Hibachi treats read cache hits and write cache hits differently to maximize cache hit rates and judiciously adjusts the clean and the dirty cache sizes to capture workloads’ tendencies. In addition, it converts random writes to sequential writes for high disk write throughput and further exploits storage server I/O workload characteristics to improve read performance.
    [Show full text]
  • Oracle® Linux 7 Working with LXC
    Oracle® Linux 7 Working With LXC F32445-02 October 2020 Oracle Legal Notices Copyright © 2020, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]