De-Indirection for Flash-Based Ssds with Nameless Write

Total Page:16

File Type:pdf, Size:1020Kb

De-Indirection for Flash-Based Ssds with Nameless Write ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS De-indirection for Flash-based SSDs with Nameless Write Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci- Dusseau University of Wisconsin - Madison ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS “All Problems in computer science can be solved by another level of indirection.” – Butler Lampson o Indirection • the ability to reference something using a name, reference, or container instead of the value itself. • Example: value (A)<- pointer (*B) <- pointer of pointer (**C) o Indirection in computer systems • Virtual memory: virtual to physical memory address • SSDs: logical to SSD physical address ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS “…… But that usually will create another problem.” –David Wheeler Challenges: 1. Extra translation/ search time Examples: virtual-physical memory address (extra translation lookaside table ) 2. Extra space overhead Examples: logical-physical address mapping table (1TB data needs 1GB table) ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Indirection in Flash-Based SSDs • Mapping from logical to physical address (L->P) o Advantages: • Hiding erase-before-write • Able to perform wear leveling o Disadvantages: • Occupy extra RAM space to maintain indirection table • Performance cost of garbage collection • Performance impact on random writes ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS De-indirection with Nameless Write One solution is De-indirection. • Remove logical-to-physical mapping in SSDs; • Instead, store physical addresses in file system; New interface: Nameless Write • Write without a name (logical address) • Device allocates and returns physical address • File system stores physical address Advantages: Reduces space and performance cost ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Details of Nameless Write Interfaces o Nameless Write: • Write only data and no name o Physical Read • Read using physical address o Free/Trim • Invalidates block at physical address ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS More conditions to take into account o P1: Overwrite a data block in a file ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS P1: Segment Address Space Example ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Reconsider Overwrite Condition ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS More conditions to take into account o P2: Data Migration During Wear Leveling • SSD moves data to distributed block for balanced P/E cycles; o Traditional Wear Leveling • Data migration doesn’t need any operation on file system; • In device, only changing L-P mapping table is enough; o Wear Leveling with nameless writing • File system needs to be informed • Only address change in the physical space ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS o P2: Data Migration During Wear Leveling o Solution: add new interface Migration Callbacks • Step1: add temporary remapping table • Step2: read data from old address and remap it to new address; • Step3: inform file system by committing old physical Address, new address and metadata; • Step4: File system processes callbacks in background; ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS More conditions to take into account o P3: Locating Metadata structures • In callbacks and recovery, metadata is necessary; o Solution: Associated Metadata . Small amount of metadata used to locate metadata . e.g. Inode number, inode generation number, block offset . Sent with nameless writes and migration callbacks . Stored adjacent to data pages on device, e.g. OOB area ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS More conditions to take into account o P4: Garbage Collection • Reclaim invalid data pages and migrate live ones to a new location; o Traditional Garbage Collection • Data migration doesn’t need any operation on file system; • In device, only changing L-P mapping table is enough; o In-Place Garbage Collection • Don’t want to change physical address of live data pages • Extra battery-backed cache to hold data • May waste space in block after garbage collection ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Evaluation o FTLs studied • Page mapping: log-structured allocation ideal in performance, unrealistic in space • Hybrid mapping: small page-mapped area+ block-mapped area • Nameless-writing ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Indirection Table Space Cost o Mapping table sizes for typical file system images ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Micro-Benchmark Performance o Sequential and sustained 4KB random write ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Micro-Benchmark Performance o Varmail, FileServer, and WebServer ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE UNIVERSITY OF TEXAS AT DALLAS Some critical issues o Trade-off of “in-place garbage collection” between waste space and waste time.
Recommended publications
  • Virtual Memory
    Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think “using a level of indirection” Virtual memory: level of indirection to physical memory • Program uses virtual memory addresses • Virtual address is converted to a physical address • Physical address indicates physical location of data • Physical location can be memory or disk! 0x800 Disk:Mem: 0x803C4 0x3C00 Virtual address Physical address 2 Virtual Memory 3 Virtual Memory 1 4 Virtual Memory: Take 1 Main memory may not be large enough for a task • Programmers turn to overlays and disk • Many programs would have to do this • Programmers should have to worry about main memory size across machines Use virtual memory to make memory look bigger for all programs 5 A System with Only Physical Memory Examples: Most Cray machines, early PCs, nearly all current embedded systems, etc. Memory 0: 1: Store 0x10 CPU Load 0xf0 N-1: CPU’s load or store addresses used directly to access memory. 6 A System with Virtual Memory Examples: modern workstations, servers, PCs, etc. Memory 0: Page Table 1: Virtual Physical Addresses 0: Addresses 1: Store 0x10 CPU Load 0xf0 P-1: N-1: Disk Address Translation: the hardware converts virtual addresses into physical addresses via an OS-managed lookup table (page table) 7 Page Faults (Similar to “Cache Misses”) What if an object is on disk rather than in memory? 1. Page table indicates that the virtual address is not in memory 2. OS trap handler is invoked, moving data from disk into memory • Current process suspends, others can resume • OS has full control over placement, etc.
    [Show full text]
  • Concept-Oriented Programming: References, Classes and Inheritance Revisited
    Concept-Oriented Programming: References, Classes and Inheritance Revisited Alexandr Savinov Database Technology Group, Technische Universität Dresden, Germany http://conceptoriented.org Abstract representation and access is supposed to be provided by the translator. Any object is guaranteed to get some kind The main goal of concept-oriented programming (COP) is of primitive reference and a built-in access procedure describing how objects are represented and accessed. It without a possibility to change them. Thus there is a makes references (object locations) first-class elements of strong asymmetry between the role of objects and refer- the program responsible for many important functions ences in OOP: objects are intended to implement domain- which are difficult to model via objects. COP rethinks and specific structure and behavior while references have a generalizes such primary notions of object-orientation as primitive form and are not modeled by the programmer. class and inheritance by introducing a novel construct, Programming means describing objects but not refer- concept, and a new relation, inclusion. An advantage is ences. that using only a few basic notions we are able to describe One reason for this asymmetry is that there is a very many general patterns of thoughts currently belonging to old and very strong belief that it is entity that should be in different programming paradigms: modeling object hier- the focus of modeling (including programming) while archies (prototype-based programming), precedence of identities simply serve entities. And even though object parent methods over child methods (inner methods in identity has been considered “a pillar of object orienta- Beta), modularizing cross-cutting concerns (aspect- tion” [Ken91] their support has always been much weaker oriented programming), value-orientation (functional pro- in comparison to that of entities.
    [Show full text]
  • Chapter 9 Pointers
    Chapter 9 Pointers Objectives ❏ To understand the concept and use of pointers ❏ To be able to declare, define, and initialize pointers ❏ To write programs that access data through pointers ❏ To use pointers as parameters and return types ❏ To understand pointer compatibility, especially regarding pointers to pointers ❏ To understand the role of quality in software engineering Computer Science: A Structured Programming Approach Using C 1 FIGURE 9-1 Derived Types Computer Science: A Structured Programming Approach Using C 2 9-1 Introduction A pointer is a constant or variable that contains an address that can be used to access data. Pointers are built on the basic concept of pointer constants. Topics discussed in this section: Pointer Constants Pointer Values Pointer Variables Accessing Variables Through Pointers Pointer Declaration and Definition Declaration versus Redirection Initialization of Pointer Variables Computer Science: A Structured Programming Approach Using C 3 FIGURE 9-2 Character Constants and Variables Computer Science: A Structured Programming Approach Using C 4 FIGURE 9-3 Pointer Constants Computer Science: A Structured Programming Approach Using C 5 Note Pointer constants, drawn from the set of addresses for a computer, exist by themselves. We cannot change them; we can only use them. Computer Science: A Structured Programming Approach Using C 6 Note An address expression, one of the expression types in the unary expression category, consists of an ampersand (&) and a variable name. Computer Science: A Structured Programming Approach Using C 7 FIGURE 9-4 Print Character Addresses Computer Science: A Structured Programming Approach Using C 8 Note A variable’s address is the first byte occupied by the variable.
    [Show full text]
  • Pointers Pointers – Getting the Address of a Value the Address Operator (&) Returns the Memory Address of a Variable
    System Design and Programming II CSCI – 194 Section 01 CRN: 10968 Fall 2017 David L. Sylvester, Sr., Assistant Professor Chapter 9 Pointers Pointers – Getting the Address of a Value The address operator (&) returns the memory address of a variable. Every variable is allocated a section of memory large enough to hold a value of the variable data type. Commonly char one bytes shorts two bytes int, long, float four bytes double eight bytes Each byte of memory has a memory address. A variable’s address is the address of the first byte allocated to that variable. Ex: char letter; short number; float amount; letter number amount 1200 1201 1203 The addresses of the variables above are used as an example. Getting the address of a variable is accomplished by using the (&) operator in front of the variable name. It allows the system to return the address of that variable in hexadecimal. Ex: &amount //returns the variable’s address cout << &amount // displays the variable’s address Note: Do not confuse the address operator with the & symbol used when defining a referenced variable. Pointers – Sample Program // This program uses the & operator to determine a variable's // address and the sizeof operator to determine its size. #include <iostream> using namespace std; int main() { int x = 25; cout << "The address of x is " << &x << endl; cout << "The size of x is " << sizeof(x) << " bytes\n"; cout << "The value of x is "<< x << endl; } Sample output Pointers Variables Pointer variables or pointers, are special variables that hold a memory address. Just as int variables are designed to hold integers, pointers variables are designed to hold memory addresses.
    [Show full text]
  • A Theory of Indirection Via Approximation
    A Theory of Indirection via Approximation Aquinas Hobor ∗† Robert Dockins† Andrew W. Appel † National University of Singapore Princeton University Princeton University [email protected] [email protected] [email protected] Abstract Consider general references in the polymorphic λ-calculus. Building semantic models that account for various kinds of indirect Here is a flawed semantic model of types for this calculus: reference has traditionally been a difficult problem. Indirect refer- value ≡ loc of address + num of N + ... ence can appear in many guises, such as heap pointers, higher-order type ≡ (memtype × value) → T (1) functions, object references, and shared-memory mutexes. memtype ≈ address * type We give a general method to construct models containing indi- rect reference by presenting a “theory of indirection”. Our method Values are a tagged disjoint union, with the tag loc indicating can be applied in a wide variety of settings and uses only simple, a memory address. T is some notion of truth values (e.g., the elementary mathematics. In addition to various forms of indirect propositions of the metalogic or a boolean algebra) and a natural reference, the resulting models support powerful features such as interpretation of the type A → T is the characteristic function for a impredicative quantification and equirecursion; moreover they are set of A. We write ≈ to mean “we wish we could define things this compatible with the kind of powerful substructural accounting re- way” and A*B to indicate a finite partial function from A to B. quired to model (higher-order) separation logic. In contrast to pre- The typing judgment has the form ψ ` v : τ, where ψ is vious work, our model is easy to apply to new settings and has a a memory typing (memtype), v is a value, and τ is a type; the simple axiomatization, which is complete in the sense that all mod- semantic model for the typing judgment is ψ ` v : τ ≡ τ(ψ, v).
    [Show full text]
  • Encapsulation and Inheritance in Object-Oriented Programming Languages
    Encapsulation and Inheritance in Object-Oriented Programming Languages Alan Snyder Affiliation: Software Technology Laboratory Hewlett-Packard Laboratories P.O. Box 10490, Palo Alto, CA , 94303-0971 (415) 857-8764 Abstract Object-oriented programming is a practical and useful programming methodology that encourages modular design and software reuse. Most object-oriented programming languages support data Abstraction by preventing an object from being manipulated except via its defined external operations. In most languages, however, the introduction of inheritanceseverely compromises the benefits of this encapsulation. Furthermore, the use of inheritance itself is globally visible in most languages, so that changes to the inheritance hierarchy cannot be made safely. This paper examines the relationship between inheritance and encapsulation and develops requirements for full support of encapsulation with inheritance. Introduction. Object-oriented programming is a practical and useful programming methodology that encourages modular design and software reuse. One of its prime features is support for data Abstraction, the ability to define new types of objects whose behavior is defined Abstractly, without reference to implementation details such as the data structure used to represent the objects. Most object-oriented languages support data Abstraction by preventing an object from being manipulated except via its defined external operations. Encapsulation has many advantages in terms of improving the understandability of programs and facilitating program modification. Unfortunately, in most object-oriented languages, the introduction of inheritance severely compromises encapsulation. This paper examines the issue of encapsulation and its support in object-oriented languages. We begin by reviewing the concepts of encapsulation and data Abstraction, as realized by most object-oriented language.
    [Show full text]
  • Specification and Verification with References Bruce W
    Specification and Verification with References Bruce W. Weide and Wayne D. Heym Computer and Information Science The Ohio State University Columbus, OH 43210 +1-614-292-1517 {weide,heym}@cis.ohio-state.edu ABSTRACT pleteness of Hoare logic [3] identified aliasing (of arguments to calls, i.e., even in a language without pointer variables) as Modern object-oriented programming languages demand that the key technical impediment to modular verification. There component designers, specifiers, and clients deal with refer- have been recent papers (e.g., [19, 23]) showing how it is tech- ences. This is true despite the fact that some programming nically possible to overcome such problems, but apparently language and formal methods researchers have been announc- only at the cost of even further complicating the programming ing for decades, in effect, that pointers/references are harmful model that a language presents to a software engineer. to the reasoning process. Their wise counsel to bury point- ers/references as deeply as possible, or to eliminate them en- Why do we need another paper about this issue? The conse- tirely, hasn’t been heeded. What can be done to reconcile the quences of programming with pointers have been examined so practical need to program in the languages provided to us by far primarily in the context of programming language design the commercial powers-that-be, with the need to reason and formal methods. We take a position in the context of the soundly about the behavior of component-based software sys- human element of specification and verification: tems? By directly comparing specifications for value and ref- erence types, it is possible to assess the impact of visible Making pointers/references visible to component clients pointers/references.
    [Show full text]
  • Lazy Reference Counting for Transactional Storage Systems
    Lazy Reference Counting for Transactional Storage Systems Miguel Castro, Atul Adya, Barbara Liskov Lab oratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139 fcastro,adya,[email protected] Abstract ing indirection table entries in an indirect pointer swizzling scheme. Hac isanovel technique for managing the client cache Pointer swizzling [11 ] is a well-established tech- in a distributed, p ersistent ob ject storage system. In a companion pap er, we showed that it outp erforms nique to sp eed up p ointer traversals in ob ject- other techniques across a wide range of cache sizes oriented databases. For ob jects in the client cache, and workloads. This rep ort describ es Hac's solution it replaces contained global ob ject names by vir- to a sp eci c problem: how to discard indirection table tual memory p ointers, therebyavoiding the need to entries in an indirect pointer swizzling scheme. Hac uses translate the ob ject name to a p ointer every time a lazy reference counting to solve this problem. Instead reference is used. Pointer swizzling schemes can b e of eagerly up dating reference counts when ob jects are classi ed as direct [13 ] or indirect [11 , 10 ]. In direct mo di ed and eagerly freeing unreferenced entries, which schemes, swizzled p ointers p oint directly to an ob- can b e exp ensive, we p erform these op erations lazily in the background while waiting for replies to fetch and ject, whereas in indirect schemes swizzled p ointers commit requests. Furthermore, weintro duce a number p oint to an indirection table entry that contains of additionaloptimizations to reduce the space and time a p ointer to the ob ject.
    [Show full text]
  • Fast Run-Time Type Checking of Unsafe Code
    Fast Run-time Type Checking of Unsafe Code Stephen Kell Computer Laboratory, University of Cambridge 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom fi[email protected] Abstract In return, unsafe languages offer several benefits: avoid- Existing approaches for detecting type errors in unsafe lan- ance of certain run-time overheads; a wide range of possi- guages work by changing the toolchain’s source- and/or ble optimisations; enough control of binary interfaces for binary-level contracts, or by imposing up-front proof obli- efficient communication with the operating system or re- gations. While these techniques permit some degree of mote processes. These are valid trades, but have so far come compile-time checking, they hinder use of libraries and are at a high price: of having no machine-assisted checking of not amenable to gradual adoption. This paper describes type- or memory-correctness beyond the limited efforts of libcrunch, a system for binary-compatible run-time type the compiler. Various tools have been developed to offer checking of unmodified unsafe code using a simple yet flex- stronger checks, but have invariably done so by abandon- ible language-independent design. Using a series of experi- ing one or more of unsafe languages’ strengths. Existing dy- ments and case studies, we show our prototype implemen- namic analyses [Burrows et al. 2003; ?] offer high run-time tation to have acceptably low run-time overhead , and to be overhead, while tools based on conservative static analy- easily applicable to real applications written in C without ses [?] are unduly restrictive. Hybrid static/dynamic systems source-level modification.
    [Show full text]
  • Migration of Threads Containing Pointers in Distributed Memory Systems
    Migration of Threads Containing Pointers in Distributed Memory Systems Saisanthosh Balakrishnan Karthik Pattabiraman Electrical and Computer Engineering Department of Information Technology University of Wisconsin at Madison Sri Venkateswara College of Engineering Madison, WI 53706, USA Sriperumbudur, Madras, India [email protected] karthik [email protected] Abstract Dynamic migration of lightweight threads support both data locality and load balancing. However, migrating threads that contain pointers referencing data in both the stack and heap, among heteroge- neous systems remains an open problem. In this paper we describe the design of a system to migrate threads with pointers referencing both stack and heap data. The system adopts an object-based ap- proach for maintaining consistency because this provides fine-grained sharing with low overhead. Also, it does not rely on the operating system’s virtual memory interface to detect writes to shared data. As a result, threads can be migrated between processors in a heterogeneous distributed memory environ- ment without making any modifications to the operating system and in a totally transparent way to the migrating thread. 1 Introduction Lightweight threads, or user level threads, have become increasingly popular over the last few years due to their shorter context switch time and portable implementation. This is especially so for threads running in a distributed memory environment. Thread migration is the process of moving or migrating a thread from one processor to a remote processor. This provides dynamic load balancing and failure recovery in a multi-threaded environment. One of the most difficult problems while migrating the state of a thread is dealing with pointers in the migrant thread.
    [Show full text]
  • Hardware-Software Interface
    CS 240 Stage 2 Hardware-Software Interface Memory addressing, C language, pointers Assertions, debugging Machine code, assembly language, program translation Control flow Procedures, stacks Data layout, security, linking and loading Program, Application Programming Language Compiler/Interpreter Software Operating System Instruction Set Architecture Microarchitecture Digital Logic Devices (transistors, etc.) Hardware Solid-State Physics Programming with Memory via C, pointers, and arrays Why not just registers? • Represent larger structures • Computable addressing • Indirection Instruction Set Architecture (HW/SW Interface) processor memory Instructions • Names, Encodings Instruction Encoded • Effects Logic Instructions • Arguments, Results Registers Data Local storage • Names, Size • How many Large storage • Addresses, Locations Computer byte-addressable memory = mutable byte array 0xFF•••F range of possiBle addresses Cell / location = element address space • Addressed by unique numerical address • Holds one byte • Readable and writable • Address = index • Unsigned number • Represented by one word 0x00•••0 • Computable and storable as a value multi-byte values in memory Store across contiguous byte locations. 64-bit Words Bytes Address 0x1F 0x1E Alignment (Why?) 0x1D ✔ 0x1C 0x1B 0x1A 0x19 0x18 0x17 0x16 0x15 0x14 0x13 ✘ 0x12 0x11 0x10 0x0F 0x0E 0x0D 0x0C 0x0B 0x0A 0x09 0x08 0x07 Bit order within byte always same. 0x06 0x05 0x04 Byte ordering within larger value? 0x03 0x02 0x01 0x00 Endianness: To store a multi-byte value in memory, which
    [Show full text]
  • C Pointers Indirection Creating Pointers Pointer Declarations Pointer
    C Pointers 55:017, Computers in z Powerful C feature but challenging to Engineering– C Pointers understand z Some uses of pointers include z CllbCall by re ference parame ter passage z Dynamic data structures z Data structures that can shrink or grow z Creating arrays during program execution z Other examples include linked lists, stacks and trees Indirection Creating Pointers z Indirection = referencing a value through a pointer z Pointers (like any other variables) must be declared int count = 7; /* Regular int variable */ before they can be used int *countPtr; /* Pointer to int */ z Examples countPtr = &count; /*Set countPtr to point to count */ int *countPtr,,; count; /* Can now use count and *countPtr interchangeably */ z countPtr is declared as type int * count = count +1; z int * means a pointer to an integer value /* is the same as */ z “countPtr is a pointer to an int” *countPtr = *countPtr +1; z “countPtr holds the address in memory of an integer value” z The “*” can be used to define a pointer to any C countPtr 7 count data type. Pointer Declarations Pointer Declarations z Read pointer declarations from right to left and z General format for declaring a variable as a pointer to a particular type: substitute the word “address” for the * operator z name-of-type *nameOfPointer int * iPtr; z This declares nameOfPointer as a pointer int *iPtr; /* pointer to type int */ float *fPtr; iPtr is address of an integer char *cPtr; double *dPtr; int * iPtr; z What are iPtr, fPtr, cPtr, and dPtr? z Each is a pointer to its associated data
    [Show full text]