Software Persistent Memory
Total Page:16
File Type:pdf, Size:1020Kb
Software Persistent Memory Jorge Guerra, Leonardo M´armol, Daniel Campello, Carlos Crespo, Raju Rangaswami, Jinpeng Wei Florida International University {jguerra, lmarm001, dcamp020, ccres008, raju, weijp}@cs.fiu.edu Abstract translators increase overhead along the data path. Most Persistence of in-memory data is necessary for many importantly, all of these solutions require substantial ap- classes of application and systems software. We pro- plication involvement for making data persistent which pose Software Persistent Memory (SoftPM), a new mem- ultimately increases code complexity affecting reliabil- ory abstraction which allows malloc style allocations ity, portability, and maintainability. to be selectively made persistent with relative ease. In this paper, we present Software Persistent Mem- Particularly, SoftPM’s persistent containers implement ory (SoftPM), a lightweight persistent memory abstrac- automatic, orthogonal persistence for all in-memory tion for C. SoftPM provides a novel form of orthogo- data reachable from a developer-defined root structure. nal persistence [8], whereby the persistence of data (the Writing new applications or adapting existing applica- how) is seamless to the developer, while allowing effort- tions to use SoftPM only requires identifying such root less control over when and what data persists. To use structures within the code. We evaluated the correct- SoftPM, developers create one or more persistent con- ness, ease of use, and performance of SoftPM using tainers to house a subset of in-memory data that they a suite of microbenchmarks and real world applica- wish to make persistent. They only need to ensure that tions including a distributed MPI application, SQLite (an a container’s root structure houses pointers to the data in-memory database), and memcachedb (a distributed structures they wish to make persistent (e.g. the head of memory cache). In all cases, SoftPM was incorporated a list or the root of a tree). SoftPM automatically dis- with minimal developer effort, was able to store and re- covers data reachable from a container’s root structure cover data successfully, and provide significant perfor- (by recursively following pointers) and makes all new mance speedup (e.g., up to 10X for memcachedb and and modified data persistent. Restoring a container re- 83% for SQLite). turns the container root structure from which all origi- nally reachable data can be accessed. SoftPM thus ob- 1 Introduction viates the need for explicitly managing persistent data Persistence of in-memory data is necessary for many and places no restrictions on persistent data locations in classes of software including metadata persistence in the process’ address space. Finally, SoftPM improves systems software [21, 24, 32, 33, 35, 38, 40, 47, 48, 52, I/O performance by eliminating the need to serialize data 59], application data persistence in in-memory databases and by using a novel chunk-remapping technique which and key-value stores [3, 5], and computational state per- utilizes the property that all container data is memory sistence in high-performance computing (HPC) applica- resident and trades writing additional data for reducing tions [19, 44]. Currently such software relies on the overall I/O latency. persistence primitives provided by the operating system We evaluated a Linux prototypeof SoftPM for correct- (file or block I/O) or a database system. When using ness, ease of use, and performance using microbench- OS primitives, developers need to carefully track persis- marks and three real world applications including a re- tent data structures in their code and ensure the atomic- coverable distributed MPI application, SQLite [5] (a ity of persistent modifications. Additionally they are re- serverless database), and memcachedb (a distributed quired to implement serialization/deserialization for their memory cache). In all cases, we could integrate SoftPM structures, potentially creating and managing additional with little developer effort and store and recover appli- metadata whose modifications must also be made con- cation data successfully. In comparison to explicitly sistent with the data they represent. On the other hand, managing persistence within code, development com- using databases for persistent metadata is generally not plexity substantially reduced with SoftPM. Performance an option within systems software, and when their use is improvements were up to 10X for memcachedb and possible, developers must deal with data impedance mis- 83% for SQLite, when compared to their native, opti- match [34]. While some development complexity is al- mized, persistence implementations. Finally, for a HPC- leviated by object-relation mapping libraries [10], these class matrix multiplication application, SoftPM’s asyn- 1 Function Description int pCAlloc(int magic, int cSSize, void ∗∗ cStruct) create a new container; returns a container identifier int pCSetAttr(int cID, struct cattr ∗ attr) set container attributes; reports success or failure struct cattr ∗ pCGetAttr(int magic) get attributes of an existing container; returns container attributes void pPoint(int cID) create a persistence point asynchronously int pSync(int cID) sync-commit outstanding persistence point I/Os; reports success or failure int pCRestore(int magic, void ∗∗ cStruct) restore a container; populates container struct, returns a container identifier void pCFree(int cID) free all in-memory container data void pCDelete(int magic) delete on-disk and in-memory container data void pExclude(int cID, void ∗ ptr) do not follow pointer during container discovery Table 1: The SoftPM application programmer interface. Network LIMA SID PFS module App1 Container Discovery & Chunk SSD module A Manager Allocator Remapper App2 P Cache module memcachedb I Write Transaction Flusher Appn Handler Handler HDD module Figure 1: The SoftPM architecture. Container Root Usage and the Storage-optimized I/O Driver (SID) as depicted Structure in Figure 1. LIMA’s container manager handles con- struct c root id = pCAlloc(m,sizeof(*cr),&cr); { tainer creation. LIMA manages the container’s persis- list t *l; cr->l = list head; tent data as a collection of memory pages marked for } *cr; pPoint(id); persistence. When creating a persistence point, the dis- covery and allocator module moves any data newly made Figure 2: Implementing a persistent list. pCAlloc reachable from the container root structure and located in allocates a container and pPoint makes it persistent. volatile memory to these pages. Updates to these pages are tracked by the write handler at the granularity of chronous persistence feature provided performance at multi-page chunks. When requested to do so, the flusher close to memory speeds without compromising data con- creates persistence points and sends the dirty chunks to sistency and recoverability. the SID layer in an asynchronous manner. Restore re- quests are translated into chunks requests for SID. 2 SoftPM Overview The SID layer atomically commits container data to SoftPM implements a persistent memory abstraction persistent storage and tunes I/O operations to the under- called container. To use this abstraction, applications lying storage mechanism. LIMA’s flusher first notifies create one or more containers and associate a root struc- the transaction handler of a new persistence point and ture with each. When the application requests a persis- submits dirty chunks to SID. The chunk remapper im- tence point, SoftPM calculates a memory closure that plements a novel I/O technique which uses the property contains all data reachable (recursively via pointers) that all container data is memory resident and trades writ- from the container root, and writes it to storage atomi- ing additional data for reducing overall I/O latency. We cally and (optionally) asynchronously. designed and evaluated SID implementations for hard The container root structure serves two purposes: (i) drive, SSD, and memcached back-ends. it frees developers from the burden of explicitly track- ing persistent memory areas, and (ii) it provides a simple 3 LIMA Design mechanism for accessing all persistent memory data af- ter a restore operation. Table 1 summarizes the SoftPM Persistent containers build a foundation to provide seam- API. In the simplest case, an application would create less memory persistence. Container data is managed one container and create persistence points as necessary within a contiguous container virtual address space, a (Figure 2). Upon recovery, a pointer to a valid container self-describing unit capable of being migrated across root structure is returned. systems and applications running on the same hardware architecture. The container virtual address space is com- 2.1 System Architecture posed solely of pages marked for persistence including The SoftPM API is implemented by two components: those containing application data and others used to store the Location Independent Memory Allocator (LIMA), LIMA metadata. This virtual address space is mapped to 2 Process Virtual Address Space LIMA Virtual Volume SID Physical Volume automatically detecting pointers in process memory, re- Container Chunk Page Table Ind. Map cursively moving data reachable from the container root <page> <chunk> to the container data pages, and fixing any back refer- . ences (other pointers) to the data that was moved. In . our implementation, this process is triggered each time <container> . a persistence point is requested by the application and is executed atomically by blocking all