Non-volatile DIMMs: Datapath and Management models

Paul von Behren / Intel Corporation Introduction to NVDIMM hardware Hardware supporting persistent memory (pmem)

Yesterday: battery backed RAM Today: NVDIMMs with DRAM and flash

. On power down, RAM copied to flash; on power up, copy back to RAM Emerging NVDIMMs: Phase Change Memory, Memristor, many others

. Offer ~ 1000x speed-up over NAND, closer to DRAM Characteristics as seen by software

. Load/Store (memory instructions) accessible

. Would reasonably stall CPU for a load instruction

. No need for paging (at least not by the OS)

DMTF APTS 2015 3 Software access to pmem

Could treat pmem like disks/SSDs . Existing software works, faster than with flash . But we still have block stack latency (Intel SSD study)

*

DMTFWith APTS 2015 Next Generation NVM, hardware is no longer the bottleneck 4 New programming model for byte- addressable persistent memory (pmem) SNIA goals for datapath programming

Define a programming model for direct access to pmem . No kernel code in data path, use existing load/store instructions Use a general approach for different types of pmem hardware . Built into OSes . Use existing OS solutions where appropriate – E.g., use existing file permissions rather than invent something new Specify behavior, not a specific API . Allow OS developers to implement APIs appropriate to the OS Support application developer goals for power-fail safe atomicity

DMTF APTS 2015 6 Legacy memory mapped files

Background: memory Application

mapped files backed by block Standard User devices File API Space . POSIX mmap(), Windows MapViewOfFile() Legacy . Disk file mapped to File System Kernel . Paged to memory when referenced (by load/store Space instructions . msync() flushes dirty pages to disk

DMTF APTS 2015 7 Persistent memory programming model

With pmem, no paging between Application persistence and volatile memory Standard Load/ User File API Store Space Memory map command causes

pmem file to be mapped to app's pmem-Aware MMU File System Mappings virtual memory Kernel Space Sync command flushes CPU

Load/Store commands directly Persistent Memory access pmem Standard read/write APIs still work − Perhaps less performant

DMTF APTS 2015 8 How this programming model meets goals

• pmem-aware file system allows pmem to be used by multiple applications, permissions controlled by admin • Applications have direct access to pmem using load/store instructions • Applications have control over CPU cache • As with block storage, applications (esp. databases) want to control cache – flush frequently for high durability, less frequently for low latency • Works with existing (C/C++) compilers • Potential for friendlier programming model with language extensions, but we don’t want to require language and compiler extensions • Works with existing hardware • Tracking emerging pmem HW features, but don’t require them

DMTF APTS 2015 9 Status of programming model

Included in SNIA NVM Programming Model specification • http://www.snia.org/tech_activities/standards/curr_standards/npm SNIA NVM Programming Technical Workgroup • Updating and extending the model • Researching impact on transaction managers and distributed access to pmem Supported by Linux kernel 4.2 • New –dax mount option for pmem-aware behavior in ext4 filesystem User-space libraries building on the pmem programming model • NVML (pmem.io/nvml) libraries supporting transactions and other usage models • NVM Direct (https://github.com/oracle/NVM-Direct)

DMTF APTS 2015 10 NVDIMM Management model Impact of NVDIMMs on management tasks

Today's volatile memory . Discovered by OS and treated by SW as single pool NVDIMMs introduce "flavors" of memory . NVDIMMs support persistent byte-addressable memory or block access (i.e., emulate disks/SSDs) . Memory controllers may allow – Configuration of a subset as byte-addressable persistent memory, block, or as volatile – QoS-driven provisioning (such as DIMM mirroring)

DMTF APTS 2015 NVDIMMs (and other advances) add management complexity

. NVDIMM Pool topology (configured as volatile, persistent byte addressable, persistent block addressable, or unused) – Allow admin to allocate pmem capacity to pmem-aware applications . Ability to configure pool topology (allocate from unused to …) – QoS provisioning . Allocation of memory volumes from pools . Health reporting . Asynchronous notifications . Firmware updates . Physical Topology (NUMA affinity to CPU sockets)

DMTF APTS 2015 13 Emerging NVDIMM management standards

CIM profiles from DMTF and SNIA • DMTF System Memory (DSP1026) existing profile limited to volatile memory • Concern in DMTF that extending this profile could break existing clients • DMTF Multi-type System Memory (DSP1071) (in progress) • reports memory regions and associated QoS attributes, • plus affinity and visibility of regions to processors • SNIA SMI-S 1.7 (in progress) includes two memory configuration profiles • Memory Configuration Profile • Assign NVDIMM capacity to volatile pool or to persistent pools • Persistent Memory Configuration Profile • Carve persistent pools into namespaces (which SW treats as volumes) • Final draft available “soon” (http://www.snia.org/tech_activities/publicreview)

DMTF APTS 2015 14 DMTF Multi-Type Memory profile model

See DSP1071 for details

ComputerSystem 1 1 RegisteredProfile SystemDevice ElementConformsToProfile 1..* (See Referencing 1 (See Profile Registration Profile) Profile)

Processor 1 1

(CPU Profile) SystemDevice 1..* MemoryAllocationSettings (SNIA:Memory Configuration 1..* ConcreteDependency Profile)

1 1 VisibleMemory 1..* ElementSettingData ElementConformsToProfile 1..* MemoryController 1..* 1..*

BasedOn ElementAllocatedFromPool 1..* 1 1..* ResourcePool 1..* RawMemory AssociatedMemory ConcreteComponent 1..* 1 (SNIA:Memory Configuration Profile) 1

Realizes

1

Physical Memory

(Physical Asset Profile)

DMTF APTS 2015 Configure Memory See SMI-S 1.7 Memory Configuration profile

Computer System MemoryConfigurationService 1 ElementCapabilities (see referencing profile) 1 1 1 1 MemoryConfigurationCapabilities HostedService ServiceAffectsElement MemoryAllocation SystemDevice ElementAllocated SettingData FromPool 1 1 ResourcePool MemoryCapabilities 1 * Element 1 Element 1..* Capabilities SettingData VisibleMemory 1 1 (see DMTF Multi-type Memory Profile)

1..* Element BasedOn Concrete SettingData 1..* Component RawMemory MemoryAllocationSettingData (see DMTF Multi-type Memory Profile) 1..*

Primary Use Case: allocate visible memory from a pool (of raw memory) • MemoryConfigurationService iAllocateFromPool() s used to request an allocation. • The end result of the allocation is the new CIM_VisibleMemory and CIM_MemoryAllocationSettingData instances. • The CIM_VisibleMemory instance is associated to the CIM_ResourcePool by an ElementAllocatedFromPool association.

DMTF APTS 2015 Configure Persistent Memory See SMI-S 1.7 Persistent Memory Configuration profile

Computer System PersistentMemoryService 1 ElementCapabilities (see referencing profile) 1 PersistentConfigurationCapabilities 1 1 PersistentMemory HostedService ServiceAffectsElement NamespaceSettingData

1 SystemDevice ResourcePool * 1 Element PersistentMemoryCapabilities ElementAllocated 1 Element FromPool Capabilities SettingData * ConcreteComponent 1 PersistentMemoryNamespace 1 BasedOn VisibleMemory

1..* (see DMTF Multi-type Memory Profile)

Primary Use Case: Create a persistent namespace • Enumerate instances of MemoryResource, find a pool that has sufficient remaining capacity and who’s associated CIM_PersistentMemoryCapabilities indicates support for the desired quality of service. • Follow the CIM_ServiceAffectsElement association to the PersistentMemoryService instance that handles the selected pool. • Utilize the AllocateFromPool() extrinsic, specify pool, desired size, quality of service parameters.

DMTF APTS 2015 Takeaways

• NVDIMM are available now • and will be more and more common in the future • will offer more configuration options • support for NVDIMMs is emerging • Applications are being rewritten to utilize NVDIMMs where available • Platform vendors are planning NVDIMM configuration tools, health and fault monitoring, topology reporting • Some NVDIMM technologies expected to support dynamic configuration • E.g., automation/orchestration that can change balance of volatile to persistent memory in response to workload changes [email protected]

DMTF APTS 2015 18