Non-Volatile Memory in the Storage Hierarchy: Opportunities and Challenges

Non-Volatile Memory in the Storage Hierarchy: Opportunities and Challenges

Non-volatile Memory in the Storage Hierarchy: Opportunities and Challenges Dhruva Chakrabarti Hewlett-Packard Laboratories 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. From Disks to Flash and Beyond Historical situation with non-volatile storage (hard disks) Slow device, slow hardware interfaces Overheads in the software stack Flash memory is a huge leap Orders of magnitude faster than hard disks But still much slower than main memory Capacity, price, performance somewhere in the middle Software being optimized NVRAM presents even bigger opportunities Performs like DRAM Access interface like DRAM 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 2 Approximate Device Characteristics1, 2 HDD SSD (NAND Flash) DRAM NVRAM Density 1011 1010 109 > 1010 (bit/cm2) Retention > 10 yrs 10 yrs 64 ms > 10 yrs Endurance Unlimited 104 – 105 Unlimited 109 (cycles) Read latency 3-5 ms 0.1 ms < 10 ns 20-40 ns Write Latency 3-5 ms 100 us < 10 ns 50-100 ns Cost ($/GB) 0.1 2 10 < 10 1International Technology Roadmap for Semiconductors (ITRS): Emerging Research Devices, 2011. 2 Qureshi et al., Scalable High Performance Main Memory System using Phase-Change Memory Technology, ISCA 2009. 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 3 Storage Class Memory (SCM)1 Benefits of DRAM and archival capabilities of HDD Two levels of SCM possible: S-SCM1, accessed through the I/O subsystem. Key factors: Retention Cost per bit M-SCM1, accessed through the memory subsystem. Key factors: Access latency Endurance 1International Technology Roadmap for Semiconductors (ITRS): Emerging Research Devices, 2011. 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 4 Access Interface Choices for SCM Block interface Operating system overheads3 Legacy optimizations for disks SSD specific optimizations Byte-addressable CPU load/store model Fast (occurs at CPU speed) Fine granularity persistence, potentially immediately But more exposed to failure-related consistency issues 3 Caulfield et al., Moneta: A High-performance Storage Array Architecture for Next-generation, Non-volatile Memories, MICRO 2010. 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 5 Architectural Model for NVRAM CPU Store Buffer • Both DRAM and NVRAM may co-exist • Volatile buffers and caches still present • Updates may linger within volatile structures Caches DRAM NVRAM 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 6 Failure Models Fail-stop (processes need to be crash-tolerant) A large percentage of failures are indeed fail-stop4 Byzantine (BFT techniques have been studied) Arbitrary state corruption (hardening techniques exist)5 Memory vulnerability during system/application crashes Memory protection can achieve a high degree of reliability6 Requirement: Store to memory must be failure-atomic Invariant: Data in buffers and caches do not survive a failure 4 Chandra et al., How Fail-Stop are Faulty Programs? FTCS 1998. 5 Correia et al., Practical Hardening of Crash-Tolerant Systems, USENIX ATC 2012. 6 Chen et al., The Rio File Cache: Surviving Operating System Crashes, ASPLOS 1996. 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 7 NVRAM Opportunities and Challenges General programming Opportunities: Persistent data structure • Achieve durability practically free No interface translation Logging o o Low write latencies • Reuse and share durable data HPC-style checkpointing Challenges: SQL database • How do we keep persistent data consistent? OS (e.g. filesystems)) • What’s the programming complexity? Models requiring more flexibility 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 8 Visibility Ordering Requirements Volatile cache NVRAM Allocate Initialize Crash Publish A crash may leave a pointer to uninitialized memory 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 9 Potential Inconsistencies Wild pointers Pointers to uninitialized memory Incomplete updates Violation of data structure invariants Persistent memory leaks 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 10 A Quick Detour Do we have an analog in multithreading? Initially: volatile int x = 10, *y = NULL, int r = 0 Thread 1 Thread 2 x = 15 if (y) y = &x r = *y Can r == 10? Either add fences between stores or use C++11 atomics 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 11 Back to NVRAM and Visibility Issues How to ensure that a store x is visible on NVRAM before store y? Insert a cache line flush to ensure visibility in NVRAM Similar to a memory fence Reminiscent of a disk cache flush Allocate Initialize Allocate fence Initialize flush Publish fence Publish 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 12 Flavors of Cache Line Flushes x86: clflush (Flushes and invalidates a cache line) Power: dcbst (Flushes a data cache block), dcbf (flushes and invalidates a data cache block) Ia64: fc (flushes and invalidates a cache line) ARM: separate operations for flush, flush & invalidate are provided UltraSPARC: block store (flushes and invalidates) 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 13 Issues about Cache Flushes Must honor the intended semantics Volatile buffers in the memory hierarchy must be flushed Invalidation should be separated from flush Processor instruction or a different API Cost Granularity How to track what to flush Can we use existing binaries on NVRAM? Is a recompilation sufficient? 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 14 Alternatives Use other caching modes such as uncacheable and write-through Cost x86 memory model not well understood Weaker store ordering 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 15 Other Requirements Higher level abstractions Interactions with other threads and processes Memory management issues Store ordering does not prevent memory leaks Garbage collection a necessity but is it always possible? Failure-atomicity enforcement What’s the right granularity? Should we be wrapping all NVRAM stores within transactions? Significant performance cost But still significantly faster than persisting on block devices Creation, identification, listing of persistent data 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 16 Conclusion Non-volatility is moving closer to the CPU Byte-addressability offers significant benefits But failures complicate matters What is the right API? How much is the additional programmer effort? What are the costs of implementing the API? 2012 Storage Developer Conference. © Hewlett-Packard Company. All Rights Reserved. 17 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    17 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us