Persistent Memory for

Bill Gervasi Principal Systems Architect [email protected]

Santa Clara, CA August 2018 1 Demand Outpacing Capacity

In-Memory Computing

Artificial Intelligence

Machine Learning

Deep Learning

Memory Demand

DRAM Capacity

Santa Clara, CA August 2018 2 Driving New Capacity Models

Non-volatile memories

Industry successfully snuggling large memories to the processors…

Memory Demand

DRAM Capacity …but we can do oh! so much more

Santa Clara, CA August 2018 3 My Three Talks at FMS

NVDIMM Analysis Memory Class Storage

Artificial Intelligence Santa Clara, CA August 2018 4 History of Architectures

Let’s go back in time… Santa Clara, CA August 2018 5 Historical Trends in Computing

Edge Co- Computing Processing

Power Failure Santa Clara, CA August 2018 Loss 6 Some Moments in History

Central Distributed Processing Processing

Shared Processor per user Dumb terminals

Peer-to-peer networks Santa Clara, CA August 2018 7 Some Moments in History

Central Distributed Processing Processing

“Native Signal Processing” Hercules graphics Main CPU drivers Sound Blaster audio Cheap analog I/O Rockwell modem Ethernet DSP Tightly-coupled coprocessing

Santa Clara, CA August 2018 8 The Lone Survivor…

Integrated graphics

Graphics add-in cards …survived the NSP war

Santa Clara, CA August 2018 9 Some Moments in History

Central Distributed Processing Processing

Phone providers Phone apps provide controlled all local services data processing

Edge computing reduces latency Santa Clara, CA August 2018 10 When the Playing Field Changes

The speed of networking directly impacts the pendulum swing from centralized to distributed

A faster network favors

Santa Clara, CA August 2018 11 Winners and Losers

Often, the maturity of the software development environment determined who won and who lost

Santa Clara, CA August 2018 12 Maintaining an Edge

See how great it is??? Mediocre software

Coprocessor Oops!!! CPU Time L

Santa Clara, CA August 2018 13 The Tail Wagging the Dog

I won’t say “It’s the Software, Stupid” because I know you’re not stupid

however

To succeed, AI needs GREAT software infrastructure

Driving some companies to design hardware to the software instead of software to the hardware

Santa Clara, CA August 2018 14 Wild Array of Programmer Options

Santa Clara, CA August 2018 15 AI on Traditional Server

No magic

Disk AI applications are like any other

Data processing done on main CPU

CPU Downside is main CPU is overkill in floating point, and weak in parallelism

Santa Clara, CA August 2018 16 AI Evolution

Addition of AI Accelerator

Disk Offloads main CPU for AI tasks

AI CPU Accelerator

Santa Clara, CA August 2018 17 AI Evolution

AI Accelerator Characteristics Proc Wide array of simple Proc processing elements

Goesinta Proc Goesouta Reduced floating point precision

Distribute Proc Recombine

Proc Tuned for matrix operations

Proc

Santa Clara, CA August 2018 18 In-Memory Computing

In-memory computing lets the AI accelerator control the memory Disk directly

AI Accelerator CPU

Also great for encryption

Santa Clara, CA August 2018 19 Data Processing Paradigms

Traditional database

Data mining Data Access Inferencing Data Reliability Fuzzy logic

Recognition

etc

Santa Clara, CA August 2018 20 The Actualization Gap

Research projects Deployments

Santa Clara, CA August 2018 21 The “Research” Projects

Many interconnects between storage elements and processing elements

Weighted calculations produce parallel possible results R R R R

R R R R Focus for a number of R R R R startup companies

Santa Clara, CA August 2018 22 What Most People Mostly Building

Pipes for Dense matrix memory networking for highest storage capacity

Shared for many execution units

Santa Clara, CA August 2018 23 Practical I/O Connection Limits

Any-to-any would Toroid is a more be awesome practicable solution

Limits how quickly data can flow in and out Santa Clara, CA August 2018 24 Network Utilization

RESET Fill processors Fill caches with with code model seed data

Send new Process input data Retrieve results Input data through model

Power Fail No Time to Checkpoint Model?

xxx Consumes I/O Yes

Retrieve updated model data Santa Clara, CA August 2018 25 Lossless Versus Lossy

Persistent data: reload needed

Transient data: reload, restart calculations

Accumulated data: modified models are expensive to rebuild

Time to reload is always an issue Santa Clara, CA August 2018 26 Recovering From Power Fail

Disk

Data pulled from main memory

…or worse… CPU

Backing store

Data requires Not uncommon for multiple hops data reload to take through the 3 minutes or more interconnects Before recalculation Santa Clara, CA August 2018 can begin! 27 Distributed Memory Complications

R R R R

R R R R

R R R R

This may help explain the gap Distributed cells complicate between research projects and download time into the arrays actual deployments

Santa Clara, CA August 2018 28 Persistent Main Memory

NVDIMMs are moving Disk data persistence to the main memory bus

and in some cases increasing memory capacity CPU

See my other talk later this week

Santa Clara, CA August 2018 29 Cost of Power Failure

Statistics vary but all agree… downtime costs a LOT Santa Clara, CA August 2018 30 Persistent Memory

DRAM Persistent Memory Loses data Holds data Must be refreshed forever, even Can’t lose power on power fail Santa Clara, CA August 2018 31 Nantero NRAM™

DDR4 DDR5

…………….. Nantero NRAM is a persistent memory ………………………….... using carbon nanotubes to build resistive arrays which can be arranged in a DRAM compatible device HBM

See my other talks Santa Clara, CA later this week August 2018 32 Classes of Persistent Memory

PCM / DRAM NRAM MRAM ReRAM FeRAM Flash 3DXpoint

Non-volatility No Yes Yes Yes Yes Yes Yes

Endurance No limit No limit Limited Limited Limited No limit 103

Read Time 10 ns 10 ns X X X X 50M ns

Write Time 10 ns 10 ns X X X X 25M ns

Memory & Storage Class Memory Storage Memory Class Storage

See my other talks Santa Clara, CA later this week August 2018 33 Applying Persistent Memory

Replace DRAM with Next generation Persistent Memory persistent memory will Completely eliminates the target SRAM, need to reload on too Power fail

Persistent shadow registers aren’t such a bad Santa Clara, CA idea, either August 2018 34 NRAM for Main Memory

NRAM replaces DDR4, DDR5 for main memory

CPU

Santa Clara, CA August 2018 35 Enables the New Architectures

NRAM cells in the array

Permanent storage through power fail

Programmed once during manufacturing, no reload

Santa Clara, CA August 2018 36 NRAM Everywhere

Soon we will look back and say

“Remember when data was lost when power went out?”

and laugh

Santa Clara, CA August 2018 37 Full Disclosure

My first home had an 8” floppy disk

I earned my gray hair

Santa Clara, CA August 2018 38 Summary

Centralized versus distributed computing is a long term cycle

Quality of software infrastructure typically determines the winner

Artificial intelligence accelerators are a recent co-processing addition

Data loss on power failure is worsened by AI architectures

Persistent memory in AI device solves major problems

Nantero NRAM addresses many usages of PM in AI systems

If you remember 8” floppies, you probably can’t read this screen

Santa Clara, CA August 2018 39 Questions?

Bill Gervasi Principal Systems Architect [email protected]

Santa Clara, CA August 2018 40