Persistent Memory for Artificial Intelligence
Bill Gervasi Principal Systems Architect [email protected]
Santa Clara, CA August 2018 1 Demand Outpacing Capacity
In-Memory Computing
Artificial Intelligence
Machine Learning
Deep Learning
Memory Demand
DRAM Capacity
Santa Clara, CA August 2018 2 Driving New Capacity Models
Non-volatile memories
Industry successfully snuggling large memories to the processors…
Memory Demand
DRAM Capacity …but we can do oh! so much more
Santa Clara, CA August 2018 3 My Three Talks at FMS
NVDIMM Analysis Memory Class Storage
Artificial Intelligence Santa Clara, CA August 2018 4 History of Architectures
Let’s go back in time… Santa Clara, CA August 2018 5 Historical Trends in Computing
Edge Co- Computing Processing
Power Failure Santa Clara, CA August 2018 Data Loss 6 Some Moments in History
Central Distributed Processing Processing
Shared Processor Processor per user Dumb terminals
Peer-to-peer networks Santa Clara, CA August 2018 7 Some Moments in History
Central Distributed Processing Processing
“Native Signal Processing” Hercules graphics Main CPU drivers Sound Blaster audio Cheap analog I/O Rockwell modem Ethernet DSP Tightly-coupled coprocessing
Santa Clara, CA August 2018 8 The Lone Survivor…
Integrated graphics
Graphics add-in cards …survived the NSP war
Santa Clara, CA August 2018 9 Some Moments in History
Central Distributed Processing Processing
Phone providers Phone apps provide controlled all local services data processing
Edge computing reduces latency Santa Clara, CA August 2018 10 When the Playing Field Changes
The speed of networking directly impacts the pendulum swing from centralized to distributed
A faster network favors distributed computing
Santa Clara, CA August 2018 11 Winners and Losers
Often, the maturity of the software development environment determined who won and who lost
Santa Clara, CA August 2018 12 Maintaining an Edge
See how great it is??? Mediocre software
Coprocessor Oops!!! CPU Time L
Santa Clara, CA August 2018 13 The Tail Wagging the Dog
I won’t say “It’s the Software, Stupid” because I know you’re not stupid
however
To succeed, AI needs GREAT software infrastructure
Driving some companies to design hardware to the software instead of software to the hardware
Santa Clara, CA August 2018 14 Wild Array of Programmer Options
Santa Clara, CA August 2018 15 AI on Traditional Server
No magic
Disk AI applications are like any other
Data processing done on main CPU
CPU Downside is main CPU is overkill in floating point, and weak in parallelism
Santa Clara, CA August 2018 16 AI Evolution
Addition of AI Accelerator
Disk Offloads main CPU for AI tasks
AI CPU Accelerator
Santa Clara, CA August 2018 17 AI Evolution
AI Accelerator Characteristics Proc Wide array of simple Proc processing elements
Goesinta Proc Goesouta Reduced floating point precision
Distribute Proc Recombine
Proc Tuned for matrix operations
Proc
Santa Clara, CA August 2018 18 In-Memory Computing
In-memory computing lets the AI accelerator control the memory Disk directly
AI Accelerator CPU
Also great for encryption
Santa Clara, CA August 2018 19 Data Processing Paradigms
Traditional database
Data mining Data Access Inferencing Data Reliability Fuzzy logic
Recognition
etc
Santa Clara, CA August 2018 20 The Actualization Gap
Research projects Deployments
Santa Clara, CA August 2018 21 The “Research” Projects
Many interconnects between storage elements and processing elements
Weighted calculations produce parallel possible results R R R R
R R R R Focus for a number of R R R R startup companies
Santa Clara, CA August 2018 22 What Most People Mostly Building
Pipes for Dense matrix memory networking for highest storage capacity
Shared memory controller for many execution units
Santa Clara, CA August 2018 23 Practical I/O Connection Limits
Any-to-any would Toroid is a more be awesome practicable solution
Limits how quickly data can flow in and out Santa Clara, CA August 2018 24 Network Utilization
RESET Fill processors Fill caches with with code model seed data
Send new Process input data Retrieve results Input data through model
Power Fail No Time to Checkpoint Model?
xxx Consumes I/O Yes
Retrieve updated model data Santa Clara, CA August 2018 25 Lossless Versus Lossy
Persistent data: reload needed
Transient data: reload, restart calculations
Accumulated data: modified models are expensive to rebuild
Time to reload is always an issue Santa Clara, CA August 2018 26 Recovering From Power Fail
Disk
Data pulled from main memory
…or worse… CPU
Backing store
Data requires Not uncommon for multiple hops data reload to take through the 3 minutes or more interconnects Before recalculation Santa Clara, CA August 2018 can begin! 27 Distributed Memory Complications
R R R R
R R R R
R R R R
This may help explain the gap Distributed cells complicate between research projects and download time into the arrays actual deployments
Santa Clara, CA August 2018 28 Persistent Main Memory
NVDIMMs are moving Disk data persistence to the main memory bus
and in some cases increasing memory capacity CPU
See my other talk later this week
Santa Clara, CA August 2018 29 Cost of Power Failure
Statistics vary but all agree… downtime costs a LOT Santa Clara, CA August 2018 30 Persistent Memory
DRAM Persistent Memory Loses data Holds data Must be refreshed forever, even Can’t lose power on power fail Santa Clara, CA August 2018 31 Nantero NRAM™
DDR4 DDR5
…………….. Nantero NRAM is a persistent memory ………………………….... using carbon nanotubes to build resistive arrays which can be arranged in a DRAM compatible device HBM
See my other talks Santa Clara, CA later this week August 2018 32 Classes of Persistent Memory
PCM / DRAM NRAM MRAM ReRAM FeRAM Flash 3DXpoint
Non-volatility No Yes Yes Yes Yes Yes Yes
Endurance No limit No limit Limited Limited Limited No limit 103
Read Time 10 ns 10 ns X X X X 50M ns
Write Time 10 ns 10 ns X X X X 25M ns
Memory & Storage Class Memory Storage Memory Class Storage
See my other talks Santa Clara, CA later this week August 2018 33 Applying Persistent Memory
Replace DRAM with Next generation Persistent Memory persistent memory will Completely eliminates the target SRAM, need to reload on too Power fail
Persistent shadow registers aren’t such a bad Santa Clara, CA idea, either August 2018 34 NRAM for Main Memory
NRAM replaces DDR4, DDR5 for main memory
CPU
Santa Clara, CA August 2018 35 Enables the New Architectures
NRAM cells in the array
Permanent storage through power fail
Programmed once during manufacturing, no reload
Santa Clara, CA August 2018 36 NRAM Everywhere
Soon we will look back and say
“Remember when data was lost when power went out?”
and laugh
Santa Clara, CA August 2018 37 Full Disclosure
My first home computer had an 8” floppy disk
I earned my gray hair
Santa Clara, CA August 2018 38 Summary
Centralized versus distributed computing is a long term cycle
Quality of software infrastructure typically determines the winner
Artificial intelligence accelerators are a recent co-processing addition
Data loss on power failure is worsened by AI architectures
Persistent memory in AI device solves major problems
Nantero NRAM addresses many usages of PM in AI systems
If you remember 8” floppies, you probably can’t read this screen
Santa Clara, CA August 2018 39 Questions?
Bill Gervasi Principal Systems Architect [email protected]
Santa Clara, CA August 2018 40