2020-11-18 NY Creates

Intelligent Silicon for AI at Scale

Samir Mittal Corporate Vice-President 3DXP Systems Solutions Engineering Micron Technology

November 18, 2020 Speaker Biography

Key experience in Deep Learning: • Cognitive Brain for investment fund marketing (Capital Group / American Funds) • Machine learning infrastructure development (with leading hyperscale customers)

Most recently: • CEO and Founder, SCUTI AI – startup in Scaled Deep Learning • SanDisk, VP of Enterprise Engineering – all flash data center • Led Pliant, Smart Storage, Fusion-io, Flashsoft and Schooner teams

Background: • PhD in Signal Processing and Control Systems from The Ohio State University • Subsequently at Seagate Enterprise (storage), Qumu (enterprise video)

2 Complex Decisions are being increasingly powered by AI

Data Insights Decisions Value

helping us to create new transformative value for society scale impact

3 AI for complex systems is the new frontier

Tomorrow Today

“Systems” “Classic” World AI AI

Autonomous driving Speech recognition Program synthesis Video analytics Mortgage decisions Language translation

Mission critical <100% accurate Correct & explainable Inform & assist New possibilities Incremental value

4 4 Emerging trends in “Systems AI”

• Mission critical deployments in ever-changing environments Ø Constant adaptation Ø Reinforcement learning • Make AI accessible – operate with more intelligence at higher levels of abstraction Ø From generative models to generative agents Ø Supervised training to unsupervised learning Ø Incorporate domain semantics Ø Learn rich representations of the environment Ø Generalize well

5 5 Case Study: Systems AI in Semiconductor (SSD) Manufacturing

Today

Quality via test

Process audit production audit In-situ measurements

Traditional approach Tomorrow

Quality through design

Design knowledge Deep & Machine Learning

Challenges

Data gravity & Robust decision Reliable real-time performance 6 compliance making Deep economic potential in traditional industries at the intersection of rich data, prediction & control

Manufacturing Surveillance & security Infrastructure Prediction & optimization Detection & action Structure safety & traffic mgmt.

However, Enterprise adoption is challenged

High complexity Slow turn-around Diminishing value 7 Requires 100’s of engineers Poor solution scalability with increasing scale 7 “Systems AI” results get better with more data, bigger models & real-time learning

https://openai.com/blog/ai-and-compute/ AI model size doubles every 3.5 months…

8 Micron Confidential 8 Insufficient memory and memory bandwidth are limiting code performance

9 9 New directions with Intelligent Silicon

1. Predictive orchestration for parallelism

2. Performance to power efficiency

3. Domain specific optimizations

4. CPU and GPU offload

5. Storage Class Memory with significant cost to performance improvement

10 10 Micron’s 3DXP Storage Class Memory technology

DRAM

3DXP

Non-volatile Flash Fast read, fast write High density Low voltage Logic integrate-able Byte addressable

11 11 3DXP is revolutionary technology with full stack implications

Application Application Object Stream CPU 1 Memory constrained CPU 1 Memory un-bound

2 Difficult to scale 2 Memory & storage Zero Translation Zero Unified Semantics Unified convergence Byte Stream Memory 3 Over-provisioned & expensive

Translation overhead Translation Memory hierarchies for 3DXP Memory 3 infrastructure virtualization File Stream SSD / HDD SSD / HDD

Today’s Computing Paradigm Vision with 3DXP

12 Micron X100: Primary focus has been to improve I/O performance

ULTRA-FAST LOCAL STORAGE: Micron 3DXP enabled server 4X-7X FASTER THAN CONVENTIONAL SSD CPU Accelerators GPU CPU complex complex ULTRA-LOW LATENCY: Server Network GPU Network 6X – 10X IMPROVEMENT Commodity SSD Micron 3DXP

HIGH PERFORMANCE IN SMALL PARTITIONS

13 User Experience improvement in key applications

Micron 3DXP enabled server DATABASE DATA WAREHOUSING CPU Accelerators GPU CPU complex complex BIG DATA SPARK ANALYTICS Server Network GPU Network Commodity SSD Micron 3DXP

DEEP LEARNING VECTOR SEARCH

14 Azure Ignite Announcement in 2019 Nov 7th 2019 –Azure CTO, Mark Russinovich endorsed Micron X100 performance advantages for Azure

Mark ran showcased X100 in 2 live demos • 9.5 GBs of throughput on the X100-based VM was >4X better than NAND

• TPC-H on Microsoft SQL showed >10X better average per transfer latency and >3X better overall run completion times for X100 vs NAND

15 3DXP Solutions Transform the Compute Server

2020-2021 2022 2023

HBM CPU & GPU CPU & GPU CPU & GPU HBM NAND HBM NAND DRAM NAND Line Line

Cache Cache SSD Access Cache Cache SSD DRAM SSD 128B - 4K Byte Memory Network Network Network 4K Storage storage 3DXP Storage Memory 3DXP Tiered memory Memory

Ultra-fast Fast storage storage Memory Zones NAND SSD NAND 3DXP 3DXP SSD SSD SSD

Storage performance at near-memory speed 3DXP memory unifies all data domains Eliminates data striping on NAND SSD Minimizes storage-to-memory data motion CPU offload of storage controllers Offloads CPU page migrations Max virtualization performance Improves economics with memory virtualization Maximizes CPU performance Enable fungible infrastructure for SKU reduction

16 Data Center Evolution with Software Defined Servers (Russinovich, 2020)

17 Software Defined Server for ”Systems AI” at the Edge

§ Fungible multi-tenant platform § Hardware offloaded virtualization of compute, memory, storage and network § Unseen symmetric performance § Best-fit for high ingest, transient local data § Transparent migration of existing workloads Micron Enabled Edge § Optimized for real-time workloads Micron Confidential Real-time Data Pipelines at the Edge

Ingest Analyze Store & Query

Log & Sensor HiveQL Data Queries HDFS VM1 VM2 VM3

Hypervisor Storage Memory Storage Storage Memory Memory 3DXP

Managed Edge Appliance

19 Micron Confidential Micron Confidential A new model for “Systems AI” at the Edge

Scalable use cases, deployments, & customizations

Economical Simplicity Composable Highest performance One-click for complex with config & operation data pipelines commodity components

20 20 Micron 3DXP enables us to infuse Domain Knowledge to resolve canonical issues in the AI stack

Multi-domain network

Image DL Machine NLP learning

Human engineering intensive mgmt. Robust AI Model characterization Computation Data Convergence User ML Apps characteristics dependencies properties Problem Distributed Model Machine Learning Framework decomposition data convergence & Intelligent model partitioning complexity management tune-up Parallelization framework(s) Model policies

Hardware abstraction layer Workload Run-time Hardware specifications optimizer attributes Pushing infrastructure to its limits with Micron 3DXP enabled server Data mobility & Poor use of Workflow acceleration Weak locality domain & HW parallelization Optimized Predictive Intelligent issues knowledge partitioning data movement scale

21 Scale-out Thank you [email protected]