Are Ethernet Attached Ssds Happening?
Total Page:16
File Type:pdf, Size:1020Kb
Are Ethernet Attached SSDs Happening? NVMF-302B-1 Organizer/Chair: Rob Davis, Mellanox Presenters: Ilker Cebeli, Samsung John Kloeppner, NetApp Balaji Venkateshwaran, Toshiba Khurram Milak, Netronome Flash Memory Summit 2019 Santa Clara, CA Woo Suk Chung, SK Hynix1 Session Agenda ▪ Ilker, Samsung – 15 minutes ▪ John - NetApp, Balaji -Toshiba, Khurram - Marvell – 40 minutes ▪ Woo, SK-Hynix – 15 minutes ▪ Q&A – 10 minutes Flash Memory Summit 2019 Santa Clara, CA 2 Are Ethernet Attached SSDs Happening? Disaggregated NVMe-oF Storage Ilker Cebeli Sr. Director of Planning Samsung Flash Memory Summit 2019 Santa Clara, CA 3 Disclaimer This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”) is intended to provide information concerning the SSD and memory industry and Samsung Electronics Co., Ltd. and certain affiliates (collectively, “Samsung”). While Samsung strives to provide information that is accurate and up-to- date, this Presentation may nonetheless contain inaccuracies or omissions. As a consequence, Samsung does not in any way guarantee the accuracy or completeness of the information provided in this Presentation. This Presentation may include forward-looking statements, including, but not limited to, statements about any matter that is not a historical fact; statements regarding Samsung’s intentions, beliefs or current expectations concerning, among other things, market prospects, technological developments, growth, strategies, and the industry in which Samsung operates; and statements regarding products or features that are still in development. By their nature, forward-looking statements involve risks and uncertainties, because they relate to events and depend on circumstances that may or may not occur in the future. Samsung cautions you that forward looking statements are not guarantees of future performance and that the actual developments of Samsung, the market, or industry in which Samsung operates may differ materially from those made or suggested by the forward-looking statements in this Presentation. In addition, even if such forward-looking statements are shown to be accurate, those developments may not be indicative of developments in future periods. 4 Data Center Evolution Traditional Hyper-converged Software-Defined Data Center Virtualized Composable Manageability Manageability Manageability Console Console Console Console Console Console Compute Storage Networking Console Resource Pools Stand Alone Component Converged Management Rack Scale Software Defined Suited for Enterprise Applications Virtualized Computing/ Networking Disaggregated Compute and Storage 1GbE Networking 10GbE Networking Composable 25-100GbE Networking Why Disaggregation? Converged Disaggregated Compute & Storage Compute Storage Compute + Storage NETWORK NETWORK ❑ Pros: ❑ Pros: ✓ Scale Compute and Storage linearly ✓ Compute and Storage scale independently ✓ Managed resources and storage services ✓ Shared resources ❑ Cons ✓ Improved utilization ➢ Resources under-utilization ✓ Grow as you go model based on workload demand ➢ Storage and Compute on the same network ✓ Centralized storage services ❑ Cons ➢ Requires efficient storage protocols and latency 8/8/2019 ➢ Low latency and high bandwidth networking 6 Some of the Use Cases for NVMe Over Fabrics Server Server Server CPU CPU CPU CPU CPU CPU NVMf Capable PCie Network PCie PCie e.g. RDMA NVMf Bridge RNIC NVMf Capable Network NVMf Bridge NVMf Bridge e.g. RDMA RNIC RNIC NVMf Target Pcie Switch CPU CPU CPU CPU NVMf Target Pcie Switch Pcie Switch Pcie Switch Hyper-Converged Direct Attached JBOF Most Common Disaggregated SAS DAS Replacement JBOF Storage NVMe-oF JBOF 2015 NVMe-oF JBOF 100GbE 100GbE NVMf Capable RNIC RNIC 2x100GbE ➔ ~24GB/s Network CPU CPU NVMf Bridge NVMf Bridge RNIC RNIC CPU CPU CPU CPU Pcie Switch Pcie Switch NVMe SSDs 24 NVMe SSDs ➔ ~24GB/s Hyper-Converged Most Common Typical NVMe-oF JBOF Flash Memory Summit 2019 2015 Platform Balanced Bandwidth between IO and NVMe Santa Clara, CA 8 Future NVMe Bandwidth Throughput (GB/s) Seq. Read 128KB Future 24x Gen4x4 NVMe SSDs FUTURE 150+ *2019 24x x4 Gen3 NVMe SSDs 84 4x 100GbE 44 4x 40GbE 18 30x 4x 10GbE 4.5 * 24x Samsung PM1725b NVMe SSDs (3.5GB/s throughput each) Network links could throttle the storage throughput performance Flash Memory Summit 2019 Santa Clara, CA 9 Evolution of Networking Speeds and 25Gb/s and Above Source: Crehan Long-range Forecast - Ethernet Adapter forecast, January 2019 via Mellanox Q2’2019 Flash Memory Summit 2019 Santa Clara, CA 10 IO Bottleneck Future 2019 3 X 200GbE ➔ ~75 GB/s 3 X 100GbE ➔ ~36 GB/s 200GbE 200GbE 200GbE 100GbE 100GbE 100GbE ` ` ` ` CPU CPU CPU CPU ` ` SSD SSD SSD SSD NVMe NVMe SSD SSD 24 NVMe SSDs ➔ ~150+ GB/s 24 NVMe SSDs ➔ ~48 GB/s CPU and IO bottleneck for storage throughput performance Flash Memory Summit 2019 Santa Clara, CA 11 NVMe-oF SSD based EBOF Conventional NVMe JBOF NVMe-oF EBOF Storage Head Nodes Storage App. Server App. Server App. Server Or Application Servers Controllers Web Database Analytics Hypervisor, Container Ethernet Switch ❑ Pros ❑ Pros ✓ Enables disaggregation of NVMe ✓ High Bandwidth SSDs ✓ Scaled Linearly (Ethernet) Ethernet Switch ✓ Management & Storage Services ✓ Sharable via NVMe-oF ✓ Utilizing existing storage & server ✓ Less power architectures ✓ Lower latency PCIe Ethernet Ethernet ❑ Cons ❑ Cons Switch Switch Switch ➢ Non-scalable Storage Controller - ➢ New platform architecture PCIe single root constraint ➢ Management of Storage Ethernet Services & Network Devices PCIe ➢ Bandwidth Limitation • CPU, PCIe, Networking NVMe NVMe NVMe NVMe-oF NVMe-oF NVMe-oF NVMe-oF NVMe Constraints SSD SSD SSD SSD SSD SSD SSD SSD ➢ Power and Thermals NVMe JBOF NVMe-oF EBOF NVMe-oF EBOF can address bandwidth, scalability, and flexibility Example Datacenter Storage Disaggregation Server DAS Disaggregated Ethernet Switch Ethernet Switch Ethernet Switch Ethernet Switch 100GbE Server Node Ethernet Server Node Storage Services Server Node Server Node Storage Storage Services Services Server Node 25GbE DRAM Ethernet Server Node Storage NIC CPU NIC CPU DRAM 25GbE Services Ethernet Server Node Server Node Storage PCie Services SSD Server Node NVMe SSD Server Node Storage Storage Services Services Server Node ? Server Node Storage Services Scale-Out Storage Services NVMe-oF EBOF Switch NVMe-oF NVMe-oF eBOF SSD Where Storage Services and Network Devices managed [email protected] Thank You Flash Memory Summit 2019 Santa Clara, CA 14 Session Agenda ▪ Ilker, Samsung – 15 minutes ▪ John - NetApp, Balaji -Toshiba, Khurram - Marvell – 40/3 minutes ▪ Woo, SK-Hynix – 15 minutes ▪ Q&A – 10 minutes Flash Memory Summit 2019 Santa Clara, CA 15 NVMe -> NVMe over Fabrics Servers without Storage Servers with embedded NVMe Storage NVMe NVMe NVMe-oF NVMe AFF A320 NVMe Data Management ▪ Local high performance / low latency access ▪ Isolated Storage NVMe-oF (direct connected) ▪ Under-utilized SSD Performance and Capacity JBOF ▪ Shared Storage, better utilization of storage ▪ Similar NVMe Performance 16 © 2019 NetApp, Inc. All rights reserved. Disaggregated Compute/ Data Management / Storage Compute ▪ Scaling Compute, Data Management and Storage Independently ▪ Full Shared Storage NVMe-oF Storage Storage Storage Data Management Controller Controller Controller RNIC NVMe-oF NVMe-oF JBOF PCIe CPU Memory Switch NVMe SSDs JBOF JBOF JBOF Data Storage 17 © 2019 NetApp, Inc. All rights reserved. NVMe-oF JBOF Limitations ▪ Performance • Throughput - PCIe Gen3 -> PCIe Gen4 -> PCIe Gen5, SCM, limit by existing infrastructure • Latency - Store and Forward architecture ▪ Cost – CPU, SOC/RNICs, Switches, Mem don’t scale well to match increasing SSD performance SOC RNIC Memory RNIC CPU based SOC RNIC based NVMe-oF JBOF NVMe-oF JBOF PCIe Switch CPU Memory PCIe Switch 24 - x4 PCIe Gen4 NVMe SSDs 24 - x4 PCIe Gen4 NVMe SSDs 18 © 2019 NetApp, Inc. All rights reserved. Native Ethernet / NVMe-oF SSDs ▪ Optimize NVMe-oF performance at SSD ▪ Options for NVMe-oF SSDs Dual 25G Dual 25G Dual 25G NVM-oF Bridge NVMe-oF Bridge Native NVMe-oF PCIe SSD Ctrl SSD SSD SSD Ctrl SSD Ctrl SSD Flash Flash Flash 19 © 2019 NetApp, Inc. All rights reserved. Solution with Native NVMe-oF SSDs ▪ Lower Latency, Higher Throughput ▪ Lower Cost and overall TCO NVMe-oF JBOF Management Ethernet & Switch Discovery 24 Native NVMe-oF SSDs * Supports one 2x200G RNIC connected with x16 PCIe Gen4 ** Supports one 2x200G SOC RNIC connected with x16 PCIe Gen4 *** Supports three 200G Host connected Ethernet ports 20 © 2019 NetApp, Inc. All rights reserved. Additional Benefits ▪ Additional Benefits • Performance/cost scales with SSDs • Lower Power, reduced TCO • Including Ethernet switching within JBOF … potential to reduce networking cost, footprint, cabling 21 © 2019 NetApp, Inc. All rights reserved. Other Activities ▪ Industry Standardization / Enablement • Standardization – Work underway in SNIA to define Form Factor, Pinout, Management – Toshiba will cover • Enablement – Fabrico Interposer – Marvell will cover 22 © 2019 NetApp, Inc. All rights reserved. Thanks! Flash Memory Summit 2019 Santa Clara, CA 23 Session Agenda ▪ Ilker, Samsung – 15 minutes ▪ John - NetApp, Balaji -Toshiba, Khurram - Marvell – 40/3 minutes ▪ Woo, SK-Hynix – 15 minutes ▪ Q&A – 10 minutes Flash Memory Summit 2019 Santa Clara, CA 24 Enabling Native NVMe-oF™ SSDs (Ethernet SSDs) August 2019 Ethernet SSD-based Storage Platforms AFA software App Server Scale for higher performance Block storage Ethernet Fabric CPU DRAM over