SPDK Performance in a Nutshell
Total Page:16
File Type:pdf, Size:1020Kb
Karol Latecki John Kariuki SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Agenda I/O Performance Performance and Efficiency 1 Workloads Local Storage Performance Test case and objectives 2 Performance Test tools, environment and optimizations Storage over Ethernet Performance Test case and objectives 3 Performance Test tools, environment and optimizations Virtualized Storage Performance Test case and objectives 4 Performance Test tools, environment and optimizations SPDK, PMDK, Intel® Performance Analyzers Virtual Forum 2 SPDK I/O Performance Efficiency & Scalability Latency I/O per sec from 1 thread Average I/O core scalability Tail(P90, P99, P99.99) 3 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum 4 KiB 128 KiB Local Storage Performance 100% Random Read 100% Seq Read Storage over Ethernet Performance 100% Random Write 100% Seq Write Virtualized Storage Performance 70%/30% Random Read/Write 70%/30% Seq Read/Write 4 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum https://spdk.io/doc/performance_reports.html The Performance Reports 5 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Local Block Storage Objectives: • Measure SPDK NVMe BDEV performance • Compare SPDK vs. Linux Kernel (libaio, io_uring) block layers SPDK Perf/FIO Test Cases: 1. I/O per second from one thread SPDK 2. I/O core scalability SPDK NVMe 3. SPDK vs. Kernel Latency BDEV 4. IOPS vs. Latency SPDK NVMe Driver Intel® TLC Test case execution automated with test/nvme/perf/run_perf.sh 3D NAND SSD 7 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum SPDK NVMe SPDK NVMe BDEV IOPS 1 CPU Core BDEV I/O 4 KB Rand Read @ QD=128 Efficiency. 6000.00 5000.00 4000.00 3000.00 (Higher is Better) (Higher IOPS IOPS (Thousands) 2000.00 1000.00 0.00 1 2 4 6 8 10 Number of SSDs Single Core IOPS scale linearly as number of SSDs increases up to 8 Maximum IOPS/Core: 5.2 million at 10 SSDs. 8 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 SPDK NVMe BDEV I/O Core Scalability Lockless I/O path - IOPS Scale linearly with addition of I/O Cores 9 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 [global] direct=1 thread=1 ---- bs=4096 Minimize number of I/O numjobs=1 threads runtime=300 ramp_time=60 [filename0] NUMA iodepth=192 cpus_allowed=0 filename=Nvme0n1 filename=Nvme1n1 Tools: fio, bdevperf, nvmeperf filename=Nvme2n1 filename=Nvme3n1 filename=Nvme4n1 filename=Nvme5n1 10 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum fio Industry standard High flexibility Lots of I/O metrics Why SPDK SPDK perf tools perf tools? Less flexibility Optimized for I/O submission and completion. Up to 2x more IOPS/Core SPDK, PMDK, Intel® Performance Analyzers Virtual Forum IOPS vs. Latency - 4 I/O Cores IOPS vs. Average Latency 4 KB Random Read (4 I/O Cores) 12.00 2,000.00 1,800.00 10.00 1,600.00 1,400.00 8.00 1,200.00 6.00 1,000.00 IOPS(millions) 800.00 (Lower (Lower is better) (Higher Better) is 4.00 Ave. Latency (usec) 600.00 400.00 2.00 200.00 0.00 0.00 1 2 4 8 16 32 64 128 Queue Depth SPDK Fio Bdev IOPS Kernel Libaio IOPS Kernel IO Uring IOPS SPDK Avg. Latency (usecs) Kernel Libaio Avg. Latency (usecs) Kernel IO Uring Avg. Latency (usecs) SPDK BDEV up to 2.9x and 5.8x more IOPS/Core vs. io_uring and libaio respectively 12 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 • Intel Server System R2224WFTZS • 2 x Intel® Xeon® Gold 6230N Processor (2.30 GHz, 20 cores per socket) • 384 GB 2933MHz DDR4 RAM • 24 x Intel® SSD DC P4610 1.6TB NVMe 13 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Why Benchmark each release? Measure Measure Validate performance on performance after performance new HW SW optimizations impact of new (SSDs, CPUs, NICs) features 14 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Block Storage over Ethernet SPDK and Linux NVMe-oF Test Cases: Performance 1. SPDK NVMe-oF target I/O core scalability 2. SPDK NVMe-oF initiator I/O core scalability 3. Latency and Interoperability of SPDK and Kernel RDMA & TCP Transports components 4. Performance with increasing number of connections Interoperability performance testing Automation scripts: spdk/scripts/perf/nvmf/run_nvmf.py 16 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Target: Storage and Network NVMe• -oF Initiator 1 NVMe-oF Initiator 1 (SPDK/Linux Kernel) (SPDK/Linux Kernel) QSFP28 Cables QSFP28 Cables (direct connection) (direct connection) Host: many CPU cores 100GbE NIC1 100GbE NIC2 (CPU Socket 0) (CPU Socket 1) NVMe-oF Target (SPDK/Linux Kernel) Benchmark tool: fio PCIe Switch 1 PCIe Switch 2 (CPU Socket 0) (CPU Socket 1) 8x Intel P4610 SSDs 8x Intel P4610 SSDs 17 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Over 100 Gbps 8 Target CPU Core saturate 100Gbps – 4KB Random Read Data from SPDK NVMe-oF TCP 21.01 Performance Report 18 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 SPDK relative efficiency up to 2x better with increasing number of connections. Data from SPDK NVMe-oF TCP 21.01 Performance Report 19 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 SPDK NVMe-oF TCP Target Core Scaling 128k Read Workload 180.00 160.00 140.00 120.00 100.00 80.00 BANDWIDTH 60.00 (GBPS, BETTER) IS HIGHER (GBPS, 40.00 20.00 0.00 1 4 8 # OF CPU CORES SPDK NVMe-oF TCP 20.01 SPDK NVMe-oF TCP 20.04 MSG_ZEROCOPY doubled performance of a single CPU SPDK Target process. Data from SPDK NVMe-oF TCP 20.01 and 20.04 performance reports. 20 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 •Hardware NUMA alignment •BIOS & OS performance settings •NIC IRQ Affinity settings •TCP/IPv4 Linux Sysctl settings 21 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Virtualized Storage The test cases: SPDK& Kernel Vhost 1. SPDK Vhost single core VM saturation Performance 2. SPDK Vhost I/O Core Scalability 3. VM Density–SPDK & Kernel Vhost VM Density 4. Latency vs IOPS with increasing Queue Depth 5. Performance Tuning: ▪ Link Time Optimization Optimizations ▪ Qemu Packed Rings More on SPDK Vhost: https://spdk.io/doc/vhost.html 23 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum VM0 VM1 VM(N-1) VM N Vhost-Scsi&Virtio-Blk lvol0 lvol1 . lvol(N-1) lvolN SPDK Vhost lvol NVMe Bdev & Logical lvol0 lvol1 . lvoln Volumes (n-1) Nvme0n1 . Nvme23n1 QEMU/KVM; up to 36 VMs Local NVMe SSDs 24 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Up to 1.8 million IOPS on 1 CPU Core. Linear scaling with addition of I/O cores. Data from SPDK Vhost 21.01 Performance Report 25 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 SPDK Vhost able to serve required IO with high number of VMs. Data from SPDK Vhost 21.01 Performance Report SPDK, PMDK, Intel® Performance Analyzers Virtual Forum 26 See configuration details – Slide 33 +1.8% +5.4% +7.6% +6.2% +5.4% Data from SPDK Vhost 21.01 Performance Report SPDK, PMDK, Intel® Performance Analyzers Virtual Forum 27 See configuration details – Slide 33 • Benchmark Tool:fio in client-server mode • Automation script: Benchmarking Tools test/vhost/perf_bench/vhost_perf.sh • Test optimizations: Optimizations • NUMA alignment • Fiomeasurementoptions • Resource limiting(cgroups) 28 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Continous Performance • Run in SPDK Continuous Integration • Uses same scripts as for quarterly benchmark reports • Currently covers Vhost, NVMe-oF TCP and NVMe-oF RDMA 30 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum See configuration details – Slide 33 • Performance & Power: Using dynamic scheduler to measure IOPS/Watt • NVMe over vfio-user performance • Container Storage performance • Data Services Performance: Compress bdev, Crypto bdev 31 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Q&A SPDK, PMDK, Intel® Performance Analyzers Virtual Forum Local Storage (Slides 8,9,12) & Virtualized Storage (Slides 25-27):Test by Intel as of 2/10/2021. 1-node, 2x Intel® Xeon® Gold 6230N Processor, 20 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.02.01.0013.121520200651 (ucode:0x4003003), Fedora 33, Linux Kernel 5.10.19-200, gcc 9.3.1 compiler, fio 3.19, SPDK 21.01, Storage: 24x Intel® SSD DC P4610 1.6TB. Network Storage (Slides 18 - 20):Test by Intel as of 2/10/2021. Target Node: 1-node, 2x Intel® Xeon® Gold 6230 Processor, 20 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: 3.4 (ucode:0x5003003), Fedora 33, Linux Kernel 5.8.15-300, gcc 9.3.1 compiler, fio 3.19, SPDK 21.01, Storage: 16x Intel® SSD DC P4610 1.6TB, Network: 2x 100 GbE Mellanox ConnectX-5. Host Nodes: 2-nodes, 2x Intel® Xeon® Gold 6252 Processor, 24 cores HT On Turbo ON Total Memory 192 GB (6 slots/ 32GB/ 2933 MHz), BIOS: 3.4 (ucode:0x5003003), Fedora 33, Linux Kernel 5.8.15-300, gcc 9.3.1 compiler, fio 3.19, SPDK 21.01, Network: 1x 100 GbE Mellanox ConnectX-5 33 SPDK, PMDK, Intel® Performance Analyzers Virtual Forum • • Automated metric collection with SAR scripts/perf/nvmf/run_nvmf.py • SAR CPU utilization measurement on Target side Bwm-ng • bwm-ng to measure bandwidth utilization on network interfaces PCM • PCM measurements on Target side include CPU, memory and power consumption.