BRKINI-2016.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Cisco HyperFlex Architecture Deep Dive, System Design and Performance Analysis Brian Everitt, Technical Marketing Engineer @CiscoTMEguy BRKINI-2016 Cisco Webex Teams Questions? Use Cisco Webex Teams to chat with the speaker after the session How 1 Find this session in the Cisco Events Mobile App 2 Click “Join the Discussion” 3 Install Webex Teams or go directly to the team space 4 Enter messages/questions in the team space BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3 Agenda • Introduction • What’s New • HyperFlex All-NVMe • HyperFlex Acceleration Engine • Benchmarking Tools • Performance Testing • Best Practices • Conclusion BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4 About Me Brian Everitt • 22 years experience as an infrastructure and storage engineer • 4 years as a Technical Marketing Engineer with the CSPG BU • Focus on HyperFlex ESXi CVDs, performance, benchmarking and quality • 5 years as a Cisco Advanced Services Solutions Architect • Focus on SAP Hana appliances and TDI, UCS and storage BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 5 What’s New? What’s New With Cisco HyperFlex 4.0? HyperFlex Edge Next Gen Hardware Management 2 node (and 4 node) All-NVMe nodes One click, full stack deployments software and firmware HyperFlex Acceleration Engine upgrades Cisco Intersight cloud hardware offload card based witness service Storage Capacity 4th Generation Fabric Forecasts and Trends in Redundant 10GbE Interconnects and VIC Cisco Intersight option Cascade Lake CPUs and New GUI license drive options (4.0.2a) management in HX Connect (4.0.2a) BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 7 Technical and Performance Improvements in Cisco HyperFlex 4.0(2a) • HyperFlex Boost Mode (more details later in this session) • Dynamic cleaner improvements smooth out performance and reduce garbage • Native replication scale grows from 200 to 1000 VMs per cluster • Support for replication across low bandwidth connections down to 10Mbps • New vCenter HTML5 plugin for cross launch of HX Connect • Cisco Intersight based upgrades of ESXi for HX Edge clusters • Health monitoring and alerting for controller VM health (memory use and capacity) and network services such as DNS, NTP and vCenter connectivity • Pre-upgrade health and compatibility checks that can be run manually to check for issues ahead of time BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 8 Scaling Options in HXDP 4.0.2a VMware HyperV SFF LFF All-NVMe SFF LFF (excluding All-NVMe) Converged Nodes Converged Nodes Converged Nodes Converged Nodes Converged Nodes 3-32 3-16 3-16 3-16 3-16 Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes 0-32 0-32 0-16 0-16 0-16 ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported 2:1 2:1 1:1 1:1 1:1 Max ratio Compute to Max ratio Compute to Max ratio Compute to Max ratio Compute to Max ratio Compute to Converged Nodes* Converged Nodes* Converged Nodes* Converged Nodes* Converged Nodes* Max Cluster Size Max Cluster Size Max Cluster Size Max Cluster Size Max Cluster Size 64 48 32 32 32 ! Hybrid Only ! Requires Enterprise License ! Hybrid Only * 2:1 – Enterprise license (HXDP-P) if # of Compute > # of Converged Nodes. * 1:1 – Standard license (HXDP-S) if # of Compute <= # of Converged Nodes. BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 9 Scaling Options in HXDP 4.0.2a (cont.) Edge Stretched Cluster** SFF SFF LFF (excluding All-NVMe) Converged Nodes Converged Nodes Converged Nodes 2-4 2-16 / Site (4-32 / Cluster) 2-8 / Site (4-16 / Cluster) Expansion Not Currently Supported 1GbE and 10GbE Compute-Only Nodes Compute-Only Nodes Network Options 0-21 / Site (0-42 / Cluster) 0-16 / Site (0-32 / Cluster) Available [Limited by Max Cluster size] ✓Expansion Supported*** ✓Expansion Supported*** Single and Dual Switch Support 2:1 2:1 Deploy and Manage Max ratio Compute to Max ratio Compute to via Cisco Intersight Converged Nodes* Converged Nodes* Max Cluster Size Max Cluster Size ! HX*220 Nodes Only 32/Site 64/Cluster 24/Site 48/Cluster ! Hybrid Only * 2:1 – Enterprise license (HXDP-P) if # of Compute > # of Converged Nodes, 1:1 – Standard license (HXDP-S) if # of Compute <= # of Converged Nodes. ** Stretched cluster requires Enterprise license (HXDP-P) *** Requires uniform expansion of converged nodes across both sites BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 10 Invisible Cloud Witness, Why? 2-Node ROBO Site #1 Consistency of the file system is essential. VM VM 1 VM VM To guarantee consistency of a file system a quorum 2 Traditional Hypervisor is needed. 2-Node Deployment In a HyperFlex cluster a quorum requires a majority, 3 or 2 of 3 nodes available and in agreement in a cluster. WAN Traditional 2-Node ROBO solutions require a 3rd 4 site for witness VMs. Cisco’s innovation of the Intersight Invisible Cloud 5 Witness virtually eliminates the costs with deploying 3rd Site for 2-node HCI clusters/ Witness VMs $$ CapEx & OpEx $$ BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 11 Benefits of Invisible Cloud Witness Architecture • No additional license cost; Included in an HX Edge Subscription 2-Node ROBO 2-Node ROBO Site #1 … Site #2000 • No requirement for 3rd site or existing infrastructure VM VM VM VM • Cloud-like operations - Nothing to manage VM VM VM VM o No user interface to monitor Hypervisor Hypervisor o No user driven software updates o No setup, configuration or backup required o No scale limitations • Security o Realtime updates with the latest security patches WAN o All communications encrypted via TLS 1.2 o Uses standard HTTPS port 443; no firewall configuration required • Built on efficient protocol stack o No periodic heartbeats o No cluster metadata or user data transferred to Intersight o Tolerates high latency and lossy WAN connections BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 12 Cisco HyperFlex Edge with Full Lifecycle Management Multi-Cluster Invisible Cloud Witness On-Going Management Connected TAC Install Service Including Experience Full Stack Upgrades 1 Node . Upgrade 2000 + Nodes BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 13 HyperFlex All-NVMe Cisco HyperFlex All-NVMe • All-NVMe UCS C220 M5 Servers, optimized for All-NVMe • Integrated Fabric Networking • Ongoing I/O optimizations • Co-Engineered with Intel VMD for Hot-Plug & surprise removal cases • Reliability Availability Serviceability (RAS) Assurance • NVMe Optane Cache, NVMe Capacity • Up to 32TB/Node, 40GB Networking • Fully UCSM Managed for FW, LED etc. • Higher IOPS • Lower Latency • Interop with HW acceleration to deliver the highest performance BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 What’s Special About NVMe Drives Unlocking the drive bottleneck Promise of SAS/SATA System NVMe System Higher IOPs, Lower Latency CPU CPU & Less CPU PCIe PCIe 8/16/32.. Gb/s cycles per IO vs Storage Controller NVMe SSD 6 Gb/s 12 Gb/s • NVMe is a new SATA bus SAS bus protocol and command set for PCIe connected SAS/SATA SSD storage • Drives are still NAND or We speak ATA or SCSI We speak all new NVMe new 3D XPoint • 65,535 queues vs.1 BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 16 What delayed All-NVMe HCI from happening? Many technical challenges needed to be overcome first… All-NVMe Hardware RAS for NVMe Drives Capable Network HCI SW Optimization Cisco HXAF220c-M5N (Reliability, Availability, Serviceability) (To match drive capabilities) (To leverage the Benefits) CPU CPU HCI HCI HCI HCI Controller Node Node Node VM PCIe ? PCIe NVMe Specific all-NVMe design Storage NW Cache SSD with PCIe M-switch controller N/W is central to HCI, all Capacity SATA writes traverse the Overcome limited PCIe SSD lanes for drives and network Is it architected to peripherals RAS Delivery Point? leverage the benefits? Network bandwidth and Intel Optane high Hot plug, Surprise drive latency both matter Software Optimization endurance caching disks Removal, Firmware, in HX to both NW IO LED, etc. Cisco UCS 10/25/40 path and local IO path GbE is the solution BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 17 RAS in SATA/SAS vs NVMe System Hot Plug, Surprise Removal, Firmware Management, LED etc. SAS System NVMe System CPU CPU No Controller PCIe Surprise removal of PCIe NVMe drive may RAS Delivery Storage controller vs crash system point NVMe NVMe SSD SSD Hot Plug specification for NVMe not there yet SAS SAS SSD SSD BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18 Absence of RAS Means.. BSOD? PSOD? Hyper-V – Blue Screen due to surprise PCIe SSD removal ESXi – Purple Screen due to surprise PCIe SSD removal BRKINI-2016 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 19 Intel Volume Management Device (VMD) To Rescue Provides Means To Address HotPlug, Surprise Removal and LED Before Intel® VMD After Intel® VMD • Intel® VMD is a new technology to Operating System Operating System enhance solutions with PCIe storage Intel® VMD Driver • Supported for Windows, Linux, and ESXi Intel® Xeon® Intel® Xeon® Scalable E5v4 Processor • Multi-SSD vendor support Processor PCIe PCIe Root Complex Root Complex Intel® VMD • Intel® VMD enables: • Isolating fault domains for device surprise hot-plug and error handling PCIe SSDs • Provide consistent framework for managing LEDs BRKINI-2016 © 2020 Cisco and/or its affiliates.