Session Presentation
Total Page:16
File Type:pdf, Size:1020Kb
#CLUS Cisco HyperFlex Architecture Deep Dive, System Design and Performance Analysis Brian Everitt, Technical Marketing Engineer @CiscoTMEguy BRKINI-2016 #CLUS Agenda • Introduction • What’s New • System Design and Data Paths • Benchmarking Tools • Performance Testing • Best Practices • Conclusion #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3 About Me Brian Everitt • Technical Marketing Engineer with the CSPG BU • Focus on HyperFlex performance, benchmarking, quality and SAP solutions • 5 years as a Cisco Advanced Services Solutions Architect • Focus on SAP Hana appliances and TDI, UCS and storage #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4 Cisco Webex Teams Questions? Use Cisco Webex Teams to chat with the speaker after the session How 1 Find this session in the Cisco Live Mobile App 2 Click “Join the Discussion” 3 Install Webex Teams or go directly to the team space 4 Enter messages/questions in the team space Webex Teams will be moderated cs.co/ciscolivebot#BRKINI-2016 by the speaker until June 16, 2019. #CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 5 Visit DC lounge outside WOS Grab a shirt, get a digital portrait for your social profiles, share your stories! Let’s Celebrate Visit the Data Center area inside 10 Years of WOS to see what’s new Unified Computing! Follow us! We made it simple. #CiscoDC https://bit.ly/2CXR33q You made it happen. #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 6 What’s New? What’s New With Cisco HyperFlex 4.0? HyperFlex Edge Next Gen Hardware Management 2 node (and 4 node) All-NVMe nodes One click, full stack deployments software and firmware HyperFlex Acceleration Engine upgrades Cisco Intersight cloud hardware offload card based witness service Stretched cluster 4th Generation Fabric compute-only nodes Redundant 10GbE Interconnects and VIC and expansions option Cascade Lake CPUs and Intel 2:1 compute-only node Optane DC persistent memory ratio (HX 4.0(2a)) #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8 Scaling Options in HXDP 4.0.1a VMware HyperV SFF LFF All-NVMe SFF LFF (excluding All-NVMe) Converged Nodes Converged Nodes Converged Nodes Converged Nodes Converged Nodes 3-32 3-16 3-16 3-16 3-16 Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes Compute-Only Nodes 0-32 0-32 0-16 0-16 0-16 ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported ✓Expansion Supported 2:1 2:1 1:1 1:1 1:1 Max ratio Compute to Max ratio Compute to Max ratio Compute to Max ratio Compute to Max ratio Compute to Converged Nodes* Converged Nodes* Converged Nodes* Converged Nodes* Converged Nodes* Max Cluster Size Max Cluster Size Max Cluster Size Max Cluster Size Max Cluster Size 64 48 32 32 32 ! Hybrid Only ! Requires Enterprise License ! Hybrid Only * 2:1 – Enterprise license (HXDP-P) if # of Compute > # of Converged Nodes. * 1:1 – Standard license (HXDP-S) if # of Compute <= # of Converged Nodes. #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9 Scaling Options in HXDP 4.0.1a (cont.) Edge Stretched Cluster** SFF SFF LFF (excluding All-NVMe) Converged Nodes Converged Nodes Converged Nodes 2-4 2-16 / Site (4-32 / Cluster) 2-8 / Site (4-16 / Cluster) Expansion Not Currently Supported 1GbE and 10GbE Compute-Only Nodes Compute-Only Nodes Network Options 0-21 / Site (0-42 / Cluster) 0-16 / Site (0-32 / Cluster) [Limited by Max Cluster size] Available ✓Expansion Supported*** ✓Expansion Supported*** Single and Dual Switch Support 2:1 2:1 Deploy and Manage Max ratio Compute to Max ratio Compute to via Cisco Intersight Converged Nodes* Converged Nodes* Max Cluster Size Max Cluster Size ! HX*220 Nodes Only 32/Site 64/Cluster 24/Site 48/Cluster ! Hybrid Only * 2:1 – Enterprise license (HXDP-P) if # of Compute > # of Converged Nodes, 1:1 – Standard license (HXDP-S) if # of Compute <= # of Converged Nodes. ** Stretched cluster requires Enterprise license (HXDP-P) *** Requires uniform expansion of converged nodes across both sites #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 10 Invisible Cloud Witness, Why? 2-Node ROBO Site #1 Consistency of the file system is essential. VM VM 1 VM VM To guarantee consistency of a file system a quorum 2 Traditional Hypervisor is needed. 2-Node Deployment In a HyperFlex cluster a quorum requires a majority, 3 or 2 of 3 nodes available and in agreement in a cluster. WAN Traditional 2-Node ROBO solutions require a 3rd 4 site for witness VMs. Cisco’s innovation of the Intersight Invisible Cloud 5 Witness virtually eliminates the costs with deploying 3rd Site for 2-node HCI clusters/ Witness VMs $$ CapEx & OpEx $$ #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11 Benefits of Invisible Cloud Witness Architecture No additional license cost; Included in an HX Edge Subscription 2-Node ROBO 2-Node ROBO Site #1 … Site #2000 No requirement for 3rd site or existing infrastructure VM VM VM VM Cloud-like operations - Nothing to manage VM VM VM VM o No user interface to monitor Hypervisor Hypervisor o No user driven software updates o No setup, configuration or backup required o No scale limitations Security o Realtime updates with the latest security patches WAN o All communications encrypted via TLS 1.2 o Uses standard HTTPS port 443; no firewall configuration required Built on efficient protocol stack o No periodic heartbeats o No cluster metadata or user data transferred to Intersight o Tolerates high latency and lossy WAN connections #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 12 The All New HyperFlex Edge with Full Lifecycle Management Multi-Cluster Invisible Cloud Witness On-Going Management Connected TAC Install Service Including Experience Full Stack Upgrades 1 Node . Upgrade 2000 + Nodes #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13 HyperFlex All-NVMe Introducing All-NVMe HyperFlex • All-NVMe UCS C220 M5 Servers, optimized for All-NVMe • Integrated Fabric Networking • Ongoing I/O optimizations • Co-Engineered with Intel VMD for Hot-Plug & surprise removal cases • Reliability Availability Serviceability (RAS) Assurance • NVMe Optane Cache, NVMe Capacity • Up to 32TB/Node, 40GB Networking • Fully UCSM Managed for FW, LED etc. • Higher IOPS • Lower Latency • Interop with HW acceleration to deliver the highest performance #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 What’s Special About NVMe Drives Unlocking the drive bottleneck Promise of SAS/SATA System NVMe System Higher IOPs, Lower Latency CPU CPU & Less CPU PCIe PCIe 8/16/32.. Gb/s cycles per IO vs Storage Controller NVMe SSD 6 Gb/s 12 Gb/s • NVMe is a new SATA bus SAS bus protocol and command set for PCIe connected SAS/SATA SSD storage • Drives are still NAND or We speak ATA or SCSI We speak all new NVMe new 3D XPoint • 65,535 queues vs.1 #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16 All-NVMe: Opportunity & Challenges Performance RAS Optimized Reliability, Availability, Serviceability • NVMe drives have higher perf • Performance without • No compromise on than SAS/SATA SSDs compromising RAS manageability • Need PCI lanes in the platform • Ability to handle hotplug, • Software stack that can to leverage drive perf surprise removal, etc leverage higher perf • Need low latency, high • Higher perf requires very high • Software optimization reqd to bandwidth N/W endurance write cache N/W and I/O datapath #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17 Why Has All-NVMe HCI Not Happened Already? Many technical challenges needed to be overcome first… All-NVMe Hardware RAS for NVMe Drives Capable Network HCI SW Optimization Cisco HXAF220c-M5N (Reliability, Availability, Serviceability) (To match drive capabilities) (To leverage the Benefits) CPU CPU HCI HCI HCI HCI Controller Node Node Node VM PCIe ? PCIe NVMe Specific all-NVMe design Storage NW Cache SSD with PCIe M-switch controller N/W is central to HCI, all Capacity SATA Overcome limited PCIe writes traverse the SSD network lanes for drives and Is it architected to peripherals RAS Delivery Point? leverage the benefits? Network bandwidth and latency both matter Intel Optane high Hot plug, Surprise drive Software Optimization endurance caching disks Removal, Firmware, in HX to both NW IO LED, etc. Cisco UCS 10/25/40 path and local IO path GbE is the solution #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18 RAS in SATA/SAS vs NVMe System Hot Plug, Surprise Removal, Firmware Management, LED etc. SAS System NVMe System CPU CPU No Controller PCIe Surprise removal of PCIe NVMe drive may RAS Delivery Storage controller vs crash system point NVMe NVMe SSD SSD Hot Plug specification for NVMe not there yet SAS SAS SSD SSD #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19 Absence of RAS Means.. BSOD? PSOD? Hyper-V – Blue Screen due to surprise PCIe SSD removal ESXi – Purple Screen due to surprise PCIe SSD removal #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20 Intel Volume Management Device (VMD) To Rescue Provides Means To Address HotPlug, Surprise Removal and LED Before Intel® VMD After Intel® VMD • Intel® VMD is a new technology to Operating System Operating System enhance solutions with PCIe storage Intel® VMD Driver • Supported for Windows, Linux, and ESXi Intel® Xeon® Intel® Xeon® E5v4 Scalable • Multi-SSD vendor support Processor Processor PCIe PCIe Root Complex Root Complex Intel® VMD • Intel® VMD enables: • Isolating fault domains for device surprise hot-plug and error handling PCIe SSDs • Provide consistent framework for managing LEDs #CLUS BRKINI-2016 © 2019 Cisco and/or its affiliates.