Network Function Virtualization Seminar
Total Page:16
File Type:pdf, Size:1020Kb
Network Function Virtualization Seminar Matthias Falkner, Distinguished Engineer, Technical Marketing Nikolai Pitaev, Technical Marketing Engineer, TECSPG-2300 Cisco Webex Teams Questions? Use Cisco Webex Teams to chat with the speaker after the session How 1 Find this session in the Cisco Events Mobile App 2 Click “Join the Discussion” 3 Install Webex Teams or go directly to the team space 4 Enter messages/questions in the team space TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3 Agenda: TECSPG-2300 Network Function Virtualization – A use-case based Technology Deep- Dive • Introduction • 08:45 – 09:05 Matt • NFV Primer • 09:05 – 10:00 Matt • Virtualizing Branch Infrastructure • 10:00 – 10:45 Nikolai Break • SP/Cloud Virtualization • 11:00 – 11:30 Nikolai • Connecting to Multiple Clouds • 11:30 – 12:15 Nikolai • Multi-Tenanted SMB Services • 12:15 – 12:50 Matt • Conclusion • 12:50 – 13:00 Matt TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4 Introduction Virtualization of Network Functions (NFV) – Current State • Idea of de-coupling software from hardware is not new! • Linked to automation / orchestration • Increased focus to simplify Enterprise architectures https://www.dataports.eu/wp-content/uploads/google-datacenter- eemshaven-img-7.jpg • Particularly on L4-7 services • SPs drive adoption, but Enterprises are following suit https://bloximages.newyork1.vip.townnews.com/omaha.com/content • Both consumption models (MSP, self-managed) /tncms/assets/v3/editorial/1/ab/1ab55a42-195a-11e7-b177- considered 3f34b38ab18c/58e3d482ba843.image.jpg?resize=1200%2C673 https://cnet3.cbsistatic.com/img/8cRqI3rcyHCpNORJVjRkeTVUoLM=/724x407/2013/1 0/25/d451bda3-3f9b-11e3-a363-14feb5ca9861/Structure_from_Yerba_Buena.jpg TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 11 Why Virtualize? Motivations for the Enterprise OPEX CAPEX • Deployment Flexibility • Deploy on standard x86 servers • Reduction of number of network elements • Economies of scale • Reduction of on-site visits • Service Elasticity – deploy as needed • Deployment of standard on-premise hardware • Simpler architectural paradigm • Simplification of physical network architecture • HA still needed? • Leveraging Virtualization benefits • Best-of-breed • Hardware oversubscription, vMotion, .. • Increased potential for automated network operations • Re-alignment of organizational boundaries TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 The 4 Layers of a virtualized System Architecture 4 Automation / Orchestration (Cisco DNA Center, NSO) 3 Virtual WAN Virtual Router Virtual Firewall Virtual Wireless LAN Optimization 3rd Party VNFs (ISRv,CSR) (ASAv, NGFWv) Controller (vWLC) (vWAAS) 2 Network Functions Virtualization Infrastructure Software (NFVIS) ISR 4000 + CSP-5444 / Enterprise Network Compute 1 UCS E-Series UCS C-Series System TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 16 NFV Primer This Section will cover basic VNF Technologies It is all about Virtual Network Functions. We are not talking about generic Virtualization Techniques. Topics, which will be covered next: • IO: SR-IOV, Virtual Switches, Service Chaining • CPU: Hyperthreading, vCPU pinning, NUMA Socket Allocation • Putting all together: NFV Performance Insights • VNF Virtualization vs. Containerization TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18 A System Architecture View with 3 VNFs VM1(4vCPU CSR 1000v) CSR VMFman2(1vCPU/ CSR 1000v)HQF IOS PPE PPE Rx CSR 1 1 1 1 CMan Pkt Scheduler IRQ vNIC vNIC VM Linux Fman3 / 1 n VM (2vCPUPPE CSRHQF 1000v) IOS Rx CSR CMan Pkt Scheduler IRQ2 vNIC 2 vNIC 2 VM Linux2 • Example: 3 CSR VMs Guest OSFman Scheduler/ 1 n PPE HQF IOS Pkt Scheduler Rx 3 3 3 3 scheduled on a 2-socket 8- Guest CManOS Scheduler IRQ vNIC1 vNICn VM Linux 1 vCPU 1 vCPU 1 1 vCPU0 1 2 vCPU3 core x86 Guest OS2 Scheduler vCPU0 – Different CSR footprints shown 3 3 vCPU0 vCPU1 • Type 1 Hypervisor vSwitch – No additional Host OS X86 Server vCPU 2 1 represented vCPU 3 0 Host Linux 2 vNICn Process Process Queue VM Kernel1 • HV Scheduler algorithm governs how HV Scheduler vCPU/IRQ/vNIC/VMKernel processes are allocated to Socket Socket 0 1 pCPUs pCPU0 pCPU1 pCPU2 pCPU3 pCPU0 pCPU1 pCPU2 pCPU3 pCPU4 pCPU5 pCPU6 pCPU7 pCPU4 pCPU5 pCPU6 pCPU7 • Note the various schedulers I/O I/O Memory Storage Memory I/O I/O – Running ships-in-the-night © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public Packet Path from Physical Interface into VNF Packet feature x86 Host (by example of FD.io VPP) processing VM1 VMn GuestUser Packet moved to CSR 1000v .. CSR 1000v VNF DPDK-VirtIO DPDK-VirtIO Ptr Ptr VNF interrupted, Ptr Ptr Ptr Ptr Why does this matter? packet pointer Kernel Guest Guest passed to buffer • Illustrates contention of shared resources Shared Pkt Mem Host User • Each packet move vNIC vNIC / vHost-user notified Pkt Pkt Pkt Pkt (vHost_user) Qemu consumes resources (vHost_user) Pkt Pkt Pkt Pkt • Packet pointer buffers FD.io VPP VPP kicked, have limited depth Kernel switching packet Host • can cause drops Ptr pNIC Driver pNIC Packet copied into Pkt Pkt Pkt Pkt Memory Pkt Pkt Pkt Pkt Pkt Pkt Pkt Pkt Packet Arrival Pkt Traffic Generator TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 20 Potential Bottlenecks in a virtualized System X86 Host (w/ OVS-DPDK, FD-IO/VPP) VM Guest 1 Application VM2 Application Intra-VM Processing (e.g. Features) IO Driver IO Driver User Space Space User Hypervisor / (QEMU) Virtualization Layer vNIC vNIC Virtual Switch / Kernel Host IO-Path Virtual Switch pNIC Physical Interfaces pNIC pNIC Driver pNIC Driver Pkt Pkt Pkt Pkt Pkt Pkt Pkt Pkt TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 21 VNF Architecture VNF Architecture Matters! VM1(4vCPU CSR 1000v) C VMFm2(a1nvC/ PU CSR 100H0QvF) S R IOS PPE PPE Rx C 1 1 1 1 • CMan Pkt Scheduler IRQ vNIC1 vNICn VM Linux VNF can be associated with multiple S VMF3m(a2nv/C PU CSRH Q1F000v) R IOS PPE Rx C • VNF can be assoPkt Schecduleirated with m2 ultip2le 2 2 CMan IRQ vNIC1 vNICn VM Linux Guest OSF mSacnh/ eduler HQF S vCPUs R vCPUIOsS PPE Rx 3 3 3 3 Guest COMSan ScheduPlkteScrheduler IRQ vNIC1 vNICn VM Linux 1 vCPU 1 vCPU 1 1 vCPU0 1 2 vCPU3 • … and cGounsesut OmS2e Smcheemduoleryr • … and consume memory vCPU0 3 3 vCPU vCPU1 • VNF Softw0 are architecture can impact performance • vSwitch VNF Software architecture can impact 2 X86 Server vCPU1 s s e 3 e vCPU0 u c Host Linux e performance o 2 u r vNICn P CSR Resource Template Q VM Kernel1 *Available in 3.16.02 and later • Example: CSR1000V vCPU allocations HV Scheduler Default (Data Plane Heavy) Control Plane Heavy vCPUs 1 2 4 8 vCPUs 1 2 4 8 Socket0 Socket1 Control Control pCPU0 pCPU1 pCPU2 pCPU3 pCPU0 pCPU1 pCPU2 pCPU3 1 1 1 1 2 2 Service 1 Service 1 pCPU4 pCPU5 pCPU6 pCPU7 pCPU4 pCPU5 pCPU6 pCPU7 Data 1 3 7 Data 1 2 6 I/O I/O Memory Storage Memory I/O I/O Service Plane Medium Service Plane Heavy vCPUs 1 2 4 8 vCPUs 1 2 4 8 Control Control 1 2 2 1 2 4 Service 1 Service 1 Data 1 2 6 Data 1 2 4 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 23 NUMA (non-uniform memory access) • NUMA is a memory sharing technology to allows a processor VPP1 VM1 VPP2 VM2 core to use memory associated with Socket 0 Socket 1 other cores NUMA Node 0 NUMA Node 1 Core0 Core1 Core2 Core3 Core4 Core4 Core6 Core7 • Accessing remote memory happens over the NUMA connection, which is Core0 L1 Core1 L1 Core2 L1 Data Core3 L1 Core4 L1 Data Core5 L1 Core6 L1 Data Core7 L1 typically slower than local memory Data Cache Data Cache Cache Data Cache Cache Data Cache Cache Data Cache Core0-1 Core2-3 Core4-5 Core6-7 access L1 Instruction Cache L1 Instruction Cache L1 Instruction Cache L1 Instruction Cache Core0-1 Core2-3 Core3-4 Core6-7 L2 Cache L2 Cache L2 Cache L2 Cache • Benefits: each core can access its own Core1-3 Core4-7 memory -> allows for simultaneous L3 Cache (last-level cache) L3 Cache (last-level cache) Core0-3 Memory NUMA Node 0 NUMA Node 1 Core4-7 Memory memory access controller (shared) Interconnect (shared) Interconnect (shared) controller (shared) • Performance implications Mem-VPP1 NUMA Node 0 External Memory Mem-VPP2 Mem-VM1 NUMA Node 1 External Memory Mem-VM2 • Higher-latency for memory access • Variable performance • Application memory may not be local to the core TECSPG-2300 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 24 CSR1000V Performance Polaris 16.10.01b ESXi/SR-IOV ESXi / SR-IOV/ Single Feature / IMIX 18000 16000 14000 12000 10000 8000 6000 4000 Throughput (Mbps) Throughput 2000 0 IPSec (Single CEF ACL NAT L4 FW Basic QoS AES) 1 vCPU 6546 4656 781 3448 3741 5342 2 vCPU 7093 5093 843 3516 3794 6250 4 vCPU 8606 7075 1218 3844 3787 7590 8 vCPU 15624 14494 2312 4546 4547 15396 Traffic Profile : IMIX {64 byes (58.33%), 594 bytes (33.33%), 1518 bytes (8.33%)} PDR(Packet Drop Rate): 0.01% *The max throughput license we offer today is 10Gbps and please contact us if you have use case requires more than 10G © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public CSR1000V Performance Polaris 16.10.01b KVM/SR-IOV KVM-REHL / SR-IOV/ Single Feature / IMIX 18000 16000 14000 12000 10000 8000 6000 4000 Throughput (Mbps) Throughput 2000 0 IPSec (Single CEF ACL NAT L4 FW Basic QoS AES) 1 vCPU 3812 4312 750 3302 3575 4753 2 vCPU 7148 3624 843 3536 3813 6304 4 vCPU 8643 6911 1218 3781 3673 7786 8 vCPU 16718 15023 2405 7276 7120 14740 Traffic Profile : IMIX {64 byes (58.33%), 594 bytes (33.33%), 1518 bytes (8.33%)} PDR(Packet Drop Rate): 0.01% *The max throughput license we offer today is 10Gbps and please contact us if you have usecase requires more than 10G © 2020 Cisco and/or its affiliates.