Anatomy of Network Elements Anatomy of Core Network Elements

AnatomyAnatomy of Coreof Network Network Elements Elements from 1Gbps to 10Tbps Josef Ungerman CSE, CCIE#6167 Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 1 Agenda 1. Basic Terms 2. Router Architectures 3. Switch Architectures 4. Hybrid Architectures 5. Network Processors 6. Switch Fabrics Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 2 Basic Terms Chapter 1 Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 3 Cisco in 80’s: Router Architecture CPU DRAM Flash, NVRAM, CON, AUX,... Packet interfaces Interconnect interfaces Store & Forward Switching – using packet buffers and QoS, handles WAN interfaces (very variable interface speeds) Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 4 Real-Time Packet Processing Process Switching CPU DRAM Flash, NVRAM, CON, AUX,... process level Packet interrupt level interfaces Interconnect interfaces Process Switching – IOS Process handles the forwarding decision and other operations with the packet Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 5 Real-Time Packet Processing Data Plane vs. Control Plane CPU DRAM Flash, NVRAM, CON, AUX,... process level process region Control Packet interrupt level I/O region Data Packet interfaces Interconnect interfaces Data Plane – transit packets (aka. fast path) Control Plane – packets for the router (routing, management, exceptions) • routing/control plane = routing and vital functions (OSPF, BGP, LDP, NTP, keepalives,...) • management plane = access to the router (telnet, ssh, SNMP, ...) Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 6 Real-Time Packet Processing NP (Network Processor) – S/W vs. H/W router CPU Route DRAM Flash, NVRAM, Control Packet CON, AUX,... IOS Data Packet NP (Network u-code Packet DRAM Processor) Data Packet interfaces Interconnect interfaces NP (Network Processor) – NP handles the data plane, not IOS (platform-dependent) CPU – runs IOS – handles only the control plane (platform-independent) Slow Path – IOS on the CPU can still forward some packets that NP cannot handle (eg. exceptions, non-IP protocols routing, unsupported features) Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 7 Real-Time Packet Processing NP (Network Processor) – S/W vs. H/W router Routing & Forwarding Engine CPU Route DRAM Flash, NVRAM, Control Packet CON, AUX,... IOS Data Packet NP (Network NPU header BQS Packet DRAM Processor) Data Packet interfaces Interconnect interfaces BQS (Buffering, Queuing, Scheduling) or TM (Traffic Manager) ASIC – handles the memory access and QoS (packet body) NPU (Network Processing Unit) – handles only packet forwarding and operations (packet header) Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 8 Summary – what is inside the router? BBB – basic building blocks • Processor • control-plane – OS processor • data-plane – network processor • Memory • DRAM for OS memory and packet buffers • SRAM for caches • TCAM for fast lookups • Interconnects • bus • serial link • switch fabric We do not care about what is visible on the router • chassis, fans, power supplies • control ports – CON, AUX, BITS, Alarms, Disks • data ports – LAN and WAN interfaces Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 9 “It is always something (corollary). Good, Fast, Cheap: Pick any two (you can’t have all three).” RFC 1925 “The Twelve Networking Truths” Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 10 Packet Processing Technology Primer Performance vs. Flexibility CPU (Central Processing Unit) • multi-purpose processors (CISC, RISC) • high s/w flexibility [weeks] • low performance stability [cca 1Mpps today] • usage example: access routers (ISR’s) ASIC (Application Specific Integrated Circuit) • mono-purpose hard-wired functionality • low engineering flexibility [2 years] • high performance stability [over 200 Mpps today] • usage example: switches (Catalysts), core routers Input Demux Feedback NP (Network Processor) = “something in between” IM IM IM IM 0 4 8 12 Mem. Column • performance + programmability IM IM IM IM 1 5 9 13 Mem. Column IM IM IM IM • moderate s/w flexibility [months] 2 6 10 14 Mem. Column IM IM IM IM 3 7 11 15 7 • moderate and stable performance [4Mpps – 40 Mpps+] Mem. Column Mux Output • can be expensive, power-hungry, can have low code memory • usage: fast feature-rich edge and aggregation Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 11 Memory Technology Primer Capacity vs. Access Speed Two basic memory technologies are in use today: • Static RAM (SRAM, SSRAM) • Dynamic RAM (DRAM, EDO DRAM, SDRAM, DDR) SRAM DRAM High Power Low Power High Speed Low Speed [10-20ns] [40-60ns] Low Density High Density [eg. 16M per chip] [eg. 1G per chip] Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 12 Interconnects Technology Primer Capacity vs. Complexity • Bus • half-duplex, shared medium • for example PCI [800Mbps to 25Gbps+ today] • simple and cheap • Serial Lane (Point-to-Point Link Set) • dedicated, unidirectional or full-duplex line • for example SPI4.2 [11.2Gbps+ today] • Switching Fabric (cross-bar, exchange) • non-blocking, full-duplex, any-to-any • for example GSR, CRS [40Gbps to 9.6Tbps+ today] Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 13 Example: Lookup Problem memory vs. processing TCAM (Ternary Content Addressable Memory) SRAM with a comparator at each cell ROOT 1 step – very fast, but very expensive 10.0.0.0 192.0.0.0 54.0.0.0 parallel, order independent lookups 10.1.0.0 10.10.0.0 192.5.0.0 192.8.0.0 (ACL, QoS, Netflow, even FIB) 54.10.0.0 10.1.1.0 10.10.5.0 192.8.2.0 Content and Mask Address 10.1.1.1 54.10.1.0 54.10.4.0 192.8.2.0 192.8.2.128 . load share 192.168.100.xxx 801 punt Tree or Serial Lookup host-route 192.168.200.xxx 802 cache drop 192.168.300.xxx 803 8-8-8-8 used by generic IOS glean incomplete . 16-8-8 used by the C12000, 11-8-5-8 used by C10K memory vs. speed tradeoff! - could be 8-1-1-1-1-1-1-1-1-1-1-1... (low SRAM) 192.168.200.111 802 - could be used also for ACL, uRPF, accounting Query Result NEXTHOP Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 14 Router Anatomy Chapter 2 Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 15 Fundamental Building Blocks Simplex serial link set Module, card Duplex serial link set I/O module (hardware module with I/O Active and backup backplane interfaces connection (serial duplex link set) Bus Switch Fabric (any-to-any full-duplex switching element) Mux/demux, fabric interface (typically including a tiny buffer) Forwarding ASIC (a complex of hardware F elements and SRAM’s handling data plane) Queuing ASIC (BQS – Buffering/Queuing/Scheduling, Q TM – Traffic Manager, etc.) NP, Network Processor (programmable hardware NP element handling data plane) buff. Packet buffering, packet memory, QoS point Control Plane element : CPU + DRAM + Flash + NVRAM and IOS control interfaces. CPU (Central Processing Unit) is a general Anatomy © 2009 Ciscopurpose Systems, Inc. microAll rights reserved.-processor running the OS (Operating System) 16 Cisco 7200 (1990’s) Software Router • data plane = IOS interrupt level architecture • control plane = IOS processes NPE-200 buff. IOS I/O controller bridge CON/AUX Flash FE PA PCI Bus PA PA 600 Mbps PA PCI Bus PA 600 Mbps Bus L Bus R PA 1, 4 or 6 PA slots Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 17 Cisco 7200 – NPE-G1/G2 upgrade architecture Software Router • no change, just faster CPU/memory NPE-G2 I/O controller on-board PCI Bus 4x GE buff. IOS crypto 600 Mbps CON/AUX/Flash bridge PA PCI Bus PA PA 600 Mbps PA PCI Bus PA 600 Mbps Bus L Bus R PA 1, 4 or 6 PA slots Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 18 Cisco ESR10000 Hardware Router • data plane = PXF chip (u-code) architecture • control plane = CPU with IOS • DMA chip for packet memory 1.6G H/H PRE (active) H/H IOS Q buff. NP SIP-600 (2-slot) 11G SPA SPA PRE (standby) NP Q buff. Full IOS Height Linecard 8 full-height slots (ESR 10008) Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 19 Cisco ASR1000 Split data and control plane • RP = control-plane only architecture • ESP = data-plane (QFP chip) 20Gbps, 16Mpps, C-programmable ASR1006 SIP SPA 11.2G ESP (active) RP (active) Encryption SPA Coprocessor NP IOS buff. SIP SPA SPA ESP (standby) RP (standby) buff. SIP NP IOS Encryption SPA Coprocessor SPA 1-3 SIP slots Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 20 “It is more complicated than you think.” RFC 1925 “The Twelve Networking Truths” Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 21 Cisco 7200: centralized single processor architecture Single-Processor • one CPU for everything NPE-G2 I/O controller on-board PCI Bus 4x GE buff. IOS PA 600 Mbps VSA CON/AUX/Flash bridge PA PCI Bus PA PA 600 Mbps PA PCI Bus PA 600 Mbps Bus L Bus R PA 1, 4 or 6 PA slots Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 22 Cisco 7500: distributed multi-processor architecture Multi-Processor • distributed, parallel CPU’s RSP (active) RSP (standby) buff. IOS buff. IOS memd memd VIP VIP PA PA buff. buff. IOS IOS PA PCI Bus PA 600 Mbps VIP VIP Cy Bus Cy Bus 1Gbps 1Gbps PA PA buff. buff. IOS IOS PA Bus L Bus R PA 3, 5 or 7 VIP slots Anatomy © 2009 Cisco Systems, Inc. All rights reserved. 23 Cisco 12000 – switch fabric architectural evolution Distributed Forwarding Architecture • up to 600Gbps today RP (active) RP (standby) IOS IOS Switch Fabric Cards Engine 0 arb. Engine 5 buff. CSC redundant 10G 622 Q NP SPA buff. M IOS arb. IOS Q CSC NP SPA buff. Engine 2 Engine 6 buff. SFC buff. Q 3G 40G Q F F SFC IOS Q F Q buff.

Anatomy of Network Elements Anatomy of Core Network Elements

C5ENPA1-DS, C-5E NETWORK PROCESSOR SILICON REVISION A1

Design and Implementation of a Stateful Network Packet Processing

Embedded Multi-Core Processing for Networking

Network Processors: Building Block for Programmable Networks

Intel® IXP42X Product Line of Network Processors with ETHERNET Powerlink Controlled Node

And GPU-Based DNN Training on Modern Architectures

Effective Compilation Support for Variable Instruction Set Architecture

NP-5™ Network Processor

Network Processors the Morgan Kaufmann Series in Systems on Silicon Series Editor: Wayne Wolf, Georgia Institute of Technology

Synchronized MIMD Computing Bradley C. Kuszmaul

A Network Processor Architecture for High Speed Carrier Grade Ethernet Networks

Object-Oriented Reconfigurable Processing for Wireless Networks Andrew A