Intel® Quickassist Technology Components
Total Page:16
File Type:pdf, Size:1020Kb
Intel® QuickAssist Technology Components Neal Oliver, PhD Principal Engineer Intel Corporation QATS002 Agenda • Intel® QuickAssist Architecture Overview • Selected Intel® QuickAssist Architecture Components • Intel® Embedded Processor for 2008 (Tolapai) • Hardware Architecture • Software Architecture • Use Cases • Intel® QuickAssist FSB-FPGA Accelerator Platform (FAP) • System Architecture • Accelerator Hardware Module (AHM) • Design Flow • Intel® QuickAssist Technology Accelerator Abstraction Layer (AAL) • AAL services and features • Software Architecture Intel’sSingle responseChip Solutions to customer - VPN/Firewall need for acceleration on2 Intel platforms Intel® QuickAssist Technology – Comprehensive Approach to Acceleration • Multiple accelerator and attach options with software and ecosystem support • Performance and scalability based on customer needs and priorities Intel’s response to customer need to deploy accelerators on Intel® architecture platforms. 3 Intel® Embedded Processor for 2008 (Tolapai) Single Die integrates - IA 32 based Core @ 600, 1066 and 1200MHz - DDR2 memory controller (MCH) - PCI Express* Technology - Standard IA PC peripherals (ICH) - 3x Gigabit Ethernet MACs - 3x TDM high-speed serial interfaces for 12 T1/E1 or SLIC/CODEC connections - Intel® QuickAssist Integrated Accelerator Vital Statistics - 148 Million transistors - 1,088-ball FCBGA w/1.092 mm pitch - 37.5 mm x 37.5 mm package SingleSystem-on-Chip Chip Solutions (SoC) - VPN/Firewall enabling performant, effective4 system solutions Intel® Embedded Processor for 2008 (Tolapai) System Architecture Current Intel® Embedded Processor Equiv Solution for 2008 (Tolapai) Intel® Pentium® M Processor MCH ATA Tolapai* Sec ATA ICH Co-Proc Enet Phy Wan/Lan Enet Wan/Lan Additional •IA SoC optimized for Power/Performance Area, Cost, Additional Core Area, Cost •In-line network/security acceleration Utilization, •Integrated I/O devices Architecture - •IA SW compatibility Lookaside Only •Highest Compute Cycles Available/$ SingleSoC Chip reduces Solutions area, - VPN/Firewallcost, power Tolapai = Intel® embedded processor for 2008 (Tolapai) 5 Acceleration HW Architecture ‡ Services Unit Local MDIO (x1) Expansion TDM GigE GigE GigE CAN (x2) ‡ yIA CPU Core w/ 256KB L2 cache Security Interface MAC MAC MAC ‡ Bus SSP (x1) Services Unit (16b @ (12 E1/T1) #2 #1 #0 - Intel® Pentium® M processor derivative (3DES, AES, (A)RC4, 80 MHz) IEEE-1588 MD5, SHA-x, PKE, yIntegrated Memory Controller TRNG) 1 channel 64-bit DDR2 - 256 KB - 4 channel DMA engine ASU SRAM - PCI Express* (1x8, 2x4, or 2x1) Acceleration and I/O Complex ‡ Enabling software required. yIntel® QuickAssist Acceleration IA Complex - Multi-core, Multi-threaded Engines IMCH Transparent EDMA - 256KB Internal SRAM PCI-to-PCI Bridge - Security Hardware Acceleration for Bulk: AES, 3DES, (A)RC4 Hash: MD5, SHA-x Public Key – RSA, DSA, DH Cache FSB Memory Controller Hub 256K L2 Internal True Random Number Generator (TRNG) IA32 Core yIntegrated I/O Interfaces - 3x TDM (12 T1/E1) IICH - 3x GbE MAC (RGMII or RMII) APIC, DMA, Timers, Watch Dog - 1x Local Expansion Bus (16b) Timer, RTC, HPET( x3) PCI - 2x Controller Area Network (CAN) Express* Interface Memory Controller - 1x Sync Serial Port (SSP) (x1) - 2x UART, 37x GPIO, (Gen1, DDR2 2x SMBus/I2C, LPC 1x8, 2x4 or (400/533/667/800, - UART (x2) 2x1 root 64b with ECC) - 2x USB, 2x SATA SATA2 .0 USB2 .0 GPIO (x37) complex) (x2) (x2) SMBus(x2) - WDT, RTC LPC1.1 SingleSoC integrates Chip Solutions processor, - VPN/Firewall chipset, accelerators6 Acceleration Packet Processing ‡ Services Unit Local MDIO (x1) Expansion TDM GigE GigE GigE CAN (x2) ‡ Security Interface MAC MAC MAC ‡ Bus SSP (x1) Flows Services Unit (16b @ (12 E1/T1) #2 #1 #0 (3DES, AES, (A)RC4, 80 MHz) IEEE-1588 Classic IA (blue) MD5, SHA-x, PKE, y TRNG) - GigE Rx DMA packets to DRAM (includes IA snoop) 256 KB - IA interrupt ASU SRAM - IA CPU runs protocol Acceleration and I/O Complex ‡ Enabling software required. - IA CPU controls GigE TX IA Complex IMCH Transparent y Fastpath (red) PCI-to-PCI Bridge EDMA - GigE Rx DMA packets to DRAM - Interrupt routed to accelerator Cache FSB Memory Controller Hub 256K L2 - Accelerator operates on IA32 Core packet - Forwarding/filtering and security functions can be IICH handled w/o IA CPU APIC, DMA, Timers, Watch Dog intervention Timer, RTC, HPET( x3) PCI - Accelerator controls GigE Tx Express* Interface Memory Controller y Exception Packets (green) (x1) (Gen1, DDR2 - Move packet to coherent 1x8, 2x4 or (400/533/667/800, DRAM (includes IA snoop) UART (x2) 2x1 root 64b with ECC) SATA2 .0 USB2 .0 GPIO (x37) complex) - Accelerator signals IA CPU (x2) (x2) SMBus(x2) LPC1.1 SingleProcessing Chip Solutions flows preserve - VPN/Firewall legacy, provide next-gen7 performance Intel® Embedded Processor for 2008 (Tolapai) Architecture Overview High Level SW Model Customer App IA Core OS/Stack Driver/Shim …API IO/Acceleration/Security Access Library Acceleration/Security Services (or Unit) SingleSoftware Chip Solutionsframework - VPN/Firewall goal: enable scalable software8 solutions Intel® Embedded Processor for 2008 (Tolapai) Software Architecture Application y Drivers for Linux* (Red Hat) and FreeBSD y Intel® QuickAssist FreeS/WAN Open SSL OCF Technology Shim Layer Shim Layer Shim Layer Security API - Low level crypto API – PKCS #11 Intel® QuickAssist Technology Security API compliant - High level IA Acceleration Drivers protocol support Integrated with Protocol y Accelerators IPSec SSL/TLS IKE Acceleration middleware frameworks Bulk processing Public Key – Combined Encrypt and - OpenSSL RSA, DH, DSA Operation Hash - OCF Bulk Packet Low Level Rand Mod Exp Authentication - FreeS/WAN, etc. Crypto Classify Acceleration SingleConvenient, Chip Solutions efficient - VPN/Firewall access to security9 accelerators Intel® QuickAssist Acceleration Technology Security Applications – Single Chip VPN/FW User Data App TCP IP Auth. State Bulk Keys MAC&PL L5-7 L4 L3 L2&1 Encrypt & Hash Authenticate Traditional Solutions Tolapai with Acceleration High Performance Tolapai RISC or CISC IA 32 Core 1 GbE PCIX NIC Intel®QuickAssist Acceleration Security IMCH Technology Co-Processor IICH layer 72 Bit layer DDR2 GbE Single Chip Solutions - VPN/Firewall Tolapai = Intel® Embedded Processor for 2008 (Tolapai) 10 Intel® Embedded Processor for 2008 (Tolapai) - Summary y The Tolapai System-on-a-Chip enables single chip, small form factor developments, bringing x86 performance and cost effectiveness to new applications! y The Intel® QuickAssist Integrated Accelerator within Tolapai draws its identity from the QuickAssist Software Services Modules, enabling customers to develop complete communications, or security solutions in a single chip design. y The Tolapai Integrated Accelerators make possible single chip solutions such as SMB IP PBX, and VPN/Firewall. Tolapai = Intel® Embedded Processor for 2008 (Tolapai) SingleSoC bringsChip Solutions X86 performance - VPN/Firewall and Cost-effectiveness11 to new apps FSB-FPGA Accelerator Platform (FAP) Field Programmable Gate Arrays (FPGA) allow the development of domain-specific hardware-based, parallel algorithms that execute significantly faster than equivalent algorithm in software Front Side Bus (FSB) provides a high performance, low latency interconnect between AHM and CPU SingleFSB provides Chip Solutions high-bandwidth - VPN/Firewall access to third party vendor12 (TPV) accelerators A A A A M M M M B B B B A A AM A M M B M B B B A A A A M M M M B B B B A A A A M M M M B B B B )/ )/ ) ) Modules FSB – FPGA – FSB Bensley Bensley Entry Two Entry Two 5000 Series that plug into Workstation Workstation ® ® Stoakley Stoakley ® ® ( ( Two Socket Two Socket 2007 ( 2007 ( Intel Intel Platform for 2008 Platform for 2008 Intel Intel Socket Platform for Socket Socket Platform for Socket (AHMs) nce, lowest latency interconnect 13 Server Platforms – A A A A M M M M DP & MP B B B B A A A A ® M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B Xeon ® Modules FSB – FPGA – FSB ) ) processor sockets ® Intel 7000 Series Single Chip Solutions - VPN/Firewall Caneland Caneland Xeon Expandable Expandable Expandable Expandable ® ® ® Attach directly to the Front Side Bus (FSB) FSB provides the highest performa 2007 ( 2007 ( Intel Intel Server Platform for Server Platform for FPGA Accelerator Hardware Modules Intel - What are FSB-FPGA Accelerators? Serial Links Multi-Gigabit A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M B B B B A A A A M M M M connected in a ring topology B B B B A A A A M M M M B B B B 14 Modules FSB – FPGA – FSB ) ) Modules FSB – FPGA – FSB Expandable Expandable Expandable Expandable for 2007 for 2007 ® ® Caneland Caneland ( ( Modules FSB – FPGA – FSB Server Platform Server Platform Intel Intel Flexible hardware configuration Single Chip Solutions - VPN/Firewall Partition complex algorithms across multiple accelerator modules Higher degrees of parallelization for even higher performance Connect to external I/O sources Multiple FPGA modules can be - Multiple AHMs Shared Memory: SMP y In a typical SMP system, all XEON XEON CPUs can see system memory, but they can’t see Inter-Processor Interrupts each other because they (IPI) don’t occupy address space Memory Controller Hub y CPUs usually communicate through shared memory and signal each other using Inter-Processor Interrupts (IPIs) System Memory SHARED MEMORY SingleHow Chip SMP Solutions architectures - VPN/Firewall work 15 Shared Memory: CPU+AHM y A similar technique is used to communicate with FSB- attached AHMs y The AHM device driver allocates (and pins) system FSB FSB memory for AHM device registers, command/response queues and shared workspaces y The physical address of the device register memory area is sent to the AHM y CPU communicates with AHM through this memory-based interface AHMSingle Core Chip participates Solutions - VPN/Firewallin FSB protocol 16 FAP System Architecture y Accelerator Hardware Modules (AHM) contains FPGAs, SRAM, flash memory and control logic.