Intel® QuickAssist Technology Components

Neal Oliver, PhD Principal Engineer Corporation

QATS002 Agenda

• Intel® QuickAssist Architecture Overview • Selected Intel® QuickAssist Architecture Components • Intel® Embedded Processor for 2008 (Tolapai) • Hardware Architecture • Software Architecture • Use Cases • Intel® QuickAssist FSB-FPGA Accelerator Platform (FAP) • System Architecture • Accelerator Hardware Module (AHM) • Design Flow • Intel® QuickAssist Technology Accelerator Abstraction Layer (AAL) • AAL services and features • Software Architecture

Intel’sSingle responseChip Solutions to customer - VPN/Firewall need for

acceleration on2 Intel platforms Intel® QuickAssist Technology – Comprehensive Approach to Acceleration

• Multiple accelerator and attach options with software and ecosystem support • Performance and scalability based on customer needs and priorities

Intel’s response to customer need to deploy accelerators on Intel® architecture platforms.

3 Intel® Embedded Processor for 2008 (Tolapai) Single Die integrates - IA 32 based Core @ 600, 1066 and 1200MHz - DDR2 memory controller (MCH) - PCI Express* Technology - Standard IA PC peripherals (ICH) - 3x Gigabit Ethernet MACs - 3x TDM high-speed serial interfaces for 12 T1/E1 or SLIC/CODEC connections - Intel® QuickAssist Integrated Accelerator

Vital Statistics - 148 Million transistors - 1,088-ball FCBGA w/1.092 mm pitch - 37.5 mm x 37.5 mm package

SingleSystem-on-Chip Chip Solutions (SoC) - VPN/Firewall enabling

performant, effective4 system solutions Intel® Embedded Processor for 2008 (Tolapai) System Architecture Current Intel® Embedded Processor Equiv Solution for 2008 (Tolapai)

Intel® ® M Processor

MCH ATA Tolapai*

Sec ATA ICH Co-Proc Enet Phy Wan/Lan Enet Wan/Lan

Additional •IA SoC optimized for Power/Performance Area, Cost, Additional Core Area, Cost •In-line network/security acceleration Utilization, •Integrated I/O devices Architecture - •IA SW compatibility Lookaside Only •Highest Compute Cycles Available/$

SingleSoC Chip reduces Solutions area, - VPN/Firewallcost, power

Tolapai = Intel® embedded processor for 2008 (Tolapai) 5 Acceleration HW Architecture ‡ Services Unit Local MDIO (x1) Expansion TDM GigE GigE GigE CAN (x2) ‡ yIA CPU Core w/ 256KB L2 cache Security Interface MAC MAC MAC ‡ Bus SSP (x1) Services Unit (16b @ (12 E1/T1) #2 #1 #0 - Intel® Pentium® M processor derivative (3DES, AES, (A)RC4, 80 MHz) IEEE-1588 MD5, SHA-x, PKE, yIntegrated Memory Controller TRNG) 1 channel 64-bit DDR2 - 256 KB - 4 channel DMA engine ASU SRAM - PCI Express* (1x8, 2x4, or 2x1) Acceleration and I/O Complex ‡ Enabling software required. yIntel® QuickAssist Acceleration IA Complex - Multi-core, Multi-threaded Engines IMCH Transparent EDMA - 256KB Internal SRAM PCI-to-PCI Bridge - Security Hardware Acceleration for ƒ Bulk: AES, 3DES, (A)RC4 ƒ Hash: MD5, SHA-x

Public Key – RSA, DSA, DH Cache FSB Memory Controller Hub ƒ 256K L2

ƒ Internal True Random Number Generator (TRNG) IA32 Core yIntegrated I/O Interfaces - 3x TDM (12 T1/E1) IICH - 3x GbE MAC (RGMII or RMII) APIC, DMA, Timers, Watch Dog - 1x Local Expansion Bus (16b) Timer, RTC, HPET( x3) PCI - 2x Controller Area Network (CAN) Express* Interface Memory Controller - 1x Sync Serial Port (SSP) (x1)

- 2x UART, 37x GPIO, (Gen1, DDR2 2x SMBus/I2C, LPC 1x8, 2x4 or (400/533/667/800, - UART (x2) 2x1 root 64b with ECC) - 2x USB, 2x SATA SATA2 .0 USB2 .0 GPIO (x37) complex) (x2) (x2) SMBus(x2) - WDT, RTC LPC1.1

SingleSoC integrates Chip Solutions processor, - VPN/Firewall chipset,

accelerators6 Acceleration Packet Processing ‡ Services Unit Local MDIO (x1) Expansion TDM GigE GigE GigE CAN (x2) ‡ Security Interface MAC MAC MAC ‡ Bus SSP (x1) Flows Services Unit (16b @ (12 E1/T1) #2 #1 #0 (3DES, AES, (A)RC4, 80 MHz) IEEE-1588 Classic IA (blue) MD5, SHA-x, PKE, y TRNG) - GigE Rx DMA packets to DRAM (includes IA snoop) 256 KB - IA interrupt ASU SRAM - IA CPU runs protocol Acceleration and I/O Complex ‡ Enabling software required. - IA CPU controls GigE TX IA Complex IMCH Transparent y Fastpath (red) PCI-to-PCI Bridge EDMA - GigE Rx DMA packets to DRAM - Interrupt routed to

accelerator Cache FSB Memory Controller Hub 256K L2

- Accelerator operates on IA32 Core packet - Forwarding/filtering and security functions can be IICH handled w/o IA CPU APIC, DMA, Timers, Watch Dog intervention Timer, RTC, HPET( x3) PCI - Accelerator controls GigE Tx Express* Interface Memory Controller y Exception Packets (green) (x1) (Gen1, DDR2 - Move packet to coherent 1x8, 2x4 or (400/533/667/800, DRAM (includes IA snoop) UART (x2) 2x1 root 64b with ECC) SATA2 .0 USB2 .0 GPIO (x37) complex) - Accelerator signals IA CPU (x2) (x2) SMBus(x2) LPC1.1

SingleProcessing Chip Solutions flows preserve - VPN/Firewall legacy,

provide next-gen7 performance Intel® Embedded Processor for 2008 (Tolapai) Architecture Overview High Level SW Model

Customer App

IA Core OS/Stack

Driver/Shim

…API IO/Acceleration/Security Access Library

Acceleration/Security Services (or Unit)

SingleSoftware Chip Solutionsframework - VPN/Firewall goal: enable

scalable software8 solutions Intel® Embedded Processor for 2008 (Tolapai) Software Architecture Application y Drivers for Linux* (Red Hat) and FreeBSD y Intel® QuickAssist FreeS/WAN Open SSL OCF Technology Shim Layer Shim Layer Shim Layer Security API - Low level crypto API – PKCS #11 Intel® QuickAssist Technology Security API compliant - High level IA Acceleration Drivers protocol support Integrated with Protocol y Accelerators IPSec SSL/TLS IKE Acceleration middleware frameworks Bulk processing Public Key – Combined Encrypt and - OpenSSL RSA, DH, DSA Operation Hash - OCF Bulk Packet Low Level Rand Mod Exp Authentication - FreeS/WAN, etc. Crypto Classify Acceleration

SingleConvenient, Chip Solutions efficient - VPN/Firewall access

to security9 accelerators Intel® QuickAssist Acceleration Technology Security Applications – Single Chip VPN/FW User Data App TCP IP Auth. State Bulk Keys MAC&PL

L5-7 L4 L3 L2&1

Encrypt & Hash

Authenticate Traditional Solutions Tolapai with Acceleration

High Performance Tolapai RISC or CISC

IA 32 Core 1 GbE PCIX NIC Intel®QuickAssist Acceleration Security IMCH Technology Co-Processor IICH layer 72 Bit layer DDR2 GbE

Single Chip Solutions - VPN/Firewall

Tolapai = Intel® Embedded Processor for 2008 (Tolapai) 10 Intel® Embedded Processor for 2008 (Tolapai) - Summary

y The Tolapai System-on-a-Chip enables single chip, small form factor developments, bringing performance and cost effectiveness to new applications! y The Intel® QuickAssist Integrated Accelerator within Tolapai draws its identity from the QuickAssist Software Services Modules, enabling customers to develop complete communications, or security solutions in a single chip design. y The Tolapai Integrated Accelerators make possible single chip solutions such as SMB IP PBX, and VPN/Firewall.

Tolapai = Intel® Embedded Processor for 2008 (Tolapai)

SingleSoC bringsChip Solutions X86 performance - VPN/Firewall and

Cost-effectiveness11 to new apps FSB-FPGA Accelerator Platform (FAP)

Field Programmable Gate Arrays (FPGA) allow the development of domain-specific hardware-based, parallel algorithms that execute significantly faster than equivalent algorithm in software

Front Side Bus (FSB) provides a high performance, low latency interconnect between AHM and CPU

SingleFSB provides Chip Solutions high-bandwidth - VPN/Firewall access

to third party vendor12 (TPV) accelerators What are FSB-FPGA Accelerators? - FPGA Accelerator Hardware Modules (AHMs) that plug into Intel® ® processor sockets ƒ Attach directly to the Front Side Bus (FSB) ƒ FSB provides the highest performance, lowest latency interconnect

FSB – FPGA FSB – FPGA Modules Modules M A B M A B M A B M A ® B M A M B A M B A B M A M B A B M A M M B A A B B Intel Entry Two Socket Platform for M A B M A B M A B M A B M A M B A M B A B M A M B A B M A M M B A A Intel® Expandable B B 2007 (Bensley)/ ® M A B

Intel Workstation M A AM Intel Workstation B B M A Server Platform for B M A M B A M B A B M A M B A B M A M M B A A 2007 (Caneland) B B Two Socket M A B M A B M A B M A Platform for 2008 B M A M B A M B A B M A M B A B M A M M B A A B B (Stoakley)

7000 Series 5000 Series

® ® SingleIntel ChipXeon SolutionsServer - VPN/Firewall Platforms –

DP13 & MP Multiple AHMs

- Multiple FPGA modules can be connected in a ring topology ƒ Partition complex algorithms across multiple accelerator modules ƒ Higher degrees of parallelization for even higher performance ƒ Connect to external I/O sources

FSB – FPGA FSB – FPGA FSB – FPGA Modules Modules Modules

Multi-Gigabit M A M B A M B A B M A M B A B M A M M B A A B B Serial Links M A M B A M B A B M A M B

® A B

® M A M M B A A Intel Expandable B B Server Platform M A M B A M B A B M A M B A B M A M M B A A for 2007 B B (Caneland) M A M B A M B A B M A M B A B M A B M M A A B B

SingleFlexible Chip hardwareSolutions - configuration VPN/Firewall 14 Shared Memory: SMP

y In a typical SMP system, all XEON XEON CPUs can see system memory, but they can’t see Inter-Processor Interrupts each other because they (IPI) don’t occupy address space

Memory Controller Hub y CPUs usually communicate through shared memory and signal each other using Inter-Processor Interrupts (IPIs) System Memory SHARED MEMORY

SingleHow Chip SMP Solutions architectures - VPN/Firewall work 15 Shared Memory: CPU+AHM

y A similar technique is used to communicate with FSB- attached AHMs y The AHM device driver allocates (and pins) system FSB FSB memory for AHM device registers, command/response queues and shared workspaces y The physical address of the device register memory area is sent to the AHM y CPU communicates with AHM through this memory-based interface

AHMSingle Core Chip participates Solutions - VPN/Firewallin FSB protocol 16 FAP System Architecture

y Accelerator Hardware Modules (AHM) contains FPGAs, SRAM, flash memory and control logic. AAL will support multiple AHMs from different vendors. AHMs may also be interconnected with serial links. y Accelerator Function Units (AFU) implement accelerated algorithms on compute FPGAs. y Management AFU supports FPGA reconfiguration and other platform services on the bridge FPGA. y AHM Core logic provides FSB interface logic, low-level AHM device interface and multiple AFU engine interfaces. y Workspace memory is reserved area of system memory managed by AHM driver - Memory is allocated, pinned (to prevent swapping) and mapped into application’s virtual address space. Both application and AFU can access this memory directly y Accelerator Abstraction Layer (AAL) provides generic accelerator discovery, configuration and messaging services y AFU proxies implement AFU specific message formatting and API. y Domain Specific Libraries may be used to provide a more abstract interface on top of AAL y Applications access accelerator functionality through the domain specific library or directly to AAL.

SingleFAP architectureChip Solutions provides - VPN/Firewall critical

functions of17 FSB protocol AHM Architecture

SingleAHM Chip Core Solutions implements - VPN/Firewall critical

functions 18of FSB agent AFU Engine Interface y One pair of Command and Data FIFOs in each direction y Separate CSR and Bulk data Cmd Data Cmd Data transfer interfaces Data Data SPL2AFU SPL2AFU AFU2SPL AFU2SPL CSR Write CSR Write CSR Update y Command FIFO holds Cmd_Hdr CSR Update indicating the type of transfer y Data FIFO holds data corresponding to each Cmd_Hdr y Benefits - High performance - Use FPGA vendor FIFO macro implementation - Simple FIFO interface can also be used to cross clock boundaries - Variable burst size, 64B to 4MB - Supports bursting with/without wait states - Receiver controls dataflow

Simple Accelerator Interface 19 AFU Design Flow

y Multiple languages will be High Level DSP Flow VHDL / Verilog Language (ie: Matlab) (C, C++, Java) available for developing FPGA accelerator algorithms

Compilers Compilers - VHDL / Verilog - High Level Languages VHDL ƒ C, C++, Java - Matlab* Simulation y Intel is working with a variety of 3rd-Party tool vendors to Synthesis provide FPGA software development kits Gates / EDIF

FPGA

Flexible Design Choices 20 FPGA Acceleration Platform (FAP) - Summary y FAP enables FPGA, tool and hardware vendors to provide customers with a complete solution for integrating FSB-FPGA acceleration into their applications - Comprehensive Approach To Acceleration - Flexible Hardware Configuration - Zero-Copy Programming Model - Simple AFU Interface, flexible design choices

SingleFAP Chipbrings Solutions FPGA acceleration - VPN/Firewall to

Intel platforms21 Accelerator Abstraction Layer (AAL) y AAL allows FAP accelerators in an Intel platform to be managed as a uniform set of resources

- Independent of attach technology (e.g. FSB, PCI Express* Technology) - Independent of the acceleration workload - Defines a common host programming model - Provides a common presentation of the services using existing interconnect interfaces ƒ Tightly coupled accelerators: FSB ƒ Loosely coupled accelerators: PCI Express Technology

AALSingle presents Chip Solutions accelerators - VPN/Firewall as uniform

resources to application22 programmer AAL Services y Installation and configuration of accelerators on the platform - Uniform interface for installers to register the packages they are installing - Uniform interface for applications / libraries to query, enumerate, find and load installed packages y Communication between applications and accelerators - Asynchronous programming model with event dispatcher and delivery: optimized for concurrent task processing - Protection of shared resources allowing multiple applications to use one accelerator: support for multiple threads - Dynamic binding to acceleration packages: provides maximum flexibility without changing framework

AALSingle services Chip Solutions satisfy -requirements VPN/Firewall of

enterprise23 deployments AAL Software Architecture

Application

ISystem, IFactory IAFU Callback IProprietary*

AAS AFU(n) Proxy IRegistrar IEDS

IAFUFactory IRegistrar IPIP Callback Proprietary*

AIA(n) User Mode

Kernel Mode Open, Close, Read, Write, IOCTL, MMap, … Read, park thread

FSB AHM Driver(n)

3rd Party / Standard Intel

AALSingle callback Chip Solutions model provides - VPN/Firewall distributed

Programming24 features AAL Interfaces y Installation and Configuration - AFU Registration Interface (IRegistrar) ƒ Used by installer to register AFU packages ƒ Administrative privilege required - AAL System Interface (ISystem) ƒ Called once per process to initialize AAL - AAL Factory Interface (IFactory) ƒ Called to create AFU proxy object instances y Communication - AAL Event Delivery Service (IEDS) ƒ Used to deliver events and support different threading models - AIA Interface (IAFU) ƒ Implemented by AFU proxy to support data exchange with AHM ƒ User level privilege since device driver validates all workspace memory references

SingleComponent Chip Solutions model integrates- VPN/Firewall with

enterprise software25 frameworks Accelerator Abstraction Layer (AAL) - Summary y Intel’s AAL provides a uniform programming interface for accelerators - Integration with enterprise software frameworks - Management of accelerators as resources - Concurrent programming model across FPGA and attachment technologies

SingleUsing Chip AAL Solutions protects - yourVPN/Firewall software

investment26 Additional sources of information on this topic: y Other Sessions / Chalk Talks / Labs – QATS001 – Overview of Intel® QuickAssist Technology

y For an overview of Intel® embedded processor for 2008 (Tolapai) with Intel® QuickAssist Technology go to: www.intel.com/go/soc

y Accelerators in Action: Visit the I/O & Application Acceleration Community in the showcase to see technology demonstrations from Intel and other industry-leading companies

y More web based info: http://www.intel.com/go/quickassist

27 Session Presentations - PDFs

The PDF of this Session presentation is available from our IDF Content Catalog: https://intel.wingateweb.com/SHchina/catalog/controller/catalog

These can also be found from links on www.intel.com/idf

28 Please Fill out the Session Evaluation Form

Put in your lucky draw coupon to win the prize at the end of the track!

You must be present to win!

Thank You for your input, we use it to improve future events

29 Legal Disclaimer y INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. y Intel may make changes to specifications and product descriptions at any time, without notice. y All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. y Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. y Tolopai, FAP, AAL and other code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user y Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. y Intel, Intel Inside, Intel® QuickAssist Technology, Intel® Xeon®, and the Intel logo are trademarks of Intel Corporation in the United States and other countries. y *Other names and brands may be claimed as the property of others. y Copyright © 2008 Intel Corporation.

30 Risk Factors This presentation contains forward-looking statements that involve a number of risks and uncertainties. These statements do not reflect the potential impact of any mergers, acquisitions, divestitures, investments or other similar transactions that may be completed in the future, with the exception of the Numonyx transaction. Our forward-looking statements for 2008 reflect the expectation that the Numonyx transaction will close during the first quarter. The information presented is accurate only as of today’s date and will not be updated. In addition to any factors discussed in the presentation, the important factors that could cause actual results to differ materially include the following: Factors that could cause demand to be different from Intel's expectations include changes in business and economic conditions, including conditions in the credit market that could affect consumer confidence; customer acceptance of Intel’s and competitors’ products; changes in customer order patterns, including order cancellations; and changes in the level of inventory at customers. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Additionally, Intel is in the process of transitioning to its next generation of products on 45 nm process technology, and there could be execution issues associated with these changes, including product defects and errata along with lower than anticipated manufacturing yields. Revenue and the gross margin percentage are affected by the timing of new Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; Intel’s ability to respond quickly to technological developments and to incorporate new features into its products; and the availability of sufficient components from suppliers to meet demand. The gross margin percentage could vary significantly from expectations based on changes in revenue levels; product mix and pricing; capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; excess or obsolete inventory; manufacturing yields; changes in unit costs; impairments of long-lived assets, including manufacturing, assembly/test and intangible assets; and the timing and execution of the manufacturing ramp and associated costs, including start- up costs. Expenses, particularly certain marketing and compensation expenses, vary depending on the level of demand for Intel's products, the level of revenue and profits, and impairments of long-lived assets. Intel is in the midst of a structure and efficiency program that is resulting in several actions that could have an impact on expected expense levels and gross margin. We expect to complete the divestiture of our NOR flash memory assets to Numonyx. A delay or failure of the transaction to close, or a change in the financial performance of the contributed businesses could have a negative impact on our financial statements. Intel’s equity proportion of the new company’s results will be reflected on its financial statements below operating income and with a one quarter lag. Intel’s results could be affected by the amount, type, and valuation of share-based awards granted as well as the amount of awards cancelled due to employee turnover and the timing of award exercises by employees. Intel's results could be impacted by adverse economic, social, political and physical/infrastructure conditions in the countries in which Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the report on Form 10- K for the fiscal year ended December 29, 2007.

31 32