HyperTransport Extending Technology Leadership

International HyperTransport Symposium 2009

February 11, 2009

Mario Cavalli General Manager HyperTransport Technology Consortium

Copyright HyperTransport Consortium 2009 HyperTransport Extending Technology Leadership

HyperTransport and Consortium Snapshot

Industry Status and Trends

HyperTransport Leadership Role

February 11, 2009

Mario Cavalli General Manager HyperTransport Technology Consortium

Copyright HyperTransport Consortium 2009 HyperTransport Snapshot

Low Latency, High Bandwidth, High Efficiency Point-to-Point Interconnect Leadership

CPU-to-I/O CPU-to-CPU CPU-to-

Copyright HyperTransport Consortium 2009 Adopted by Industry Leaders in Widest Range of Applications than Any Other Interconnect Technology

Copyright HyperTransport Consortium 2009 Snapshot

Formed 2001

Controls, Licenses, Promotes HyperTransport as Royalty-Free Open Standard

World Technology Leaders among Commercial and Academic Members

Newly Elected President Mike Uhler VP Accelerated Computing

Copyright HyperTransport Consortium 2009 Industry Status and Trends

Copyright HyperTransport Consortium 2009 Global Economic Downturn

Tough State of Affairs for All Industries

Consumer Markets Crippled with Long-Term to Recovery Commercial Markets Strongly Impacted

Copyright HyperTransport Consortium 2009 Consequent Business Focus

Cost Effectiveness

No Redundancy

Frugality

Copyright HyperTransport Consortium 2009 Downturn Breeds Opportunities

Reinforced Need for More Optimized, Cost-Effective Computing Infrastructure

Good for HPC Sector

Copyright HyperTransport Consortium 2009 Creating Demand for New Technology Delivering: More Value for Same Power and Cost Same Value for Less Power and Cost Best Investment Preservation Minimized Total Cost of Ownership Through Better: Performance and Power Efficiency Resource Flexibility and Adaptability System Virtualization Æ Consolidation

Copyright HyperTransport Consortium 2009 Producing New Computing Trends

Cloud Computing Æ Hosted Software, Software as a Service (SaaS) Replace Costly In-House Infrastructure and Management Resources

Infrastructure Centralization Demands Efficient Data Centers, Server Farms

Copyright HyperTransport Consortium 2009 Producing New Computing Trends (cont.)

Netbook over Notebook / Desktop

New? No Innovative? No Same for Less? No Less for Much Less? Yes!

Good Enough if Budget Tight? Yes! Right-Time, Right-Place Products? Right!

Copyright HyperTransport Consortium 2009 HyperTransport Leadership Role

Copyright HyperTransport Consortium 2009 Answers Market Trend Expectations

With Core Values

Leading Performance Full Scalability Power Efficiency Low Design Cost Market-Proven Solidity Vast Product Ecosystem

Copyright HyperTransport Consortium 2009 Continued Technology Progression

With Expanding Market Presence

HT 1.0 HT 1.1

2001 2002

HT 2.0 17.7M HT-Based Systems Shipped (Note 1) HT 3.1 2003 2004 HTX HT 3.0 HNC 1.0 HTX3 (Note 3)

2005 2006 2008 2009 62.7M HT-Based Systems Shipped Note 1: by end of 2003 – Source InStat (Note 2) Note 2: by end of 2008 – Source InStat Note 3: High Node Count HT Specification 1.0 - Accessible/Useable by HTC Promoter and Contributor Members Only

Copyright HyperTransport Consortium 2009 HT 3.1 Specification

Keeps HT Ahead of Industry Requirements

Feature Current Use HT 3.1 Max Max Headroom Clock Rate 2.0 GHz 3.2 GHz 60% HT 3.1 Bandwidth 16 GB/s 51.2 GB/s 220% 51.2 GB/s (32-Bit) Link Width 16-bit 32-bit 100% 25.6 GB/s (16-Bit)

Solidifies HT Leadership HT 3.0 Reinforces HT ROI 41.6 GB/s (32-Bit) 20.8 GB/s (16-Bit) The Only 32-Bit-Capable Interconnect 2.6 GHz 2.8 GHz 3.0 GHz 3.2 GHz Clock In Industry

Copyright HyperTransport Consortium 2009 HTX3TM Specification

3x Bandwidth of HTXTM Connector Standard

• HT3.0 Performance • HT3.0 Link Splitting Support • More Power Mgmt. Features • 100% Backward Compatibility

For Highest Performance Subsystems

Copyright HyperTransport Consortium 2009 M4 M1

k

or Server 1

w t

e

N

d M3 M2 M8 he

c t i

w

S

M5 /

k r

o

w

et

N

Server 2 t

c

e r i

D +3 M7 M6 Mx Copyright HyperTransportConsortium 2009 Mx Server n +1 +2 Mx Mx Enables Scalable HPC Systems and Clusters with High Node Count HT Specification 1.0 Low Latency Non-Coherent Architecture High Node Count HT Specification 1.0 (cont.)

Answers Ever Compounding On-Chip + In-System Addressing Challenge

Exponential Exponential Number of CPU Number of Cores Clusters/Subclusters You are Here

Copyright HyperTransport Consortium 2009 (cont.) Server X

k r

o

Server Y w t

e

N Copyright HyperTransportConsortium 2009 Server Z High Node Count Specification 1.0 Supports Global Sharing of Localized Data Storage (cont.) Server X

k r

o

Server Y w t

e

N DRAM High-Density Copyright HyperTransportConsortium 2009 Server Z High Node Count Specification 1.0 Supports Global Sharing of Localized Data Storage Subsystem Especially High-Density DRAM Flash Memory Flash (cont.) Server X

k r

o

Server Y w t

e

N DRAM High-Density Copyright HyperTransportConsortium 2009 Server Z High Node Count Specification 1.0 Supports Global Sharing of Localized Data Storage Subsystem Especially High-Density DRAM and Low Power Flash-Based Memory Subsystems Flash Memory Flash High Node Count Specification 1.0 (cont.)

Best System and Performance Scalability Minimized Power Consumption

Optimized Total Cost of Ownership

Copyright HyperTransport Consortium 2009 Mature Stability, Mission-Critical Reliability

Field-Proven Dependability for Demanding Markets

63 Million HT-Powered Products by end of 2008

2007 2007 Capture Market Yr/Yr Growth

8% Defense Applications 17% 32% Top500 Supercomputers 28% 11% Core Routers 1.2% 22% Edge Routers 34% 15% SAN 11% 23% Servers 38%

Source: InStat

Copyright HyperTransport Consortium 2009 Ever Expanding Product Ecosystem

• From HT IP to HT Software • 12 HT-Based Processor Brands • Fosters Technology Strength • Widespread Market Utilization

X86 Computing

Graphics

Security

Packet

Media

Comm

Acceleration

System Virtualization

Copyright HyperTransport Consortium 2009 Expanding Product Ecosystem (cont.)

New Godson Multi-Core Server-Class CPU

• Petascale Performance Target by 2010 • Backed by China’s Government • MIPS-Based with 200+ More Instructions for Translation and Acceleration • 16 GFLOPS at 1GHz and 10W of Power • Earlier versions (non-HT), produced by ST Institute of Computing Technology Microelectronics and sold to 40 companies Chinese Academy of Sciences in set-top boxes, laptops, etc.

• @200 developers working on Godson HW, @100 on SW and Compilers

Copyright HyperTransport Consortium 2009 HyperTransport Book

Covers all HT Link and HTX Specification

700 Pages of Must-Have Tutorial

Co-Authored by HTC’s Brian Holden

Available Online from MindShare www.mindhsare.com in Paper and eBook Formats

Copyright HyperTransport Consortium 2009 Thank You!

Mario Cavalli General Manager HyperTransport Technology Consortium

Copyright HyperTransport Consortium 2009 Corollary Information Not Part of Live Presentation

Copyright HyperTransport Consortium 2009 HyperTransport Everywhere!

Also in PowerPC-Based and -Based Products

Copyright HyperTransport Consortium 2009 Godson Server-Class CPU Institute of Computing Technology - Chinese Academy of Sciences 4-Core Reconfigurable Architecture

65-nm Technology Directory-Based Coherence 8 Config. Address Protocol Safeguards Windows of Each Master Port Allow Cache Data Pages Migration Across L2 and Memory

Nodes Organized in Mesh

ncHT1.0 ncHT1.0 8x8 AXI Switch PCIe PCIe

Shared L2 Configurable DMA Engine Supports As Internal RAM, DMA 2 Links for Each Node’s Pre-Fetch and Matrix To Internal RAM Directly 4 Connection Points (Stream Processor)

Copyright HyperTransport Consortium 2009 Godson Server-Class CPU (cont.) Institute of Computing Technology - Chinese Academy of Sciences Godson Versions

8-Core Multi-Chip 20W Version Possible in 2009

Copyright HyperTransport Consortium 2009 Godson Server-Class CPU (cont.) Institute of Computing Technology - Chinese Academy of Sciences

Godson Cores Profile

Copyright HyperTransport Consortium 2009 HTXTM Spotlight

How and Why HyperTransport HTX Proves Best Choice for Compute-Intensive Applications

Copyright HyperTransport Consortium 2009 HTXTM Values Snapshot

Enables • HPC Products Demanding Performance Beyond the Reach of PCI-Class Interconnects • Integration of System Functionality Too New/Complex/Costly for MB Integration Empowers • HPC Solution Providers with a Competitive Edge – No Risks of Premature MB Integration – Shortest Time-to-Market – One MB Fits Multiple Markets/Applications – Up-Sell Factor

Copyright HyperTransport Consortium 2009 HTXTM Applications

Compute Intensive • High Bandwidth + Low Latency • Multi-Processing, Co-Processing Target Markets • Database Analytics • High Traffic Web Services • Stock Trading Acceleration • Server Clustering and SMP • Streaming Media Servers • Financial Modeling

Copyright HyperTransport Consortium 2009 Expanding HTXTM Product Ecosystem

Server / MB

Data Analysys Coprocessor HTXTM Content-Aware Routing Processor

High-Perf Server Clustering Controller

Content/Security Processor

Content/Security Processor

10GE NIC Ref Design

Universal HTX/HTX3 Board Ref Design

FPGA Ref Design Board More Innovative HTXTM Systems and Subsystems in the Pipeline Copyright HyperTransport Consortium 2009 New HTXTM Systems

HTX HTX PCIe PCIe PCIe PCIe PCIe PCIe PCIe ProL iant D x16 x16 x4 x4 x16 x4 x4 x4 x8 L165- Slot Blank 9 Blank 8 7 6 5 4 3 2 1 G5

ProLiant DL785-G5

Copyright HyperTransport Consortium 2009 New HTXTM Subsystems

NumaChip Technology

Cache-Coherent Shared Memory Processor for Scalable Server Clustering

Copyright HyperTransport Consortium 2009 New HTXTM Subsystems

Vulcan Content-Aware Routing Processor for Multi-Core Systems Delivers Unprecedented Multi-Core Processing and Power Optimization

Applications High-Traffic Web Telecom Automated Trading High Throughput, Fast Network Access

Copyright HyperTransport Consortium 2009 New HTXTM Reference Designs

HTX3TM Universal Reference Design Board

HT3 Core IP

Copyright HyperTransport Consortium 2009 Why HTX3TM ?

Empowers Future HPC Innovation

• FPGAs Playing Key Role in Compute-Intensive Designs • HTX3 Paves Way for New Generation FPGA Technology – FPGAs from Bandwidth Bottlenecks to Performance Drivers • Power Optimization Ranks High in HPC Agenda • HT 3.0 Has Reached Maturity and Stability • HT 3.0 Capability Now Safely and Stably “Connectorized”

Reinforces HTX Performance Edge over PCI Express

Copyright HyperTransport Consortium 2009 HTX3TM Features Summary

Feature HTX HTX3 Notes Max Clock Rate 800 MHz 2.6 GHz 12” Trace length Max Bandwidth x Lane 1.6 GT/s 5.2 GT/s Bi-directional Max Bandwidth 6.4 GB/s 20.8 GB/s Bi-directional 16-Bit HT link Aggregate HT3 Link Splitting NO YES HT link can be 1x 16-Bit or Support 2x 8-Bit for multi-CPU support HT3 Extended Power NO YES LDTREQ# Signal Added to Management participate in x86 power states Extended FPGA NO YES Incorporated field-proven Guidelines recommendations Full Backward -- YES Level shifters and signal Compatibility allocation

For more details, see HTX3 specifications on HTC’s web site

Copyright HyperTransport Consortium 2009 HTXTM a Substitute for PCI Express?

No – HTX Complements and Coexists with PCIe by Providing the Capability that PCIe Cannot Deliver

DDR Memory DDR Memory

Chipset HTX TM TM

HTX3 16-Bit

Direct Connect to 2x 8-Bit Compute-Intensive Subsystems TM Peripheral HTX3 Interconnects

Copyright HyperTransport Consortium 2009 Unique HTXTM Capabilities

Aggregate Latency Advantage

• 20% Better Physical Layer Latency and Bandwidth due to Absence of 8B/10B Clock Recovery Overhead – No SerDes • 55% Lower Latency Per Transaction due to Absence of Intermediate Control Logic Overhead – 95nS of PCIe Gen2’s Estimated Round Trip Penalty out of 170nS Total on Short, Open Page DRAM Reads • Vastly Leaner Protocol (Packet Payload) – 12 Less Bytes of Overhead per Packet Compared to PCIe • 20nS Better Per-Transaction Latency in Heavy Traffic Environments due to HT’s Priority Request InterleavingTM

Copyright HyperTransport Consortium 2009 TM Unique HTX Capabilities (cont.)

Up to Twice Packet/Latency Efficiency in Intra-Processor Traffic

TM HTX Packet Overhead Efficiency Margins over PCIe

Efficiency

Min Overhead

Max Overhead

Usual Intra-Processor Traffic Data Bytes per Packet

Copyright HyperTransport Consortium 2009 TM Unique HTX Capabilities (cont.)

Considerable Per-Packet Latency Advantage

TM HTX3 Per Packet Latency Advantage over PCIe Gen2 nS HTX3: 2.6 GHz - x16 Links

PCIe: 5.0 GHz – x16 Links

Min Packet Overhead

Max Packet Overhead

Usual Intra-Processor Traffic Data Bytes Per Packet

Latency Advantage nS

Latency Advantage nS

The results take into account PCIe’s 20% clock recovery, packet payload and 55% chipset overhead penalties. HTX’s Priority Request Interleaving, if applicable, will add to HTX’s total latency advantage.

Copyright HyperTransport Consortium 2009 TM Unique HTX Capabilities (cont.)

Superior Bandwidth

Feature PCIe PCIe HTX HTX3 Gen1 Gen2 Max Clock Rate 2.5 GHz 5.0 GHz 800 MHz 2.6 GHz NO NO YES YES Max Bandwidth x Lane 2.5 Gbps 5.0 Gbps 1.6 GT/s (*) 5.2 GT/s (*) 8B/10B Penalty -20% -20% No Penalty No Penalty Net Bandwidth x Lane 2.0 Gbps 4.0 Gbps 1.6 GT/s (*) 5.2 GT/s (*) Net Bandwidth 8 16 6.4 20.8 16-Bit - Aggregate Gbytes/s Gbytes/s GBytes/s GBytes/s

(*) HyperTransport supports Double Data Rate (DDR), transferring data on both the leading and trailing edge of the clock. Therefore HyperTransport’s bandwidth is more appropriately represented by the term “Transfers/second” than the term “Bits/second.”

Copyright HyperTransport Consortium 2009 TM Unique HTX Capabilities (cont.)

Tangible Time-to-Result Savings!

Compute-Intensive Tasks Require 100Ks to Billions of Packet Transactions HTX3TM Time-to-Result Savings vs. PCIe Gen2

Number of Packets 100,000 1 Million 1 Billion Transferred Per Task Per Task Per Task Bytes per Packet Transferred 4 0.78 mS 7.8 mS 7.8 Sec 16 4 mS 40 mS 40 Sec 256 0.32 Sec 3.20 Sec 53 Min 512 1.16 Sec 11.62 Sec 3.23 Hrs

The results take into account PCIe’s 20% clock recovery, packet payload and 55% chipset overhead penalties. HTX’s Priority Request InterleavingTM, if applicable, will add to HTX’s total time-to-result latency advantage

Copyright HyperTransport Consortium 2009 TM Unique HTX Capabilities (cont.)

Example: Celoxica’s Accelerator Company’s Benchmark Results

Latency Access to Network Data Regardless of Packet Size

e face fac ter TM Inter In HTX 1.4 uS <10 uS

Copyright HyperTransport Consortium 2009 HPC - Industry’s Bright Star

Strong Business Growth Opportunities

Copyright HyperTransport Consortium 2009