Simulation Strategies for Massively Parallel Design

Authored by: Ansoft Corporation

Special Thanks to:

Ansoft 2003 / Global Seminars: Delivering Performance Presentation #2 Introduction

• Cray: Red Storm Supercomputer – Sandia National Laboratories awarded Cray Inc. a multiyear contract to develop and deliver a new massively parallel processing (MPP) supercomputer called Red Storm. The computer will use 10,000 Inc. Opteron™ processors connected via a high-bandwidth, three-dimensional mesh interconnect network. Introduction

• About Cray – Approximately 850 employees worldwide – Corporate headquarters: Seattle, WA – 3 major engineering centers: • Chippewa Falls, WI, • Mendota Heights, MN, • Seattle, WA – NASDAQ: CRAY Introduction

• Red Storm: System Overview – Theoretical peak performance: 40 trillion calculations per second – 10,368 Compute Nodes: AMD 64 bit Opteron™ processors • Connected via a low-latency, high-bandwidth, three-dimensional mesh interconnect network based on HyperTransport™ technology – Approximately 3000 ft² including disk systems Introduction

• Red Storm: High Speed Network (HSN) – 3D Mesh that interconnects all of the compute nodes • 27 x 16 x 24 (x, y, z) mesh • High-Speed Serial Link • Nominal Data Rate: 3.2Gbps

HighHigh SpeedSpeed NetworkNetwork (HSN)(HSN)

+Y

-Z +X

-X +Z -Y PCI - X

ComputeCompute NodeNode Introduction

• NEC Earth Simulator • Cray Red Storm – Performance: 40Tflops – Performance: 40Tflops – Processor: NEC .15um vector CPU – Processor: AMD Opteron™ – Date: 1997-2002 – Date: 2002-2004

– Cost: $450M – Cost: $90M – Development Schedule: >54 months – Development Schedule: 26 months

$$ $$ º º

CustomCustom SystemSystem HardwareHardware IntegrationIntegration Introduction • Relative “Cost” of Finding Hardware Design Problems – “Cost” = “Pain” = $$$, Time to Market, Your Job, etc. 100

50

20

10

5

2

1 Software Test and Measurement

PreliminaryPreliminary DetailedDetailed DesignDesign IntegrationIntegration ValidationValidation OperationOperation DesignDesign Introduction

• Designing for High-Speed LUCK – Difficult Aspects • As Speed increases, luck decreases “Cost” Increases SPEED A. Fraser, S. Argyrakis, “Does Signal Integrity Engineering – Large number of codependent terms have a Future”, DesignCon 2003,. » They are not always controllable/understood – random variation • New effects – Large Systems composed of many sub-systems » Variables that could be ignored in the past must be known to a very high precision » Signal Channel Management – How do we account for and manage information? • New techniques – At high-speeds: Signal Integrity Engineering = Microwave Engineering » New Design Flows » New Techniques and Terms: Frequency Domain vs. Time-Domain » New Tools: Harmonic Balance, Quasi-Static, Full-Wave, etc. » New Models: 2D and 3D Physical Device Models » Model Abstraction Introduction • Designing for High-Speed – Reverse the trend • Decrease “Cost”: Move more Integration and Validation into early design stages. Virtual Prototypes! • Stop relying on Luck: Better models, techniques, and tools increase the probability of first past success. – Microwave Engineers have been using these techniques for over a decade Introduction Sub-System – Board/Stackup • Virtual Prototypes – Full System & Sub-Systems Sub-System - Connectors Full System

Sub-System – Daughter Card

Sub-System - Transitions

Sub-System - Packaging

Sub-System - Routing Introduction • Channel Management – Challenge: Move Integration and Validation into Virtual Prototype System

SPICE Models 3D Models

Connectors Cross-Talk Bandwidth Frequency Dependence Vias Impedance Channel Eye Diagram Model Boards Management Layout Modes Isolation

Skin Effect 2D Models TDR Packaging Transitions Power Delivery ISI

BER Loss Load

Delay Source Introduction • Channel Management – Common Design Environment/Integrated Database • Solver on Demand – Circuit: Transient/Linear/Non-Linear Harmonic Balance – System: Mixed Mode Analysis - Baseband-through-RF – Planar EM: 2.5D Full-Wave Method of Moments – 3D Full-Wave: HFSS v9 Finite Elements (Solver on Demand, Now in Ansoft Designer 1.1) – 3D Quasi-Static: Spicelink Boundary Elements (Solver on Demand, Version 6.0 coming soon) • Solver on Demand - Information Hiding – Prevents higher levels of design from becoming dependent on low-level details such as 3D Physical Device Modeling. Mechanical CAD 3D AnsoftLinks Layout

Planar EM DXF/GDSII Ansoft Design Channel Manager Environment Matlab System

C code Circuit SPICE Introduction • Why are better models, techniques, and tools needed? – Speed = Problems • Evolution of a short circuit – Once interconnects stop behaving as transmission lines, SPICE models and SPICE like tools can not predict performance

SPEED

A. Fraser, S. Argyrakis, “Does Signal Integrity Engineering have a Future”, DesignCon 2003,. Introduction • Why are better models, techniques, and tools needed? – Co-dependent terms • Example: As speed increases, the connector performance begins to depend on the board integration. – Adopting new models, techniques, and tools that can identify these co-dependent performance factors reduces the probability of discovering hardware problems late in the product development cycle » Remember: The possibility of uncontrollable or unforeseen variables can still appear ? Introduction

• What are these uncontrollable or unforeseen variables? – Virtual Prototypes are abstractions • They only contain the essential details of a complex system – Essential Details = Those that are critical to the electrical performance – Model Abstraction efficiently uses limited computer resources and product development time – Example: Cavity filter designers routinely use screws to tune the filter and account for manufacturing variations. When they simulate their filter designs they would not include the threads on the screw. The threads are essential mechanical details, not electrical details – Manufacturing Process Variations • Example: If the virtual prototype does not account for the substrate thickness shrinking because of thermal effects in the manufacturing process, you will not predict the performance correctly. Introduction

• Ansoft and Cray – Ansoft: Provide End-to-End Simulations of HSN Channel • Five different classes of simulations / analysis 1. PCB/Interconnects – Mezzanine, Module, Backplane, and Red/Black Switch 2. Connectors – NexLev, GbX, and VHDM 3. Cabling – Self-Equalizing Twin-Ax (1.1m - 8m) 4. Packaging – HyperBGA – High Performance Organic Flip-Chip BGA 5. System – Frequency and Time Based Performance Extraction – Cray: Provide • Electrical Specifications • Electrical Models • Mechanical Models • Board Layouts Introduction • Red Storm: HSN Physical Configuration

VHDMVHDM ConnectorConnector SerDesSerDes ASICASIC

BackplaneBackplane

AMDAMD 6464 bitbit OpteronOpteron™™ GbXGbX ConnectorConnector ComputeCompute BoardBoard Introduction • Red Storm: HSN Electrical Configuration

HyperBGA Module Red/Black + Backplane Board Switch Mezzanine Board

Cable Connector Connector Connector Introduction • Red Storm: HSN Electrical Configuration

SerDes HyperBGA

Teradyne Molex Molex Mezzanine GbX VHDM Twin-ax Board Connector Connector Cable Teradyne Module Backplane Red/Black Nexlev Board Board Switch Connector Teradyne Molex Molex GbX VHDM Twin-ax Still in Model Connector Connector Cable Development PCB/Interconnects • Module Board NexLev GBX

YP3_FMBP{18,19} 32mm

YM0_TOBP{18,19} 28mm

NexLev GBX PCB/Interconnects System • Module Board (Frequency or Time Based Analysis)

Planar EM – Coupled Bend Port1

Port2 Port3 _EM_SLCBENDS29 Port1 W21 P1 Port4 W=5mil W=5milS=9.11milP=293mil S=9.11mil P=645mil Port1Port3 W=5milS=9.11milP=716mil

W20 _EM_SLCBENDS28 W=5mil S=9.11mil _EM_SLCBENDS27 U3 Port2Port4 W18 P=3095mil

W=5mil S=9.11mil P=401mil W=5milS=9.11milP=376mil PlanarEM7

Circuit Solver On Demand Planar EM

Speed Choose the level of speed and accuracy Accuracy PCB/Interconnects

• Backplane

P1BYP3_FMBP{18,19} 217mm

P1BYM0_TOBP{18,19} 291mm PCB/Interconnects

• Backplane – Spicelink 2D – Layer Height (B): 0.27178 mm (10.7 mil) – Trace Width (W): 0.125 mm – Trace Separation (S): 0.25 mm – Trace Thickness: 0.5 Oz Copper (0.7 mil)

S W B

er = 3.4, tand = 0.006

Layer B W S Zse Zd Zcom S1/S10 0.272 0.125 0.250 49.15 96.05 25.13 All Dimensions are in mm PCB/Interconnects

• Backplane Routing – Via Stub

– In the link the GbX and VHDM VHDM Connector VHDM Connector will contain a best and worst case via stub Route Layer: s1 (Via Stub: 123.95mil)

VHDM Best Case

Backplane Worst Case

GbX Route Layer: s10 (Via Stub: 10.75mil) PCB/Interconnects

• Backplane Routing – Via Stub

These results do not include loss PCB/Interconnects

• Backplane Routing – Anti-pad

Antipad Radius: 0.5mm Antipad Radius: 0.7mm (Layout) PCB/Interconnects • Backplane Routing – Anti-pad (Layer: S1)

These results do not include loss PCB/Interconnects

• Backplane Routing – Anti-pad (Layer: S10)

These results do not include loss HFSS side view GbX Connector • Teradyne’s GbX advanced performance interconnect provides the highest density optimized differential connector available today. – Delivering data rates greater than 5 Gb/s. – High Density: GbX provides up to 55 pairs per linear inch (4-pair configuration). – Reliability: Two points of contact at a separable interface. bottom view – Flexibility: Choice of density configurations (3, 4 and 5-pair) for higher application flexibility. – Vertical and Horizontal Routing make GbX the ideal solution for star or mesh backplane design. Connectors

• GbX – All links contain 2 backplane sections • One channel outbound from SerDes ASIC. • One channel inbound to SerDes ASIC.

– GbX models encapsulate connectors and escape vias/routing • Connector performance is very dependent on board interface. • Interface is critically dependent on board metrics: – route layer – via stub length – antipad dimensions – board materials • Escape routing is different on the outbound and inbound channels. Backplane

VHDM

To ASIC from backplane

Module GbX

Module

GbX VHDM

From ASIC to backplane Connectors • Models are generated separately for the GbX components. Each channel includes models for: 1. Backplane board escape routing, with adjacent pins. 2. GbX connector with single wafer. 3. Module board escape routing, with adjacent pins. • Different levels of complexity were retained initially for the escape routing. – “From Backplane” routing will be used to determine what level of complexity is necessary.

From Backplane

+ + Complexity Complexity - -

To Backplane GbX connector

module escape routing

backplane escape routing Connectors • VHDM – Very High Density Matrix Connectors Connectors twin-ax cable feed • VHDM - Backplane

VHDM connector

backplane escape routing Connectors • Red/Black switch allows supercomputer to be physically divided for secure (classified) processing – Red/Black switch is two VHDM-HSD connectors in a back-to-back configuration – A center-plane circuit board provides support for the back-to-back configuration

HFSS model Cable

• Gore Twin-Ax – 100 differential – “Self Equalization” Cable

• Self Equalization – Attenuation increases with sqrt(f) due to conductor skin effects • Higher frequency components attenuations >> fundamental frequency – Increased jitter and inter-symbol interference – Limits length of cable • Dielectric loss vary directly with frequency – Low loss dielectric – Cable Equalization • Produces a near linear attenuation response vs. frequency • Use different skin depth properties of conducting materials – Base material has low conductivity and/or high permeability » Coat with a good conductor Cable

Equalized

Standard Cable Package

• HyperBGA – High Performance Organic Package – Flip Chip Packaging Packaging Conclusions • Cray and Ansoft Corporation are in collaboration to verify for the 3.2Gb/s serial data channel of the Cray Red Storm Supercomputer high-bandwidth, three-dimensional mesh interconnect network. – Cray recognized the value of electromagnetic-based simulation to ensure reliable supercomputer performance.

• This presentation showed how a combination of electromagnetic field simulation coupled with circuit and system simulation was used to predict the interconnect performance. – The successful/accurate characterization of the system was made possible by utilizing: • Electromagnetics based analyses software – Circuit/System Level » Ansoft Designer – Passive Physical Device Modeling » Ansoft HFSS » Ansoft Designer » Ansoft SpiceLink » Ansoft Optimetrics

• Modern high-speed designs are requiring engineers to achieve new levels of technological advances. – The methodologies introduced here show how to systematically reduce a complex system to a solvable problem. – This structured procedure breaks the design-build-redesign loop commonly found in the old methodology of addressing problems after signal integrity errors are encountered.