IntraChip Optical Networks for a Future Supercomputer-on-a-Chip

Jeffrey Kash, IBM Research

IntraChip Optical Networks 29 February, 2008

© 2008 IBM Corporation Acknowledgements ƒ IBM Research: Yurii Vlasov, Clint Schow, Will Green, Fengnian Xia, Jose Moreira, Eugen Schenfeld, Jose Tierno, Alexander Rylyakov, ƒ Columbia University: Keren Bergman, Luca Carloni, Rick Osgood ƒ Cornell University: David Albonesi, Alyssa Apsel, Michal Lipson, Jose Martinez ƒ UC Santa Barbara: Daniel Blumenthal, John Bowers

2 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Outline

ƒ Optics in today’s HPCs ƒ Trends in microprocessor design – Multi-core designs for power efficiency ƒ Vision for future IntraChip Optical Networks (ICON) – 3D stack of logic, memory, and global optical interconnects ƒ Required devices and processes – Low power and small footprint

3 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Today’s High Performance Server Clusters: Racks are mainly electrically connected, but going optical NEC Earth Simulator (during installation) ƒ Real systems are 10-100s of server racks -All copper -Next gen: Optics and several racks of switches – Rack-to-rack interconnects (≤100m) now moving to optics – Interconnects within racks (≤5m) now primarily copper ƒ Over time, optics will increasingly replace copper at shorter and shorter distances All Electrical All Optical • Backplane and card interconnects (≤1m) after rack-to-rack – Trend will accelerate as bitrates (in the media) increase and costs come down Snap 12 module • 2.5Gb/s Æ 5 Gb/s Æ 10Gb/s Æ20Gb/s(?) 12 Tx or Rx at 2.5Gb/s placement at back of rack • Target ~ $1/Gb/s

IBM Federation Switch for ASCI Purple (LLNL) (backside of a switch rack) - Copper (bulk, bend, weight, air cooling) - Optical (very organized but more expensive) 4 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Beyond bitrate, density is a major driver of optics. Connectors Cables

HM-Zd 10Gbps connector 40 differential pairs (25mm wide)

Fiber-Ribbon MT fiber ferrule High-Speed 48 fibers Copper Cabling extendable to 72 or 96 (7mm wide)

Electrical Transmission Lines and Optical Waveguides 400μm 35 x 35μm 343 62.5μm pitch 1.26 6 7.24 mils

But optics must be packaged deep within the system to achieve density improvements 5 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Packaging of Optical Interconnects is Critical

• Better to put optics close to logic rather than at the card edge 9Avoids distortion, power, & cost of electrical link on each end of optical link 9Breaks through pin-count limitation of multi-chip modules (MCMs)

Operation to >15 Gb/s: no equalization required Opto module ~2cm traces Laser+driver IC 1.7cm traces fiber NIC Operation at 10 Gb/s: Bandwidth limited Ceramic 1cm Flex equalization required by # of pins Organic card

Optics on-MCM Optical bulkhead connector 1.7cm~2cm tracestraces Opto module NIC Laser+driver IC Ceramic Organic cardOrganic card 1cm Flex fiber >12.5cm traces Good:Optics Optics on-card on-card (with or w/o via stubs)

Colgan, et. al., “Direct integration of dense parallel optical interconnects on a first level package for high-end servers”, ECTC 2005, 55th, pp. 228- 233, Vol. 1., 31 May-3 June 2005. 6 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Current architecture: Electronic Packet Switching

ƒ Current architecture (electronic switch Central switch racks chips, interconnected by electrical or optical links, in multi-stage networks) works well now--- – Scalable BW & application- optimized cost • Multiple switches in parallel – Modular building blocks • many identical switch chips & links) ƒ -- but challenging in the future – Switch chip throughput stresses the hardest aspects of chip design • I/O & packaging – Multi-stage networks will require multiple E-O-E conversions • N-stage Exabyte/s network = N*Exabytes/s of cost N*Exabytes/s of power Mare Nostrum, Barcelona Supercomputing Center

7 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Possible new architecture: Optical Circuit Switching (Optics is not electronics, maybe a different architecture can use it better) ƒ All-Optical Packet Switches are hard Scalable Optical Circuit Switch (OCS) – e.g., IBM/Corning OSMOSIS project • Expensive, and required complex electrical control network – No optical memory or optical logic – Probably not cost-competitive against electronic packet switches, even in 2015- 2020 ƒ But Optical Circuit Switches (~10millisecond switching time) are available today – Several technologies (MEMS, piezo-, thermo-,..) OCS – Low power • OCS power essentially zero, compared to electronic switch • no extra O-E-O conversion OCS Concept – But require single-mode optics Input fiber Output fibers (one channel – In ~2015, with silicon , ~1nsec shown) switching time • Does 6 orders of magnitude make approach more suitable to general-purpose computing?

MEMS-based OCS HW is commercially 2-axis MEMS available (Calient, Glimmerglass,..) Mirror • 20 ms switching time (one channel shown) • <100 Watts

8 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Outline

ƒ Optics in today’s HPCs ƒ Trends in microprocessor design – Multi-core designs for power efficiency ƒ Vision for future IntraChip Optical Networks (ICON) – 3D stack of logic, memory, and global optical interconnects ƒ Required devices and processes – Low power and small footprint

9 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Chip MultiProcessors (CMPs) IBM Cell, Sun Niagara, Intel Montecito, … (note that the processors on the chip are not identical)

IBM Cell:

Parameter Value Technology process 90nm SOI with low-κ dielectrics and 8 metal layers of copper interconnect Chip area 235mm^2 Number of transistors ~234M Operating clock frequency 4Ghz Power dissipation ~100W Percentage of power dissipation due to 30-50% global interconnect Intra-chip, inter-core communication 1.024 Tbps, 2Gb/sec/lane (four shared bandwidth buses, 128 bits data + 64 bits address each) I/O communication bandwidth 0.819 Tbps (includes external memory)

10 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation …but perhaps a hierarchical design of several cores grouped into a supercore will emerge

~2017 Multiple “supercores” on a chip Electrical communication within supercore Optical communications between supercores After Moray McLaren, HP Labs

11 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Theme: How to continue to get exponential performance increase over time (Moore’s Law extension) from silicon ICs even though CMOS scaling by itself is no longer enough

(Moore’s Law extension) Communications and Architecture Exa-scale (~2017) Can Si photonics provide this

Increased # of performance increase?

(log) Processors Peta-scale (~2012) Uniprocessor (original Moore’s Law applies here) performance

Transistors

Performance Tera-scale (today)

Time (linear)

IBM Cell Processor 9 processors, ~200GFLOPs On- and Off-chip BW~100GB/sec (0.5B/FLOP) ƒ BW requirements must scale with System Performance, ~1Byte/FLOP

12 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Outline

ƒ Optics in today’s HPCs ƒ Trends in microprocessor design – Multi-core designs for power efficiency ƒ Vision for future IntraChip Optical Networks (ICON) – 3D stack of logic, memory, and global optical interconnects ƒ Required devices and processes – Low power and small footprint

13 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Inter-core communication trends – network on chip INTEL Polaris 2007 Research Chip: 100 Million Transistors ● 80 cores (tiles) ● 275mm2

i.e., 3D Integration (why not go to optical plane, too?)

Higher BW and lower Power with Optics?

14 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Photonics in Multi-Core Processors Intra-Chip Communications Network

Photonics changes the rules OPTICS: ƒ Modulate/receive ultra-high bandwidth data stream once per communication event ELECTRONICS: ƒ Broadband switch fabric is uses | Buffer, receive and re-transmit very little power at every switch ○ highly scalable | Off chip is pin-limited and ƒ Off-chip and on-chip can use really power hungry essentially the same technology ○ Much more off-chip BW available

RX RX RX RX TX RX TX RX TX TX TX TX

15 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Integration Concept Processor System Stack ƒ 3D layer stacking will be prevalent in the 22nm timeframe BEOL vertical electrical interconnects ƒ Intra-chip optics can take advantage of this technology Processor Plane w/ local memory cache ƒ Photonics layer (with supporting electrical circuits) more easily Memory Plane integrated with high performance logic and memory layers Memory Plane ƒ Layers can be separately optimized for performance and Memory Plane yield Optical Off-chip Interconnects Photonic Network Interconnect Plane (includes optical devices, electronic drivers & amplifiers and electronic control network)

16 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Vision for Silicon Photonics: Intra-Chip Optical Networks ƒ Pack ~36 IBM Cell processor “supercores” on a single ~600mm2 die in 22nm CMOS

• In each Cell supercore, there are 9 cores (PPE + 8SPEs) • 324 processors in one chip • Power and area dramatically lower than today at comparable clock speeds • Each supercore is electrically interconnected • Communication between supercores and off-chip are optical • BW between supercores is similar to today’s off-Cell BW (i.e., 1-2Tbps per Cell)

ƒ Intra-Chip Optical Network: Fundamentally alters the roadmap to scaling high- performance multi-core processors – Communications sub-system and architecture that leap-frogs equivalent electronic systems • Use photonics for communications, not logic • May require new network architecture; not just a point to point replacement of electrical network – Silicon : Enormous capacity and fundamentally low power consumption • Estimate optical network requires 25 Watts vs. 640 Watts for equivalent electrical network • Off-chip power advantage is even more compelling, by more than an order of magnitude

17 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Possible On-Chip Optical Network Architecture Bufferless, Deflection-switch based (OCS on a chip)

Cell Core P P P (on processor plane) G G G Gateway to ICON (on processor and photonic plane)

P P P Thin Electrical Control Network G G G (~1% BW, also sends small messages) Photonic Network P P P Deflection Switch G G G

18 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation On-chip Network Implementation • Bufferless optical switch network Architecture with ¾Over provisioned to optimize Improved throughput and latency ¾Simple electric control plane Application Performance with block transfer ¾Integration with μProc via 3D layer stacking

• Supercore Gateway to optical network ¾~2Tbps including over provisioning and coding ¾Combine WDM, TDM, SDM, for example: Subsystems – 480Gbps Optical channels –6 λ’s at 80Gbps or 12 λ’s at 40Gbps – 4 parallel optical channels • Optical devices ¾Metrics determined by system needs ¾Ultra-dense: 30x area improvement compared to EPIC program Possible Devices ¾CMOS compatible (22nm node) ¾Low power ¾Functions include: – Transmitter, Receiver, Switch, Transport e.g., ring resonator array (waveguides, gain) 19 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Outline

ƒ Optics in today’s HPCs ƒ Trends in microprocessor design – Multi-core designs for power efficiency ƒ Vision for future IntraChip Optical Networks (ICON) – 3D stack of logic, memory, and global optical interconnects ƒ Required devices and processes – Low power and small footprint

20 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Devices for Implementation Ultradense waveguides ƒ Ultradense Si waveguides and Optical Components: 1μm bend radius – 90nm and beyond CMOS generations enables: • ~20x smaller bend radius than today’s EPIC designs • ~30x smaller area Ring Resonator ƒ On chip modulators: Ring resonator or MZI based Total Internal Reflection Switch ƒ 2x2 optical deflection switches: – Broad band (all λ’s switched simultaneously) – Temperature variation tolerant (~20C) – MZI, TIR or MMI devices ƒ Integrated InP layers: – Provides optical gain to overcome network losses ƒ Optical or Electronic TDM : Mux up to high bitrate from logic with fast modulators/detectors or by on-chip OTDM ƒ Optical WDM Mux-DeMux – Rings, MZI, MMI, AWG are choices ƒ Detectors: e.g., Integrated WG Ge photodetectors Optical Gain Block ƒ Supporting electronics: High performance, low power CMOS based drivers, amplifiers, control network/logic. WDM – Lattice filter (L), ƒ Optical source: off chip lasers -- Ring Resonator (R) ƒ Device designs and estimated future performance based on work at IBM, Cornell, UC Santa Barbara and Columbia – Aggressive (but not implausible) performance extrapolation

21 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Off-chip optical coupling

22 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Coupling from fiber to silicon photonic wire S.McNab et al Optics Express 2003 (IBM)

F iber

Polymer waveguide (~3μm by 2μm) ~500nm

~500nm by 220nm Si photonic wire Coupling loss <1dB Cornell, IBM, NTT (with a lensed fiber)

23 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation WDM passives

24 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation (i) Multimode Interferometer based WDM devices

An imaging device: an input field is reproduced in single or multiple images at periodic intervals along propagation direction

6μm

In

F.Xia et al OFC 2007 (IBM) Concept: Soldano et al., J. Lightwave Technol., 13, 615-627, 1995.

25 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation (i, continued) MMI-MZI based λ demultiplexer F.Xia et al OFC 2007 (IBM)

λ1, λ2, λ3, λ4 λ1 20μm Footprint: ~40μm×130μm (~0.005mm2) λ2 • 10 times smaller than AWG on same SOI platform • 100 times smaller than III-V AWG) λ3 R 2μm λ4 0 1 2 3 4 • As-grown 1 • No active tuning -5 • Loss: 3dB • Pass band: 0.3nm -10

• Limited by crosstalk Reponse (dB) -15 • Crosstalk: -12dB • Channel spacing: 3.2±0.1nm -20 • Designed channel spacing: 3.2nm 1540 1542 1544 1546 1548 1550 1552 1554 Wavelength (nm)

26 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation (ii) WDM based on detuned ring resonators

F.Xia et al OFC 2007 (IBM) 0 Channel #1 λ1, λ2, λ3, λ4 Channel #2 Channel #3 Channel #4 -10 λ1

λ2 20μm -20

λ3 Relative response (dB)

λ -30 4 1542 1544 1546 1548 1550 1552 1554 1556 Wavelength (nm)

•Difference in resonance wavelength •Cross talk < -20dB is due to different perimeters of the •Limited transmission bandwidth resonator •Designed channel spacing: 3.2nm • Δλ=3.2nm ↔ ΔL=180nm •Experimental value: 2.2nm to 3.1nm

27 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Fast modulators

28 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Silicon Ring Modulator at 12.5 Gb/s Xu, Schmidt, and Lipson, Nature, 2005(Cornell)

0.2µm

Waveguide Ring ,

>9dB modulation depth

PRBS 210-1

Appears extendable to 40Gb/s

29 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Broadband Optical Deflection Switches (all wavelengths simultaneously deflected)

CMOS CMOS ƒ Broadband ring-resonator switch Driver Driver ƒ ON state: – carrier injection Æ coupling into ring Æ signal switched ƒ OFF state – passive waveguide crossover – negligible power

OFF ON

CMOS CMOS Driver Driver

30 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Broadband, thermally stable deflection switch from multiple resonators (Multiple rings broaden the passband for thermal stability) Xia, et al, CLEO 2007, Green, et al., OFC 2008 (IBM) Switch Performance (multiple λs) IN DROP

THRU

Apodization (flattens passband) λ2

0 2.5nm

-5 5-ring (w apodization) -10 4-ring (w/o apodization) λ3 -15

-20

-25 Transmission (dBm) -30 λ4 -35 -6 -4 -2 0 2 4 6 W avelength detuning (nm) 31 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Photodetectors

32 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Ge-on-SOI Detector Design Dehlinger, et al. PTL, 2004 and Schow et al., PTL, 2006 (IBM) • Lateral PIN design, direct Ge growth on thin SOI.

Ti/Al Wi = 300 nm W m Wm = 200 nm S = 0.4 – 1.0 μm

SiO2 tGe = 350 nm n+ p+ n+ p+ tGe W S Si i Ge

SiO2

Si

• Design Features: – Epitaxy using UHV-CVD − Buried oxide isolates carriers generated in substrate − Eliminates low-frequency “tail” − 20GHz bandwidth − Lateral p-i-n design for low capacitance/ dark current

33 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation On-chip optics requires waveguide geometry (e.g., Luxtera, announced)

Ge adiabatic taper

Si strip e od waveguide L di

er L tap

Oxide layer

Light in Si waveguide

Adiabatically coupled, high bandwidth waveguide Ge photodiodes on SOI substrate

34 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation On-chip gain: Evanescent Optical Amplifiers Park, Bowers, et al, PTL 19, p. 210 (2007) (UCSB) ƒ Intrachip networks have many nodes. ƒ Network size is limited by loss in waveguides, switches, and waveguide crossings ƒ Solution: Silicon evanescent optical amplifiers (III-V gain medium) ƒ Initial results: – Device dimensions: H= 0.7 um, W = 2 um, L = 1.36 mm – Amplifier Gain: 13 dB • Evanescent design allows higher saturation output powers than convention III-V amp • Heating effects can be minimized with package design

University of California Santa Barbara

35 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation 3D Integration

36 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Development of 3DI Process for Si Photonics ƒ Currently developing 3DI processes based upon both Cu-Cu compression and oxide fusion bonding.

Oxide bonding SOI Cu-Cu bonding Face-to-back Bulk Face-to-back

Additional process technology development will be necessary to adapt each approach for photonics integration. Challenges include: – Photonic devices and off-chip optical coupling compatible with 3D integration – Thermal management

37 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Major Challenges

ƒ Achieving required device performance, in particular for: – Optical bandwidth of modulators and switches – High per-channel bitrates from direct modulation – Device density – WDM stability against temperature variations ƒ Low power operation ƒ Integration and path to manufacturability – InP with Si – Electronic support circuits with Si nanophotonics devices – Compatibility with 3D layer stacking technologies and CMOS processing ƒ Network performance: demonstrate system/application advantages – Low latency – High throughput (avoid congestion)

38 IntraChip Optical Networks 29 February, 2008 © 2008 IBM Corporation Summary

O / I

l

a

c i t

p

O

ƒ Multi-core uProcessor architectures are emerging as a key concept to provide power efficient high performance computing capability – On-chip op