“Processor Graphics”

Tom Piazza Sr. Fellow, Director Of Graphics , Folsom

Property of Intel 1 My Brief History

• I started at General Electric in 1978 – Flight Simulators $10M each • Sold by the pound – Short list of 1st’s: • Procedural textures (1978) • Bitmap Textures – Tri-linear Filtered (1983) • Tile Based Deferred Rendering (1986) • Z Buffered Real-me Rendering (1986) • 16X MSAA Analiasing (Color and Z) (1986) • Geometry Tesselaon (1986) – Then there was ® Graphics Accelerator

Property of Intel 2 Short History of Graphics at Intel

• Intel 740 - Inially started as a joint venture with (Derived from General Electric’s Flight Simulator business) • Original intenon was to create a discrete graphics business – Got redirected to Chipset Integrated – White Space • Intended for Business users, Mom and Pop • Not Gamers – OK, maybe casual gamers (WOW)

Property of Intel 3 What Changed: Visual User Experiences - Exploding Vastly Richer Displays Traditional Media Comp & Environment High dpi, touch, Stereo3D, Transcode, LP Media, HQ Video Compeon invesng in big Gfx Draw more, richer pixels Conferencing, 4k Media, HEVC Apple: Big Gfx iMac, MBP, MBA Android: GPU > important vs CPU New Gfx Usages GFLOPs-based Media Win8 Metro: new min Gfx bar HTML5 makes Web 3D Video Processing & Analycs Next-gen Consoles est in ‘13 Advanced User Interface Computaonal photography Hetero CPU+Gfx Programming Realisc 3D gaming

pGfx Arch Wins in Thin Form Factor Scales up in Perf - $6.5B dGfx SAM

Visual Computing Central to New User Experience PC expectaons are now the expectaons of a Tablet (Ease of Use) with the UI of a PC (Keyboard, Mouse) 4 Property of Intel What we said … and then …

Property of Intel 5 … what we delivered: Processor Graphics – Sandybridge • Sandybridge is 1st incarnaon (2010) – Cache sharing (sets a stage J) – Power Sharing – Voltage and MHZ modulated • CPU Hi, Gfx Lo workload à Power to CPU • CPU Lo, Gfx Hi workload à Power to GPU – DX10.1 – OpenGL3.0 at inial release, OpenGL3.1 on present release

Property of Intel 6 Sandybridge Exceeds 10X by ’10 Goal

SANDY BRIDGE – 1300MHz 2xDDR3-1600

ARRANDALE – 766MHz 2xDDR3-1067

GM45 – 533MHz 2xDDR3-1067

Assumes PRQ dates. SNB config: 2.8GHz 4+2 8MB LLC shared 1300MHz Core and 2xDDR3-1600. 3DMark05 is used since 3DMark2006 has a SM3.0 test that does not run on 2006 parts. 3DMark2006 SM2.0 sub-test is shown since Gen driver optimizations for all recent devices are not focused on 3DMark2005. Property of Intel 7 Processor Graphics – Ivybridge

• Ivybridge is 2nd incarnaon (2011) – Same power and cache sharing – DX11 – OCL1.2 - OpenGL3.3 at inial release • Performance wise, Ivybridge is basically an XBox360 on every laptop

Property of Intel 8 Ivy Bridge Graphics and Media Microarchitecture Overview Ivy Bridge Microarchitecture

x16 ® PCIe • Next generaon ™ microprocessor on the latest DMI PCI Express 22-nm process System IMC Display Agent 2ch DDR3/ • Improved Game Playability Core LLC DDR3L – More 3D performance Core LLC Core LLC – Microso* DirectX*11 Support Core LLC PECI • Significant Media Performance Interface Graphics To ® Embedded – Higher performing Controller • eDP output Three Nave Display Support

DMI

2012 PCH

9 Property of Intel Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture

µArchitecture Changes

Scalable Architecture paroned into 5 domains: 1. Global Assets: Includes Geometry Front-end up to Setup 2. Slice Common: Includes Rasterizer, L3$ and Pixel Back-end 3. Slice: (EUs), IC$, Samplers, Addrs Gen 4. CODEX and media 5. Displays Sets the stage for further scale-up opportunies

10 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: Architecture

Adds Significant 3D Enhancements

Microso* DirectX* 11 Hardware Tessellaon • Adds two programmable stages (HS and DS) and one fixed funcon Tessellator New Compressed Texture Format Support (BC6H/7)

11 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: Architecture

More Key Changes Compute Support • Data Parallelism • UAVs, Atomics, Barriers, etc for compute shader ops • Shared Local Memory aka Thread Group Local Memory for Direct Compute* Shaders • Scaer gather Shader Array adds support for Shader Model 5.0 (New DX11 Instrucons)

12 Property of Intel Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture µArchitecture Changes

Improved Geometry Performance • Faster GS and H/w Stream-out • Faster Clip/Setup Fast Clear of Render Target Increase in Hi-Z Performance Sampler throughput • Improved Anisotropic Quality Increased compute throughput (peak GFLOPs) • Increased # of threads/registers to cover latency and support complex shaders • “Enhanced” coissue L3$ lowers BW need from Ring Architecture Media Applicaons benefit from infrastructure changes in EU/L3$/etc

13 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture

Significant Media Performance • Higher performing Intel® Quick Sync Video µArchitecture Changes • Enhanced Performance for Mul-Format CODEC • Increased Media Sampler Throughput and performance for scaling and other filters • Pixel Back end has Image Color and Contrast Enhancement capabilies

14 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Processor Graphics Future Goals

• Maximize performance aainable within specific “socket” limits – 25Wa Laptops – 15Wa Thin and Light “Ultra-Books” – 3-5Wa Tablets and Phones

Property of Intel 15 How do we do this?

• Most efficient operang point of anything in silicon is at the knee of the process: – Power = C * V^2 * F – Max Frequency allowed at Vmin since Voltage has a squared power funcon (cubic when leakage is factored in) • Future products will have sustained power limits at Vmin – Higher power in “bursts” if Silicon is “cold”

Property of Intel 16 Summary

• Intel® Next Generaon Microarchitecture, Codename Ivy Bridge, is the 1st product on 22 nm process technology

• Another big leap in Performance/Power efficiency in both IA core and Graphics/Media

• Features for improved Security, beer Baery life, new Memory technology (DDR3L), beer Overclocking support

• Next generaon Graphics microarchitecture is a Significant Graphics and Media (“ck+”) evoluon for Intel® HD Graphics

It’s Just The Beginning…

17 Some Metrics

Sandybridge Ivybridge Next Max Ghz 1.3 1.2 1.X Shaders 12 16 Much More MAD / Clk 4 8 8 Plane / Clk 4 4 4 Max Gflops (Plane+MAD) 250 461 Much More Max Gflops (MAD+MAD) 125 307 Much More Samplers @ 4 Texels / Clk 1 2 Much More Scatter/Gather / Clk 1 32 Much More DX Rev 10.1 11 OGL Rev 3.0 3.3 OCL Rev NA 1.2

Property of Intel 18 Scalability – GenX

• “Slice” Based CPU LLC LLC CPU

Media Geom Geom Media Media VS,V$ TE Geom VS,V$ TE • At 1GHZ, provides 2 VS,V$ TE GTI GTI GS DS CL GS DS CL GS DS CL GTI TFLOP of shader HS HS SOSO WWFEFE HS SO WFE

U U Slice Common U U Slice Common U U Slice Common U U E E E E E E E E U U U U Setup Setup U U Setup U U E E E E performance (GT4) E E E E Rasterizer U U U U

Rasterizer U U U U

U U Rasterizer E E E E E E E E E Plane-ZE /HiZ/IPZlane-Z/HiZ/IZ Plane-Z/HiZ/IZ U U

Sampler E E Sampler Pix. BackendPix. Backend Sampler Pix. Backend Sampler

U U RCC STCRCZ

• E RCC STE C RCZ RCC STC RCZ Chop: DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC

U VEBU ox VEBox VEBox E E TDL, PSD, BC TDL, PSD, BC Sampler TDL, PSD, BC TDL, PSD, BC

DAP*, GWY, IDCAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC $ $ $ Sampler 3 Sampler Sampler Sampler 3 3 TDL, PSD, BC L L L U U U U U U U U E E E E • Slice Half – GT1 E E E E B M B U U U U B M U U U U M R L R L R E E E E S L U E E E E S U S U U U U U U U U U E E E E • Right Half – GT2 E E E E U U U U U U U U E E E E E E E E B B M M U U U U U U U U R R L L E E E E E E E E S S U U U U U U U U U U

• Boom Half – GT3 E E E E E E E E $ $ 3 3

Sampler L Sampler Sampler L Sampler

DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC

TDL, PSD, BC TDL, PSD, BC TDL, PSD, BC TDL, PSD, BC Slice Common Slice Common DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC Setup Setup Sampler Rasterizer Sampler Sampler Rasterizer Sampler Plane-Z/HiZ/IZ Plane-Z/HiZ/IZ U U U U U U U U

E E Pix. Backend E E E E Pix. Backend E E U U U U U U U U

E E RCC STC RCZ E E E E RCC STC RCZ E E

U U VEBox U U U U VEBox U U E E E E E E E E

Property of Intel 19 Conclusion

• Intel is commied to a comprehensive graphics roadmap: – Maximum performance on “Baery” driven plaorms: • Laptops à Tablets à Phones • Always looking for talent – Many Geographies – The sun never sets on Intel J

Property of Intel 20 Q & A

Property of Intel 21