Tom Piazza Sr
Total Page:16
File Type:pdf, Size:1020Kb
“Processor Graphics” Tom Piazza Sr. Fellow, Director Of Graphics Intel, Folsom Property of Intel 1 My Brief History • I started at General Electric in 1978 – Flight Simulators $10M each • Sold by the pound – Short list of 1st’s: • Procedural textures (1978) • Bitmap Textures – Tri-linear Filtered (1983) • Tile Based Deferred Rendering (1986) • Z Buffered Real-Qme Rendering (1986) • 16X MSAA AnQaliasing (Color and Z) (1986) • Geometry Tesselaon (1986) – Then there was Intel740® Graphics Accelerator Property of Intel 2 Short History of Graphics at Intel • Intel 740 - IniQally started as a joint venture with Real3D (Derived from General Electric’s Flight Simulator business) • Original intenQon was to create a discrete graphics business – Got redirected to Chipset Integrated – White Space • Intended for Business users, Mom and Pop • Not Gamers – OK, maybe casual gamers (WOW) Property of Intel 3 What Changed: Visual User Experiences - Exploding Vastly Richer Displays Traditional Media Comp & Environment High dpi, touch, Stereo3D, widi Transcode, LP Media, HQ Video Compeon invesng in big Gfx Draw more, richer pixels Conferencing, 4k Media, HEVC Apple: Big Gfx iMac, MBP, MBA Android: GPU > important vs CPU New Gfx Usages GFLOPs-based Media Win8 Metro: new min Gfx bar HTML5 makes Web 3D Video Processing & AnalyEcs Next-gen Consoles est in ‘13 Advanced User Interface Computaonal photography Hetero CPU+Gfx Programming RealisEc 3D gaming pGfx Arch Wins in Thin Form Factor Scales up in Perf - $6.5B dGfx SAM Visual Computing Central to New User Experience PC expecta/ons are now the expecta/ons of a Tablet (Ease of Use) with the UI of a PC (Keyboard, Mouse) 4 Property of Intel What we said … and then … Property of Intel 5 … what we delivered: Processor Graphics – Sandybridge • Sandybridge is 1st incarnaon (2010) – Cache sharing (sets a stage J) – Power Sharing – Voltage and MHZ modulated • CPU Hi, Gfx Lo workload à Power to CPU • CPU Lo, Gfx Hi workload à Power to GPU – DX10.1 – OpenGL3.0 at iniQal release, OpenGL3.1 on present release Property of Intel 6 Sandybridge Exceeds 10X by ’10 Goal SANDY BRIDGE – 1300MHz 2xDDR3-1600 ARRANDALE – 766MHz 2xDDR3-1067 GM45 – 533MHz 2xDDR3-1067 Assumes PRQ dates. SNB config: 2.8GHz 4+2 8MB LLC shared 1300MHz Core and 2xDDR3-1600. 3DMark05 is used since 3DMark2006 has a SM3.0 test that does not run on 2006 parts. 3DMark2006 SM2.0 sub-test is shown since Gen driver optimizations for all recent devices are not focused on 3DMark2005. Property of Intel 7 Processor Graphics – Ivybridge • Ivybridge is 2nd incarnaon (2011) – Same power and cache sharing – DX11 – OCL1.2 - OpenGL3.3 at iniQal release • Performance wise, Ivybridge is basically an XBox360 on every laptop Property of Intel 8 Ivy Bridge Graphics and Media Microarchitecture Overview Ivy Bridge Microarchitecture x16 ® PCIe • Next generaon Intel Core™ microprocessor on the latest DMI PCI Express 22-nm process System IMC Display Agent 2ch DDR3/ • Improved Game Playability Core LLC DDR3L – More 3D performance Core LLC Core LLC – Microsoh* DirectX*11 Support Core LLC PECI • Significant Media Performance Interface Graphics To ® Embedded – Higher performing Intel Quick Sync Video Controller • eDP output Three Nave Display Support DMI 2012 PCH 9 Property of Intel Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture µArchitecture Changes Scalable Architecture parQQoned into 5 domains: 1. Global Assets: Includes Geometry Front-end up to Setup 2. Slice Common: Includes Rasterizer, L3$ and Pixel Back-end 3. Slice: Shaders (EUs), IC$, Samplers, Addrs Gen 4. CODEX and media 5. Displays Sets the stage for further scale-up opportuniQes 10 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: Architecture Adds Significant 3D Enhancements Microsoh* DirectX* 11 Hardware Tessellaon • Adds two programmable stages (HS and DS) and one fixed funcQon Tessellator New Compressed Texture Format Support (BC6H/7) 11 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: Architecture More Key Changes Compute Shader Support • Data Parallelism • UAVs, Atomics, Barriers, etc for compute shader ops • Shared Local Memory aka Thread Group Local Memory for Direct Compute* Shaders • Scaer gather Shader Array adds support for Shader Model 5.0 (New DX11 InstrucQons) 12 Property of Intel Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture µArchitecture Changes Improved Geometry Performance • Faster GS and H/w Stream-out • Faster Clip/Setup Fast Clear of Render Target Increase in Hi-Z Performance Sampler throughput • Improved Anisotropic Quality Increased compute throughput (peak GFLOPs) • Increased # of threads/registers to cover latency and support complex shaders • “Enhanced” coissue L3$ lowers BW need from Ring Architecture Media Applicaons benefit from infrastructure changes in EU/L3$/etc 13 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Ivy Bridge HD Graphics: mArchitecture Significant Media Performance • Higher performing Intel® Quick Sync Video µArchitecture Changes • Enhanced Performance for MulQ-Format CODEC • Increased Media Sampler Throughput and performance for scaling and other filters • Pixel Back end has Image Color and Contrast Enhancement capabiliQes 14 Intel® Next Generaon Microarchitecture Codename Ivy Bridge Processor Graphics Future Goals • Maximize performance aainable within specific “socket” limits – 25Wa Laptops – 15Wa Thin and Light “Ultra-Books” – 3-5Wa Tablets and Phones Property of Intel 15 How do we do this? • Most efficient operang point of anything in silicon is at the knee of the process: – Power = C * V^2 * F – Max Frequency allowed at Vmin since Voltage has a squared power funcQon (cubic when leakage is factored in) • Future products will have sustained power limits at Vmin – Higher power in “bursts” if Silicon is “cold” Property of Intel 16 Summary • Intel® Next Generaon Microarchitecture, Codename Ivy Bridge, is the 1st product on 22 nm process technology • Another big leap in Performance/Power efficiency in both IA core and Graphics/Media • Features for improved Security, bener Baery life, new Memory technology (DDR3L), bener Overclocking support • Next generaon Graphics microarchitecture is a Significant Graphics and Media (“Qck+”) evoluQon for Intel® HD Graphics It’s Just The Beginning… 17 Some Metrics Sandybridge Ivybridge Next Max Ghz 1.3 1.2 1.X Shaders 12 16 Much More MAD / Clk 4 8 8 Plane / Clk 4 4 4 Max Gflops (Plane+MAD) 250 461 Much More Max Gflops (MAD+MAD) 125 307 Much More Samplers @ 4 Texels / Clk 1 2 Much More Scatter/Gather / Clk 1 32 Much More DX Rev 10.1 11 OGL Rev 3.0 3.3 OCL Rev NA 1.2 Property of Intel 18 Scalability – GenX • “Slice” Based CPU LLC LLC CPU Media Geom Geom Media Media VS,V$ TE Geom VS,V$ TE • At 1GHZ, provides 2 VS,V$ TE GTI GTI GS DS CL GS DS CL GS DS CL GTI TFLOP of shader HS HS SOSO WWFEFE HS SO WFE U U Slice Common U U Slice Common U U Slice Common U U E E E E E E E E U U U U Setup Setup U U Setup U U E E E E performance (GT4) E E E E Rasterizer U U U U Rasterizer U U U U U U Rasterizer E E E E E E E E E Plane-ZE /HiZ/IPZlane-Z/HiZ/IZ Plane-Z/HiZ/IZ U U Sampler E E Sampler Pix. BackendPix. Backend Sampler Pix. Backend Sampler U U RCC STCRCZ • E RCC STE C RCZ RCC STC RCZ Chop: DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC U VEBU ox VEBox VEBox E E TDL, PSD, BC TDL, PSD, BC Sampler TDL, PSD, BC TDL, PSD, BC DAP*, GWY, IDCAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC $ $ $ Sampler 3 Sampler Sampler Sampler 3 3 TDL, PSD, BC L L L U U U U U U U U E E E E • Slice Half – GT1 E E E E B M B U U U U B M U U U U M R L R L R E E E E S L U E E E E S U S U U U U U U U U U E E E E • Right Half – GT2 E E E E U U U U U U U U E E E E E E E E B B M M U U U U U U U U R R L L E E E E E E E E S S U U U U U U U U U U • Bonom Half – GT3 E E E E E E E E $ $ 3 3 Sampler L Sampler Sampler L Sampler DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC TDL, PSD, BC TDL, PSD, BC TDL, PSD, BC TDL, PSD, BC Slice Common Slice Common DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC DAP*, GWY, IC Setup Setup Sampler Rasterizer Sampler Sampler Rasterizer Sampler Plane-Z/HiZ/IZ Plane-Z/HiZ/IZ U U U U U U U U E E Pix. Backend E E E E Pix. Backend E E U U U U U U U U E E RCC STC RCZ E E E E RCC STC RCZ E E U U VEBox U U U U VEBox U U E E E E E E E E Property of Intel 19 Conclusion • Intel is commined to a comprehensive graphics roadmap: – Maximum performance on “Baery” driven plaorms: • Laptops à Tablets à Phones • Always looking for talent – Many Geographies – The sun never sets on Intel J Property of Intel 20 Q & A Property of Intel 21 .