Mali-G57: Premium GPU Performance for Mainstream Devices Tech Symposia 2019

Daniele Di Donato, Product Manager October 2019 Arm Mali GPUs: The World’s #1 Shipping Graphics Processor

Gaming AI Over 1Bn Mali GPUs shipped in 2018

AR VR ~80% ~50% ~50%

SmartTVs Smartphones Mobile VR 183

Mali GPUs GPUs Mali in: are Total licenses Mali Total

2 © 2019 Arm Limited Efficient Ultra Mainstream High Performance Arm Mali Graphics Processor Roadmap - Mali - 450 Mali - T830 Mali - G Mali 71 - 470 Mali - G51 Mali - G72 Mali - G31 Mali - G52 Mali - G76 Mali - G57 Mali - G77

© 2019 Arm Limited Complex and Challenging GPU Powered Use Cases High - fidelity mobile gaming mobile fidelity Augmentedreality realityvirtual and On - device machine learning machine device

© 2019 Arm Limited Arm Mali Graphics Processor Generations VALHALL MIDGARD BIFROST Superscalarscalar simplified instruction ISA, engine, dynamic scheduling Unified cores,Unified shader scalar ISA, execution,clause full coherency, Vulkan, OpenCL Unified shader cores,Vulkan shader ESUnified ISA, OpenCL, SIMD OpenGL 3.x, Mali Mali Mali - - G77 G71 - T 600 GPU series Mali Mali - - G57 G51 Mali Mali - G 72 - T700 GPU series Mali - G52 Mali Mali - G31 - T800 GPU series Mali - G76

© 2019 Arm Limited Compared with MaliCompared Device Performance First Valhall GPU for Mainstream Market Delivers Outstanding - G52 3EE running complex3EE content G52 with same process node under similar conditions performance 1.3x better

© 2019 Arm Limited Compared to Compared Mali Leap in Gaming Performance and Efficiency Efficientlysupporting growing graphics andMLcomplexity - G52 3EE on same on process 3EE similar node under G52 conditions better energy 3 efficiency 0% performancedensity 30% m ore machine learning improvement for 60%

© 2019 Arm Limited • • • Improved High texture heavygames large impact on some Quad texture mapper has with G52 2EE capabilities when compared Up millimetre performance Mali to 2x - G57 deliversmore more compute - per - square - Fidelity and Casual Gaming Performance 1.4x Complex Game1 Mali-G52 3EEfps/mm2 Mali-G52 1.2x Complex Game2 RelativeGame Performance 1.25x Complex Game3 Mali-G57 fps/mm2 Mali-G57 1.25x ISO process and frequency Causal ContentCausal

© 2019 Arm Limited • • • Delivers Even Longer Game Play range of content energyefficiency across wide Average 1.3x improvement in mainstream products Deliverslonger battery life for across all workloads Mali - G57 boosts energy efficiency 1.24x Complex content1 1.29x Complex content2 RelativeEnergy Efficiency Mali-G52 3EE Mali-G52 1.20x Complex content3 Mali-G57 1.39 ISO process and frequency Complex content4 x

© 2019 Arm Limited • • Enhanced On multiple NN networks Average improvement for performance Machine Learninginference Mali - G57 significantly improves - Device Intelligence 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 0 1 Relativeperformance improvement in ML Mali-G52 3EE Mali-G52 3EE 1.6 networks x Mali-G57 Mali-G57 ISO process and frequency

© 2019 Arm Limited • • • • • Excellent improvements, configurability and flexibility Configurable 1 to 6 shader cores networks Optimized Load andStore cache for ML Quad texture mapper increasedFMAs Single executioncore engineper shader with Mainstream markets Introduction of Vahall architecture for MALI™ Register File AMBA Datapath L2 cacheL2 - Control/Scheduler G57 © Inter Memory Memory management unit 4 ACE Messaging Advanced Tiling Unit - core taskmanagement l o r t n o C Register File Datapath Shader Shader core Shader 1 core AMBA L2 cacheL2 © 4 ACE 2 3 4 5 6

© 2019 Arm Limited Valhall Architecture Goals • The new Mali architecture following Bifrost Mali-G51 Execution Engine

Warp control, scheduler, icache Datapath 4 wide per engine Register File 4 lanes

Control 3 engines per core nd Datapath • 2 generation of Arm GPU scalar architecture Messaging for high-performance, high-efficient GPUs Mali-G52 Execution Engine

Warp control, 8 wide per engine scheduler, icache 2/3 engines per Datapath Datapath

Register File 4 lanes 4 lanes core

Control Datapath Datapath • 16-wide warp-based execution model Messaging

Mali-G57 Execution Engine

• New simplified and compiler-friendly control, scheduler, icache instruction set Register File Register File 16 wide warp per cluster

2 clusters per engine Control Datapath Datapath Datapath 1 engine per core 16 lanes 16 lanes • Aligned to new APIs Messaging

12 © 2019 Arm Limited • • • • Valhall • • • • • • Dependency system of instructions scheduling Dynamic Newset instruction Warp New features • • • No more clauses, tuples and fixed moreand No tuples clauses, issuing byHW Done HW for Support FP16rendertargets AFBC1.3 Regular,unconstrained instruction encoding Operational equivalencetoBifrost 16 threads executedlockstepin a in - based executionbased model allocatedvertex outputs shader Fundamentals warp Time T0.x T0.y T0.z T1.x T T1.z 1 .y … … … T15.x T15.y T15.z Cycle Cycle 2 Cycle 1 Warp threads 3

© 2019 Arm Limited New gaming content becoming more complex Efficient Shader Core with increased compute capabilities 1.3x FMA compared to G52 3EE 2 FMA compared to G52 2EE 2.6x FMA compared to G51 32 FMAsper - core 10 15 20 25 30 35 0 5 Mali-G51 Mali-G52 2EE FMAs per coreFMAs Mali-G52 3EE Mali-G57

© 2019 Arm Limited New gaming content becoming more complex Quad Texture Mapper Doubles Throughput 4 2x Mali throughput texels /cycle - G52 0 1 2 3 4 5 Mali-G51 Bilinear Texels/clockBilinear Mali-G52 Mali-G57

© 2019 Arm Limited Improved Load and Store cache Latency improvements Throughput half Number Internal of datapath pipeline stagespipeline improvements is cacheline reduced wide by 0.00 0.50 1.00 1.50 2.00 2.50 RelativeNN performance Mali-G57 Mali-G52 3EE

© 2019 Arm Limited • • • Bringing Premium Device Experiences Mainstream * Compared to Mali enable premiumenable use cases on mainstream devices Outstanding Mali GPU performance improvement to First mainstream GPU with new Valhall architecture High - end graphicsend performance at increasedefficiency​ - G 52 52 3 EE EE on same process node under similar conditions 30% moreenergy efficient* moreperformance 30% density* machine learning improvement* 60%

© 2019 Arm Limited Thank You Danke Merci 谢谢 ありがとう Gracias Kiitos 감사합니다 धन्यवाद شك ًرا תודה

© 2019 Arm Limited The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks