ARM Mali Architecture

ARM Mali Architecture

ARM Mali GPU Architecture Sam Martin Graphics Architect, ARM ARM Game Developer Day - London 03/12/2015 Agenda . Mali architecture and tiling introduction . Behind the scenes – power limits . Vulkan 2 © ARM 2015 Mali GPU Taxonomy In a Nutshell . Mali 4xx series OpenGL ES 2.0 . 1-8 shaders cores, separate fragment and vertex processors . Mali 6xx – 8xx OpenGL ES 3.x . Unified “tri-pipe” shader core . Larger core configurations, max 16 cores from Mali 760 + . AFBC, ASTC, Transaction Elimination, ... All tile-based GPUs 3 © ARM 2015 Command stream Command phase from CPU Input assembly Geometry phase Vertex shader Rasterizer Pixel shader Pixel phase Output merger 4 © ARM 2015 Tile-based GPUs Command stream Input assembly from CPU . Fragments >> Geometry Vertex shader Rasterizer Pixel shader . Phased structure 1. Buffer all operations into “render passes” Outer merger 2. Transform + bin all geometry into screen space tiles 3. Fully shade each tile into local memory, then write back 5 © ARM 2015 Mali Architecture . Hardware tiling . Forward Pixel Kill . Reduce overdraw . Framebuffer memory on-chip . 4x MSAA for “free” . Advanced on-chip shading . Bandwidth efficiencies . ARM Framebuffer Compression . Transaction elimination . ASTC 6 © ARM 2015 Mobile Power Limits . Lifetime constrained by battery Phones 1-3 Watts . High-end performance constrained by heat Tablets 3-5 Watts . Thermal Design Power/Point (TDP) Small laptop-like 10-25 Watts . Capacity constrained by ability to dissipate heat Regular laptop 25-50 Watts . Memory bandwidth particularly expensive Integrated desktop 40-100 Watts . Rule of thumb: 100mW / GB/s, assume 1 W total . Low-mid end GPUs are constrained by die area . Savings prolong battery life but may not increase performance 7 © ARM 2015 3 mm² 5 mm² 10 mm² 30 mm² 561 mm² Similarly capable mobile GPUs NVIDIA GeForce Die areas shown to scale GTX Titan 8 © ARM 2015 3 mm² 5 mm² 10 mm² 30 mm² 561 mm² Low-end 9 © ARM 2015 3 mm² 5 mm² 10 mm² 30 mm² 561 mm² Mid-range 10 © ARM 2015 3 mm² 5 mm² 10 mm² 30 mm² 561 mm² High-end 11 © ARM 2015 3 mm² 5 mm² 10 mm² 30 mm² 561 mm² . 1-10x range, just within mobile phones . Servicing such a wide range demands scalable GPU designs . GPU feature set cannot indicate performance capability 12 © ARM 2015 Thermal Throttling . CPU - big . CPU - LITTLE . GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs] Max OPP big Max OPP LITTLE Frequency Frequency Max OPP GPU Median filtered chart for clarity Time (s) 13 © ARM 2015 Thermal Throttling . CPU - big . CPU - LITTLE . GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs] Max OPP big Max OPP LITTLE Frequency Frequency Max OPP GPU Median filtered chart for clarity Time (s) 14 © ARM 2015 Thermal Throttling . CPU - big . CPU - LITTLE . GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs] Max OPP big Max OPP LITTLE Frequency Frequency Max OPP GPU Median filtered chart for clarity Time (s) 15 © ARM 2015 Thermal Throttling . CPU - big . CPU - LITTLE . GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs] Max OPP big Max OPP LITTLE Frequency Frequency Max OPP GPU Median filtered chart for clarity Time (s) 16 © ARM 2015 Thermal Throttling . CPU - big . CPU - LITTLE . GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs] Max OPP big Max OPP LITTLE Frequency Frequency Max OPP GPU Median filtered chart for clarity Time (s) 17 © ARM 2015 Vulkan . Good match for mobile and tiling architectures . Explicit multi-pass render passes . No hidden costs (copies, allocs, shader recompiles, etc) . Multi-threaded . Low overhead . Gloves-off API . Needs care – look out for future info post-release 18 © ARM 2015 Thanks! Questions? [email protected] @palgorithm . Coming up: . Increase texturing efficiency and quality . Daniele Di Donato, “Get the most out of ASTC” – up next! . Advanced use of tiled framebuffers . Marius Bjørge, “Fast Approximate Indirect Lighting on Mobile”, 11am . Compute shaders & tessellation . Hans-Kristian Arntzen, “Real-time GPU-driven Ocean Rendering on Mobile”, 11.30am 19 © ARM 2015 For more information visit the Mali Developer Centre: http://malideveloper.arm.com • Revisit this talk in PDF and audio format post event • Download tools and resources 20 © ARM 2015 The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. Copyright © 2015 ARM Limited .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    21 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us