Reference Guide

Reference Guide Graphics Core Next Architecture, Generation 3 August 2016 Revision 1.1 Copyright © 2016 Advanced Micro Devices, Inc. All rights reserved. DISCLAIMER The information contained herein is for informational purposes only, and is subject to change without notice. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Mi- cro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellec- tual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale. © 2016 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, AMD Accelerated Parallel Processing, the AMD Accelerated Parallel Processing logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. Other names are for informational purposes only and may be trademarks of their respective owners. AMD’s products are not designed, intended, authorized or warranted for use as compo- nents in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or envi- ronmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Advanced Micro Devices, Inc. One AMD Place P.O. Box 3453 Sunnyvale, CA 94088-3453 www.amd.com For AMD Accelerated Parallel Processing: URL: developer.amd.com/appsdk Developing: developer.amd.com/ Forum: developer.amd.com/openclforum ii AMD GRAPHICS CORE NEXT TECHNOLOGY Contents Contents Preface Chapter 1 Introduction Chapter 2 Program Organization 2.1 The Compute Shader Program Type ............................................................................................. 2-1 2.2 Instruction Terminology .................................................................................................................. 2-2 2.3 Data Sharing..................................................................................................................................... 2-3 2.3.1 Local Data Share (LDS)...................................................................................................2-4 2.3.2 Global Data Share (GDS).................................................................................................2-5 2.4 Device Memory................................................................................................................................. 2-5 Chapter 3Kernel State 3.1 State Overview ................................................................................................................................. 3-1 3.2 Program Counter (PC) .................................................................................................................... 3-2 3.3 EXECute Mask.................................................................................................................................. 3-2 3.4 Status Registers............................................................................................................................... 3-2 3.5 Mode Register .................................................................................................................................. 3-4 3.6 GPRs and LDS ................................................................................................................................. 3-5 3.6.1 Out-of-Range Behavior....................................................................................................3-5 3.6.2 SGPR Allocation and Storage ........................................................................................3-6 3.6.3 SGPR Alignment...............................................................................................................3-6 3.6.4 VGPR Allocation and Alignment ....................................................................................3-6 3.6.5 LDS Allocation and Clamping ........................................................................................3-6 3.7 M# Memory Descriptor .................................................................................................................... 3-7 3.8 SCC: Scalar Condition Code........................................................................................................... 3-7 3.9 Vector Compares: VCC and VCCZ.................................................................................................. 3-8 3.10 Trap and Exception Registers........................................................................................................ 3-8 3.11 Memory Violations ......................................................................................................................... 3-10 Chapter 4 Program Flow Control 4.1 Program Control............................................................................................................................... 4-1 4.2 Branching.......................................................................................................................................... 4-1 4.3 Work-Groups..................................................................................................................................... 4-2 iii Copyright © 2016 Advanced Micro Devices, Inc. All rights reserved. AMD GRAPHICS CORE NEXT TECHNOLOGY 4.4 Data Dependency Resolution ......................................................................................................... 4-2 4.5 Manually Inserted Wait States (NOPs) .......................................................................................... 4-4 4.6 Arbitrary Divergent Control Flow................................................................................................... 4-5 Chapter 5 Scalar ALU Operations 5.1 SALU Instruction Formats .............................................................................................................. 5-1 5.2 Scalar ALU Operands...................................................................................................................... 5-2 5.3 Scalar Condition Code (SCC)......................................................................................................... 5-2 5.4 Integer Arithmetic Instructions....................................................................................................... 5-5 5.5 Conditional Instructions.................................................................................................................. 5-5 5.6 Comparison Instructions................................................................................................................. 5-6 5.7 Bit-Wise Instructions ....................................................................................................................... 5-6 5.8 Special Instructions......................................................................................................................... 5-8 Chapter 6 Vector ALU Operations 6.1 Microcode Encodings...................................................................................................................... 6-1 6.2 Operands........................................................................................................................................... 6-2 6.2.1 Instruction Inputs.............................................................................................................6-2 6.2.2 Instruction Outputs..........................................................................................................6-3 6.2.3 Out-of-Range GPRs..........................................................................................................6-4 6.2.4 GPR Indexing ..................................................................................................................6-5 6.3 Instructions....................................................................................................................................... 6-5 6.4 Denormalized and Rounding Modes ............................................................................................. 6-7 6.5 ALU CLAMP Bit Usage.................................................................................................................... 6-7 6.6 VGPR Indexing ................................................................................................................................. 6-7 6.6.1 Indexing Instructions.......................................................................................................6-8 6.6.2 Special Cases ...................................................................................................................6-8 Chapter 7 Scalar Memory Operations 7.1 Microcode Encoding.......................................................................................................................

Reference Guide

Small Form Factor 3D Graphics for Your Pc

AMD Firepro™Professional Graphics for CAD & Engineering and Media & Entertainment

Minimum System Requirements for Illustrator

Monte Carlo Evaluation of Financial Options Using a GPU a Thesis

PC-Grade Parallel Processing and Hardware Acceleration for Large-Scale Data Analysis

Deep Dive: Asynchronous Compute

Contributions of Hybrid Architectures to Depth Imaging: a CPU, APU and GPU Comparative Study

AMD APP SDK V2.8.1 Developer Release Notes

AMD Releases ATI Catalyst 72

Mellanox Corporate Deck

AMD Codexl 1.0 GA Release Notes (Version 1.0.)

AMD Codexl 1.1 GA Release Notes