Programming of Graphics

Peter Mileff PhD Programming of Graphics GPU overview Graphics and Game Engines University of Miskolc Department of Information Technology Overview of the GPU... 2 GPU Overview ⦿ Graphics Processing Unit (GPU) is the central unit of your graphics card ⦿ Its objective: ● Performing complex graphical operations ● Directly accelerate the visualization ● Offload the CPU: ○ taking high-level visualization tasks from the CPU ○ therefore CPU can be used to do other things ⦿ The reason of the spread of the GPUs: ● Hardware manufacturers quickly recognized the business opportunities. Creating: ○ Multimedia applications (e.g. Photoshop) ○ Engineering systems (e.g. CAD systems) ○ Games 3 First Achievements ⦿ In 1996, 3dfx company released Voodoo I ⦿ Voodoo I characteristics: ● The first 3D accelerator card (4MB RAM, 50 Mhz) ● Huge success ● Support only the 3D visualization ○ It required an additional 2D video card ⦿ The idea: ● The 2D transformations are performed by a fast 2D video card ○ E.g. the popular Matrox video card ● The 3D transformations are performed by the Voodoo card ○ its hardware were able to make faster calculations than software rendering. 4 Other important events ⦿ In the same year: ● NVIDIA and the ATI started their own GPU series ● Nvidia: NV1, RIVA 128, Geforce 256 ● ATI: 3D Rage, Rage Pro, Rage 128 ⦿ The video cards immediately became very popular ⦿ The reasons of this are: ● Reasonable price ● These cards could be buy in every computer shop ● Cards were supported by games and operating systems (mainly by windows) 5 Today main (GPU) trades 6 Architecture of the GPU… 7 CPU vs GPU ⦿ The GPU architecture is very different from the CPU (already from the very beginning!) ⦿ Reason 1: ● They are designed for specific purposes: typically to speed up graphical calculations ● Graphical calculations have different requirements than the needs against the CPU ● The CPU is for general purposes ⦿ Reason 2: ● Graphical calculations and the process of the rasterization can be heavily parallelized ⦿ The development of the GPUs started to this direction 8 CPU vs GPU ⦿ CPU: implements a single-threaded computing architecture ● allows to run multiple processes on a single threaded pipeline ● application data can be reached through one memory interface ⦿ GPU: the architecture follows the stream processing technology ● This is much more efficient approach to process large amount of data ● A GPU can contain even thousands of stream processors ● There are no conflicts and the wait like at the CPU ○ Stream processors form a pipeline 9 CPU vs GPU 10 Geforce 8800 11 Geforce GTX 280 12 CPU vs GPU ⦿ CPU: uses a lot of resources to ● the control of the programs, ● to switch between instructions and tasks ⦿ GPU: is totally unsuitable for this ● GPU contains a lot of arithmetic logic units (ALU), ○ has the ability to calculate faster with order of magnitude ● Limitations: ○ every processing unit should run the same command – Data parallelism! ⦿ CPU also supports data parallelism! ● with extended instruction sets (e.g.. SSE, SSE2, SSE3, SSE4, AVX, ALTIVEC, stb), ● with multicore CPUs 13 The problem of data transfer ⦿ There is a distance between the GPU and CPU ● they are connected through the system bus ⦿ The data transfer problem appeared soon! ● Transfer data from main memory to the GPU memory is time consulting 14 The problem of data transfer ⦿ For this reason, numerous bus types were developed ● Former standards: ISA, MCA, VLB, PCI ● In 1997, the AGP (Accelerated Graphics Port) standard was developed ⦿ Very fast data transfer between CPu and GPU ⦿ Today is still present in the AGP standard ⦿ Today, the dominant solution is the PCI Express standard ● a high-speed serial computer expansion bus standard PCIe 1.0 PCIe 2.0 PCIe 3.0 PCIe 4.0 250 MB/s 500 MB/s 984,6 MB/s 1969.2 MB/s 15 Tendency of evolution ⦿ The GPUs evolution far exceeds the development of CPUs ⦿ Moore's law (1965): ● is the observation that the number of transistors in a dense integrated circuit doubles approximately every two years. ⦿ Today: ● CPU: the speed slowed to 18 months ● GPU: doubling rate reduced to 6 months 16 Tendency of evolution ⦿ Example: ATI Radeon HD 3800 GPU family: ● 320 stream processor ● 666 million transistors ● Performance > 1 terraFLOPS Intel Core 2 Quad CPU ● 582 million transistors ● Performance ~ 9.8 gigaFLOPS 17 Tendency 18 Tendency 19 Programming the GPU… 20 Programming APIs ⦿ In parallel with the development of video cards numerous low-level programming interfaces (API) were developed ● Under strong influence of hardware vendors ⦿ First well known API: Glide API ● Developed by 3dfx for their own Voodoo cards ● OpenGL like interface ● Targeted games in terms of performance and functionality ● It was dominant in game industry until mid-1990s ● In 2000, Nvidia acquired 3dfx 21 Direct3D vs OpenGL ⦿ Direct3D ● Part of the Microsoft’s DirectX graphical API ● Available only for Windows platforms ○ Desktops, XBox, Windows Phone ● The most popular graphical APIs for game developers ⦿ The reason of its popularity: ● development is perfectly follows the evolution of graphics hardwares ● Provides also built in higher level solutions: ○ Optimized mathematical solutions. E.g. matrices, vectors, collision detection, etc ○ Own 3D bone animation based model format called X ● Other additional higher-level APIs: DirectDraw, DirectInput, DirectSound, etc 22 Direct3D vs OpenGL ⦿ OpenGL (Open Graphics Library): ● Specification standard for platform independent 2D és 3D visualization ● Introduced in 1992 by Silicon Graphics Inc ● The ARB (Architecture Review Board) consortium was responsible for its development ○ Members are the major software and hardware manufacturers: ○ ATI, NVIDIA, Intel, Microsoft, etc.) ● In 2006 Khronos Group consortium took over its development ○ https://www.khronos.org/ ● Slower development: the development of the specification is a slow process, which significantly hinders the graphics-intensive applications developers. 23 Direct3D vs OpenGL ⦿ Real competitors in the field of game development ⦿ Both API has its own advantages and drawbacks ● Mainly there are only structural differences, the two APIs are almost identical in functionality ⦿ Advantage of the OpenGL (the future): ● Platform independence: opengl has the opportunity to run on almost all devices ● OpenGL can also be used for embedded systems and mobile devices. ○ This version is called OpenGL ES ● Popular operating systems are using OpenGL ○ iOS - OpenGL ES ○ Linux, Unix, BSD - OpenGL ○ Playstation - OpenGL ○ AmigaOS, MorphOS, Haiku OS ○ etc 24 Game and Graphics Engines… 25 Game engines ⦿ Objective 1: to provide a toolkit for the developers team (developer, designer, tester), ● E.g.: editors,runtime environment, network, audio ⦿ Efficient, convenient and fast game development becomes possible ⦿ It is a layer between the Operating System and the game logic. ⦿ It simplifies the routine programming tasks: ● Otherwise these should be performed for all games ● E.g: creating a window, audio, play video, loading assets, collision detection, etc. ⦿ Objective 2: representing an appropriate technical quality ● in terms graphics quality and performance 26 Structure of a Game Engine ⦿ The process of game development requires a complex IT knowledge! ● The game engine supports these process and therefore it’s functionality should be also complex They are organized into well-defined subsystems: ⦿ Core subsystem: core functions, controls the modules and other subsystems. Provides platform independency, forwards events to other engine parts. ⦿ Graphics subsystem: responsible for visualization. It is typically built upon an API (OpenGL, DirectX) ● Display models, lights, effects, post-processing, particle systems, etc. 27 Structure of a Game Engine ⦿ Audio and Music subsystem: playing audio effects and music ⦿ Artificial intelligence subsystem ⦿ Network subsystem: support for network connections and data transfer ⦿ Input and Event subsystem: handle input devices and event management ⦿ Scripting subsystem: support script based development ⦿ Resource subsystem: functions to access to resources ⦿ Physics subsystem: make physical based simulations possible. (E.g. racing games) ⦿ Other subsystems: for math calculations, video playing, etc. 28 Structure of a Game Engine ⦿ Subsystems should be a replaceable unit ⦿ Sometimes a subsystem is not developed by in-house ● the companies may decide to buy an existing and well- functioning technology. ● If the development of the new subsystem will cost more than licensing an existing ○ Typical example is integrating a physical subsystem ⦿ Examining today's major game engines modularity can be seen ● Main components are written using a low level language (e.g. C/C++) ○ Because of performance ● Game logic is written using a higher level language ○ fewer errors ○ Cheaper developers 29 Today’s major Engines ⦿ Thanks to technology, the graphics and game engines can offer sumptuous visuals ⦿ Games become increasingly complex ● They contain even more cinematic parts, and functionality ⦿ A modern game engine can be very expensive ● In return: developers will receive multiple years of experience in the form of implemented algorithms 30 Today’s major Engines ⦿ Unreal Engine 4 - Epic Games ⦿ Engine is free, but 5% royalty should be payed after the first $3,000 of revenue per product per quarter ⦿ ID Tech 5 – ID Software ⦿ Frostbite 3 - EA Digital Illusions CE ⦿ Cryengine 3 - Crytek ⦿ Source Engine – Valve ⦿ Unity Engine - Unity ⦿ ShiVa 3D - Stonetrip ⦿ C4 Engine - Terathon

Programming of Graphics

Energy-Efficient VLSI Architectures for Next

Drivers for Windows Compressed Modes User’S Guide

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

In5050 – Gpu & Cuda

Programming Graphics Hardware Overview of the Tutorial: Afternoon

GPU-Based Deep Learning Inference

Arxiv:1809.03668V2 [Cs.LG] 20 Jan 2019 17, 20, 21]

4010, 237 8514, 226 80486, 280 82786, 227, 280 a AA. See Anti-Aliasing (AA) Abacus, 16 Accelerated Graphics Port (AGP), 219 Acce

Shippensburg University Investment Management Program

Numerical Behavior of NVIDIA Tensor Cores

Manycore GPU Architectures and Programming, Part 1

COM Express® + GPU Embedded System (VXG/DXG)