A Simulator for a Novel GPU to Support the Verifying and Profiling in Real World Applications

國立中山大學資訊工程學系碩士論文 Department of Computer Science National Sun Yat-sen University Master Thesis 原創繪圖處理器於現實應用之側寫及驗證模擬器 A Simulator for a Novel GPU to Support the Verifying and Profiling in Real World Applications 研究生：竇旭康 Hsu-Kang Dow 指導教授：Dr. Steve.W.Haga 中華民國 102 年 10 月 October 2013 i Dedicated to My parents ii 摘要此論文提出以 Attila 開源模擬器為基礎的現代 GPU 驗證及側寫模擬器，支援 OpenGL ES 2.0 與 GLSL ES 編譯和執行。完成的模擬器可以做到驗證著色語言編譯器與提供側寫協助式最佳化編譯所需之統計資料。此模擬器在研究領域上則可提供效能數據的統計並提供管線各階段的處理值，提供硬體驗證除錯時的正確值。為了使 Attila 模擬器能支援 GLSL 著色語言，本論文提供轉接器含有兩個部分，第一部分是 API 界面的轉接器，用於 GLSL 編譯時所使用的程式資料鏈結。另一部份為 Attila ISA 與 NSYSU ISA 指令之間的轉換與暫存器 I/O 對映。新的 Attila 模擬器與 SystemC 模擬器相比提供 300 至 2000 倍效能提升，並在系統模擬時避免所需的前置計算，是模擬複雜的應用程式不可或缺的必要功能。關鍵字: 繪圖晶片，Attila 模擬器，OpenGL ES，GLSL ES iii Abstract This is a simulator created base on Attila, a modern GPU architecture and open source project with the power to run games and benchmarks. This simulator has been modified in order to support OpenGL ES 2.0 and GLSL ES compilation and execution. As such, it is an important extension of the Attila simulator. In addition, the compiler is designed for the NSYSU GPU architecture, which allows the verifying of the code produced by our GLSL ES shader compiler. This simulator was also created to enable future research, by providing the ability to record statistic data from running real world applications and then use these data to make profile assisted compiler optimizations. Along with the simulator is a NSYSU GPU to Attila simulator converter. This converter consist two parts of the conversion, one is the API converter and another is the assembly converter. The converter solves data linking problems for attribute, uniform and varying data, which occur when adapting the Attila simulator to use NSYSU GLSL compiler assembly. Compared to the NSYSU GPU's current SystemC simulator, the new Attila simulator is 300 to 2000 times faster. It also avoids the necessities to precompute input to the simulator for system-wise simulation. These benefits are necessary for simulating non-trivial applications. Keywords: GPU, Attila, Simulator, OpenGL ES, GLSL ES. iv Contents 1. Introduction ................................................................................................................................ 1 1.1 Profiling .................................................................................................................................................... 2 1.2 Verifying ................................................................................................................................................... 3 1.3 Real World Applications .................................................................................................................... 3 1.4 Attila Simulator ..................................................................................................................................... 4 1.5 Converter ................................................................................................................................................. 5 2 Related Works ............................................................................................................................. 7 2.1 NSYSU GPU .............................................................................................................................................. 8 2.2 NSYSU SystemC Simulator ............................................................................................................. 10 2.3 Attila GPU .............................................................................................................................................. 11 2.4 Attila Tracing ....................................................................................................................................... 12 3 Methodology ............................................................................................................................. 15 3.1 An Overview of the Converter ...................................................................................................... 17 3.2 Data Flow (Attribute Uniform Varying) ................................................................................... 19 3.3 Attila OpenGL API Driver Modification .................................................................................... 26 3.4 Converter for NSYSU to ATTILA Assembly ............................................................................. 27 3.5 Load / Store instructions & Memory Design .......................................................................... 31 3.6 Miscellaneous ...................................................................................................................................... 33 4 Performance Comparison and Result ......................................................................... 34 4.1 GLBenchmark ...................................................................................................................................... 36 5 Reference .................................................................................................................................... 38 v A Simulator for a Novel GPU to Support the Verifying and Profiling in Real World Applications Author: Hsu-Kang Dow Advisor: Dr. Steve.W.Haga National Sun Yat-Sen University 1. Introduction This thesis presents a simulator for a novel GPU to support the verifying and profiling in real world application. The novel GPU [1] is presented by Department of Computer Science in National Sun Yat-Sen University (NSYSU). The design goal of this novel GPU is aiming on embedded system. In this case, reducing the power consumption is a vital mission in this design, thus optimization take an important role when developing this GPU. Applications and Games on handheld device are more and more popular and many of them implement 3D graphic with in it. To meet the needs of the future market, an embedded GPU with power awareness is therefore introduced. Programmable shader is also important to modern game design. Thus the NSYSU GPU supports OpenGL ES 2.0 [2] and GLSL ES [3] to meet this requirement. Along with the GPU, A GPU simulator is also required for verifying the implementation of both hardware and software. Current simulator is written in SystemC. It directly models the hardware behavior and it is cycle accurate. The nature of this makes the current SystemC simulator slow. Another disadvantage of the current SystemC simulator is that it lack of full system simulation. The simulator only simulates hardware behavior but not the communication between OpenGL API calls (CPU) and shader programs (GPU). Simple benchmark like red cubes has simple communication, which can be set by 1 programmers but not for larger scale. Current SystemC simulator is limited and not enough for real world applications. 1.1 Profiling Because the nature of the embedded system. It’s hard to increase performance by simply adding more hardware like NVidia and AMD do on Desktop GPU. This is where optimization takes part. Before we optimize the codes, we need information to understand where the bottleneck is and what resource can be reused. Profiling helps us to reach this information by tracing tagged control flow, collect memory related information and reuse calculation like distant object texture for better performance and saving power. The environment of embedded system is different from PC, thus when we are doing optimization, we focus on different aspect. For example, most handheld device has lower resolution comparing to Desktop PC. What we made is a tool to help programmer to automate optimization which usually done by hand. For instance, a distant tree might only take 3 pixels, we can take hours to rasterize the full scene and acquire those 3 pixels. But programmer can do tricks on this and simply paste a texture to replace the distant object. Profiling the scene while rendering objects can do this. In order to obtain the data we need, first we need a simulator, which can run applications and gather information on runtime so we can use this information to make profile assist compiler optimization. This is an asked feature for the simulator and the new simulator enable this power. 2 1.2 Verifying The NSYSU GPU comes with a simulator coded in SystemC, which functions directly mapped into hardware. It’s designed for hardware verification and logical in transistor detail that takes long duration to simulate the entire process. It’s also cycle accurate while simulating the hardware behavior, thus make this SystemC simulator slow when using it to verify the codes generated by the compiler. Another problem is that the current simulator lack of connection to API because simple benchmark doesn’t contain complex API call sequence, this leads to insufficient API-to-GPU communication for further analysis and verification. In this case, we need a fast simulator to quickly verify our optimized result and capability to process complex scene for fully test our compiler generated codes. 1.3 Real World Applications Another goal of this work is to expand our infrastructure to handle more complicated and more-state-of-the-art shader programs, such as might be found in modern real-world applications. Prior to this thesis, the NSYSU GPU project was unable to support such applications. As for the pre-existing SystemC simulator, it is too buggy and too slow to simulate complex shader codes. And as for the OpenGL ES API function calls, real-world applications are so intricate that it becomes infeasible

Load more