1028224 Shihua Hang a Flexibility Metric for Processors

Eindhoven University of Technology MASTER A flexibility metric for processors Huang, S. Award date: 2019 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Electrical Engineering Electronic System Group A Flexibility Metric for Processors Master Thesis Report Shihua Huang 1028224 Supervisors: Luc Waeijen Henk Corporaal Kees van Berkel Version 1 Eindhoven, February 2019 Abstract In recent years, the substantial growth in computing power has encountered a bottleneck, as the miniaturization of CMOS technology reaches its limit. No more exponential scaling occurs as being described by Moore's law. In the coming years of computing, new advancements have to be made on the architectural side. To advance the state of the art, we need good understanding of the various trade-offs in architectural design. One property people refer to is flexibility. The claim that tradeoffs exist between flexibility and performance/energy efficiency, has been frequently used in diverse contexts. Processor flexibility, however, has not been properly defined, and as such these claims can not be validated nor \exploited". Thus, it is impossible to conduct quantitative flexibility comparison among modern processors, such as ASIC, FPGA, DSP, GPU, and CPU. The lack of such a measure obstructs designers from obtaining a deeper understanding of this property and leads to the failure of characterizing its relation to other features of a processor, such as performance and energy efficiency. In this work, a quantifiable flexibility metric is proposed at application level, thus being applic- able for processors with diverse architectures, which allows a valid quantitative comparison between processors. With a novel approach to normalize applications based on their intrinsic workload, the proposed flexibility metric evaluates 24 platforms across CPU, GPU, DSP, and FPGA. More importantly, the obtained results confirm the tradeoffs between flexibility and performance/energy/area efficiency, and hint the underlying relations between other properties such as area and energy efficiency. This work aims to provide a starting point in understanding processor flexibility, and raise awareness and discussions in the future computer architecture design. ii A Flexibility Metric for Processors Preface Hi, this is a flexible preface. The topic is flexibility, but not completely related to processors, because my master thesis is not only full of flexible or inflexible processors, but also persons. So here I will introduce the flexible and inflexible persons in my thesis story. The preface starts with my motivations for selecting this project. I liked the course Embedded Computer Architecture (ECA) from Electronic System group, which provided me with a starting point in the field of computer architecture. Another reason is certainly the project itself. I still remember the first meeting with Luc, my thesis mentor. He told me that the word “flexible" can also be an adjective for processors, not in a way of bending them. Then he started the \story" about how diverse processors may differ in flexibility. Definitely, he did a good job at selling his \story" to me. Luc this guy is really amazing, he really knows everything. In this way, he is really flexible. Maybe his hand is able to reach his toe? He told me a lot of "stories" during the whole thesis, sometimes it was kind of hard to stop him from immersing himself. Of course I also enjoyed the stories. I always wonder that he must have a different brain setup which is designed so general-purpose. However, general-purpose processors are claimed by most to have high energy consumption. So here comes the question: how can this brain setup be powered sufficiently by only eating bread as lunch? Who knows! Anyway Luc is the representation of a flexible person I met during my thesis. To conclude, he is like a CPU or FPGA, which is extremely flexible. Compared to Luc, my friend UDID, a sleepy margikarp, is extremely inflexible. Six months of observations revealed that the reason that makes her always sleepy is only eating bread! Indeed, I am insulting her unhealthy diet. Besides, she also has an incredible brain setup, which 90% of the time is busy with processing but no output. That is truly a time-consuming process with significant energy overhead. Half a year ago, it was really challenge for me to communicate with her. Mostly when you are saying something, she might keep quiet or will give a dot \." to the conversation. Such an awkward inflexible physics girl! Anyhow I did enjoy her mostly silent company during the whole thesis. With my efforts, she eventually talked a lot more than before. Impressive! So the conclusion is: UDID is the least flexible ASIC? However, this special ASIC does not process anything useful and mostly get stucked in deadlocks. And me? I consider myself as a domain-specific accelerator GPU, a bit more flexible than the UDID ASIC and of course less flexible than the Luc CPU. This GPU could have high performance and efficiency but low flexibility. Its goal is to become flexible as the Luc CPU! During the whole thesis, every day was a fruitful day for me. Due to the decent scheduling and mapping, I could efficiently master new knowledge and handle different challenges. The \incredible" efficiency resulted in an interesting misunderstanding. One day in the past Luc tried to convince me to spend less time on study and more time in enjoying my life, since he thought I worked so hard everyday and even during weekends. To be honest, I was completely confused at that moment, since the thesis period was the most relaxing time of my master phase. However, this misunderstanding is also warm-hearted. To conclude, I am definitely a lovely pink GPU with high energy efficiency and performance. In the future, I will work hard at enriching myself and practicing my story skills in different domains, eventually becoming as flexible as the Luc CPU. Dear Henk and Kees, I appreciate your help for my thesis. Are you reading this line? If so, you must be a flexible person as well, congratulations! I hope you like my flexible preface. If not, never mind, have a nice day! :p A Flexibility Metric for Processors iii Contents 1 Introduction 1 1.1 Background........................................1 1.2 Motivation........................................2 2 Literature Research3 2.1 Flexibility in Other Fields................................3 2.1.1 Manufacturing Systems.............................3 2.1.2 Networks.....................................3 2.1.3 Power Systems..................................4 2.2 Flexibility of Processors.................................4 2.2.1 Single-ISA Heterogeneous Processors......................4 2.2.2 Computer Architecture.............................5 2.2.3 Processor Versatility...............................6 3 Flexibility as a New Measure7 3.1 Flexibility Definition...................................7 3.2 Flexibility Measure....................................8 3.2.1 Additional Properties of GM and GSD.....................9 3.3 Flexibility Framework.................................. 12 3.4 Data Normalization................................... 12 3.4.1 Normalize to Dataset.............................. 13 3.4.2 Normalize to Baseline.............................. 14 3.4.3 Summary of Normalization........................... 15 4 Experiment Setup 16 4.1 Benchmarks........................................ 16 4.2 Platforms......................................... 16 4.2.1 CPU........................................ 17 4.2.2 GPU........................................ 17 4.2.3 FPGA....................................... 17 4.2.4 DSP........................................ 17 4.3 Compiler Directives................................... 18 4.3.1 CPU........................................ 18 4.3.2 FPGA....................................... 18 4.3.3 DSP........................................ 18 5 Implementation 19 5.1 CPU & GPU....................................... 19 5.2 FPGA........................................... 19 5.3 DSP............................................ 21 5.4 Intrinsic Workload Estimator.............................. 21 5.4.1 IR Interpreter................................... 22 iv A Flexibility Metric for Processors CONTENTS 5.4.2 Intrinsic Transistors............................... 25 6 Methodologies 29 6.1 Flexibility......................................... 29 6.2 Secondary Metrics.................................... 29 6.2.1 Performance................................... 29 6.2.2 Energy Efficiency................................. 30 6.2.3 Area Efficiency.................................. 30 6.3 Parallelism.......................................

1028224 Shihua Hang a Flexibility Metric for Processors

Snapdragon-870-5G-Mobile-Platform

Qualcomm® Vision Intelligence 300/400 Platforms (QCS603/QCS605)

Qualcomm® Snapdragon™ Embedded Platforms HW and SW Overview

Qt5wm6k4xb.Pdf

Qualcomm Technologies, Inc

Qualcomm® Snapdragon™ 670 Mobile Platform

Hexagon Dsp: an Architecture Optimized for Mobile Multimedia and Communications

Qualcomm-Snapdragon-7C-Compute-Platform-Product-Brief.Pdf

Thundersoft Turbox® D845 SOM Datasheet

Qualcomm Hexagon V67 Programmer's Reference Manual

MSM8953 (14 X 12 Mm) Qualcomm Technologies, Inc

A Chained Barrel Processor