Retargeting a C Compiler to the HAPRA/FAPRA Architecture

Institute of Computer Architecture and Computer Engineering University of Stuttgart Pfaffenwaldring 47 D{70569 Stuttgart Diplomarbeit Nr. 2980 Retargeting a C compiler to the HAPRA/FAPRA architecture Tilmann Scheller Course of Study: Software Engineering Examiner: Prof. Dr. Hans-Joachim Wunderlich Supervisor: Dipl.-Ing. Christian Zöllin Dipl.-Inform. Melanie Elm Commenced: October 19, 2009 Completed: April 20, 2010 CR-Classification: C.0, C.4, D.3.4 Abstract The HAPRA and FAPRA architectures are simple 32-bit RISC architectures which are used for educational purposes. Without a compiler for a high-level language, software for the HAPRA/- FAPRA architecture needs to be written in assembly language. This is un- fortunate since writing software in assembly language is time-consuming, error-prone and results in unportable software. The goal of this thesis is to develop a complete C-based toolchain for the HAPRA/FAPRA architecture, including an assembler, a C compiler and porting a C standard library. The resulting toolchain is used to compare the different subsets of the HAPRA/FAPRA instruction set in terms of runtime performance and space efficiency. With the availability of a C compiler for the target architecture it is significantly easier to measure the impact of extensions or modifications of the existing instruction set. A wide spectrum of portable open source software exists today. The availability of a C-based toolchain for the HAPRA/FAPRA architecture enables, at one go, access to this large software stack. It is expected that this toolchain will be used to port a full operating system to the HAPRA/FAPRA architecture. The ability of running an entire operating system is also likely to be a great motivator for students designing their own custom implementations of the HAPRA/FAPRA architecture as part of their lab courses. 3 4 Contents 1 Introduction 13 2 Architecture & Compiler Fundamentals 15 2.1 Instruction set architecture.................... 15 2.1.1 FAPRA architecture................... 15 2.1.2 HAPRA architecture................... 18 2.1.3 HASE/Angora...................... 20 2.2 Compiler pipeline......................... 21 2.2.1 Frontend.......................... 22 2.2.2 Middle-end........................ 23 2.2.3 Backend.......................... 24 2.3 LLVM Compiler Infrastructure.................. 25 2.3.1 Overview......................... 25 2.3.2 C frontend......................... 26 2.3.3 Intermediate representations............... 26 2.3.4 Target-independent code generator........... 28 3 Implementation 31 3.1 Application Binary Interface................... 31 3.2 HAPRA/FAPRA backend.................... 33 3.2.1 Overview......................... 34 3.2.2 Machine description................... 38 3.3 C frontend............................. 39 3.4 C standard library........................ 39 3.5 Assembler............................. 40 3.6 Simulator............................. 40 3.7 Linker............................... 42 4 Results 43 4.1 Example.............................. 43 4.2 Measurements........................... 46 4.2.1 Instruction count..................... 46 4.2.2 Code size......................... 48 4.3 Simulator............................. 50 4.4 Experience............................. 52 4.5 Remarks.............................. 52 5 Conclusion & Future Work 55 5 A Appendix 57 6 List of Figures 1 The HASE GUI.......................... 20 2 The compiler pipeline....................... 21 3 The LLVM architecture...................... 26 4 The target-independent code generator architecture....... 28 5 The HAPRA/FAPRA ABI stack frame layout.......... 32 6 The newlib architecture...................... 40 7 The libcpu architecture...................... 41 8 Comparing the number of executed instructions between HAPRA and FAPRA............................ 48 9 Comparing the binary size of benchmarks compiled for HAPRA and FAPRA............................ 50 7 8 List of Tables 1 The FAPRA instruction set architecture............. 17 2 The HAPRA instruction set architecture............. 19 3 Registers in the HAPRA/FAPRA ABI.............. 32 4 Size and alignment of scalar data types on the FAPRA architecture with byte-addressing.................... 33 5 Size and alignment of scalar data types on the FAPRA architecture with word-addressing and on the HAPRA architecture. 34 6 The Stanford benchmark suite.................. 46 7 The Computer Language Shootout benchmark suite...... 47 8 Instruction distribution when executing hello.......... 49 9 Executing the Stanford benchmark suite on simulators..... 51 10 Executing the Computer Language Shootout benchmark suite on simulators............................ 51 9 10 Listings 1 Instruction format description in TableGen syntax....... 38 2 Instruction description in TableGen syntax........... 39 3 C source code of Mandelbrot example.............. 43 4 LLVM IR of Mandelbrot example................. 44 5 FAPRA assembly code of Mandelbrot example......... 45 11 12 1 Introduction The RISC processor architectures introduced in the undergraduate and grad- uate lab courses of the institute use a rather simple, nevertheless complete, instruction set (see table1 and2). While the programming environment features an assembly simulation and debugging environment, larger software projects would benefit from the availability of a compiler for a high-level language. The goal of this thesis is to develop a complete C-based toolchain for the HAPRA/FAPRA architecture, including an assembler, a C compiler and porting a C standard library. A wide spectrum of portable open source software exists today, with the Linux kernel being undisputedly among the most popular open source projects. The availability of a C-based toolchain for the HAPRA/FAPRA architecture enables, at one go, access to a large software stack, which, once the initial work of porting the Linux kernel is done, can be run with minimal effort. Without a compiler for a high-level language, software for the HAPRA/- FAPRA architecture needs to be written in assembly language. This is un- fortunate since writing software in assembly language is time-consuming, error-prone and results in unportable software. It is expected that this toolchain will be used to port the µClinux [uCl] kernel and userspace to the HAPRA/FAPRA architecture, making it possi- ble to run a full operating system on the hardware. This has the nice side effect that students which design custom implementations of the HAPRA/- FAPRA architecture as part of their lab courses, get the ability to run a whole operating system on their hardware, which is expected to be a great motivator for them. With the availability of a C compiler for the target architecture it is significantly easier to compare the different subsets of the HAPRA/FAPRA instruction set in terms of runtime performance and space efficiency and to measure the effect of extensions or modifications of the existing instruction set. This thesis is organized as follows: Chapter2 describes the the HAPRA/- FAPRA architecture, the compiler pipeline and the compiler framework used in this thesis. Chapter3 discusses the implementation of the toolchain for the HAPRA/FAPRA architecture. Chapter4 presents results of measurements, the experiences made during the implementation of the toolchain and suggestions on how to improve the HAPRA/FAPRA architecture. Chapter5 draws a conclusion and gives an outlook on future work. 13 14 2 Architecture & Compiler Fundamentals This chapter describes the HAPRA and FAPRA instruction set architectures (ISA), presents a brief overview of the compilation process and introduces the Low Level Virtual Machine (LLVM) Compiler Infrastructure. 2.1 Instruction set architecture This section describes the HAPRA and FAPRA architecture and their re- spective assembly language development environments. 2.1.1 FAPRA architecture The FAPRA architecture is a simple 32-bit RISC architecture, which was designed for educational purposes. The instruction set encompasses 29 different instructions which can be divided into 5 instruction classes: load/store instructions, load immediate instructions, control flow instructions, arithmetic/logical instructions and comparison instructions. Table1 shows the FAPRA ISA. SIMD-variants of the architecture also exist, which allow operations on vectors of two or four 32-bit integers, increasing the data-parallelism respec- tively. The memory transfer size is not widened in the SIMD-variants and is always 32-bit, thus loading an arbitrary 128-bit value from memory requires four load instructions and additional instructions to place the individual vector elements at the desired positions within the destination vector register. As the native word size is 32-bit, there is a register file consisting of 32 registers with a width of 32-bit, except for the SIMD-variants where the registers are 64-bit or 128-bit wide, forming a unified register file which con- tains both scalar and vector values. The program counter is stored in a dedicated register which can only be read/written implicitly, e.g. through certain control flow instructions. A special status register with flags for results of arithmetic/logical instructions does not exist, instead comparison instructions compute their results directly into their destination register. There are no native floating-point instructions, floating-point operations need to be implemented in software. As usual for RISC architectures, instructions have a fixed width of 32-bit in order to simplify the decoding and prefetching

Load more