Retargeting Open64 to a RISC Processor
Total Page:16
File Type:pdf, Size:1020Kb
Retargeting Open64 to A RISC processor -- A Student’s Perspective Huimin Cui, Xiaobing Feng Key Laboratory of Computer System and Architecture, Institute of Computing Technology, CAS, 100190 Beijing, China {cuihm, fxb}@ict.ac.cn compiler [6]. Abstract Open64 has been retargeted to a number of This paper presents a student’s experience architectures. Pathscale modified Open64 to in Open64-Mips prototype development, we create EkoPath, a compiler for the AMD64 and summarize three retargeting observations. X8664 architecture. The University of Open64 is easy to be retargeted and the Delaware's Computer Architecture and Parallel procedure takes only a short period. With the Systems Laboratory (CAPSL) modified Open64 retarget procedure done, the compiler can to create the Kylin Compiler, a compiler for achieve good and stable performance. Open64 Intel's X-Scale architecture [1]. Besides the also provides many supports for debugging, with targets mentioned above, there are several other which a beginner can debug the compiler supported targets including PowerPC [6], without difficulty. We also share some NVISA [9], Simplight [10] and Qualcomm[11]. experiences of our retarget, including In this paper, we will discuss three issues methodology of verifying the compiler when retargeting Open64: ease of retarget, framework, attention to switches in Open64, performance, and debuggability. The discussion importance of debugging and reading generated is based on our work in retargeting Open64 to code. the MIPS platform, using the Simplight branch as our starting point, which is a RISC style DSP 1 Introduction with mixed 32/16 bit instruction set. During our discussion, we will use GCC (Mips target) for Open64 receives contributions from a comparison. This retarget work is just a student number of compiler groups around the world, project, so we have some non-goals, which will from industry as well as from academia. It is be discussed in section 5. derived from the SGI MIPSPro64 compiler [1]. And we will also share some of our Good performance and industrial strength origin retargeting experiences, in four folds: 1) Steps to make the Open64 compiler a popular choice for verify the compiler framework. 2) Turn on/off research projects. The Open64 compiler supports switches for special hardware features. 3) C/C++ and Fortran95 languages and many Debugging is very important, not only for researchers from industry and academia have retarget, but also for later extensive work. 4) Pay contributed to the retargeting of the Open64 attention to the generated code if it doesn’t work 1 as expected. isa/<targ>/isa.cxx The paper is organized as follows. Section isa/<targ>/isa_properties.cxx 2 introduces the retarget procedure. Section 3 isa/<targ>/isa_subset.cxx presents our major retarget results and analysis. isa/<targ>/isa_registers.cxx Section 4 summarizes our experiences and isa/<targ>/isa_enums.cxx expectations. Section 5 shows our non-goals and isa/<targ>/isa_lits.cxx section 6 concludes the paper. isa/<targ>/isa_operands.cxx isa/<targ>/isa_hazards.cxx 2 Suggested Retarget Procedure isa/<targ>/isa_print.cxx isa/<targ>/isa_pack.cxx This section presents our retargeting isa/<targ>/isa_bundle.cxx procedure, and it is applicable if one can find a isa/<targ>/isa_decode.cxx similar processor in Open64’s supporting targets. isa/<targ>/isa_pseudo.cxx Here, letters (A,B,..) represent the file creation abi/<targ>/abi_properties.cxx order, and numbers (1,2...) represent the building proc/<targ>/proc.cxx order. proc/<targ>/proc_properties.cxx (A). Add make rules for your target(<targ>) in proc/<targ>/<targ>_si.cxx top level Makefile.gsetup This directory contains most of the basic (B). Create dir targia32_targ<targ> setting for retarget purposes. Most retarget This is the build directory for your new problems will be due to wrong settings in this target. Host is specified as ia32. directory. All these files should be written (C). Create and set up Makefile in each subdir carefully. under targia32_targ<targ> (3). Build targ_info Make sure the BUILD_TARGET is set (F). Deal with frontend in kgccfe and kg++fe correctly so it goes to the right target directory directories. for each sub component. In kgccfe/gnu, kgccfe/gnu/config, kg++fe/ Set your BUILD_VENDOR correctly. gnu, kg++fe/gnu/config, create directory of (1). Build include <targ> by copying from ia64 or x8664 and (D). In common/com, create <targ>/ config_ fixing the values and the include file names as targ.h well as define statement. Add macros for Is_Target_xxx, xxx Fix up the gnu_config.h to include the represents the supported processors of the corresponding config.h file for your target. new target, e.g, Is_Target_R10K() for MIPS. Fix up the <targ>/config.h, <targ>/ Add ABI macros for the new target. hconfig.h, <targ>/tconfig.h, <targ>/tm_p.h, to Set GP area size. include the corresponding file for your target. Set target debugging information. Make sure Makefile and Makefile.gbase In common/util, create <targ>c_qwmultu.c by know to go to gnu/<targ> directory also. It copying from ia64, or as appropriate. requires one to check whether gnu/<targ> has (2). Build libcmplrs, libiberty, libcomutil been added into SRC_DIRS. Sometimes, it helps to show the compile In include/elf.h, the #define for Elf32_Byte line during building of the compiler, simply and Elf64_Byte can always be enabled. change gcommondefs and gcommonrules in If need to add intrinsic, the following files linux/make/. needs changing. (E). In common/targ_info, create <targ> dir's intrinsic.def – defines and their files. Use this order: 2 INTRINSIC_ID, property of consistency. intrinsic (order is important, config_cache_targ.cxx sets cache models. indexed by INTRINSIC_ID) Targ_em_* files are for assembly output FE definition files, kgccfe/gnu control, section types, relocations, dwarf etc. Builtins.def – id, name, One only need to change these files when focus prototype, attribute in BE portion. Builtin-types.def – needed if new Basically one needs to tailor files type is needed mentioned about and then continue building Wfe_expr.cxx – translate GNU components as follows. builtin to WHIRL. (4). Build gccfe (G). Create the following files: (5). Build g++fe common/com/<targ>/config_platform.h (6). Build ir_tools common/com/<targ>/config_asm.h (H). Create be/be/<targ> and be/com/<targ> common/com/<targ>/config_cache_targ.cxx directories common/com/<targ>/config_elf_targ.cxx Create driver_targ.cxx, fill_align_targ.cxx common/com/<targ>/config_host.c in be/be/<targ> dir common/com/<targ>/config_platform.c Create betarget.cxx, sections.cxx in common/com/<targ>/config_targ.cxx be/com/<targ> dir common/com/<targ>/config_targ.h There is no need to be optimal at this point, common/com/<targ>/config_targ_opt.cxx just make it work, and then make it perform. e.g. common/com/<targ>/config_targ_opt.h One can set Can_Do_Fast_Multiply to return common/com/<targ>/targ_const.cxx FALSE, and return to that later. common/com/<targ>/targ_const_private.h (7). Build be common/com/<targ>/targ_ctrl.h (8). Build libelf, libelfutil, libdwarf, libunwindP common/com/<targ>/targ_em_const.cxx (I). Create be/cg/<targ>/register_targ.h common/com/<targ>/targ_em_dwarf.cxx be/cg/<targ>/tn_targ.h common/com/<targ>/targ_em_dwarf.h be/cg/<targ>/op_targ.h common/com/<targ>/targ_em_elf.cxx Create one’s own target specific portion in common/com/<targ>/targ_em_elf.h lib/elf.h (or appropriate elf.h), define the right common/com/<targ>/targ_sim.cxx relocations, sections, etc. common/com/<targ>/targ_sim.h (9). Build wopt All these files can be copied from the (J). Create files in be/cg/<targ> directory of a similar target, and modified as Fix cgtarget_arch.h, such as CGTARG_ required. Copy_Op, ... Make sure the endianness is set correctly. It There might be needs to add more things in is set in config_host.c, config_targ.cxx and be/cg/op.h common/com/config.cxx. Fix cgdwarf_targ.cxx, Find_Spill_TN, ..., Make sure the assembly syntax for .s file unwind table in dwarf. output is set correctly. It is set in config_asm.c. Copy other files from a similar processor’s Make sure the calling convention and directory. Go over expand.cxx, whir2ops.cxx parameter passing are set correctly. They are set carefully. And exp_branch.cxx, exp_divrem.cxx, in targ_sim.h and targ_sim.cxx. If you need your exp_loadstore.cxx, expand.cxx, entry_exit_targ. own ABI definition, just change this file, and do cxx needs to be changed or tailored substantially. not forget to modify targ_sim.cxx for (10). Build cg 3 (11). Build driver 3.2 Main Results (12). Build ipl (13). Build lno Observation 1 (See also section 3.3.1). A (14). Build inline student can complete the retarget procedure in (15). Build whirl2c about two months, if an appropriate supported (16). Build ipa target processor can be used to start the retargeting. In our case, the Simplight SL1 3 Retarget Results & Discussion processor is a RISC based processor with 16 bit instructions and DSP extensions. We chose this 3.1 Machine Assumption & as our starting point for its RISC basis as ISA. benchmarks Observation 2 (See also section 3.3.2). Performance of the generated Open64 compiler Machine Assumption: is satisfying, in two folds. Processor: Loongson 2f, out of order 1. We expect the retargeted Open64 ISA: MIPS3 compiler will generate very good code due to the Frequency: 666MHz excellent middle ends such as Wopt and Lno Issues: 4 components. Indeed, when we achieved ALU: 2 competitive or better performance compared to a FALU: 2 matured GCC compiler, for all the benchmarks MEM Unit: 1 we tested as a result of our effort. It is customary to use SPEC CPU2000 or 2. Open64 provides obvious performance 2006 to evaluate a compiler performance, but improvement with optimization option switched that is our non-goal (discussed in section 5). As a from –O2 to –O3, especially for C++ programs student project, we only consider the following with higher abstraction levels.