<<

Porting GCC Part I : How GCC works?

Choi, Hyung-Kyu [email protected] July 17, 2003 Microprocessor Architecture and System Laboratory lm32(Samsung) port of GCC

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory 2.95.2 Some example are from Ca „ Basically this document is based on GCC 2.95.2 This document have been updated to follow up GCC 3.4. „ „ About this presentation material s may have 8, 16,

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory

ABI(application binary interface) is at least a 32-bit is at least have several general registers with a flat (non-segmented) byte addressed data address space (the code can be separate). Target int 32 or 64-bit int char can be wider than 8 bits type. „ „ „ „ Make a good, fast compiler for machines on the GNU system aims to run including GNU/ variants Elegance, theoretical power and simplicity are only secondary „ „ „ Main Goal of GCC Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory coordinates these Preprocessor Compiler Optimizer Assembler Linker „ „ „ „ „ phases. Compilation system includes the phases Compiler Driver „ „ GCC Compilation System Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory GCC Execution ObjC RTL TREE Parsing Assembly Global Optimizations

Register Allocation ArchitectureMicroprocessor SystemSoftware and Laboratory Instruction Combining Instruction Scheduling ++ - Optimization Jump - Optimization Loop - Data Flow Analysis - Elimination Subexpr. Common Peephole Optimizations Register Class Preferencing C Macro Machine Definition Description The Structure of GCC Compiler Assember Code With or w/o debugging information

Assembler Code Generation

RTL Optimization

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL Generation

Tree Optimization IR : Register Transfer Language (RTL)

Parser GCC Compiler flow

For each function IR : Tree Representation s (instrucitons)

insn set memory pointed

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory source st [r54], r53 (reg:SI 53)) destination RTL(Register Transfer Language) Used to describe Written in LISP-like form (set (mem:SI SI 54)) (reg: by register 54 with value in 53” i.e. Above RTL statement means “ „ „ „ Intermediate Language (1/2) Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (plus:SI (reg:SI 55) (const_int -1)) Register is also 4-byte integer. Register number is 55. Value is “-1”. Mode is VOIDmode (not given). „ „ „ „ Example of RTL Adds two 4-byte integer (SImode) operands. First operand is register Second operand is constant integer. „ „ „ „ Intermediate Language (2/2) such as float-point mode,

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory a single bit, for predicate registers Quarter-Integer(1bytes) Half-Integer(2bytes) Single Integer(4bytes) Double Integer(8bytes) Tetra Integer(16bytes) Octa Integer(32bytes) Partial single integer mode which occupies for bytes but doesn’t really use all four. e.g. pointers on some machines „ „ „ „ „ „ „ „ BImode [QI/HI/SI/DI/TI/OI]mode PSImode And many other machine mode complex mode and etc. „ „ „ „ Intermediate Language : machine mode list

insn code Assembler . RTL templates .

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (RTL) Register Transfer Language is used to generate an RTL

assembler code

. named instruction patterns list is matched against the

parse tree insn Tree based on The front end reads the ant builds a The The to produces

parse tree Parse „ „ „ Three Main Conversions in the compiler code Assembler code 1 Assembler code 2 Assembler code 3 Assembler Optimizations RTL Pattern 5 RTL Pattern 1 RTL Pattern 2’

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (RTL) Register Transfer Language RTL Pattern 3 RTL Pattern 4 RTL Pattern 1 RTL Pattern 2 Tree has pre-defined standard names. „ Tree Parse Name 1 Name 2 Name 3 Name, RTL pattern and Assembler code n, common subexpression

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory tree based inlining constant folding some arithmetic simplification performs many well known optimizations e.g. jump optimizatio elimination (CSE), loop optimization, if- conversion, instruction scheduling, register allocation and etc „ „ „ „ „ Tree optimization RTL optimization „ „ Optimizations Assember Code with or w/o debugging information

Assembler Code Generation

RTL Optimization

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL Generation NO!

Tree Optimization Should we modify these all?

Parser How to port ?

For each function / .h

machine machine .c

sparc .md and .h,

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory

machine description

: target description macros sparc : : user-defined functions md h c

. . . machine .md,

in /gcc/config/sparc/ sparc „ „ e.g. SPARC machine machine machine used in „ „ „ „ Just write below 3 files in /gcc/config/ „ Then how ? (1/2) .md) file

machine

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory Generate some .c and .h files from machine description( Then actually compile .c and .h files including generated files. „ „ Then let Makefile below two does jobs „ Then how ? (2/2) .c .h .md

machine machine machine genattr genemit genflags genrecog

gencodes ArchitectureMicroprocessor SystemSoftware and Laboratory genconfig genoutput genattrtab genextract Build process (1/2) insn-attr.h insn-attrtab.c insn-flag.h insn-emit.c insn-recog.c insn-extract.c insn-codes.h insn-output.c .h c . .md

machine machine

machine

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory insn-flag.h insn-emit.c insn-codes.h insn-output.c insn-recog.c, insn-extract.c, insn-attr.h, insn-attrtab.c DBX, SDB, DWARF, DWARF2, VMS are supported DWARF2, DBX, SDB, DWARF, Build process (2/2) Generation Generation Assembler Code Debugging information RTL generation RTL Optimization Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory .md contains machine description Functional Units, Latency and etc Used when convert Tree into RTL All kind of RTL Patterns which can be generated „ „ „ CPU description RTL Patterns Assembler mnemonic etc. „ „ „ „

machine Machine Description „ „ Machine Desciption Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory .h contains target description macros alignment, endian, structure padding and etc calling convention, stack layout and etc allocation strategy, how value fit in registers and etc „ „ „ Storage layout ABI(Application Interface) Binary Register usage Defining output assembler language Controlling Debugging Information Format supports etc. „ „ „ „ „ „ „

machine Target Description Macro „ „ Target Description Macro GCC Compiler Part II : In details

Choi, Hyung-Kyu [email protected] July 17, 2003 Microprocessor Architecture and System Software Laboratory Assember Code With or w/o debugging information

Assembler Code Generation

RTL Optimization

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL Generation

Tree Optimization IR : Register Transfer Language (RTL)

Parser GCC Compiler flow

For each function IR : Tree Representation , and storing list based on op1 and

insn

op2 .md

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory machine where all operands have SImode”

op0, “addsi3” : which means “add result in list defined in „ e.g. “addsi3”, “movsi” and etc. Example of standard pattern name „ „

Tree has pre-defined standard pattern names For each standard pattern name, generate RTL insn „ „

Convert parse tree into RTL named instruction patterns How? „ „ RTL Generation Attributes Output Template Condition (optional) RTL Template Name

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory " addc (match_operand:SI (match_operand:SI " Machine description : define_insn RTL Generation Example 1 (reg:SI 18))) „ (plus:SI (plus:SI (match_operand:SI (match_operand:SI (plus:SI (plus:SI (ltu:SI 2)) (match_dup 1) (match_dup (plus:SI (set (reg:SI(set 18) "" define_insn [(set (match_operand:SI 0 "arith_reg_operand" "=r") "arith_reg_operand" 0 (match_operand:SI [(set "ADC\\t%0,%2" [(set_attr "arith")]) "type" ( 1 "arith_reg_operand" "0") 1 "arith_reg_operand" "r")) 2 "arith_reg_operand" 1)))] (match_dup r57 Å r56 + 1 r56 + 1 #i #i mem(r55) Å Å Å Å Å

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory movsi r55 ; movsi r56 ; addsi3 ; r57 movsi mem(r54) ; movsi r54 ; addsi3 ; r57 i = + 1; RTL Generation Example 1a Standard name in Tree int i; int main() { } „ In addsi3.c (VOIDmode, (VOIDmode, operand0, (VOIDmode, (SImode, operand1, operand2)), operand1, (SImode, CLOBBER SET PLUS (operand0, operand1, operand2) operand1, (operand0, gen_rtvec (2, gen_rtx_ gen_rtx_ rtx rtx operand0; rtx operand1; operand2; gen_rtx_REG (SImode, 18)))); gen_rtx_ return gen_rtx_PARALLELreturn (VOIDmode, Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory } rtx gen_addsi3 { In insn-emit.c :SI 1 :SI 2 :SI 0 rand" "=r,r") (reg:SI 18))] match_operand match_operand "addsi3" ( match_operand .md ( :SI ( clobber ( set RTL Generation Example 1b plus "arith_operand" "%0,0") "arith_operand" "r,I"))) ( [( Find RTL pattern by name defined in .md machine "arith_reg_ope „ In . "“ ) (define_insn ) reg:SI 56 :SI ( const_int 1 ( plus ( set (reg:SI 57) ( (clobber (reg:SI 18 t)) ] ) -1 (nil) In addsi3.c.rtl (insn 14 13 16 (parallel[ [0x1]))) (nil))

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory r57 Å r56 + 1 r56 + 1 #i #i mem(r55) Å Å Å Å Å RTL Generation Example 1c Generate RTL from Tree movsi r54 ; movsi mem(r54) ; movsi r55 ; movsi r56 ; addsi3 ; r57 addsi3 ; r57 „ Name statement Preparation RTL Template Condition (optional)

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory " "register_operand" "") "register_operand" "") 1 mode,GEN_INT(65535));") 0 )))] 2 zero_extendhisi2 " (match_dup Machine description : define_expand (match_operand:HI 0) „ RTL Generation Example 2 (and:SI (subreg:SI define_expand [(set (match_operand:SI "operands[2]=force_reg(SI “” ( 0) ” (reg:SI 55) (reg:SI 54) )) -1 (nil) (reg:SI 52)

define_insn “ (reg:SI 55) instead of ” (nil)) (const_int 65535 [0xffff])) -1 (nil) (nil)) (and:SI (subreg:SI (insn 24 23 25 0x0 (set (insn 23 22 24 0x0 (set After RTL generation ArchitectureMicroprocessor SystemSoftware and Laboratory

define_expand “ by using „ We can generate RTL sequences from standard name while generating RTL patterns. „ RTL Generation Example 2a . zero_extendhisi2; r54<-r52 . Before RTL generation insn pattern Condition new insn patterns preparation statements

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory th_reg_operand" "r" ))) "arith_reg_operand" "%0") rith_reg_operand" "=r" ) Machine description : define_split (match_operand:DI 2 "ari „ RTL Generation Example 3 (plus:DI(match_operand:DI 1 DONE;}”) (clobber (reg:SI 18))] "reload_completed" [. . .] define_split [(set(match_operand:DI 0 "a { . ( r58 Å ” const 1 mem(r55) #I #i Å Å Å Å ing standard name

define_insn “ movdi ; mem(r54) movdi ; r57 movdi ; r56 movdi ; r55 movdi ; r54 addsi3 ; addc ; r58 instead of

Å ArchitectureMicroprocessor SystemSoftware and Laboratory ” r56+r57 #i #I mem(r55) const 1 Å Å Å Å Å generation After splitt

define_split “ movdi ; r54 adddi3 ; let’s split movdi ; r55 movdi ; r56 movdi ; r57 adddi3 ; r58 movdi ; mem(r54) by using „ We can also split generated insn pattern into new insn patterns. RTL Generation Example 3a „ i = i+1; i = i+1; In adddi3.c While RTL long i; int main() { } one code Assembler code 1 Assembler code 2 Assembler code 3 Assembler Optimizations RTL Pattern 5 RTL Pattern 1 RTL Pattern 2’ name(standard not) for or Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (RTL) Register Transfer Language one RTL Pattern 3 RTL Pattern 4 RTL Pattern 1 RTL Pattern 2 RTL pattern. But one name can have multiple Tree Name 1 Name 2 Name 3 There can be only unique RTL patterns. Parse „ Name, RTL pattern and Assembler code Assember Code With or w/o debugging information

Assembler Code Generation

RTL Optimization

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL Generation

Tree Optimization IR : Register Transfer Language (RTL)

Parser GCC Compiler flow

For each function IR : Tree Representation ssembly code generation) r2 + r4 r1 + r3 Å Å

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory movdi ; After splitting standard name ( and just before A movdi ; movdi ; movdi ; movdi ; addsi3 ; r2 addc ; r1 s think previous example ’ Let „ Assembly code Generation Example 1a i = i+1; i = i+1; In adddi3.c long i; int main() { } . "ADC\t%0,%2", . "ADC\t%0,%2", const char * insn_template[] = { };

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (match_operand:SI (match_operand:SI Find asm output by RTL pattern matching .md In insn-output.c „ (reg:SI 18))) Assembly code Generation Example 1b (plus:SI (plus:SI (match_operand:SI (match_operand:SI (plus:SI (plus:SI (ltu:SI 2)) (match_dup 1) (match_dup (plus:SI machine (set (reg:SI(set 18) "" "ADC\\t%0,%2" [(set_attr "arith")]) "type" "ADC\\t%0,%2" [(set (match_operand:SI 0 "arith_reg_operand" "=r") "arith_reg_operand" 0 (match_operand:SI [(set (define_insn "addc" 1 "arith_reg_operand" "0") 1 "arith_reg_operand" "r")) 2 "arith_reg_operand" 1)))] (match_dup In . ADD r2,r4 ADC r1,r3 . In adddi3.s

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory ssembler code generation) r2 + r4 r1 + r3 Å Å Find asm output by RTL pattern matching „ Assembly code Generation Example 1c . . addsi3 ; r2 addc ; r1 After splitting standard name ( and just before A Assember Code With or w/o debugging information

Assembler Code Generation

RTL Optimization

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL Generation

Tree Optimization IR : Register Transfer Language (RTL)

Parser GCC Compiler flow

For each function IR : Tree Representation nd" "%0,0") rand" "=r,r")

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory 1 "arith_opera rith_operand" "r,I"))) "arith_reg_ope s think about two different RTL for (match_operand:SI 2 "a ’ (clobber (reg:SI 18))] (plus:SI (match_operand:SI Let one standard name [(set (match_operand:SI 0 ] "“ . (define_insn "addsi3" ) Optimization and RTL Example 1a „ Define register 18 Å Use register 18 Å

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory )) (match_operand:SI 2 "arith_reg_operand" "r")) "arith_reg_operand" 2 (match_operand:SI (match_operand:SI 2 "arith_operand" "r,I"))) 2 "arith_operand" (match_operand:SI (reg:SI 18) ] (plus:SI (match_operand:SI 1 "arith_operand" "%0,0") 1 "arith_operand" (match_operand:SI (plus:SI "0") "arith_reg_operand" 1 (match_operand:SI (plus:SI (plus:SI (ltu:SI 1)))] (match_dup 2)) (match_dup 1) (match_dup (plus:SI (clobber (reg:SI 18))] (set (reg:SI(set 18) [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") "arith_reg_operand" 0 (match_operand:SI [(set "=r") "arith_reg_operand" 0 (match_operand:SI [(set on In RTL form For same example as before Optimization and RTL Example 1b „ . addsi3 addc movesi movesi . addsi3 addc Before optimizati scheduling . addsi3 addc movsi movsi . After instruction With (clobber 18 t)

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory Sequence changed! scheduling Wrong !! . addc movsi addsi3 movsi . addc addsi3 For same example as before Without (clobber 18 t) After instruction Optimization and RTL Example 1b „ ” " “arith")]) " “arith")]) type type

define_attr [ . ] “” "ADC\\t%0,%2“ [(set_attr[(set_attr “ “ (define_insn “addc” ibutes for each RTL pattern Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory " "")]) " "yes,no" (const_string "no")) (const_string "yes,no" " y_slot" "yes")]) " "yes,no“ " “arith") (const_string "no")]) (const_string “arith") " type " "return") needs_delay_slot type needs_delay_slot in_delay_slot .md You can introduce attributes by using “ You can introduce attributes multiple attr You can assign Optimization and attributes 2a „ „ (cond [(eq_attr " (const_string “no”)) (const_string machine [ . ] “ . " . (set_attr "needs_dela (set_attr " "JMPD\\tR14%#“ [(set_attr "type" "return") [(set_attr " In (define_attr " (define_attr " (define_insn "return" " “arith")]) " “arith")]) type type [ . ] “” "ADC\\t%0,%2“ [(set_attr[(set_attr “ “ be “no” (define_insn “addc”

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory Set “in_delay_slot” attribute of ”addc” to " "yes")]) " "yes,no" (const_string "no")) (const_string "yes,no" " y_slot" "yes")]) " "yes,no“ " “arith") (const_string "no")]) (const_string “arith") " type " "return") needs_delay_slot type needs_delay_slot in_delay_slot .md Optimization and attributes 2b (cond [(eq_attr " (const_string “no”)) (const_string machine [ . ] “ . " . (set_attr "needs_dela (set_attr " "JMPD\\tR14%#“ [(set_attr "type" "return") [(set_attr " In (define_attr " (define_attr " (define_insn "return" ” " "yes")]) return “ " "return") type needs_delay_slot . (set_attr " [(set_attr " Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (define_insn "return“ ” " "yes") " "yes") (nil) (nil)]) can not be in delay slot of ” " "arith")]) " "return") addc type in_delay_slot “ type needs_delay_slot define_delay .md “ Finally you can specify delay slot scheduling policy by e.g. Optimization and attributes 2c machine . . . . [(set_attr " „ „ . ) (eq_attr " (eq_attr " [(eq_attr " In (define_delay (define_delay (define_insn “addc” Porting GCC Compiler Part III : Other things

Choi, Hyung-Kyu [email protected] July 17, 2003 Microprocessor Architecture and System Software Laboratory nd" "%0,0") rand" "=r,r") “arith_operand” 1 "arith_opera rith_operand" "r,I")))] "arith_reg_ope

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory (match_operand:SI 2 "a No, also check machine mode and predicate for operand (plus:SI (match_operand:SI „ [(set (match_operand:SI 0 Find RTL template by name? How and where should we define predicate? . ) (define_insn "addsi3" „ „ When generate RTL from Tree „ Not explained here 1a .c arithmetic insn. mode)

machine rce operand for an && CONST_OK_FOR_I (INTVAL (op)))

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory enum machine_mode (rtx op, .c return 1; return 1; arith_operand Predicate is C function with 2 arguments defined in machine if (arith_reg_operand (op, mode)) if (GET_CODE (op) == CONST_INT return 0; „ Not explained here 1b /* Returns 1 if OP is a valid sou { } */ int In GCC

immediate Genearated and automatically by r57 Å r56 + 1 r56 + #i #i mem(r55) mem(r55) s machine mode by ’ Å Å Å Å register Å movsi r54 ; movsi r55 ; movsi r56 ; addsi3 ; r57 movsi mem(r54) ; movsi r56 ; ArchitectureMicroprocessor SystemSoftware and Laboratory pattern to generate RTL ”

m accepts only ” mov “ addsi3 “ i = + 1; Automatically convert operand generating abort! If fails, just e.g. If operands int i; int main() { } „ „ „ What happen, if there is no matching RTL template? „ Not explained here 1c .md .h

machine

machine

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory RTL syntax Target Macros in Machine descriptions in etc „ „ „ „ In this presentation, only flow of GCC Compiler is explained. No implementation detail „ „ Not explained here 2 register “ in old version. t porting by this ” ’

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory delay slot “ , t have to consider optimization ” ’ You should modify core part of GCC Compiler. e.g. There was no support for e.g. There was no support for 1bit-register before, such as predicate register in Itanium window „ „ „ When new architecture feature is introduced, we can You don method explained before. every time. But sometimes you should consider! „ „ Limitation of GCC .h c . .md

machine machine

machine

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory insn-flag.h insn-emit.c insn-codes.h insn-output.c insn-recog.c, insn-extract.c, insn-attr.h, insn-attrtab.c DBX, SDB, DWARF, DWARF2, VMS are supported DWARF2, DBX, SDB, DWARF, Summary Generation Generation Assembler Code Debugging information RTL generation RTL Optimization t.com/ml/crossgcc/

Microprocessor ArchitectureMicroprocessor SystemSoftware and Laboratory http://sources.redha Especially Ch.7 ~ Ch.11 „ http://gcc.gnu.org/onlinedocs/ http://gcc.gnu.org archives : „ „ „ GCC Internals Manual GCC page GCC home crossgcc (mailing list) „ „ „ References