Lecture 3 Overview of the LLVM Compiler

Lecture 3 Overview of the LLVM Compiler

LLVM Compiler System Lecture 3 • The LLVM Compiler Infrastructure - Provides reusable components for building compilers Overview of the LLVM Compiler - Reduce the time/cost to build a new compiler - Build static compilers, JITs, trace-based optimizers, ... • The LLVM Compiler Framework - End-to-end compilers using the LLVM infrastructure - C and C++ are robust and aggressive: • Java, Scheme and others are in development - Emit C code or native code for X86, Sparc, PowerPC Substantial portions courtesy of Gennady Pekhimenko, Olatunji Ruwase, Chris Lattner, Vikram Adve, and David Koes Carnegie Mellon Carnegie Mellon 2 Three primary LLVM components Tutorial Overview • The LLVM Virtual Instruction Set • Introduction to the running example - The common language- and target-independent IR • LLVM C/C++ Compiler Overview - Internal (IR) and external (persistent) representation - High-level view of an example LLVM compiler • The LLVM Virtual Instruction Set • A collection of well-integrated libraries - IR overview and type-system - Analyses, optimizations, code generators, JIT compiler, garbage • The Pass Manager collection support, profiling, … • Important LLVM Tools - opt, code generator, JIT, test suite, bugpoint • A collection of tools built from the libraries - Assemblers, automatic debugger, linker, code generator, compiler driver, modular optimizer, … Carnegie Mellon Carnegie Mellon 3 4 1 Running Example: Argument Promotion Why is this hard? Consider use of by-reference parameters: • Requires interprocedural analysis: - Must change the prototype of the callee int callee(const int &X) int callee(const int *X) { { return *X+1; // memory load - Must update all call sites à we must know all callers return X+1; } - What about callers outside the translation unit? } int caller() { • Requires alias analysis: int caller() { compiles to int tmp; // stack object return callee(4); tmp = 4; // memory store - Reference could alias other pointers in callee } return callee(&tmp); - Must know that loaded value doesn’t change from function entry to } the load We want: - Must know the pointer is not being stored through int callee(int X) { return X+1; • Reference might not be to a stack object! } int caller() { return • call procedure with constant argument callee(4); } Carnegie Mellon Carnegie Mellon 5 6 Tutorial Overview The LLVM C/C++ Compiler • Introduction to the running example • From a high-level perspective, it is a standard compiler: • LLVM C/C++ Compiler Overview - Compatible with standard makefiles - High-level view of an example LLVM compiler - Uses Clang (or possibly GCC 4.2) C and C++ parser • The LLVM Virtual Instruction Set C file llvmgcc -emit-llvm .o file - IR overview and type-system llvm linker executable • The Pass Manager C++ file llvmg++ -emit-llvm .o file • Important LLVM Tools Compile Time Link Time - opt, code generator, JIT, test suite, bugpoint • Distinguishing features: - Uses LLVM optimizers (not GCC optimizers) - .o files contain LLVM IR/bytecode, not machine code - Executable can be bytecode (JIT’d) or machine code Carnegie Mellon Carnegie Mellon 7 8 2 Looking into events at compile-time Looking into events at link-time .o file llvm linker executable C file llvmgcc .o file .o file .o file C to LLVM Compile-time LLVM Link-time .bc file for LLVM JIT Frontend Optimizer .o file Linker Optimizer “cc1” “gccas” Native Code Native Backend LLVM IR LLVM 40 LLVM Analysis & LLVM .bc 20 LLVM Analysis & executable “ ” Parser Verifier Optimization File Optimization Passes llc Passes Writer C Code Native Optionally “internalizes”: C Compiler Clang (or modified GCC) Backend executable marks most functions as “gcc” Emits LLVM IR as text file internal, to improve IPO “llc –march=c” Lowers C AST to LLVM Dead Global Elimination, IP Constant Propagation, Dead Argument Elimination, Inlining, NOTE: Produces very ugly C. Officially deprecated, but still works fairly well. Reassociation, LICM, Loop Opts, Memory Perfect place for argument Promotion, Dead Store Elimination, ADCE, … promotion optimization! Link in native .o files and libraries here Carnegie Mellon Carnegie Mellon 9 10 Goals of the compiler design Tutorial Overview • Analyze and optimize as early as possible: • Introduction to the running example - Compile-time opts reduce modify-rebuild-execute cycle • LLVM C/C++ Compiler Overview - Compile-time optimizations reduce work at link-time (by shrinking the - High-level view of an example LLVM compiler program) • The LLVM Virtual Instruction Set • All IPA/IPO make an open-world assumption - IR overview and type-system - Thus, they all work on libraries and at compile-time • The Pass Manager - “Internalize” pass enables “whole program” optzn • Important LLVM Tools • One IR (without lowering) for analysis & optzn - opt, code generator, JIT, test suite, bugpoint - Compile-time optzns can be run at link-time too! - The same IR is used as input to the JIT IR design is the key to these goals! Carnegie Mellon Carnegie Mellon 11 12 3 Goals of LLVM Intermediate Representation (IR) LLVM Instruction Set Overview • Easy to produce, understand, and define! • Low-level and target-independent semantics • Language- and Target-Independent - RISC-like three address code • One IR for analysis and optimization - Infinite virtual register set in SSA form - IR must be able to support aggressive IPO, loop opts, scalar opts, … - Simple, low-level control flow constructs high- and low-level optimization! - Load/store instructions with typed-pointers • Optimize as much as early as possible • IR has text, binary, and in-memory forms - Can’t postpone everything until link or runtime - No lowering in the IR! loop: ; preds = %bb0, %loop %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop Carnegie Mellon Carnegie Mellon 13 14 LLVM Instruction Set Overview (Continued) LLVM Type System Details • High-level information exposed in the code • The entire type system consists of: - Explicit dataflow through SSA form - Primitives: label, void, float, integer, … • (more on SSA later in the course) • Arbitrary bitwidth integers (i1, i32, i64) - Explicit control-flow graph (even for exceptions) - Derived: pointer, array, structure, function - Explicit language-independent type-information - No high-level types: type-system is language neutral! - Explicit typed pointer arithmetic • Preserve array subscript and structure indexing • Type system allows arbitrary casts: loop: ; preds = %bb0, %loop - Allows expressing weakly-typed languages, like C %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] - Front-ends can implement safe languages %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) - Also easy to define a type-safe subset of LLVM call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop See also: docs/LangRef.html Carnegie Mellon Carnegie Mellon 15 16 4 Lowering source-level types to LLVM LLVM Program Structure • Source language types are lowered: • Module contains Functions/GlobalVariables - Rich type systems expanded to simple type system - Module is unit of compilation/analysis/optimization - Implicit & abstract types are made explicit & concrete • Function contains BasicBlocks/Arguments • Examples of lowering: - Functions roughly correspond to functions in C - References turn into pointers: T& à T* • BasicBlock contains list of instructions - Complex numbers: complex float à { float, float } - Each block ends in a control flow instruction - Bitfields: struct X { int Y:4; int Z:2; } à { i32 } • Instruction is opcode + vector of operands - Inheritance: class T : S { int X; } à { S, i32 } - All operands have types - Methods: class T { void foo(); } à void foo(T*) - Instruction result is typed • Same idea as lowering to machine code Carnegie Mellon Carnegie Mellon 17 18 Our Example, Compiled to LLVM Our Example, Compiled to LLVM int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } } • Stack allocation is explicit in LLVM Carnegie Mellon Carnegie Mellon 19 20 5 Our Example, Compiled to LLVM Our Example, Compiled to LLVM int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } } • All

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us