Lecture 3 Overview of the LLVM Compiler

LLVM Compiler System Lecture 3 • The LLVM Compiler Infrastructure - Provides reusable components for building compilers Overview of the LLVM Compiler - Reduce the time/cost to build a new compiler - Build static compilers, JITs, trace-based optimizers, ... • The LLVM Compiler Framework - End-to-end compilers using the LLVM infrastructure - C and C++ are robust and aggressive: • Java, Scheme and others are in development - Emit C code or native code for X86, Sparc, PowerPC Substantial portions courtesy of Gennady Pekhimenko, Olatunji Ruwase, Chris Lattner, Vikram Adve, and David Koes Carnegie Mellon Carnegie Mellon 2 Three primary LLVM components Tutorial Overview • The LLVM Virtual Instruction Set • Introduction to the running example - The common language- and target-independent IR • LLVM C/C++ Compiler Overview - Internal (IR) and external (persistent) representation - High-level view of an example LLVM compiler • The LLVM Virtual Instruction Set • A collection of well-integrated libraries - IR overview and type-system - Analyses, optimizations, code generators, JIT compiler, garbage • The Pass Manager collection support, profiling, … • Important LLVM Tools - opt, code generator, JIT, test suite, bugpoint • A collection of tools built from the libraries - Assemblers, automatic debugger, linker, code generator, compiler driver, modular optimizer, … Carnegie Mellon Carnegie Mellon 3 4 1 Running Example: Argument Promotion Why is this hard? Consider use of by-reference parameters: • Requires interprocedural analysis: - Must change the prototype of the callee int callee(const int &X) int callee(const int *X) { { return *X+1; // memory load - Must update all call sites à we must know all callers return X+1; } - What about callers outside the translation unit? } int caller() { • Requires alias analysis: int caller() { compiles to int tmp; // stack object return callee(4); tmp = 4; // memory store - Reference could alias other pointers in callee } return callee(&tmp); - Must know that loaded value doesn’t change from function entry to } the load We want: - Must know the pointer is not being stored through int callee(int X) { return X+1; • Reference might not be to a stack object! } int caller() { return • call procedure with constant argument callee(4); } Carnegie Mellon Carnegie Mellon 5 6 Tutorial Overview The LLVM C/C++ Compiler • Introduction to the running example • From a high-level perspective, it is a standard compiler: • LLVM C/C++ Compiler Overview - Compatible with standard makefiles - High-level view of an example LLVM compiler - Uses Clang (or possibly GCC 4.2) C and C++ parser • The LLVM Virtual Instruction Set C file llvmgcc -emit-llvm .o file - IR overview and type-system llvm linker executable • The Pass Manager C++ file llvmg++ -emit-llvm .o file • Important LLVM Tools Compile Time Link Time - opt, code generator, JIT, test suite, bugpoint • Distinguishing features: - Uses LLVM optimizers (not GCC optimizers) - .o files contain LLVM IR/bytecode, not machine code - Executable can be bytecode (JIT’d) or machine code Carnegie Mellon Carnegie Mellon 7 8 2 Looking into events at compile-time Looking into events at link-time .o file llvm linker executable C file llvmgcc .o file .o file .o file C to LLVM Compile-time LLVM Link-time .bc file for LLVM JIT Frontend Optimizer .o file Linker Optimizer “cc1” “gccas” Native Code Native Backend LLVM IR LLVM 40 LLVM Analysis & LLVM .bc 20 LLVM Analysis & executable “ ” Parser Verifier Optimization File Optimization Passes llc Passes Writer C Code Native Optionally “internalizes”: C Compiler Clang (or modified GCC) Backend executable marks most functions as “gcc” Emits LLVM IR as text file internal, to improve IPO “llc –march=c” Lowers C AST to LLVM Dead Global Elimination, IP Constant Propagation, Dead Argument Elimination, Inlining, NOTE: Produces very ugly C. Officially deprecated, but still works fairly well. Reassociation, LICM, Loop Opts, Memory Perfect place for argument Promotion, Dead Store Elimination, ADCE, … promotion optimization! Link in native .o files and libraries here Carnegie Mellon Carnegie Mellon 9 10 Goals of the compiler design Tutorial Overview • Analyze and optimize as early as possible: • Introduction to the running example - Compile-time opts reduce modify-rebuild-execute cycle • LLVM C/C++ Compiler Overview - Compile-time optimizations reduce work at link-time (by shrinking the - High-level view of an example LLVM compiler program) • The LLVM Virtual Instruction Set • All IPA/IPO make an open-world assumption - IR overview and type-system - Thus, they all work on libraries and at compile-time • The Pass Manager - “Internalize” pass enables “whole program” optzn • Important LLVM Tools • One IR (without lowering) for analysis & optzn - opt, code generator, JIT, test suite, bugpoint - Compile-time optzns can be run at link-time too! - The same IR is used as input to the JIT IR design is the key to these goals! Carnegie Mellon Carnegie Mellon 11 12 3 Goals of LLVM Intermediate Representation (IR) LLVM Instruction Set Overview • Easy to produce, understand, and define! • Low-level and target-independent semantics • Language- and Target-Independent - RISC-like three address code • One IR for analysis and optimization - Infinite virtual register set in SSA form - IR must be able to support aggressive IPO, loop opts, scalar opts, … - Simple, low-level control flow constructs high- and low-level optimization! - Load/store instructions with typed-pointers • Optimize as much as early as possible • IR has text, binary, and in-memory forms - Can’t postpone everything until link or runtime - No lowering in the IR! loop: ; preds = %bb0, %loop %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop Carnegie Mellon Carnegie Mellon 13 14 LLVM Instruction Set Overview (Continued) LLVM Type System Details • High-level information exposed in the code • The entire type system consists of: - Explicit dataflow through SSA form - Primitives: label, void, float, integer, … • (more on SSA later in the course) • Arbitrary bitwidth integers (i1, i32, i64) - Explicit control-flow graph (even for exceptions) - Derived: pointer, array, structure, function - Explicit language-independent type-information - No high-level types: type-system is language neutral! - Explicit typed pointer arithmetic • Preserve array subscript and structure indexing • Type system allows arbitrary casts: loop: ; preds = %bb0, %loop - Allows expressing weakly-typed languages, like C %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] - Front-ends can implement safe languages %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) - Also easy to define a type-safe subset of LLVM call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop See also: docs/LangRef.html Carnegie Mellon Carnegie Mellon 15 16 4 Lowering source-level types to LLVM LLVM Program Structure • Source language types are lowered: • Module contains Functions/GlobalVariables - Rich type systems expanded to simple type system - Module is unit of compilation/analysis/optimization - Implicit & abstract types are made explicit & concrete • Function contains BasicBlocks/Arguments • Examples of lowering: - Functions roughly correspond to functions in C - References turn into pointers: T& à T* • BasicBlock contains list of instructions - Complex numbers: complex float à { float, float } - Each block ends in a control flow instruction - Bitfields: struct X { int Y:4; int Z:2; } à { i32 } • Instruction is opcode + vector of operands - Inheritance: class T : S { int X; } à { S, i32 } - All operands have types - Methods: class T { void foo(); } à void foo(T*) - Instruction result is typed • Same idea as lowering to machine code Carnegie Mellon Carnegie Mellon 17 18 Our Example, Compiled to LLVM Our Example, Compiled to LLVM int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } } • Stack allocation is explicit in LLVM Carnegie Mellon Carnegie Mellon 19 20 5 Our Example, Compiled to LLVM Our Example, Compiled to LLVM int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } } • All

Lecture 3 Overview of the LLVM Compiler

The LLVM Instruction Set and Compilation Strategy

What Is LLVM? and a Status Update

CUDA Flux: a Lightweight Instruction Profiler for CUDA Applications

Implementing Continuation Based Language in LLVM and Clang

Code Transformation and Analysis Using Clang and LLVM Static and Dynamic Analysis

Sham: a DSL for Fast Dsls

Introduction to LLVM (Low Level Virtual Machine)

Polymorphisation: Improving Rust Compilation Times Through

Formalizing the LLVM Intermediate Representation for Verified Program Transformations

“Which Targets Does Clang Support?”

Bringing Next Generation C++ to Gpus

Llvmlinux: the Linux Kernel with Dragon Wings