LLVM System

Lecture 3 • The LLVM Compiler Infrastructure - Provides reusable components for building Overview of the LLVM Compiler - Reduce the time/cost to build a new compiler - Build static compilers, JITs, trace-based optimizers, ...

• The LLVM Compiler Framework - End-to-end compilers using the LLVM infrastructure - and C++ are robust and aggressive: • Java, Scheme and others are in development - Emit C code or native code for , Sparc, PowerPC

Substantial portions courtesy of Gennady Pekhimenko, Olatunji Ruwase, , Vikram Adve, and David Koes

Carnegie Mellon Carnegie Mellon 2

Three primary LLVM components Tutorial Overview

• The LLVM Virtual Instruction Set • Introduction to the running example - The common language- and target-independent IR • LLVM C/C++ Compiler Overview - Internal (IR) and external (persistent) representation - High-level view of an example LLVM compiler • The LLVM Virtual Instruction Set • A collection of well-integrated libraries - IR overview and type-system - Analyses, optimizations, code generators, JIT compiler, garbage • The Pass Manager collection support, profiling, … • Important LLVM Tools - opt, code generator, JIT, test suite, bugpoint • A collection of tools built from the libraries - Assemblers, automatic , , code generator, compiler driver, modular optimizer, …

Carnegie Mellon Carnegie Mellon 3 4

1 Running Example: Argument Promotion Why is this hard?

Consider use of by-reference parameters: • Requires interprocedural analysis: - Must change the prototype of the callee int callee(const int &X) int callee(const int *X) { { return *X+1; // memory load - Must update all call sites à we must know all callers return X+1; } - What about callers outside the translation unit? } int caller() { • Requires alias analysis: int caller() { compiles to int tmp; // stack object return callee(4); tmp = 4; // memory store - Reference could alias other pointers in callee } return callee(&tmp); - Must know that loaded value doesn’t change from function entry to } the load We want: - Must know the pointer is not being stored through int callee(int X) { return X+1; • Reference might not be to a stack object! } int caller() { return • call procedure with constant argument callee(4); }

Carnegie Mellon Carnegie Mellon 5 6

Tutorial Overview The LLVM C/C++ Compiler

• Introduction to the running example • From a high-level perspective, it is a standard compiler: • LLVM C/C++ Compiler Overview - Compatible with standard makefiles - High-level view of an example LLVM compiler - Uses (or possibly GCC 4.2) C and C++ parser • The LLVM Virtual Instruction Set C file llvmgcc -emit-llvm .o file - IR overview and type-system llvm linker • The Pass Manager C++ file llvmg++ -emit-llvm .o file • Important LLVM Tools Link Time - opt, code generator, JIT, test suite, bugpoint • Distinguishing features: - Uses LLVM optimizers (not GCC optimizers) - .o files contain LLVM IR/, not - Executable can be bytecode (JIT’) or machine code

Carnegie Mellon Carnegie Mellon 7 8

2 Looking into events at compile-time Looking into events at link-time

.o file llvm linker executable C file llvmgcc .o file .o file

.o file C to LLVM Compile-time LLVM Link-time .bc file for LLVM JIT Frontend Optimizer .o file Linker Optimizer “cc1” “gccas” Native Code Native Backend LLVM IR LLVM 40 LLVM Analysis & LLVM .bc 20 LLVM Analysis & executable “ ” Parser Verifier Optimization File Optimization Passes llc Passes Writer C Code Native Optionally “internalizes”: C Compiler Clang (or modified GCC) Backend executable marks most functions as “gcc” Emits LLVM IR as text file internal, to improve IPO “llc –march=c” Lowers C AST to LLVM Dead Global Elimination, IP Constant Propagation, Dead Argument Elimination, Inlining, NOTE: Produces very ugly C. Officially deprecated, but still works fairly well. Reassociation, LICM, Loop Opts, Memory Perfect place for argument Promotion, Dead Store Elimination, ADCE, … promotion optimization! Link in native .o files and libraries here Carnegie Mellon Carnegie Mellon 9 10

Goals of the compiler design Tutorial Overview

• Analyze and optimize as early as possible: • Introduction to the running example - Compile-time opts reduce modify-rebuild-execute cycle • LLVM C/C++ Compiler Overview - Compile-time optimizations reduce work at link-time (by shrinking the - High-level view of an example LLVM compiler program) • The LLVM Virtual Instruction Set • All IPA/IPO make an open-world assumption - IR overview and type-system - Thus, they all work on libraries and at compile-time • The Pass Manager - “Internalize” pass enables “whole program” optzn • Important LLVM Tools • One IR (without lowering) for analysis & optzn - opt, code generator, JIT, test suite, bugpoint - Compile-time optzns can be run at link-time too! - The same IR is used as input to the JIT IR design is the key to these goals!

Carnegie Mellon Carnegie Mellon 11 12

3 Goals of LLVM Intermediate Representation (IR) LLVM Instruction Set Overview

• Easy to produce, understand, and define! • Low-level and target-independent semantics • Language- and Target-Independent - RISC-like three address code • One IR for analysis and optimization - Infinite virtual register set in SSA form - IR must be able to support aggressive IPO, loop opts, scalar opts, … - Simple, low-level control flow constructs high- and low-level optimization! - Load/store instructions with typed-pointers • Optimize as much as early as possible • IR has text, binary, and in-memory forms - Can’t postpone everything until link or runtime - No lowering in the IR! loop: ; preds = %bb0, %loop %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop

Carnegie Mellon Carnegie Mellon 13 14

LLVM Instruction Set Overview (Continued) LLVM Details

• High-level information exposed in the code • The entire type system consists of: - Explicit dataflow through SSA form - Primitives: label, void, float, integer, … • (more on SSA later in the course) • Arbitrary bitwidth integers (i1, i32, i64) - Explicit control-flow graph (even for exceptions) - Derived: pointer, array, structure, function - Explicit language-independent type-information - No high-level types: type-system is language neutral! - Explicit typed pointer arithmetic • Preserve array subscript and structure indexing • Type system allows arbitrary casts: loop: ; preds = %bb0, %loop - Allows expressing weakly-typed languages, like C %i.1 = phi i32 [ 0, %bb0 ], [ %i.2, %loop ] - Front-ends can implement safe languages %AiAddr = getelementptr float* %A, i32 %i.1 for (i = 0; i < N; i++) - Also easy to define a type-safe subset of LLVM call void @Sum(float %AiAddr, %pair* %P) Sum(&A[i], &P); %i.2 = add i32 %i.1, 1 %exitcond = icmp eq i32 %i.1, %N br i1 %exitcond, label %outloop, label %loop See also: docs/LangRef.html

Carnegie Mellon Carnegie Mellon 15 16

4 Lowering source-level types to LLVM LLVM Program Structure

• Source language types are lowered: • Module contains Functions/GlobalVariables - Rich type systems expanded to simple type system - Module is unit of compilation/analysis/optimization - Implicit & abstract types are made explicit & concrete • Function contains BasicBlocks/Arguments • Examples of lowering: - Functions roughly correspond to functions in C - References turn into pointers: T& à T* • BasicBlock contains list of instructions - Complex numbers: complex float à { float, float } - Each block ends in a control flow instruction - Bitfields: struct X { int Y:4; int Z:2; } à { i32 } • Instruction is opcode + vector of operands - Inheritance: class T : S { int X; } à { S, i32 } - All operands have types - Methods: class T { void foo(); } à void foo(T*) - Instruction result is typed • Same idea as lowering to machine code

Carnegie Mellon Carnegie Mellon 17 18

Our Example, Compiled to LLVM Our Example, Compiled to LLVM int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } }

• Stack allocation is explicit in LLVM

Carnegie Mellon Carnegie Mellon 19 20

5 Our Example, Compiled to LLVM Our Example, Compiled to LLVM

int callee(const int *X) { internal int %callee(int* %X) { int callee(const int *X) { internal int %callee(int* %X) { return *X+1; // load %tmp.1 = load int* %X return *X+1; // load %tmp.1 = load int* %X } %tmp.2 = add int %tmp.1, 1 } %tmp.2 = add int %tmp.1, 1 int caller() { ret int %tmp.2 int caller() { ret int %tmp.2 int T; // on stack } int T; // on stack } T = 4; // store int %caller() { T = 4; // store int %caller() { return callee(&T); %T = alloca int return callee(&T); %T = alloca int } store int 4, int* %T } store int 4, int* %T %tmp.3 = call int %callee(int* %T) %tmp.3 = call int %callee(int* %T) ret int %tmp.3 ret int %tmp.3 } }

• All loads and stores are explicit in the LLVM representation • Linker “internalizes” most functions in most cases

Carnegie Mellon Carnegie Mellon 21 22

Our Example: Desired Transformation Tutorial Overview internal int %callee(int* %X) { internal int %callee(int %X.val) { %tmp.1 = load int* %X %tmp.2 = add int %X.val, 1 • Introduction to the running example %tmp.2 = add int %tmp.1, 1 ret int %tmp.2 ret int %tmp.2 } • LLVM C/C++ Compiler Overview } int %caller() { int %caller() { %tmp.3 = call int %callee(int 4) - High-level view of an example LLVM compiler %T = alloca int ret int %tmp.3 • The LLVM Virtual Instruction Set store int 4, int* %T } %tmp.3 = call int %callee(int* %T) - IR overview and type-system ret int %tmp.3 • } The Pass Manager • Important LLVM Tools • Change the prototype for the function - opt, code generator, JIT, test suite, bugpoint • Update all call sites of “callee” • Other transformation (-mem2reg) cleans up the rest

Carnegie Mellon Carnegie Mellon 23 24

6 LLVM Coding Basics LLVM Pass Manager

• Written in modern C++, uses the STL: • Compiler is organized as a series of “passes”: - Particularly the vector, set, and map classes - Each pass is one analysis or transformation • Four types of passes: • LLVM IR is almost all doubly-linked lists: - ModulePass: general interprocedural pass - Module contains lists of Functions & GlobalVariables - CallGraphSCCPass: bottom-up on the call graph - Function contains lists of BasicBlocks & Arguments - FunctionPass: process a function at a time - BasicBlock contains list of Instructions - BasicBlockPass: process a basic block at a time • Constraints imposed (e.g. FunctionPass): • Linked lists are traversed with iterators: - FunctionPass can only look at “current function” Function *M = … for (Function::iterator I = M->begin(); I != M->end(); ++I) { - Cannot maintain state across functions BasicBlock &BB = *I; ... See also: docs/ProgrammersManual.html See also: docs/WritingAnLLVMPass.html

Carnegie Mellon Carnegie Mellon 25 26

Services provided by PassManager Tutorial Overview

• Optimization of pass : • Introduction to the running example - Process a function at a time instead of a pass at a time • LLVM C/C++ Compiler Overview - Example: three functions, F, G, H in input program, and two passes X - High-level view of an example LLVM compiler & Y: • The LLVM Virtual Instruction Set “ ” “ X(F)Y(F) X(G)Y(G) X(H)Y(H) not X(F)X(G)X(H) - IR overview and type-system Y(F)Y(G)Y(H)” • The Pass Manager - Process functions in parallel on an SMP (future work) • Important LLVM Tools • Declarative dependency management: - opt, code generator, JIT, test suite, bugpoint - Automatically fulfill and manage analysis pass lifetimes - Share analyses between passes when safe: • e.g. “DominatorSet live unless pass modifies CFG” • Avoid boilerplate for traversal of program

Carnegie Mellon Carnegie Mellon See also: docs/WritingAnLLVMPass.html27 28

7 LLVM tools: two flavors opt tool: LLVM modular optimizer

• “Primitive” tools: do a single job • Invoke arbitrary sequence of passes: - llvm-as: Convert from .ll (text) to .bc (binary) - Completely control PassManager from command line - llvm-dis: Convert from .bc (binary) to .ll (text) - Supports loading passes as plugins from .so files - llvm-link: Link multiple .bc files together opt -load foo.so -pass1 -pass2 -pass3 x.bc -o y.bc - llvm-prof: Print profile output to human readers • Passes “register” themselves: - RegisterPass X("simpleargpromotion", llvmc: Configurable compiler driver "Promote 'by reference' arguments to 'by value'"); • Aggregate tools: pull in multiple features • Standard mechanism for obtaining parameters - gccas/gccld: Compile/link-time optimizers for C/C++ FE opt StringVar(“sv", cl::desc(“Long description of param"), - bugpoint: automatic compiler debugger cl::value_desc(“long_flag")); - llvm-gcc/llvm-g++: C/C++ compilers • From this, they are exposed through opt: > opt -load libsimpleargpromote.so –help ... -sccp - Sparse Conditional Constant Propagation -simpleargpromotion - Promote 'by reference' arguments to 'by -simplifycfg - Simplify the CFG See also: docs/CommandGuide / ...

Carnegie Mellon Carnegie Mellon 29 30

Assignment 1 - Practice Assignment 1 - Questions

• Introduction to LLVM n Building Control Flow Graph - Install and play with it n Data Flow Analysis • Learn interesting program properties v Available Expressions - Functions: name, arguments, return types, local or global n Apply existing analysis - Compute live values using iterative dataflow analysis v New Dataflow Analysis

Carnegie Mellon Carnegie Mellon 31 32

8 Questions?

• Thank you

Carnegie Mellon 33

9