Attacking Client-Side JIT Compilers.Key

Total Page:16

File Type:pdf, Size:1020Kb

Attacking Client-Side JIT Compilers.Key Attacking Client-Side JIT Compilers Samuel Groß (@5aelo) !1 A JavaScript Engine Parser JIT Compiler Interpreter Runtime Garbage Collector !2 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !3 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !4 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !5 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !6 Agenda 1. Background: Runtime Parser • Object representation and Builtins JIT Compiler 2. JIT Compiler Internals • Problem: missing type information • Solution: "speculative" JIT Interpreter 3. JIT Compiler Attack Surface Runtime • Different vulnerability categories 4. CVE-2018-4233 (Pwn2Own) Garbage Collector • Typical JIT Bug in JavaScriptCore !7 The Runtime !8 Builtins A "builtin": a function exposed to script which is implemented by the engine itself* var a = [ 1, 2, 3 ]; a.slice(1, 2); // [ 2 ] * definition for this talk !9 Builtins A "builtin": a function exposed to script which is implemented by the engine itself* var a = [ 1, 2, 3 ]; a.slice(1, 2); // [ 2 ] Engine can implement builtins in various ways: in C++, in JavaScript, in assembly, in its JIT compiler IL (v8 turbofan builtins), ... * definition for this talk !10 Builtins var a = [ 1, 2, 3 ]; a.slice(1, 2); // [ 2 ] EncodedJSValue JSC_HOST_CALL arrayProtoFuncSlice(ExecState* exec) { // https://tc39.github.io/ecma262/#sec-array.prototype.slice VM& vm = exec->vm(); auto scope = DECLARE_THROW_SCOPE(vm); ...; !11 Builtins Builtins historically the source of many bugs • Unexpected callbacks var a = [ 1, 2, 3 ]; • Integer related issues a.slice(1, 2); • Use-after-frees (missing GC rooting) // [ 2 ] • ... EncodedJSValue JSC_HOST_CALL arrayProtoFuncSlice(ExecState* exec) { // https://tc39.github.io/ecma262/#sec-array.prototype.slice VM& vm = exec->vm(); auto scope = DECLARE_THROW_SCOPE(vm); ...; !12 var a = 42; JSValues a = "foo"; var o = {}; a = {}; o.a = 42; o.a = "foo"; o.a = {}; • JavaScript is dynamically typed => Type information stored in runtime values, not compile time variables • Challenge: efficiently store type information and value information together • Solution: clever hacks to fit both into 8 bytes (a single CPU register) !13 JSValues • Common approaches: NaN-boxing and pointer tagging • For this talk we'll use the pointer tagging scheme from v8: • 1-bit cleared: it's a "SMI", a SMall Integer (32 bits) • 1-bit set: it's a pointer to some object, can be dereferenced 0x0000004200000000 0x00000e0359b8e611 1-bit cleared => a SMI 1-bit set => a pointer to an object located Payload in the upper 32 bits (0x42) at address 0x00000e0359b8e610 !14 JSObjects var p1 = { x: 0x41, y: 0x42 }; !15 JSObjects var p1 = { x: 0x41, y: 0x42 }; Object 1 map<String, JSValue> or similar - properties: "x" -> 0x41 "y" -> 0x42 ??? !16 JSObjects var p1 = { x: 0x41, y: 0x42 }; Object 1 map<String, JSValue> or similar - properties: "x" -> 0x41 "y" -> 0x42 ??? !17 JSObjects Idea: separate property names from property values Shape* object stores property names and their location in the object var o = { x: 0x41, y: 0x42 }; * Abstract name used for this talk, does not refer to a specific implementation !18 JSObjects Idea: separate property names from property values Shape* object stores property names and their location in the object var o = { x: 0x41, Object 1 y: 0x42 }; - properties: "x" -> 0x41 "y" -> 0x42 * Abstract name used for this talk, does not refer to a specific implementation !19 JSObjects Idea: separate property names from property values Shape* object stores property names and their location in the object var o = { Shape 1 x: 0x41, Object 1 y: 0x42 - properties: }; "x" -> slot 0 "y" -> slot 1 - properties: Object 1 "x" -> 0x41 "y" -> 0x42 - shape - slots: 0: 0x41 * Abstract name used for this talk, does 1: 0x42 not refer to a specific implementation !20 Benefit: Shape Sharing Shape 1 - properties: var o1 = { "x" -> slot 0 o1 x: 0x41, "y" -> slot 1 y: 0x42 - shape }; - slots: 0: 0x41 1: 0x42 !21 Benefit: Shape Sharing Shape 1 - properties: var o1 = { "x" -> slot 0 o1 x: 0x41, "y" -> slot 1 y: 0x42 - shape }; - slots: var o2 = { 0: 0x41 x: 0x1337, 1: 0x42 o2 y: 0x1338 }; - shape - slots: 0: 0x1337 1: 0x1338 !22 Benefit: Shape Sharing Shape 1 Shape is shared between similar objects! - properties: var o1 = { "x" -> slot 0 o1 x: 0x41, "y" -> slot 1 y: 0x42 - shape }; - slots: var o2 = { 0: 0x41 x: 0x1337, 1: 0x42 o2 y: 0x1338 }; - shape - slots: 0: 0x1337 1: 0x1338 !23 Benefit: Shape Sharing var o1 = { o1 ??? x: 0x41, y: 0x42 - shape }; - slots: var o2 = { 0: 0x41 x: 0x1337, 1: 0x42 o2 y: 0x1338 2: 0x43 }; - shape o1.z = 0x43; - slots: 0: 0x1337 1: 0x1338 !24 Benefit: Shape Sharing Shapes are immutable so Shape 2 a new Shape is created! - properties: "x" -> slot 0 var o1 = { "y" -> slot 1 o1 x: 0x41, "z" -> slot 2 Shape 1 y: 0x42 - shape }; - slots: - properties: var o2 = { 0: 0x41 "x" -> slot 0 "y" -> slot 1 x: 0x1337, 1: 0x42 o2 y: 0x1338 2: 0x43 }; - shape o1.z = 0x43; - slots: 0: 0x1337 1: 0x1338 !25 Benefit: Shape Sharing Shape 2 - properties: "x" -> slot 0 var o1 = { "y" -> slot 1 o1 x: 0x41, "z" -> slot 2 y: 0x42 - shape }; - slots: var o2 = { 0: 0x41 o2 x: 0x1337, 1: 0x42 y: 0x1338 2: 0x43 - shape }; - slots: o1.z = 0x43; 0: 0x1337 o2.z = 0x1339; 1: 0x1338 2: 0x1339 !26 Object Example: v8 var o = { x: 0x41, y: 0x42 }; o.z = 0x43; o[0] = 0x1337; o[1] = 0x1338; Underlined: v8::Map pointer Green: Inline properties Red: Out-of-line Properties Blue: Elements !27 Object Example: v8 Shape (called "Map" in v8) (lldb) x/5gx 0xe0359b8e610 var o = { 0xe0359b8e610: 0x00000e034a80d309 0x00000e0359b90601 x: 0x41, 0xe0359b8e620: 0x00000e0359b90699 0x0000004100000000 y: 0x42 0xe0359b8e630: 0x0000004200000000 }; o.z = 0x43; o[0] = 0x1337; o[1] = 0x1338; Underlined: v8::Map pointer Green: Inline properties Red: Out-of-line Properties Blue: Elements !28 Object Example: v8 Shape (called "Map" in v8) (lldb) x/5gx 0xe0359b8e610 var o = { 0xe0359b8e610: 0x00000e034a80d309 0x00000e0359b90601 x: 0x41, 0xe0359b8e620: 0x00000e0359b90699 0x0000004100000000 y: 0x42 0xe0359b8e630: 0x0000004200000000 }; o.z = 0x43; (lldb) x/3gx 0x00000e0359b90600 o[0] = 0x1337; 0xe0359b90600: 0x00000e034ee836f9 0x0000000300000000 o[1] = 0x1338; 0xe0359b90610: 0x0000004300000000 Underlined: v8::Map pointer Green: Inline properties Red: Out-of-line Properties Blue: Elements !29 Object Example: v8 Shape (called "Map" in v8) (lldb) x/5gx 0xe0359b8e610 var o = { 0xe0359b8e610: 0x00000e034a80d309 0x00000e0359b90601 x: 0x41, 0xe0359b8e620: 0x00000e0359b90699 0x0000004100000000 y: 0x42 0xe0359b8e630: 0x0000004200000000 }; o.z = 0x43; (lldb) x/3gx 0x00000e0359b90600 o[0] = 0x1337; 0xe0359b90600: 0x00000e034ee836f9 0x0000000300000000 o[1] = 0x1338; 0xe0359b90610: 0x0000004300000000 (lldb) x/4gx 0x00000e0359b90698 Underlined: v8::Map pointer 0xe0359b90698: 0x00000e034ee82361 0x0000001100000000 Green: Inline properties 0xe0359b906a8: 0x0000133700000000 0x0000133800000000 Red: Out-of-line Properties Blue: Elements !30 Summary Objects In all major engines, a JavaScript object roughly consists of: • A reference to a Shape and Group/Map/Structure/Type instance • Immutable and shared between similar objects • Stores name and location of properties, element kind, prototype, ... => "describes" the object • Inline property slots • Out-of-line property slots • Out-of-line buffer for elements • Possibly additional, type-specific fields (e.g. data pointer in TypedArrays) !31 (Speculative) JIT Compilers !32 Interpreter vs. JIT Compiler • Usually execution starts in Interpreter JIT Compiler the interpreter • After a certain number of Code Speed - + invocations a function becomes "hot" and is compiled to machine code Startup Time + - • Afterwards execution switches to the machine Memory code instead of the Footprint + - interpreter !33 Introduction How to compile this code? int add(int a, int b) { return a + b; } !34 Introduction How to compile this code? ; add(int, int): lea eax, [rdi+rsi] ret int add(int a, int b) { return a + b; Easy: } • Know parameter types • Know ABI Try this at home: https://godbolt.org/ !35 Introduction How to compile this code? function add(a, b) { return a + b; } !36 Introduction How to
Recommended publications
  • A Methodology for Assessing Javascript Software Protections
    A methodology for Assessing JavaScript Software Protections Pedro Fortuna A methodology for Assessing JavaScript Software Protections Pedro Fortuna About me Pedro Fortuna Co-Founder & CTO @ JSCRAMBLER OWASP Member SECURITY, JAVASCRIPT @pedrofortuna 2 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Agenda 1 4 7 What is Code Protection? Testing Resilience 2 5 Code Obfuscation Metrics Conclusions 3 6 JS Software Protections Q & A Checklist 3 What is Code Protection Part 1 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property Protection Legal or Technical Protection? Alice Bob Software Developer Reverse Engineer Sells her software over the Internet Wants algorithms and data structures Does not need to revert back to original source code 5 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property IP Protection Protection Legal Technical Encryption ? Trusted Computing Server-Side Execution Obfuscation 6 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Code Obfuscation Obfuscation “transforms a program into a form that is more difficult for an adversary to understand or change than the original code” [1] More Difficult “requires more human time, more money, or more computing power to analyze than the original program.” [1] in Collberg, C., and Nagra, J., “Surreptitious software: obfuscation, watermarking, and tamperproofing for software protection.”, Addison- Wesley Professional, 2010. 7 A methodology for Assessing
    [Show full text]
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Quaxe, Infinity and Beyond
    Quaxe, infinity and beyond Daniel Glazman — WWX 2015 /usr/bin/whoami Primary architect and developer of the leading Web and Ebook editors Nvu and BlueGriffon Former member of the Netscape CSS and Editor engineering teams Involved in Internet and Web Standards since 1990 Currently co-chair of CSS Working Group at W3C New-comer in the Haxe ecosystem Desktop Frameworks Visual Studio (Windows only) Xcode (OS X only) Qt wxWidgets XUL Adobe Air Mobile Frameworks Adobe PhoneGap/Air Xcode (iOS only) Qt Mobile AppCelerator Visual Studio Two solutions but many issues Fragmentation desktop/mobile Heavy runtimes Can’t easily reuse existing c++ libraries Complex to have native-like UI Qt/QtMobile still require c++ Qt’s QML is a weak and convoluted UI language Haxe 9 years success of Multiplatform OSS language Strong affinity to gaming Wide and vibrant community Some press recognition Dead code elimination Compiles to native on all But no native GUI… platforms through c++ and java Best of all worlds Haxe + Qt/QtMobile Multiplatform Native apps, native performance through c++/Java C++/Java lib reusability Introducing Quaxe Native apps w/o c++ complexity Highly dynamic applications on desktop and mobile Native-like UI through Qt HTML5-based UI, CSS-based styling Benefits from Haxe and Qt communities Going from HTML5 to native GUI completeness DOM dynamism in native UI var b: Element = document.getElementById("thirdButton"); var t: Element = document.createElement("input"); t.setAttribute("type", "text"); t.setAttribute("value", "a text field"); b.parentNode.insertBefore(t,
    [Show full text]
  • 1 More Register Allocation Interference Graph Allocators
    More Register Allocation Last time – Register allocation – Global allocation via graph coloring Today – More register allocation – Clarifications from last time – Finish improvements on basic graph coloring concept – Procedure calls – Interprocedural CS553 Lecture Register Allocation II 2 Interference Graph Allocators Chaitin Briggs CS553 Lecture Register Allocation II 3 1 Coalescing Move instructions – Code generation can produce unnecessary move instructions mov t1, t2 – If we can assign t1 and t2 to the same register, we can eliminate the move Idea – If t1 and t2 are not connected in the interference graph, coalesce them into a single variable Problem – Coalescing can increase the number of edges and make a graph uncolorable – Limit coalescing coalesce to avoid uncolorable t1 t2 t12 graphs CS553 Lecture Register Allocation II 4 Coalescing Logistics Rule – If the virtual registers s1 and s2 do not interfere and there is a copy statement s1 = s2 then s1 and s2 can be coalesced – Steps – SSA – Find webs – Virtual registers – Interference graph – Coalesce CS553 Lecture Register Allocation II 5 2 Example (Apply Chaitin algorithm) Attempt to 3-color this graph ( , , ) Stack: Weighted order: a1 b d e c a1 b a e c 2 a2 b a1 c e d a2 d The example indicates that nodes are visited in increasing weight order. Chaitin and Briggs visit nodes in an arbitrary order. CS553 Lecture Register Allocation II 6 Example (Apply Briggs Algorithm) Attempt to 2-color this graph ( , ) Stack: Weighted order: a1 b d e c a1 b* a e c 2 a2* b a1* c e* d a2 d * blocked node CS553 Lecture Register Allocation II 7 3 Spilling (Original CFG and Interference Graph) a1 := ..
    [Show full text]
  • Feasibility of Optimizations Requiring Bounded Treewidth in a Data Flow Centric Intermediate Representation
    Feasibility of Optimizations Requiring Bounded Treewidth in a Data Flow Centric Intermediate Representation Sigve Sjømæling Nordgaard and Jan Christian Meyer Department of Computer Science, NTNU Trondheim, Norway Abstract Data flow analyses are instrumental to effective compiler optimizations, and are typically implemented by extracting implicit data flow information from traversals of a control flow graph intermediate representation. The Regionalized Value State Dependence Graph is an alternative intermediate representation, which represents a program in terms of its data flow dependencies, leaving control flow implicit. Several analyses that enable compiler optimizations reduce to NP-Complete graph problems in general, but admit linear time solutions if the graph’s treewidth is limited. In this paper, we investigate the treewidth of application benchmarks and synthetic programs, in order to identify program features which cause the treewidth of its data flow graph to increase, and assess how they may appear in practical software. We find that increasing numbers of live variables cause unbounded growth in data flow graph treewidth, but this can ordinarily be remedied by modular program design, and monolithic programs that exceed a given bound can be efficiently detected using an approximate treewidth heuristic. 1 Introduction Effective program optimizations depend on an intermediate representation that permits a compiler to analyze the data and control flow semantics of the source program, so as to enable transformations without altering the result. The information from analyses such as live variables, available expressions, and reaching definitions pertains to data flow. The classical method to obtain it is to iteratively traverse a control flow graph, where program execution paths are explicitly encoded and data movement is implicit.
    [Show full text]
  • Iterative-Free Program Analysis
    Iterative-Free Program Analysis Mizuhito Ogawa†∗ Zhenjiang Hu‡∗ Isao Sasano† [email protected] [email protected] [email protected] †Japan Advanced Institute of Science and Technology, ‡The University of Tokyo, and ∗Japan Science and Technology Corporation, PRESTO Abstract 1 Introduction Program analysis is the heart of modern compilers. Most control Program analysis is the heart of modern compilers. Most control flow analyses are reduced to the problem of finding a fixed point flow analyses are reduced to the problem of finding a fixed point in a in a certain transition system, and such fixed point is commonly certain transition system. Ordinary method to compute a fixed point computed through an iterative procedure that repeats tracing until is an iterative procedure that repeats tracing until convergence. convergence. Our starting observation is that most programs (without spaghetti This paper proposes a new method to analyze programs through re- GOTO) have quite well-structured control flow graphs. This fact cursive graph traversals instead of iterative procedures, based on is formally characterized in terms of tree width of a graph [25]. the fact that most programs (without spaghetti GOTO) have well- Thorup showed that control flow graphs of GOTO-free C programs structured control flow graphs, graphs with bounded tree width. have tree width at most 6 [33], and recent empirical study shows Our main techniques are; an algebraic construction of a control that control flow graphs of most Java programs have tree width at flow graph, called SP Term, which enables control flow analysis most 3 (though in general it can be arbitrary large) [16].
    [Show full text]
  • Dead Code Elimination for Web Applications Written in Dynamic Languages
    Dead code elimination for web applications written in dynamic languages Master’s Thesis Hidde Boomsma Dead code elimination for web applications written in dynamic languages THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in COMPUTER ENGINEERING by Hidde Boomsma born in Naarden 1984, the Netherlands Software Engineering Research Group Department of Software Technology Department of Software Engineering Faculty EEMCS, Delft University of Technology Hostnet B.V. Delft, the Netherlands Amsterdam, the Netherlands www.ewi.tudelft.nl www.hostnet.nl © 2012 Hidde Boomsma. All rights reserved Cover picture: Hoster the Hostnet mascot Dead code elimination for web applications written in dynamic languages Author: Hidde Boomsma Student id: 1174371 Email: [email protected] Abstract Dead code is source code that is not necessary for the correct execution of an application. Dead code is a result of software ageing. It is a threat for maintainability and should therefore be removed. Many organizations in the web domain have the problem that their software grows and de- mands increasingly more effort to maintain, test, check out and deploy. Old features often remain in the software, because their dependencies are not obvious from the software documentation. Dead code can be found by collecting the set of code that is used and subtract this set from the set of all code. Collecting the set can be done statically or dynamically. Web applications are often written in dynamic languages. For dynamic languages dynamic analysis suits best. From the maintainability perspective a dynamic analysis is preferred over static analysis because it is able to detect reachable but unused code.
    [Show full text]
  • Lecture Notes on Peephole Optimizations and Common Subexpression Elimination
    Lecture Notes on Peephole Optimizations and Common Subexpression Elimination 15-411: Compiler Design Frank Pfenning and Jan Hoffmann Lecture 18 October 31, 2017 1 Introduction In this lecture, we discuss common subexpression elimination and a class of optimiza- tions that is called peephole optimizations. The idea of common subexpression elimination is to avoid to perform the same operation twice by replacing (binary) operations with variables. To ensure that these substitutions are sound we intro- duce dominance, which ensures that substituted variables are always defined. Peephole optimizations are optimizations that are performed locally on a small number of instructions. The name is inspired from the picture that we look at the code through a peephole and make optimization that only involve the small amount code we can see and that are indented of the rest of the program. There is a large number of possible peephole optimizations. The LLVM com- piler implements for example more than 1000 peephole optimizations [LMNR15]. In this lecture, we discuss three important and representative peephole optimiza- tions: constant folding, strength reduction, and null sequences. 2 Constant Folding Optimizations have two components: (1) a condition under which they can be ap- plied and the (2) code transformation itself. The optimization of constant folding is a straightforward example of this. The code transformation itself replaces a binary operation with a single constant, and applies whenever c1 c2 is defined. l : x c1 c2 −! l : x c (where c =
    [Show full text]
  • Register Allocation by Puzzle Solving
    University of California Los Angeles Register Allocation by Puzzle Solving A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Fernando Magno Quint˜aoPereira 2008 c Copyright by Fernando Magno Quint˜aoPereira 2008 The dissertation of Fernando Magno Quint˜ao Pereira is approved. Vivek Sarkar Todd Millstein Sheila Greibach Bruce Rothschild Jens Palsberg, Committee Chair University of California, Los Angeles 2008 ii to the Brazilian People iii Table of Contents 1 Introduction ................................ 1 2 Background ................................ 5 2.1 Introduction . 5 2.1.1 Irregular Architectures . 8 2.1.2 Pre-coloring . 8 2.1.3 Aliasing . 9 2.1.4 Some Register Allocation Jargon . 10 2.2 Different Register Allocation Approaches . 12 2.2.1 Register Allocation via Graph Coloring . 12 2.2.2 Linear Scan Register Allocation . 18 2.2.3 Register allocation via Integer Linear Programming . 21 2.2.4 Register allocation via Partitioned Quadratic Programming 22 2.2.5 Register allocation via Multi-Flow of Commodities . 24 2.3 SSA based register allocation . 25 2.3.1 The Advantages of SSA-Based Register Allocation . 29 2.4 NP-Completeness Results . 33 3 Puzzle Solving .............................. 37 3.1 Introduction . 37 3.2 Puzzles . 39 3.2.1 From Register File to Puzzle Board . 40 iv 3.2.2 From Program Variables to Puzzle Pieces . 42 3.2.3 Register Allocation and Puzzle Solving are Equivalent . 45 3.3 Solving Type-1 Puzzles . 46 3.3.1 A Visual Language of Puzzle Solving Programs . 46 3.3.2 Our Puzzle Solving Program .
    [Show full text]
  • Code Optimizations Recap Peephole Optimization
    7/23/2016 Program Analysis Recap https://www.cse.iitb.ac.in/~karkare/cs618/ • Optimizations Code Optimizations – To improve efficiency of generated executable (time, space, resources …) Amey Karkare – Maintain semantic equivalence Dept of Computer Science and Engg • Two levels IIT Kanpur – Machine Independent Visiting IIT Bombay [email protected] – Machine Dependent [email protected] 2 Peephole Optimization Peephole optimization examples… • target code often contains redundant Redundant loads and stores instructions and suboptimal constructs • Consider the code sequence • examine a short sequence of target instruction (peephole) and replace by a Move R , a shorter or faster sequence 0 Move a, R0 • peephole is a small moving window on the target systems • Instruction 2 can always be removed if it does not have a label. 3 4 1 7/23/2016 Peephole optimization examples… Unreachable code example … Unreachable code constant propagation • Consider following code sequence if 0 <> 1 goto L2 int debug = 0 print debugging information if (debug) { L2: print debugging info } Evaluate boolean expression. Since if condition is always true the code becomes this may be translated as goto L2 if debug == 1 goto L1 goto L2 print debugging information L1: print debugging info L2: L2: The print statement is now unreachable. Therefore, the code Eliminate jumps becomes if debug != 1 goto L2 print debugging information L2: L2: 5 6 Peephole optimization examples… Peephole optimization examples… • flow of control: replace jump over jumps • Strength reduction
    [Show full text]
  • Validating Register Allocation and Spilling
    Validating Register Allocation and Spilling Silvain Rideau1 and Xavier Leroy2 1 Ecole´ Normale Sup´erieure,45 rue d'Ulm, 75005 Paris, France [email protected] 2 INRIA Paris-Rocquencourt, BP 105, 78153 Le Chesnay, France [email protected] Abstract. Following the translation validation approach to high- assurance compilation, we describe a new algorithm for validating a posteriori the results of a run of register allocation. The algorithm is based on backward dataflow inference of equations between variables, registers and stack locations, and can cope with sophisticated forms of spilling and live range splitting, as well as many architectural irregular- ities such as overlapping registers. The soundness of the algorithm was mechanically proved using the Coq proof assistant. 1 Introduction To generate fast and compact machine code, it is crucial to make effective use of the limited number of registers provided by hardware architectures. Register allocation and its accompanying code transformations (spilling, reloading, coa- lescing, live range splitting, rematerialization, etc) therefore play a prominent role in optimizing compilers. As in the case of any advanced compiler pass, mistakes sometimes happen in the design or implementation of register allocators, possibly causing incorrect machine code to be generated from a correct source program. Such compiler- introduced bugs are uncommon but especially difficult to exhibit and track down. In the context of safety-critical software, they can also invalidate all the safety guarantees obtained by formal verification of the source code, which is a growing concern in the formal methods world. There exist two major approaches to rule out incorrect compilations. Com- piler verification proves, once and for all, the correctness of a compiler or compi- lation pass, preferably using mechanical assistance (proof assistants) to conduct the proof.
    [Show full text]
  • A Little on V8 and Webassembly
    A Little on V8 and WebAssembly An V8 Engine Perspective Ben L. Titzer WebAssembly Runtime TLM Background ● A bit about me ● A bit about V8 and JavaScript ● A bit about virtual machines Some history ● JavaScript ⇒ asm.js (2013) ● asm.js ⇒ wasm prototypes (2014-2015) ● prototypes ⇒ production (2015-2017) This talk mostly ● production ⇒ maturity (2017- ) ● maturity ⇒ future (2019- ) WebAssembly in a nutshell ● Low-level bytecode designed to be fast to verify and compile ○ Explicit non-goal: fast to interpret ● Static types, argument counts, direct/indirect calls, no overloaded operations ● Unit of code is a module ○ Globals, data initialization, functions ○ Imports, exports WebAssembly module example header: 8 magic bytes types: TypeDecl[] ● Binary format imports: ImportDecl[] ● Type declarations funcdecl: FuncDecl[] ● Imports: tables: TableDecl[] ○ Types memories: MemoryDecl[] ○ Functions globals: GlobalVar[] ○ Globals exports: ExportDecl[] ○ Memory ○ Tables code: FunctionBody[] ● Tables, memories data: Data[] ● Global variables ● Exports ● Function bodies (bytecode) WebAssembly bytecode example func: (i32, i32)->i32 get_local[0] ● Typed if[i32] ● Stack machine get_local[0] ● Structured control flow i32.load_mem[8] ● One large flat memory else ● Low-level memory operations get_local[1] ● Low-level arithmetic i32.load_mem[12] end i32.const[42] i32.add end Anatomy of a Wasm engine ● Load and validate wasm bytecode ● Allocate internal data structures ● Execute: compile or interpret wasm bytecode ● JavaScript API integration ● Memory management
    [Show full text]