The New Framework for Loop Nest Optimization in GCC: from Prototyping to Evaluation

Total Page:16

File Type:pdf, Size:1020Kb

The New Framework for Loop Nest Optimization in GCC: from Prototyping to Evaluation The New Framework for Loop Nest Optimization in GCC: from Prototyping to Evaluation Sebastian Pop1, Albert Cohen2, Pierre Jouvelot1, and Georges-Andr´eSilber1 1 Centre de recherche en informatique, Mines Paris, Fontainebleau, France 2 ALCHEMY group, INRIA Futurs, Orsay, France Abstract. This paper presents a practical prototyping tool for SSA transformations based on PROLOG, and a case study of its applicabil- ity using the New Framework for Loop Nest Optimization of the GNU Compiler Collection (GCC). Using this approach, we implemented an in- duction variable analysis in GCC and performed several experiments for assessing different issues: performance of the generated code, effective- ness of the analyzers and optimizers, compilation overhead, together with possible improvements. This evaluation, based on the SPEC CPU2000, MiBench and JavaGrande benchmarks on different architectures, suggests that our framework could be of valuable use in production compilers de- velopments such as GCC. 1 Introduction The GNU Compiler Collection (GCC) is the leading compiler suite for the free and open source software. The large set of target architectures and standard com- pliant front ends makes GCC the de facto compiler for portable developments and system design. However, until recently, GCC was not an option for high per- formance computing: it had infrastructure neither for data access restructuring nor for automatic parallelization. Furthermore, until recently, the compilation research community had no robust, universal, free, and industry-supported plat- form where new ideas in compiler technology can be implemented and tested on large scale programs written in standard programming languages. For more than a decade, the field of advanced compilation has been plagued by redundant development efforts on short-lived projects, with little impact on production en- vironments. In this paper, we propose an overview and a preliminary evaluation of the new infrastructure of GCC for analysis and loop nest optimizations. Based on this evaluation we intend to show the potential of GCC to address the needs of a free compiler in the compiler research and high performance computing communities. Modern compilers implement some of the sophisticated optimizations intro- duced for supercomputing applications [23, 3]. They provide performance models and transformations to improve fine-grain parallelism and to take into account Front−ends C C++ Java F95 Ada GENERIC GIMPLE Middle−end Induction variable analysis Data dependence analysis Scalar and Loop Nest Optimizations GIMPLE − SSA GIMPLE Back−end RTL Machine Description Asm Fig. 1. The infrastructure of GCC 4. the memory hierarchy. Most of these optimizations are loop-oriented and as- sume a high-level code representation with rich control and data structures: do loops with regular control, constant bounds and strides, typed arrays with linear subscripts. Yet these compilers are architecture-specific and designed by processor vendors, e.g., IBM, SGI, HP and Intel. In addition, the most advanced optimizations are limited to Fortran and C, and performance is dependent on the recognition of specific patterns in the source code. Some source-to-source com- pilers implement advanced loop transformations driven by architecture models and profiling [12]. However, good optimizations require manual efforts in the syntactic presentation of loops and array subscripts (avoiding, e.g., while loops, imperfect nests, linearized subscripts, pointers). In addition, the source-to-source approach is not suitable for low-level optimizations, including vectorization for SIMD instructions and software pipelining. It is also associated with pattern- based strategies that do not easily adapt to syntactic variations in the programs. Finally, these compilers require a huge implementation effort that cannot be matched by small development teams for general-purpose and/or free software compilers. Several projects demonstrated the interest of type enriched low-level repre- sentation [14]. They build on the normalization and simplicity of three-address code, adding data types, Static Single-Assignment (SSA) form [8,17] to ease data-flow analysis and scalar optimizations, and control and data annotations (loop nesting, heap structure, etc.). Recently, GCC 4 adopted such a represen- tation called GIMPLE [19,16], a three-address code derived from SIMPLE, the representation of the McCAT compiler [11]. The GIMPLE representation was proposed by Sebastian Pop and Diego Novillo, from RedHat, for minimizing the effort in the development and the maintenance of new analyzes and transforma- tions. GIMPLE is used for building the SSA representation that is then used by the analyzers and the scalar and loop nest optimizers. After these optimizations, the intermediate representation is translated into RTL, where machine-specific informations are added to the representation. Figure 1 presents the overall struc- ture of GCC. The front-ends C, C++, Java, Fortran95 and Ada are translating their intermediate representation to a common representation called GENERIC, then to GIMPLE that is eventually translated to RTL. Before presenting the loop optimizations that we will evaluate, we give an overview of the GIMPLE intermediate representation on which the loop optimizers are acting. A three-address representation like GIMPLE could seem not suitable to imple- ment program transformations for high performance: control-flow is represented by only two primitives, a conditional expression if, and a goto expression goto. Loops have to be discovered from the control-flow graph [2]. Although the num- ber of iterations is not captured by the GIMPLE representation, it is often dis- covered by analyzing the scalar variables involved in the loop exit conditions; this allows to compute precise information lost in the translation to a low-level representation, but it may also be useful when the source level does not expose enough syntactic information, e.g., in while loops, or loops with irregular con- trol constructs such as break or exceptions. In GIMPLE, subscript expressions, loop bounds and strides are spread across a number of elementary instructions and basic blocks, possibly far away from the original location in the source code. This is due to the translation to the low-level intermediate representation and to optimization phases such as dead-code elimination, partial redundancy elimina- tion, optimization of the control flow, invariant code motion, etc. However, the SSA form provides fast practical ways to extract information concerning values of scalar variables. We recall some SSA terminology [8, 17] to facilitate further discussions: the SSA graph is the graph of def-use chains in the SSA form; each assignment targets a unique variable; φ nodes occur at merge points of the control flow and restore the flow of values from the renamed variables. Assuming the natural loops [2] information has been recovered, φ nodes whose arguments are defined at different loop depths are called loop-φ nodes and they have a specific semantics related to a potentially recursive definition. With the following notations: – S[[e]] for the semantics of an expression e, – loopx − φ(b,c) for an SSA φ node defined in loop x, where b is defined in an enclosing loop y and c is defined in loop x, as illustrated in Figure 3, – ℓx for the integer-valued indices of loop x,0 ≤ ℓx <Nx, with Nx the number of iterations in loop x, – a(ℓx) for the value of variable a at iteration ℓx, we give the denotational semantics for a subset of the SSA expressions contained in an innermost loop x: b(ℓy), if ℓx = 0; S[[a = loopx − φ(b,c)]]ℓy = ∀ℓx 0 ≤ ℓx <Nx ∧ a(ℓx)= c(ℓx − 1), otherwise. S[[d = e]]ℓx = [d(ℓx)= e(ℓx)] S[[f = g + h]]ℓx = [f(ℓx)= g(ℓx)+ h(ℓx)] We illustrate the semantics of the SSA representation with two examples: Figure 2 illustrates the role of a φ node used for merging values assigned in different branches of a condition expression. Figure 3 presents the syntax and the semantics of a loop-φ node, variable a is self defined in the argument coming from the loop body. v = 5; a=5 if (E) if (E) { then b = 0 a = 5 v = 0; SSA Sem endif b = 0 } −−→ −−−→ c = φ(a, b) ... = v; b if E is true, ... = c c = (a otherwise. Fig. 2. Syntax and semantics of a condition-φ node. b= 0 f= 7 a(0) = b i = 0; begin1: a(x) = c(x − 1) for x > 0 s = 7; a = loop1-φ(b, c) a(x) = a(x − 1) + f(x − 1) for x > 0 do { SSA c=a+f Sem . if (E) i=i+s; −−→ −−−→ x−1 } while (E); then goto begin1 a(x) = b + f(j) for x ≥ 0 else goto end1 jX=0 endif a(x) = b + x · f for x ≥ 0 end1: a(x) = 7x for x ≥ 0 Fig. 3. Syntax and semantics of a loop-φ node. 2 Contributions We developed an analyzer [21] that matches common induction patterns on the SSA graph, based on an algorithm close to linear unification [20]. We develop this idea in Section 3 and give a practical framework for prototyping SSA analyzes and transformations in PROLOG [1], thus paving the way for more analysis and transformations in GCC. The analyzer that we presented in [21] builds a description of the scalar variables using a practical closed form representation: the chains of recurrences [4, 13, 22]. We presented in [21] some extensions for the chains of recurrences for solving practical problems that we encountered during the integration of our algorithm in GCC. Among these extensions we can cite the symbolic form of chains of recurrences that we called trees of recurrences TREC, enabling the expression of self-defined chains of recurrences, as for example the Fibbonacci sequence, or more generally, exponential evolutions expressed in the program only by additions. We have also proposed the peeled chains of recurrences, that list the first values of a sequence. Periodic sequences can naturally be represented using a combination of these two extensions, as a self-defined TREC listing the elements of the period.
Recommended publications
  • A Methodology for Assessing Javascript Software Protections
    A methodology for Assessing JavaScript Software Protections Pedro Fortuna A methodology for Assessing JavaScript Software Protections Pedro Fortuna About me Pedro Fortuna Co-Founder & CTO @ JSCRAMBLER OWASP Member SECURITY, JAVASCRIPT @pedrofortuna 2 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Agenda 1 4 7 What is Code Protection? Testing Resilience 2 5 Code Obfuscation Metrics Conclusions 3 6 JS Software Protections Q & A Checklist 3 What is Code Protection Part 1 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property Protection Legal or Technical Protection? Alice Bob Software Developer Reverse Engineer Sells her software over the Internet Wants algorithms and data structures Does not need to revert back to original source code 5 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property IP Protection Protection Legal Technical Encryption ? Trusted Computing Server-Side Execution Obfuscation 6 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Code Obfuscation Obfuscation “transforms a program into a form that is more difficult for an adversary to understand or change than the original code” [1] More Difficult “requires more human time, more money, or more computing power to analyze than the original program.” [1] in Collberg, C., and Nagra, J., “Surreptitious software: obfuscation, watermarking, and tamperproofing for software protection.”, Addison- Wesley Professional, 2010. 7 A methodology for Assessing
    [Show full text]
  • Attacking Client-Side JIT Compilers.Key
    Attacking Client-Side JIT Compilers Samuel Groß (@5aelo) !1 A JavaScript Engine Parser JIT Compiler Interpreter Runtime Garbage Collector !2 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !3 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !4 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !5 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !6 Agenda 1. Background: Runtime Parser • Object representation and Builtins JIT Compiler 2. JIT Compiler Internals • Problem: missing type information • Solution: "speculative" JIT Interpreter 3.
    [Show full text]
  • C++ Programming: Program Design Including Data Structures, Fifth Edition
    C++ Programming: From Problem Analysis to Program Design, Fifth Edition Chapter 5: Control Structures II (Repetition) Objectives In this chapter, you will: • Learn about repetition (looping) control structures • Explore how to construct and use count- controlled, sentinel-controlled, flag- controlled, and EOF-controlled repetition structures • Examine break and continue statements • Discover how to form and use nested control structures C++ Programming: From Problem Analysis to Program Design, Fifth Edition 2 Objectives (cont'd.) • Learn how to avoid bugs by avoiding patches • Learn how to debug loops C++ Programming: From Problem Analysis to Program Design, Fifth Edition 3 Why Is Repetition Needed? • Repetition allows you to efficiently use variables • Can input, add, and average multiple numbers using a limited number of variables • For example, to add five numbers: – Declare a variable for each number, input the numbers and add the variables together – Create a loop that reads a number into a variable and adds it to a variable that contains the sum of the numbers C++ Programming: From Problem Analysis to Program Design, Fifth Edition 4 while Looping (Repetition) Structure • The general form of the while statement is: while is a reserved word • Statement can be simple or compound • Expression acts as a decision maker and is usually a logical expression • Statement is called the body of the loop • The parentheses are part of the syntax C++ Programming: From Problem Analysis to Program Design, Fifth Edition 5 while Looping (Repetition)
    [Show full text]
  • Compiler Construction Assignment 3 – Spring 2018
    Compiler Construction Assignment 3 { Spring 2018 Robert van Engelen µc for the JVM µc (micro-C) is a small C-inspired programming language. In this assignment we will implement a compiler in C++ for µc. The compiler compiles µc programs to java class files for execution with the Java virtual machine. To implement the compiler, we can reuse the same concepts in the code-generation parts that were done in programming assignment 1 and reuse parts of the lexical analyzer you implemented in programming assignment 2. We will implement a new parser based on Yacc/Bison. This new parser utilizes translation schemes defined in Yacc grammars to emit Java bytecode. In the next programming assignment (the last assignment following this assignment) we will further extend the capabilities of our µc compiler by adding static semantics such as data types, apply type checking, and implement scoping rules for functions and blocks. Download Download the Pr3.zip file from http://www.cs.fsu.edu/~engelen/courses/COP5621/Pr3.zip. After unzipping you will get the following files Makefile A makefile bytecode.c The bytecode emitter (same as Pr1) bytecode.h The bytecode definitions (same as Pr1) error.c Error reporter global.h Global definitions init.c Symbol table initialization javaclass.c Java class file operations (same as Pr1) javaclass.h Java class file definitions (same as Pr1) mycc.l *) Lex specification mycc.y *) Yacc specification and main program symbol.c *) Symbol table operations test#.uc A number of µc test programs The files marked ∗) are incomplete. For this assignment you are required to complete these files.
    [Show full text]
  • PDF Python 3
    Python for Everybody Exploring Data Using Python 3 Charles R. Severance 5.7. LOOP PATTERNS 61 In Python terms, the variable friends is a list1 of three strings and the for loop goes through the list and executes the body once for each of the three strings in the list resulting in this output: Happy New Year: Joseph Happy New Year: Glenn Happy New Year: Sally Done! Translating this for loop to English is not as direct as the while, but if you think of friends as a set, it goes like this: “Run the statements in the body of the for loop once for each friend in the set named friends.” Looking at the for loop, for and in are reserved Python keywords, and friend and friends are variables. for friend in friends: print('Happy New Year:', friend) In particular, friend is the iteration variable for the for loop. The variable friend changes for each iteration of the loop and controls when the for loop completes. The iteration variable steps successively through the three strings stored in the friends variable. 5.7 Loop patterns Often we use a for or while loop to go through a list of items or the contents of a file and we are looking for something such as the largest or smallest value of the data we scan through. These loops are generally constructed by: • Initializing one or more variables before the loop starts • Performing some computation on each item in the loop body, possibly chang- ing the variables in the body of the loop • Looking at the resulting variables when the loop completes We will use a list of numbers to demonstrate the concepts and construction of these loop patterns.
    [Show full text]
  • Expression Rematerialization for VLIW DSP Processors with Distributed Register Files ?
    Expression Rematerialization for VLIW DSP Processors with Distributed Register Files ? Chung-Ju Wu, Chia-Han Lu, and Jenq-Kuen Lee Department of Computer Science, National Tsing-Hua University, Hsinchu 30013, Taiwan {jasonwu,chlu}@pllab.cs.nthu.edu.tw,[email protected] Abstract. Spill code is the overhead of memory load/store behavior if the available registers are not sufficient to map live ranges during the process of register allocation. Previously, works have been proposed to reduce spill code for the unified register file. For reducing power and cost in design of VLIW DSP processors, distributed register files and multi- bank register architectures are being adopted to eliminate the amount of read/write ports between functional units and registers. This presents new challenges for devising compiler optimization schemes for such ar- chitectures. This paper aims at addressing the issues of reducing spill code via rematerialization for a VLIW DSP processor with distributed register files. Rematerialization is a strategy for register allocator to de- termine if it is cheaper to recompute the value than to use memory load/store. In the paper, we propose a solution to exploit the character- istics of distributed register files where there is the chance to balance or split live ranges. By heuristically estimating register pressure for each register file, we are going to treat them as optional spilled locations rather than spilling to memory. The choice of spilled location might pre- serve an expression result and keep the value alive in different register file. It increases the possibility to do expression rematerialization which is effectively able to reduce spill code.
    [Show full text]
  • 7. Control Flow First?
    Copyright (C) R.A. van Engelen, FSU Department of Computer Science, 2000-2004 Ordering Program Execution: What is Done 7. Control Flow First? Overview Categories for specifying ordering in programming languages: Expressions 1. Sequencing: the execution of statements and evaluation of Evaluation order expressions is usually in the order in which they appear in a Assignments program text Structured and unstructured flow constructs 2. Selection (or alternation): a run-time condition determines the Goto's choice among two or more statements or expressions Sequencing 3. Iteration: a statement is repeated a number of times or until a Selection run-time condition is met Iteration and iterators 4. Procedural abstraction: subroutines encapsulate collections of Recursion statements and subroutine calls can be treated as single Nondeterminacy statements 5. Recursion: subroutines which call themselves directly or indirectly to solve a problem, where the problem is typically defined in terms of simpler versions of itself 6. Concurrency: two or more program fragments executed in parallel, either on separate processors or interleaved on a single processor Note: Study Chapter 6 of the textbook except Section 7. Nondeterminacy: the execution order among alternative 6.6.2. constructs is deliberately left unspecified, indicating that any alternative will lead to a correct result Expression Syntax Expression Evaluation Ordering: Precedence An expression consists of and Associativity An atomic object, e.g. number or variable The use of infix, prefix, and postfix notation leads to ambiguity An operator applied to a collection of operands (or as to what is an operand of what arguments) which are expressions Fortran example: a+b*c**d**e/f Common syntactic forms for operators: The choice among alternative evaluation orders depends on Function call notation, e.g.
    [Show full text]
  • Repetition Structures
    24785_CH06_BRONSON.qrk 11/10/04 9:05 M Page 301 Repetition Structures 6.1 Introduction Goals 6.2 Do While Loops 6.3 Interactive Do While Loops 6.4 For/Next Loops 6.5 Nested Loops 6.6 Exit-Controlled Loops 6.7 Focus on Program Design and Implementation: After-the- Fact Data Validation and Creating Keyboard Shortcuts 6.8 Knowing About: Programming Costs 6.9 Common Programming Errors and Problems 6.10 Chapter Review 24785_CH06_BRONSON.qrk 11/10/04 9:05 M Page 302 302 | Chapter 6: Repetition Structures The applications examined so far have illustrated the programming concepts involved in input, output, assignment, and selection capabilities. By this time you should have gained enough experience to be comfortable with these concepts and the mechanics of implementing them using Visual Basic. However, many problems require a repetition capability, in which the same calculation or sequence of instructions is repeated, over and over, using different sets of data. Examples of such repetition include continual checking of user data entries until an acceptable entry, such as a valid password, is made; counting and accumulating running totals; and recurring acceptance of input data and recalculation of output values that only stop upon entry of a designated value. This chapter explores the different methods that programmers use to construct repeating sections of code and how they can be implemented in Visual Basic. A repeated procedural section of code is commonly called a loop, because after the last statement in the code is executed, the program branches, or loops back to the first statement and starts another repetition.
    [Show full text]
  • Equality Saturation: a New Approach to Optimization
    Logical Methods in Computer Science Vol. 7 (1:10) 2011, pp. 1–37 Submitted Oct. 12, 2009 www.lmcs-online.org Published Mar. 28, 2011 EQUALITY SATURATION: A NEW APPROACH TO OPTIMIZATION ROSS TATE, MICHAEL STEPP, ZACHARY TATLOCK, AND SORIN LERNER Department of Computer Science and Engineering, University of California, San Diego e-mail address: {rtate,mstepp,ztatlock,lerner}@cs.ucsd.edu Abstract. Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce a transformed program that is then passed to the next optimization. We present a new approach for structuring the optimization phase of a compiler. In our approach, optimizations take the form of equality analyses that add equality information to a common intermediate representation. The op- timizer works by repeatedly applying these analyses to infer equivalences between program fragments, thus saturating the intermediate representation with equalities. Once saturated, the intermediate representation encodes multiple optimized versions of the input program. At this point, a profitability heuristic picks the final optimized program from the various programs represented in the saturated representation. Our proposed way of structuring optimizers has a variety of benefits over previous approaches: our approach obviates the need to worry about optimization ordering, enables the use of a global optimization heuris- tic that selects among fully optimized programs, and can be used to perform translation validation, even on compilers other than our own. We present our approach, formalize it, and describe our choice of intermediate representation. We also present experimental results showing that our approach is practical in terms of time and space overhead, is effective at discovering intricate optimization opportunities, and is effective at performing translation validation for a realistic optimizer.
    [Show full text]
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Quaxe, Infinity and Beyond
    Quaxe, infinity and beyond Daniel Glazman — WWX 2015 /usr/bin/whoami Primary architect and developer of the leading Web and Ebook editors Nvu and BlueGriffon Former member of the Netscape CSS and Editor engineering teams Involved in Internet and Web Standards since 1990 Currently co-chair of CSS Working Group at W3C New-comer in the Haxe ecosystem Desktop Frameworks Visual Studio (Windows only) Xcode (OS X only) Qt wxWidgets XUL Adobe Air Mobile Frameworks Adobe PhoneGap/Air Xcode (iOS only) Qt Mobile AppCelerator Visual Studio Two solutions but many issues Fragmentation desktop/mobile Heavy runtimes Can’t easily reuse existing c++ libraries Complex to have native-like UI Qt/QtMobile still require c++ Qt’s QML is a weak and convoluted UI language Haxe 9 years success of Multiplatform OSS language Strong affinity to gaming Wide and vibrant community Some press recognition Dead code elimination Compiles to native on all But no native GUI… platforms through c++ and java Best of all worlds Haxe + Qt/QtMobile Multiplatform Native apps, native performance through c++/Java C++/Java lib reusability Introducing Quaxe Native apps w/o c++ complexity Highly dynamic applications on desktop and mobile Native-like UI through Qt HTML5-based UI, CSS-based styling Benefits from Haxe and Qt communities Going from HTML5 to native GUI completeness DOM dynamism in native UI var b: Element = document.getElementById("thirdButton"); var t: Element = document.createElement("input"); t.setAttribute("type", "text"); t.setAttribute("value", "a text field"); b.parentNode.insertBefore(t,
    [Show full text]
  • Scalable Conditional Induction Variables (CIV) Analysis
    Scalable Conditional Induction Variables (CIV) Analysis ifact Cosmin E. Oancea Lawrence Rauchwerger rt * * Comple A t te n * A te s W i E * s e n l C l o D C O Department of Computer Science Department of Computer Science and Engineering o * * c u e G m s E u e C e n R t v e o d t * y * s E University of Copenhagen Texas A & M University a a l d u e a [email protected] [email protected] t Abstract k = k0 Ind. k = k0 DO i = 1, N DO i =1,N Var. DO i = 1, N IF(cond(b(i)))THEN Subscripts using induction variables that cannot be ex- k = k+2 ) a(k0+2*i)=.. civ = civ+1 )? pressed as a formula in terms of the enclosing-loop indices a(k)=.. Sub. ENDDO a(civ) = ... appear in the low-level implementation of common pro- ENDDO k=k0+MAX(2N,0) ENDIF ENDDO gramming abstractions such as filter, or stack operations and (a) (b) (c) pose significant challenges to automatic parallelization. Be- Figure 1. Loops with affine and CIV array accesses. cause the complexity of such induction variables is often due to their conditional evaluation across the iteration space of its closed-form equivalent k0+2*i, which enables its in- loops we name them Conditional Induction Variables (CIV). dependent evaluation by all iterations. More importantly, This paper presents a flow-sensitive technique that sum- the resulted code, shown in Figure 1(b), allows the com- marizes both such CIV-based and affine subscripts to pro- piler to verify that the set of points written by any dis- gram level, using the same representation.
    [Show full text]