Survey on Instruction Selection an Extensive and Modern Literature Review

Survey on Instruction Selection An Extensive and Modern Literature Review gabriel s. hjort blindell October 4, 2013 School of Information and Communication Technology KTH Royal Institute of Technology Stockholm, Sweden Author: Hjort Blindell, Gabriel S. <[email protected]> Title: Survey on Instruction Selection Subtitle: An Extensive and Modern Literature Review Identification data: TRITA-ICT/ECS R 13:17 ISRN KTH/ICT/ECS/R-13/17-SE ISSN 1653-6363 ISBN 978-91-7501-898-0 Doc. build version: 2013-10-04 19:32 Research Unit: Software and Computer Systems School of Information and Communication Technology KTH Royal Institute of Technology Forum 105 164 40, Kista Sweden Copyright c 2013 – All rights reserved. This work is licensed under the Creative Commons Attribution (CC BY 3.0) license. To access the full license, please visit http://creativecommons.org/licenses/by/3.0/legalcode. This document was typeset in LATEX, using Garamond, Linux Libertine, and Bera Mono as fonts. All figures were produced, either partly or entirely, with the TikZ package. ii Abstract* Instruction selection is one of three optimization problems – the other two are instruction scheduling and register allocation – involved in code generation. The task of the instruction selector is to transform an input program from its target-independent representation into a target-specific form by making best use of the available machine instructions. Hence instruction selection is a crucial component of generating code that is both correct and runs efficiently on a specific target machine. Despite on-going research since the late 1960s, the last comprehensive survey on this field was written more than 30 years ago. As many new approaches and techniques have appeared since its publication, there is a need for an up-to-date review of the current body of literature; this report addresses that need by presenting an extensive survey and categorization of both dated method and the state-of-the-art of instruction selection. The report thereby supersedes and extends the previous surveys, and attempts to identify where future research could be directed. *This research was funded by the Swedish Research Council (VR) (grant 621-2011-6229). I also wish to thank Roberto Castañeda Lozano, Mats Carlsson, and Karl Johansson for their helpful feedback, and SICS (The Swedish Institute of Computer Science) for letting me use their computer equipment to compile this report. iii Contents Abstract iii Contents iv List of Figures vii List of Abbreviations ix 1 Introduction 1 1.1 What is instruction selection? 2 1.1.1 Inside the compiler 2 1.1.2 The interface of the target machine 3 1.2 Problems of comparing different methods 4 1.2.1 Machine instruction characteristics 4 1.2.2 Optimal instruction selection 6 1.3 The prehistory of instruction selection 7 2 Macro Expansion 8 2.1 The principle 8 2.2 Naïve macro expansion 9 2.2.1 Early approaches 9 2.2.2 Generating the macros automatically from a machine description 11 2.2.3 More efficient compilation with tables 13 2.2.4 Falling out of fashion 14 2.3 Improving code quality with peephole optimization 15 2.3.1 Peephole optimization 15 2.3.2 Combining naïve macro expansion with peephole optimization 17 2.3.3 Running peephole optimization before instruction selection 20 2.4 Summary 21 3 Tree Covering 22 3.1 The principle 22 3.2 First techniques to use tree pattern matching 23 3.3 Covering trees bottom-up with LR parsing 25 3.3.1 The Glanville-Graham approach 26 3.3.2 Extending grammars with semantic handling 32 3.3.3 Maintaining multiple parse trees for better code quality 34 3.4 Covering trees top-down with recursion 34 3.4.1 First approaches 35 3.5 A note on tree rewriting and tree covering 37 3.6 Separating pattern matching and pattern selection 37 iv 3.6.1 Algorithms for linear-time tree pattern matching 38 3.6.2 Optimal pattern selection with dynamic programming 43 3.6.3 Faster pattern selection with offline cost analysis 47 3.6.4 Lazy state table generation 53 3.7 Other techniques for covering trees 53 3.7.1 Approaches based on formal frameworks 54 3.7.2 Tree automata and series transducers 55 3.7.3 Rewriting strategies 55 3.7.4 Genetic algorithms 55 3.7.5 Trellis diagrams 56 3.8 Summary 57 3.8.1 Restrictions that come with trees 58 4 DAG Covering 59 4.1 The principle 59 4.2 Optimal pattern selection on DAGs is NP-complete 59 4.2.1 The proof 60 4.3 Covering DAGs with tree patterns 62 4.3.1 Undagging the expression DAG into a tree 62 4.3.2 Extending dynamic programming to DAGs 64 4.3.3 Matching tree patterns directly on DAGs 65 4.3.4 Selecting patterns with integer programming 66 4.3.5 Modeling entire processors as DAGs 67 4.4 Covering DAGs with DAG patterns 68 4.4.1 Splitting DAG patterns into trees 68 4.5 Reducing pattern selection to a MIS problem 75 4.5.1 Approaches using MIS and MWIS 76 4.6 Other DAG-based techniques 76 4.6.1 Extending means-end analysis to DAGs 76 4.6.2 DAG-like trellis diagrams 77 4.7 Summary 77 5 Graph-Based Approaches 79 5.1 The principle 79 5.2 Pattern matching using subgraph isomorphism 80 5.2.1 Ullmann’s algorithm 80 5.2.2 The VF2 algorithm 81 5.3 Graph-based pattern selection techniques 81 5.3.1 Unate and binate covering 81 5.3.2 CP-based techniques 83 5.3.3 PBQP-based techniques 84 5.4 Approaches based on processor modeling 90 5.4.1 Chess 90 5.4.2 Simulated annealing 91 v 5.5 Summary 91 6 Conclusions 92 6.1 Is instruction selection a closed problem? 92 6.2 Where to go from here? 93 7 Bibliography 94 A List of Approaches 116 B Publication Diagram 119 C Graph Definitions 120 D Taxonomy 122 D.1 Common terms 122 D.2 Machine instruction characteristics 123 D.3 Scope 124 D.4 Principles 125 E Document Revisions 126 Index 127 vi List of Figures 1.1 Overview of a common compiler infrastructure 2 2.1 Language transfer example using Simcmp 9 2.2 IR expression represented as a tree 10 2.3 Binary addition macro in ICL 10 2.4 MIML example 11 2.5 Partial specification of IBM-360 in OMML 12 2.6 Overview of the Davidson-Fraser approach 17 2.7 An extension of the Davidson-Fraser approach 19 2.8 Machine instruction expressed in λ-RTL 19 3.1 Machine description sample for PCC 24 3.2 Straight-forward pattern matching with nm time complexity 25 3.3 Example of an instruction set grammar O( ) 29 3.4 Generated state table 29 3.5 Execution walk-through of the Glanville-Graham approach 30 3.6 An instruction set expressed as an affix grammar 33 3.7 String matching state machine 39 3.8 Hoffmann-O’Donnell’s bottom-up algorithm for labeling expressions tree 39 3.9 Tree pattern matching using Hoffmann-O’Donnell 41 3.10 Preprocessing algorithms of Hoffmann-O’Donnell 41 3.11 Compression of lookup tables 43 3.12 Rule samples from Twig 44 3.13 Computing the costs with dynamic programming 45 3.14 DP-driven instruction selection algorithm with optimal pattern selection 45 3.15 Table-driven algorithm for performing optimal pattern selection and code emission 48 3.16 BURS example 49 3.17 Incorporating costs into states 50 3.18 Trellis diagram example 57 4.1 Reducing SAT to DAG covering 61 4.2 Undagging an expression DAG with a common subexpression 63 4.3 Sharing nonterminals of tree covering on a DAG 64 4.4 CO graph of a simple processor 67 4.5 Splitting DAG pattern at output nodes into multiple tree patterns 74 4.6 Conflict graph example 75 5.1 Time complexities of pattern matching and optimal pattern selection 80 5.2 Unate covering example 82 5.3 Example code which would benefit from global instruction selection 86 5.4 Same as Figure 5.3 but in SSA form 86 vii B.1 Publication timeline diagram 119 C.1 Example of two simple, directed graphs 120 D.1 Basic block example 124 viii List of Abbreviations AI artificial intelligence APT abstract parse tree (same as AST) ASIP application-specific instruction-set processor AST abstract syntax tree BURS bottom-up rewriting system CDFG control and data flow graph CISC complex instruction set computing CNF conjunctive normal form CP constraint programming DAG directed acyclic graph DP dynamic programming DSP digital signal processor GA genetic algorithm IP integer programming IR intermediate representation ISA instruction set architecture ISE instruction set extension JIT just-in-time LL parsing left-to-right, left-most derivation parsing LR parsing left-to-right, right-most derivation parsing LALR look-ahead LR parsing MIS maximum independent set MWIS maximum weighted independent set PBQP partitioned Boolean quadratic problem PO peephole optimisation QAP quadratic assignment problem RTL register transfer list SAT problem satisfiability problem SIMD single-instruction, multiple-data SLR parsing simple LR parsing SSA static single assignment VLIW very long instruction word VLSI very-large-scale integration ix 1 Introduction Every undertaking of implementing any piece of software – be it a GUI-based Windows application written in C, a high-performance Fortran program for mathematical computations, or a smartphone app written in Java – necessitates a compiler in one form or another.

Load more