Generalized Instruction Selector Generation: the Automatic Construction of Instruction Selectors from Descriptions of Compiler Internal Forms and Target Machines

University of Massachusetts Amherst ScholarWorks@UMass Amherst Open Access Dissertations 2-2010 Generalized Instruction Selector Generation: The Automatic Construction of Instruction Selectors from Descriptions of Compiler Internal Forms and Target Machines Timothy David Richards University of Massachusetts Amherst, [email protected] Follow this and additional works at: https://scholarworks.umass.edu/open_access_dissertations Part of the Computer Sciences Commons Recommended Citation Richards, Timothy David, "Generalized Instruction Selector Generation: The Automatic Construction of Instruction Selectors from Descriptions of Compiler Internal Forms and Target Machines" (2010). Open Access Dissertations. 178. https://scholarworks.umass.edu/open_access_dissertations/178 This Open Access Dissertation is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Open Access Dissertations by an authorized administrator of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. GENERALIZED INSTRUCTION SELECTOR GENERATION: THE AUTOMATIC CONSTRUCTION OF INSTRUCTION SELECTORS FROM DESCRIPTIONS OF COMPILER INTERNAL FORMS AND TARGET MACHINES A Dissertation Presented by TIMOTHY D. RICHARDS Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY February 2010 Department of Computer Science c 2010 Timothy D. Richards GENERALIZED INSTRUCTION SELECTOR GENERATION: THE AUTOMATIC CONSTRUCTION OF INSTRUCTION SELECTORS FROM DESCRIPTIONS OF COMPILER INTERNAL FORMS AND TARGET MACHINES A Dissertation Presented by TIMOTHY D. RICHARDS Approved as to style and content by: J. Eliot B. Moss, Chair Charles C. Weems, Member David Barrington, Member Maciej Ciesielski, Member Andrew Barto, Department Chair Department of Computer Science To Jody, Caleb, Hazel, Pixel, Andy, and Lumpy—I love you EPIGRAPH “Errors, like straws, upon the surface flow; He who would search for pearls must dive below....”, -John Dryden v ACKNOWLEDGMENTS Eliot Moss and Chip Weems have been excellent advisors during my graduate career. They dedicated a significant amount of their own time to provide me with the best possible graduate experience. Their complementary style has had a profound impact on my success during this time and has allowed me to reach this point in my career. They continually offer advice that helps form my ideas and research goals—for this, I am eternally grateful. I would also like to thank my thesis committee members David Barrington and Maciej Ciesielski for their feedback and time during this process. I also acknowledge the members of the Architecture and Language Implementation Laboratory (ALI) for providing a fun and interactive environment well suited for generating interesting ideas and solutions. I am especially grateful to the rest of the CoGenT team, Trek Palmer and Ed Walters, for their hard work, dedication, support, and help to make GIST a reality. I also thank the undergraduate students involved as part of the Re- search Experience for Undergraduates (REU) program that spent their summers working on CoGenT and GIST. A special thanks to Addison Mayberry, Adam Fidel, and Chujiao Ma for their hard work in generating a significant amount of GISTs results during the 2009 summer program. Finally, I am forever thankful to my wife, Jody Bird, for her commitment to this process and for getting me on track in 1994. Her support and encouragement has been vital and I could not have accomplished this without her smiles and laughter. I would like to thank my son Caleb for teaching me that ones imagination has no limits and pulling me away from work to help on the construction site is more important than I can begin to describe. I thank my daughter Hazel for arriving precisely when I could use those extra smiles and gentle giggles that ground me in this life. I also thank my friends and family, especially Nina, vi for their words (and cards) of encouragement over the past several years, to popie/grampi for teaching me how to sweat pipes, and to the entire Bird family for all their concern, encouragement, phone calls, and laughter. A special thanks goes to Sam and Dana Tomlin for everything they have done to help my family during this time—I am always and forever, in Caleb’s words, “ophay”. vii ABSTRACT GENERALIZED INSTRUCTION SELECTOR GENERATION: THE AUTOMATIC CONSTRUCTION OF INSTRUCTION SELECTORS FROM DESCRIPTIONS OF COMPILER INTERNAL FORMS AND TARGET MACHINES FEBRUARY 2010 TIMOTHY D. RICHARDS B.A., CLARK UNIVERSITY M.Sc., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor J. Eliot B. Moss One of the most difficult tasks a compiler writer faces is the construction of the instruction selector. The instruction selector is that part of the compiler that translates compiler intermediate representation (IR) into instructions for a target machine. Unfortunately, im- plementing an instruction selector “by hand” is a difficult, time consuming, and error prone task. The details of both the IR and target instruction set must be carefully considered in order to generate correct and efficient code. This, in turn, requires an expert in both the compiler internals as well as the target machine. Even an expert, however, can implement an instruction selector that is difficult to verify and debug. In this dissertation we describe the instruction selector problem, cover previous attempts at solving it, and identify what we believe to be the most prominent factor inhibiting their widespread adoption. viii This dissertation proposes a generalized approach toward generating instruction selectors automatically. In particular, we propose CISL, a common machine description language for specifying the semantics of compiler IR and target instructions, and GIST, a machine independent heuristic search procedure that can find equivalent instruction sequences between compiler IR and target instructions. CISL is an object-oriented-based language leveraging modern programming language constructs (e.g., classes, inheritance, mixins) and is capable of describing instructions for a variety of IR and target ISAs (Instruction Set Architecture). GIST leverages CISLs well-defined semantics and a canonicalization process to discover automatically instruction selector patterns: target instruction sequences that implement IR semantics. These instruction selector patterns are then generated in a compiler implementation independent format (XML). Small adapter programs use the generated instruction selector patterns to generate compiler specific implementation code. Our experiments show that instruction selector patterns can be discovered automatically and independent of a particular compiler framework or target machine. In addition, experience proved that adapter programs are easy to implement and instruction selector code is easy to generate from generated patterns. Furthermore, the generated instruction selectors are comparable in performance to the original compilers. ix TABLE OF CONTENTS Page EPIGRAPH ............................................................... v ACKNOWLEDGMENTS .................................................. vi ABSTRACT ............................................................. viii LIST OF TABLES........................................................ xiii LIST OF FIGURES ...................................................... xiv CHAPTER 1. INTRODUCTION ...................................................... 1 1.1 Motivation . .1 1.2 Posing the Problem . .7 1.3 Prequisites for Valid CGG Solutions . 10 1.4 Background . 12 1.5 Overview of GIST .................................................. 14 2. RELATED WORK ..................................................... 18 2.1 Retargetability . 18 2.2 Manual Construction of Code Generators . 20 2.2.1 Macro Expansion . 21 2.2.2 Maximal Munch. 22 2.3 Automatic Generation by Pattern Specification . 24 2.3.1 LR Parsing Techniques – Graham/Glanville . 24 2.3.2 Attribute Grammars . 27 2.3.3 Bottom Up Rewrite System (BURS) Techniques . 30 2.4 Automatic Generation by Pattern Discovery . 33 2.5 Same Machine Pattern Mappings . 34 2.5.1 Massalin . 34 2.5.2 GNU Superoptimizer. 35 2.5.3 Denali . 36 2.5.4 Bansal and Aiken . 38 x 2.6 Different Machine Pattern Mappings . 40 2.6.1 XGEN . 41 2.6.2 Cattell . 44 2.6.3 Dias . 49 3. DESCRIBING MACHINE INSTRUCTIONS ............................. 52 3.1 Introduction . 52 3.2 Overview . 53 3.2.1 CISL Types................................................. 57 3.2.2 Store . 59 3.3 Describing Instructions . 62 3.3.1 Defining Instruction Encoding . 63 3.3.2 Specifying Instruction Semantics . 66 3.3.2.1 Variable Expressions . 67 3.3.2.2 Expression Types . 68 3.3.2.3 Operators . 70 3.3.2.4 Statements . 73 3.3.3 Classes and Mixins . 77 3.4 Summary . 83 4. SUPPORTING INSTRUCTION SELECTOR DESIGN .................... 86 4.1 Introduction . 86 4.2 Instruction Selector State . 87 4.3 Mapping the Store . 91 4.4 Summary . 96 5. CANONICALIZATION ................................................ 97 5.1 Introduction . 97 5.2 Axiom Normalization . 98 5.2.1 Terms and Equations . 100 5.2.2 Term Rewriting Rule of Inference . 101 5.2.3 Axiom Admissibility . 106 5.3 Expression Reassociation . 109 5.3.1 Computing Ranks . 109 5.3.2 Operand Sorting . 110 5.3.3 Compiler Constants . 111 5.3.4 Canonicalization Example . 112 5.4 Summary . 113 6. DISCOVERING INSTRUCTION SELECTOR PATTERNS ............... 115 6.1 Overview . 115 6.2 Instruction Selector Patterns . 118 6.3 Target Pattern Indexing . 121 6.4 Resource.

Load more