The Platform-Aware Compilation Environment 1 Status and Future Directions

The Platform-Aware Compilation Environment 1 Status and Future Directions June 13, 2012 1 The Platform-Aware Compilation Environment project (PACE) is funded by the Defense Advanced Projects Research Agency (DARPA) through Air Force Research Laboratory (AFRL) Contract FA8650-09-C-7915 with Rice University. PACE is part of the Architecture-Aware Compilation Environment program (AACE). The opinions and findings in this document do not necessarily reflect the views of either the United States Government or Rice University. ii Credits The Platform-Aware Compiler Environment (PACE) project is an inter-institutional collaboration. Organization Location Principal Contacts Rice University (lead) Houston, TX, USA Keith D. Cooper, PI John Mellor-Crummey Erzsebet´ Merenyi´ Krishna Palem Vivek Sarkar Linda Torczon ET International Newark, DE, USA Rishi Khan Ohio State University Columbus, OH, USA P. Sadayappan Stanford University Palo Alto, CA, USA Sanjiva Lele Texas Instruments, Inc. Dallas, TX, USA Reid Tatge The PACE team includes a large number of additional colleagues and collaborators: Laksono Adhianto,1 Rajkishore Barik,1 Heba Bevan,1 Milind Chabbi, 1 Jean-Christophe Beyler,2 Zoran Budimlic,´ 1 Michael Burke,1 Vincent Cave,´ 1 Lakshmi Chakrapani,1 Phillipe Charles,1 Jack Dennis,2 Sebastien Donadio,2 Mike Fagan,1 Guohua Jin,1 Paul Hahn,1 Timothy Harvey,1 Thomas Henretty,3 Justin Hoelwinski,3, Zhao Jishen,1 Sam Kaplan,2 Kirk Kelsey,2 Mark Krentel,1 Abid Malik,1 Dung “Zung” Nguyen,1 Rene Pecnik,˘ 4 Louis-Noel¨ Pouchet,3 Atanas Rountev,3 Jeffrey Sandoval,1 Arnold Schwaighofer,1 Jun Shirako,1 Ray Simar,1 Brian West,1 Yonghong Yan,1 Anna Youseffi,1 Jisheng Zhao1 1 Rice University 2 ET International 3 Ohio State University 4 Stanford University 5 Texas Instruments, Incorporated Technical Contacts: Keith D. Cooper 713-348-6013 [email protected] Linda Torczon 713-348-5177 [email protected] Vivek Sarkar 713-348-5304 [email protected] Design Document Master: Michael Burke 713-348-4476 [email protected] Administrative Contacts: Penny Anderson 713-348-5186 [email protected] Lena Sifuentes 713-348-6325 [email protected] Web Site: http://pace.rice.edu Contents 1 Overview of the PACE System 1 1.1 Introduction . .1 1.1.1 Motivation . .1 1.1.2 Document Roadmap . .3 1.2 Structure of the PACE System . .3 1.2.1 Information Flow in the PACE System . .4 1.2.1.1 The Compiler . .4 1.2.1.2 The Runtime System . .5 1.2.1.3 The Characterization Tools . .5 1.2.1.4 Machine Learning Tool . .6 1.2.2 Storing Knowledge in a Distributed Fashion . .7 1.3 Adaptation in the PACE Compiler . .7 1.3.1 Characteristic Driven Optimization . .8 1.3.2 Offline Feedback-Driven Optimization . .8 1.3.3 Online Feedback-Driven Optimization . .9 1.3.4 Machine Learning . 10 1.4 Status . 10 2 Resource Characterization in the PACE System 13 2.1 Introduction . 13 2.1.1 Motivation . 13 2.1.2 Approach . 14 2.2 Functionality . 15 2.2.1 Interfaces . 15 2.2.2 Inputs . 15 2.2.3 Output . 17 2.3 Method . 18 2.3.1 Reporting Characteristic Values . 19 2.3.1.1 Interface to Other PACE Tools . 21 3 An Overview of the PACE Compiler 23 3.1 Introduction . 23 3.2 Functionality . 23 3.2.1 Input and Output . 23 3.2.2 Interfaces . 24 3.2.3 The Refactored Program Unit . 25 3.2.4 The Optimization Plan . 25 3.3 Components of the PACE Compiler . 26 iii iv CONTENTS 3.3.1 Compiler Driver . 26 3.3.2 Platform-Aware Optimizer . 27 3.3.2.1 Polyhedral Analysis and Transformation Tools . 27 3.3.3 PAO!TAO IR Translator . 27 3.3.4 Target-Aware Optimizer . 27 3.4 Paths Through the PACE Compiler . 28 3.5 Optimization in the PACE Compiler . 28 3.6 Software Base for the PACE Compiler . 30 4 PACE Platform-Aware Optimizer Overview 31 4.1 Introduction . 31 4.2 Functionality . 31 4.2.1 Input . 31 4.2.2 Output . 31 4.3 Method . 33 4.3.1 Front end . 33 4.3.2 Program Analyses . 34 4.3.3 Legality Analysis . 34 4.3.4 Cost Analysis: Memory Hierarchy . 35 4.3.5 Cost Analysis: PAO-TAO Query Interface . 36 4.3.6 Transcription . 37 4.3.7 The Optimization Plan . 38 4.3.8 PAO Parameters for Runtime System . 38 4.3.9 Guidance from Runtime System . 38 5 PolyOpt The Polyhedral Optimization Framework 39 5.1 Introduction . 39 5.1.1 Motivation . 39 5.1.2 Background . 40 5.2 Functionality . 40 5.2.1 Static Control Part (SCoP) Code Fragments . 41 5.2.2 SCoP Detection and Extraction of Polyhedra . 41 5.2.3 Polyhedral Dependence Analysis with Candl . 42 5.2.4 Pluto Transformation Generator . 43 5.2.5 Polyhedral Code Generation with CLooG . 43 5.2.6 Parametric Tiling with PTile . 43 5.2.7 Translation to Sage ASTs ............................. 43 5.3 Method . 44 5.3.1 SCoP Detection and Extraction of Polyhedra . 44 5.3.2 Polyhedral Dependence Analysis with Candl . 44 5.3.3 Pluto Transformation Generator . 45 5.3.4 Polyhedral Code Generation with CLooG . 47 5.3.5 Translation to Sage ASTs ............................. 47 5.3.6 Parametric Tiling with PTile . 48 6 AST-based Transformations in the Platform-Aware Optimizer 51 6.1 Introduction and Motivation . 51 6.2 Functionality . 52 6.2.1 Input . 52 6.2.2 Output . 52 CONTENTS v 6.3 Method . 52 6.3.1 Pattern-driven Idiom Recognition . 53 6.3.2 AST-based Loop Tiling . 54 6.3.3 Selection of Tile Size . 55 6.3.3.1 DL Model . 55 6.3.3.2 ML Model . 56 6.3.3.3 Bounding Search Space and Selecting Initial Tile Size . 56 6.3.4 Loop Interchange . 57 6.3.5 Unrolling of Nested Loops . 57 6.3.5.1 Cost Driven Loop Unroll-and-Jam . 58 6.3.5.2 Pruning the Search Space . 59 6.3.6 Scalar Replacement . 59 6.3.7 Incremental Reanalysis . 59 7 The Rose to LLVM Translator 63 7.1 Introduction . 63 7.1.1 Motivation . 63 7.2 Functionality . 63 7.2.1 Input . ..

The Platform-Aware Compilation Environment 1 Status and Future Directions

Expression Rematerialization for VLIW DSP Processors with Distributed Register Files ?

User-Directed Loop-Transformations in Clang

Elimination of Memory-Based Dependences For

A General Compilation Algorithm to Parallelize and Optimize Counted Loops with Dynamic Data-Dependent Bounds Jie Zhao, Albert Cohen

Foundations of Scientific Research

Polyhedral-Model Guided Loop-Nest Auto-Vectorization Konrad Trifunović, Dorit Nuzman, Albert Cohen, Ayal Zaks, Ira Rosen

Compiler Construction

Synthesis and Exploration of Loop Accelerators for Systems-On-A-Chip

Portable Section-Level Tuning of Compiler Parallelized Applications

Mipsprotm Fortran 77 Programmer's Guide

Unified Polyhedral Modeling of Temporal and Spatial Locality

Power and Energy Impact by Loop Transformations