C Compiler Aided Design of Application-Specific Instruction-Set Processors Using the Machine Description Language LISA

C Compiler Aided Design of Application-Specific Instruction-Set Processors Using the Machine Description Language LISA Von der Fakultät für Elektrotechnik und Informationstechnik der Rheinisch–Westfälischen Technischen Hochschule Aachen zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften genehmigte Dissertation vorgelegt von Diplom–Ingenieur Oliver Wahlen aus Jülich/Nordrhein-Westfalen Berichter: Universitätsprofessor Dr. rer. nat. Rainer Leupers Universitätsprofessor Dr. sc. techn. Heinrich Meyr Universitätsprofessor Dr.–Ing. Stefan Heinen Tag der mündlichen Prüfung: 04.05.2004 Diese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar. Berichte aus der Elektrotechnik Oliver Wahlen C Compiler Aided Design of Application-Specific Instruction-Set Processors Using the Machine Description Language LISA . D 82 (Diss. RWTH Aachen) Shaker Verlag Aachen 2004 Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the internet at http://dnb.ddb.de. Zugl.: Aachen, Techn. Hochsch., Diss., 2004 . Copyright Shaker Verlag 2004 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publishers. Printed in Germany. ISBN 3-8322-3035-1 ISSN 0945-0718 Shaker Verlag GmbH • P.O. BOX 101818 • D-52018 Aachen Phone: 0049/2407/9596-0 • Telefax: 0049/2407/9596-9 Internet: www.shaker.de • eMail: [email protected] gewidmet meiner Frau Manuela und meinen Eltern Peter und Hilde Contents 1 Introduction 1 1.1Motivation........................................ 1 1.2OrganizationofthisThesis............................... 3 2 ASIP Design Methodology 5 2.1ASIPDesignPhases.................................. 5 2.1.1 ArchitectureExploration............................ 5 2.1.2 ArchitectureImplementation.......................... 8 2.1.3 SoftwareApplicationDesign.......................... 8 2.1.4 SystemIntegrationandVerification...................... 8 2.2DesignEnvironments.................................. 9 3 The LISA Processor Design Platform 17 3.1FieldofApplication................................... 17 3.2TheLISALanguage.................................. 20 3.3 Applicability for Compiler Generation . 23 4 Retargetable Compilation 25 4.1CompilationPhases................................... 25 4.2CompilerFrontend................................... 27 4.3CompilerBackend.................................... 28 4.3.1 InstructionSelection.............................. 29 4.3.2 RegisterAllocation............................... 31 4.3.3 SchedulingandCodeCompaction....................... 33 4.3.4 CodeEmitter.................................. 36 4.4CompilerEnvironments................................. 37 v vi Contents 4.4.1 GCC....................................... 39 4.4.2 SUIF2/MachineSUIF............................. 39 4.4.3 SPAM...................................... 40 4.4.4 LCC....................................... 40 4.4.5 LANCE..................................... 41 4.4.6 IMPACT.................................... 41 4.4.7 Trimaran.................................... 41 4.4.8 CoSy....................................... 42 4.4.9 RelatedCodeGenerationTechniques..................... 43 5 Test Case: Two ASIP Design Approaches 45 5.1TheApplication..................................... 46 5.2TheICORE2ASIP................................... 46 5.3ALICEArchitectureTemplate............................. 47 5.4CCompilerbasedArchitectureExploration..................... 49 5.4.1 Exploring the Number of ALICE Functional Units . 50 5.4.2 ExploringSpecialPurposeUnits........................ 52 5.4.3 Exploring Latencies, Forwarding, and Register File Size . 53 5.5ArchitectureExplorationResults........................... 53 5.5.1 ExecutionCycles................................ 54 5.6ResultsoftheCaseStudy............................... 55 5.6.1 HardwareEfficiency............................... 55 5.6.2 DesignandVerificationTime......................... 56 5.7Conclusions....................................... 58 6 Generating C Compilers from the LISA ADL 61 6.1InformationRequiredbytheCompilerGenerator.................. 62 6.1.1 InstructionSelector............................... 62 6.1.2 RegisterAllocator................................ 63 6.1.3 Scheduler.................................... 64 6.1.4 CodeEmitter.................................. 64 6.2CompilerDesignFlow................................. 64 6.3InterfacingwiththeDesigner............................. 65 7 Scheduler Generation 69 7.1LISAOperationHierarchy............................... 69 7.2GenerationofReservationTables........................... 71 7.3 Generating Port Constraints for Reservation Tables . 72 7.4AllocatingIssueSlotsinReservationTables..................... 72 Contents vii 7.4.1 CodingConstraints............................... 73 7.4.2 VirtualResources................................ 73 7.4.3 ReducingtheNumberofVirtualResources.................. 78 7.5GenerationofLatencyTables............................. 85 7.6BacktrackingSchedulers................................ 87 7.6.1 OperBTSchedulerandListBTScheduler................... 87 7.6.2 MixedBTScheduler............................... 88 7.7SchedulerIntegrationintotheCoSyEnvironment.................. 92 8 Results 97 8.1DriverArchitectures.................................. 97 8.1.1 PP32NetworkProcessor............................ 97 8.1.2 ST200 Multimedia VLIW architecture . 98 8.2CompilerEvaluation.................................. 99 8.3Conclusions....................................... 102 9 Summary 105 A Compiler Companion Graphical User Interface 109 A.1RegistersDialog..................................... 109 A.2DataLayoutDialog................................... 111 A.3StackLayoutDialog.................................. 112 A.4PrologueandEpilogueMappingDialog........................ 112 A.5NonterminalsDialog.................................. 113 A.6 Calling Conventions Dialog . 115 A.7SchedulerDataflowDialog............................... 117 A.8SchedulerStructureDialog............................... 119 A.9MappingDialog..................................... 122 B Scheduler Descriptions 129 B.1 Handcrafted CoSy Scheduler Description for PP32 . 129 B.2 Handcrafted CoSy Scheduler Description for ST200 . 130 B.3GeneratedSchedulerDataforPP32.......................... 132 B.3.1ReservationTables............................... 132 B.3.2LatencyTables................................. 132 B.4 Generated Scheduler Data for ST200 . 134 B.4.1 ST200 Reservation Tables . 134 B.4.2 ST200 Latency Tables . 135 Bibliography 137 Chapter 1 Introduction 1.1 Motivation The growing complexity of mobile and automotive devices has put a high demand on robustness, performance, power efficiency, flexibility, development time, and price of these systems. The only way to meet all of these opposing design criteria is the integration of the application functionality on fewer silicon chips. The result of this integration process is a system-on-a-chip (SoC) design. SoC technology is the packaging of all the necessary electronic circuits and parts for a “system” (such as a cell phone, a digital camera, or a PDA) on a single silicon chip. Beside application specific circuits (ASICs), SoCs typically contain one or more processor cores with dedicated RAM and ROM and other peripherals. The growth of complexity was and is driven by Moore’s Law [113] that predicts a duplication of processing power of silicon chips every 18 month. The increasing performance of chips is based on decreasing structure sizes: On the one hand smaller transistors have shorter circuit times allowing for higher clock frequencies. On the other hand the number of transistors per area can be increased which better supports parallel computations and also allows larger memories. Unfortunately the decreasing structure size has the drawback of exponentially increasing non-recurring engineering (NRE) costs. The major factors of the NRE cost breakdown are chip design, chip verification and the development of the mask sets. So called platform based designs [97] are becoming an important instrument for cutting down the NRE. The advantage of design platforms is the opportunity to reuse IP-blocks at all levels of abstraction – i.e. in all phases of the design process. The ultimate goal is to create a library of functions, along with associated hardware and software implementations, that can be used for all new designs. The most important building blocks of any platform library are the programmable components. The advantage of programmable hardware is its flexibility: Processors can be used to implement many different functions on a small piece of hardware. Furthermore improvements to the 1 2 Chapter 1. Introduction functionality or error corrections can easily be added by software updates even in very late phases of the development process. Unfortunately functionality that is implemented in an ASIC is typically faster and more power efficient than a software implementation on a general purpose processor core. Using the product of silicon area (A), time per sample (T ), and energy per sample (E) as cost function (C = A·T ·E) [26] compares hardware implementations for a set of small kernel operations. It was demonstrated that compared to a physically optimized ASIC an

Load more