An Improved Version of SPEAR
Total Page:16
File Type:pdf, Size:1020Kb
DIPLOMARBEIT SPEAR2 - An Improved Version of SPEAR ausgef¨uhrtam Institut f¨ur Technische Informatik, Embedded Computing Systems Group Technische Universit¨atWien unter der Anleitung von Ao.Univ.Prof. Dipl.-Ing. Dr.techn. Andreas Steininger und Univ.Ass. Dipl.-Ing. Dr.techn. Martin Delvai von Martin Fletzer Kreuzgassee 6A 2130 Mistelbach Mistelbach, 16. November 2007 Fur¨ meine Eltern Danksagung Kurzfassung Abstract A soft core processor is a configureable microcontroller defined in software. Such processors may be appropriate for a simple system, where the only functionalities are the manipulation of general purpose I/O. Moreover, they may also fit a complex system, where an operating system and interfaces like Ethernet or DDR SDRAM are required. In course of this master thesis, the soft core processor SPEAR2 has been developed. The SPEAR2 architecture is a 16/32-bit processor and based on SPEAR (Scalable Processor for Embedded Applications in Real-time Envi- ronments). The motives for developing an improved version are versatile. Fitting the code to new target technologies, eliminating some disadvantages of SPEAR, enabling configurability, or just adding useful features like byte addressed memory. To satisfy this goals, SPEAR2 was written from scratch. To provide ad- justable memory sizes and the option to change the size of the data path, a configuration framework has been created. Basically SPEAR2 is a 16-bit ar- chitecture, but the data path can be extended to 32 bit. Considerable effort had to be done to enable the correct interaction of two different data path sizes with other components of the processor. The chief difficulty was mem- ory access and developing a consistent bus interface. For both configurations the same instruction set is used, enabling to use the same toolchain for both configurations. SPEAR2 was developed with the aim to be an efficient 16-bit processor. If required, more computational power can be provided by extending the data path. The gained experience showed, a 16-bit processor with extended data path is not able to provide the performance of 32-bit processors, because of the limited instruction set. Without extended data path SPEAR2 acts as small and efficient processor, already used by some projects. The experience shows, that the 16-bit configuration is able to compete with other soft core processors. CONTENTS vi Contents 1 Introduction 1 1.1 Motivation . 1 1.2 Outline . 3 2 State of the Art 4 2.1 MicroBlaze . 4 2.1.1 Overview . 5 2.1.2 Instruction Set Architecture . 5 2.1.3 Registers . 6 2.1.4 Pipeline Architecture . 6 2.1.5 Memory Architecture . 7 2.1.6 Exceptions . 8 2.2 Nios II . 9 2.2.1 Overview . 9 2.2.2 Instruction Set Architecture . 10 2.2.3 Registers . 10 2.2.4 Pipeline Architecture . 10 2.2.5 Memory Architecture . 11 2.2.6 Exceptions . 13 2.3 LatticeMico32 . 14 2.3.1 Overview . 14 2.3.2 Instruction Set Architecture . 15 2.3.3 Registers . 16 2.3.4 Pipeline Architecture . 16 2.3.5 Memory Architecture . 17 2.3.6 Exceptions . 18 3 SPEAR - Basis for a new Architecture 20 3.1 Overview . 20 3.1.1 Pipeline . 21 3.1.2 Memory Architecture . 22 3.2 Exceptions . 22 CONTENTS vii 3.3 Register File . 22 3.3.1 Frame Pointer Registers . 23 3.3.2 RTSX- and RTSY-Register . 23 3.3.3 RTE-Register . 24 3.4 Instruction Set Architecture . 24 3.4.1 Structure of Instructions . 24 3.4.2 Conditional Instructions . 24 3.5 Extension Modules . 26 3.5.1 Processor Control Module . 27 3.5.2 Programmer Module . 27 4 Analysing the Old Architecture 28 4.1 Three Processor Cores . 28 4.2 Analysing SPEAR . 29 5 SPEAR2 31 5.1 Overview . 31 5.2 Customizable Data Path . 32 5.2.1 Implementation Overview . 33 5.2.2 Performance Improvement . 33 5.2.3 Addressable Memory . 33 5.3 Processor Architecture . 34 5.3.1 First Stage . 34 5.3.2 Second Stage . 35 5.3.3 Third Stage . 36 5.3.4 Fourth Stage . 39 5.4 Instruction Set Architecture . 39 5.4.1 Instruction Format . 40 5.4.2 Conditional Instructions . 40 5.5 Implementation . 41 5.5.1 Program Counter . 41 5.5.2 Instruction Memories . 42 5.5.3 Decoder . 43 CONTENTS viii 5.5.4 Register File . 44 5.5.5 Forwarding Unit . 45 5.5.6 ALU . 46 5.5.7 Frame Pointer . 47 5.5.8 Data Memory . 49 5.5.9 Memory Access . 52 5.5.10 Exceptions . 53 5.5.11 Sleep Mode . 54 5.5.12 Implementation Details . 55 5.6 Extension Modules . 56 5.6.1 System Control Module . 58 5.6.2 Programmer Module . 61 5.7 Differences: 16 vs. 32 bit Version . 62 5.7.1 Interface . 63 5.7.2 Instruction Set Architecture . 63 5.7.3 Addressable Memory . 64 6 Specification 65 6.1 Configuration . 65 6.2 Interface . 66 7 Results 67 8 Conclusion 68 A Appendix - Instruction Set Reference 69 A.1 Overview . 69 A.2 Description . 74 LIST OF FIGURES ix List of Figures 1 Block Diagram of SPEAR . 21 2 Exception Vector Table of SPEAR . 22 3 Organization of a Frame . 23 4 Interface for Extension Modules . 26 5 Block Diagram of SPEAR2 . 32 6 Parts Affected by Configuration . 34 7 Performance of SPEAR2 . 35 8 The Fetch Stage in More Detail . 36 9 The Decode Stage in More Detail . 37 10 The Execute Stage in More Detail . 38 11 The Write Back Stage in More Detail . 39 12 The old Implementation of the Pogram Counter . 41 13 8 bit Barrel Shifter . 47 14 Organisation of Data Memory . 50 15 Architecture of Data Memory . 51 16 Generic Status Byte . 57 17 Generic Config Byte . 57 18 Interface of the System Control Module . 59 19 Customized Status Byte of the System Control Module . 60 20 Interface of the Programmer Module . 61 21 Customized Config Byte of the System Control Module . 62 LIST OF TABLES x List of Tables 1 Instruction Formats used by SPEAR . 25 2 Configuration Options of SPEAR2 . 66 1 1 Introduction In our life embedded systems play an important role. They are used inside products even where we do not expect them. Often a compound of embedded systems is used to enable the features of products. A car for example contains already many embedded systems and the usage is rapidly growing. An embedded system comprises of hardware (e.g. processing unit, sensors, communication interfaces, etc.) and software. The operational areas for em- bedded systems are very different and so there is great diversity of embedded systems and there components. A lot of different products try to satisfy the market. 1.1 Motivation The requirements to embedded systems are very different and so the solutions too. When starting a new embedded systems project an appropriate proces- sor have to be chosen. For small applications a cheap 8-bit microcontroller can be sufficient. If much computational power is needed and no specific features are required (e.g. untypically high quantity of I/O pins or more UARTs as usual), then a dedicated processing unit (e.g. 32-bit embedded processors or digital signal processors (DSP)) may satisfy the needs. And between this two extremes different requirements specifications are possible. But dedicated hard cores have limited flexibility. The opposite of a dedicated hard core is to use a soft core processor. A soft core processor is a microsprocessor defined in software using a Hardware De- scription Language (HDL). The soft core processor can be synthesized and run in Field Programmable Gate Array(FPGA) or Application Specific In- tegrated Circuit (ASIC). Soft core processors are very flexible and can be configured with exactly what is needed - no more, no less. Thereby it is possible to tune the processor for less area or more performance. For ex- ample, the same processor can be used with or without caches and different number of pipeline stages. Flexibility, unreachable by a dedicated hard core processor. In addition, the hardware used to implement the soft core pro- cessor can be used to implement any parts of the intended task for optimal 1.1 Motivation 2 design implementation. In general, implementing an algorithm only in soft- ware is a flexible solution and saves development time. On the other hand, implementing an algorithm in hardware can enable great performance im- provements and at the same time requires less energy to fulfil the task. Some algorithms can be accelerated up to 100 times, if implemented in hardware. Together, implementing parts of a problem in software and the other parts in hardware is a powerful solution and sometimes the only choice. Video compression in real time for example requires a lot of computational power. If low power consumption is required, the compression algorithm has to be implemented partially in hardware. Another big advantage of FPGAs and ASICs is their flexible interface. Nearly all known interfaces are realizable. Starting with a small UART through to PCI-Express interface. Packages are available with up to several hundred pins and in a variety of sizes. The advantage of FPGAs over ASICs is their flexibility since an FPGA can be updated like software. On the other hand developing a design for ASICs take more time, but ASICs provide much more performance than an FPGA. The concept of using a soft core processor has several advatages such as only one chip is necessary and thereby the board layout can be simplified which results in a cheaper design. Soft core processors can be customized to provide the required performance without wasting resources.