Page 1 of 17 IEE Proceedings Review Copy Only Computers & Digital Techniques

Page 1 of 17 Computers & Digital Techniques Reconfigurable Computing: Architectures and Design Methods Journal: IEE Proc. Computers & Digital Techniques Manuscript ID: CDT-2004-5086.R1 Manuscript Type: Research Paper Date Submitted by the n/a Author: Keyword: IEE Proceedings Review Copy Only Computers & Digital Techniques Page 2 of 17 Reconfigurable Computing: Architectures and Design Methods ∗ ∗∗ ∗∗∗ ∗ ∗ ∗∗ T.J. Todman , G.A. Constantinides , S.J.E. Wilton , O. Mencer , W. Luk and P.Y.K. Cheung ∗Department of Computing, Imperial College London, UK ∗∗Department of Electrical and Electronic Engineering, Imperial College London, UK ∗∗∗Department of Electrical and Computer Engineering, University of British Columbia, Canada Abstract— Reconfigurable computing is becoming increasingly the power consumption will tend to be much lower than that attractive for many applications. This survey covers two aspects for a general-purpose processor. A recent study [113] reports of reconfigurable computing: architectures and design methods. that moving critical software loops to reconfigurable hardware Our paper includes recent advances in reconfigurable architectures, such as the Altera Stratix II and Xilinx Virtex 4 FPGA results in average energy savings of 35% to 70% with an devices. We identify major trends in general-purpose and special- average speedup of 3 to 7 times, depending on the particular purpose design methods. It is shown that reconfigurable com- device used. puting designs are capable of achieving up to 500 times speedup Other advantages of reconfigurable computing include a and 70% energy savings over microprocessor implementations reduction in size and component count (and hence cost), for specific applications. improved time-to-market, and improved flexibility and upgrad- I. INTRODUCTION ability. These advantages are especially important for embedded applications. Indeed, there is evidence [125] that Reconfigurable computing is rapidly establishing itself as embedded systems developers show a growing interest in a major discipline that covers various subjects of learning, reconfigurable computing systems, especially with the intro- including both computing science and electronic engineering. duction of soft cores which can contain one or more instruction Reconfigurable computing involves the use of reconfigurable processors [7], [43], [74], [103], [104], [136]. devices, such as Field Programmable Gate Arrays (FPGAs), In this paper, we present a survey of modern reconfigurable for computing purposes. Reconfigurable computing is also system architectures and design methods. Although we also known as configurable computing or custom computing, since provide background information on notable aspects of older many of the design techniques can be seen as customising a technologies, our focus is on the most recent architectures and computational fabric for specific applications [76]. design methods, as well as the trends that will drive each of Reconfigurable computing systems often have impressive these areas in the near future. In other words, we intend to performance. Consider, as an example, the point multiplication complement other survey papers [16], [27], [77], [101], [121] operation in Elliptic Curve cryptography. For a key size of by: 270 bits, it has been reported [120] that a point multiplication can be computed in 0.36 ms with a reconfigurable computing 1) providing an up-to-date survey of material that appears design implemented in an XC2V6000 FPGA at 66 MHz. after the publication of the papers mentioned above; In contrast, an optimised software implementation requires 2) identifying explicitly the main trends in architectures and 196.71 ms on a dual-Xeon computer at 2.6 GHz; so the design methods for reconfigurable computing; reconfigurable computing design is more than 540 times faster, 3) examining reconfigurable computing from a perspective while its clock speed is almost 40 times slower than the different from existing surveys, for instance classifying Xeon processors. This example illustrates a hardware design design methods as special-purpose and general-purpose; implemented on a reconfigurable computing platform. We 4) offering various direct comparisons of technology op- regard such implementations as a subset of reconfigurable tions according to a selected set of metrics from different computing, which in general can involve the use of runtime perspectives. reconfiguration and soft processors. The rest of the paper is organised as follows. Section 2 Is this speed advantage of reconfigurable computing over contains background material that motivates the reconfigurable traditional microprocessors a one-off or a sustainable trend? computing approach. Section 3 describes the structure of Recent research suggests that it is a trend rather than a one-off reconfigurable fabrics, showing how various researchers and for a wide variety of applications: from image processing [51] vendors have developed fabrics that can efficiently accelerate to floating-point operations [124]. time-critical portions of applications. Section 4 then covers Sheer speed, while important, is not the only strength of recent advances in the development of design methods that reconfigurable computing. Another compelling advantage is map applications to these fabrics, and distinguishes between reduced energy and power consumption. In a reconfigurable those which employ special-purpose and general-purpose opti- system, the circuitry is optimized for the application, such that mization methods. Finally, Section 5 concludes the paper and IEE Proceedings Review Copy Only Page 3 of 17 Computers & Digital Techniques summarises the main trends in architectures, design methods a reconfigurable fabric upon which custom functional units and applications of reconfigurable computing. can be built. The processor(s) executes sequential and non- critical code, while code that can be efficiently mapped to II. BACKGROUND hardware can be “executed” by processing units that have Many of today's compute-intensive applications require been mapped to the reconfigurable fabric. Like a custom more processing power than ever before. Applications such integrated circuit, the functions that have been mapped to the as streaming video, image recognition and processing, and reconfigurable fabric can take advantage of the parallelism highly interactive services are placing new demands on the achievable in a hardware implementation. Also like an ASIC, computation units that implement these applications. At the the embedded system designer can produce the right mix same time, the power consumption targets, the acceptable of functional and storage units in the reconfigurable fabric, packaging and manufacturing costs, and the time-to-market providing a computing structure that matches the application. requirements of these computation units are all decreasing Unlike an ASIC, however, a new fabric need not be designed rapidly, especially in the embedded hand-held devices market. for each application. A given fabric can implement a wide Meeting these performance requirements under the power, variety of functional units. This means that a reconfigurable cost, and time-to-market constraints is becoming increasingly computing system can be built out of off-the-shelf compo- challenging. nents, significantly reducing the long design-time inherent in In the following, we describe three ways of supporting an ASIC implementation. Also unlike an ASIC, the functional such processing requirements: high-performance micropro- units implemented in the reconfigurable fabric can change cessors, application-specific integrated circuits, and reconfig- over time. This means that as the environment or usage of urable computing systems. the embedded system changes, the mix of functional units can High-performance microprocessors provide an off-the-shelf adapt to better match the new environment. The reconfigurable means of addressing processing requirements described earlier. fabric in a handheld device, for instance, might implement Unfortunately for many applications, a single processor, even large matrix multiply operations when the device is used in one an expensive state-of-the-art processor, is not fast enough. In mode, and large signal processing functions when the device addition, the power consumption (100 watts or more) and is used in another mode. cost (possibly thousands of dollars) state-of-the-art processors Typically, not all of the embedded system functionality place them out-of-reach for many embedded applications. needs to be implemented by the reconfigurable fabric. Only Even if microprocessors continue to follow Moore's Law so those parts of the computation that are time-critical and contain that their density doubles every 18 months, they may still be a high degree of parallelism need to be mapped to the recon- unable to keep up with the requirements of some of the most figurable fabric, while the remainder of the computation can be aggressive embedded applications. implemented by a standard instruction processor. The interface Application-specific integrated circuits (ASICs) provide an- between the processor and the fabric, as well as the interface other means of addressing these processing requirements. between the memory and the fabric, are therefore of the utmost Unlike a software implementation, an ASIC implementation importance. Modern reconfigurable devices are large enough provides a natural mechanism for implementing the large to implement instruction processors within the programmable amount of parallelism

Load more