A Modular Peripheral to Support Self-Reconfiguration in Socs

A Modular Peripheral to Support Self-Reconfiguration in SoCs Andrés Otero, Ángel Morales-Cas, Jorge Portilla, Eduardo de la Torre, and Teresa Riesgo Centro de Electrónica Industrial, Universidad Politécnica de Madrid {joseandres.otero, jorge.portilla, eduardo.delatorre, teresa.riesgo}@upm.es, [email protected] Abstract— In this paper, a solution to support the run-time configuration files to reconfigure only the desired portion of readback, relocation and replication of cores in embedded the device, the tools to relocate these modules in any arbitrary systems with dynamic and partial reconfiguration capabilities is position, or the control of the configuration port and the presented. The proposal shows a peripheral structure that allows communication infrastructure compatible with the DPR an easy integration and communication with the rest of the technique [6]. system, including an API to make the reconfiguration details to be more transparent to software applications. Differently to other Regarding the configuration ports, different interfaces are proposals, all functionality is implemented in hardware, available. Xilinx FPGAs contain the ICAP, an internal port that achieving a higher reconfiguration speed. In addition, different allows the partial reconfiguration of the device. This leads to design decisions have been taken in order to increase the the concept of self-reconfigurable embedded system, a system portability of the solution to existing and, possibly, future that can modify its functionality autonomously at run-time. To FPGAs. Finally, a use case is provided, which shows the features control the ICAP, Xilinx provides a wrapper called HWICAP of this module applied to the run-time scaling of a hardware that can be integrated in the system like a peripheral through coprocessor. the IBM PLB CoreConnect bus [7]. Keywords-component; FPGA; Dynamic Reconfiguration; core Regarding the generation of partial bitstreams, that is, relocation; Scalability configuration files to reconfigure only the desired resources of the device, traditionally there has always been an important gap I. INTRODUCTION between the silicon availability and the apparition of design tools and flows supporting this feature. In addition, partial The emergence of commercial FPGAs with dynamic and reconfiguration files are addressed to the position of the device partial reconfiguration (DPR) features has opened promising where they were initially mapped. As a result, either a different research opportunities in the field of reconfigurable computing. partial bitstream is generated in advance for each possible The benefits and possibilities offered by this technique have reconfigurable region, or the final position of each block is been extensively reported, mainly in academic works [1] [2] restricted to the original one. With the second option, the [3] [4]. An important one is that partial reconfiguration reduces system can enter into stall states, if it requires the execution of time and memory overheads compared with full a certain hardware block, and the corresponding reconfigurable reconfiguration. In addition, it allows the design of specific region is not available. An alternative, with low computing hardware for each particular application, providing advantages, overhead and bitstream storage requirements, is to modify the both regarding power consumption and performance. target position of the hardware modules at run-time, during the Furthermore, dynamic partial reconfiguration allows the reconfiguration process. Some solutions exist to carry it out, implementation of complex systems in constrained devices, including both software and hardware implementations, like and enhances run-time adaptability for embedded systems those reported in [8] and [8]. However, hardware approaches working under variable environments. Consequently, due to have restricted functionality and are very device dependent, these advantages, and the limits of silicon integration, while software ones provide lower reconfiguration rates, reconfigurable computing is seen like a must for future because of their high overhead. embedded systems. The work described in this paper provides an easy and fast Despite these advantages, the main obstacles to provide hardware based solution for the readback, relocation and partially reconfigurable solutions with commercial purposes replication of partial bitstreams. The main purpose of this are the lack of commercial design flows and tools supporting it solution is to provide digital systems designers the possibility [5], as well as the need of having specific knowledge on of making logical designs that take advantage of the partial dynamic reconfiguration techniques. In addition, the dynamic reconfiguration, without having a previous particularity of each device, and the changes from one preparation in this field. Additionally, the hardware system was generation of reconfigurable devices to the next one, makes designed to be easily ported to new devices and families of difficult to create portable and upgradable solutions. To deal Xilinx FPGAs. This module is designed as a peripheral with dynamic reconfiguration from scratch, different issues module, including integration with the rest of the system by have to be solved. Some of them are the creation of partial means of the CoreConnect technology, and a software API to increase independence between the family-specific hardware implementation and the software applications using the dynamic reconfiguration capabilities offered by the peripheral. This independence will be proved with the adaptation of this solution not only across devices within a family, but also between different families. Also, the reconfiguration control process provided by the block, may allow future online modifications (also called mutations) of the bitstream. The rest of the paper is organized as follows. In section II, a brief description of the Xilinx Virtex FPGAs reconfiguration details is provided. Section III includes an overview of the state-of-the art of the existing hardware solutions for bitstream manipulation, while in section IV the approach of this paper is introduced. In sections V and VI, implementation details, both at hardware and software levels are described, and finally, in section VII, implementation results are given. II. XILINX VIRTEX RECONFIGURATION DETAILS Before introducing the details of the solution proposed in Figure 1. Two dimensional partial reconfiguration in Virtex-5 this paper, it is necessary to explain some concepts about the Xilinx Virtex architecture and its bitstream structure. Afterwards, the Major Address, that is, the column inside the clock region has to be selected. Finally, the Minor Address, Partial reconfiguration of Xilinx FPGAs is based on the the position of the frame inside a column is identified. The modification of the content of the SRAM that controls the logic number of frames needed to configure a column depends on the configuration of each element of the device. The smaller kind of logical elements that it contains. Other field, called addressable unit of this configuration memory is called a block type, selects the configuration layer of the device. In this frame, and defines the minimum reconfiguration granularity. work, only type 0 elements will be addressed, that is, the Frames for Virtex families prior to Virtex-4 are arranged in interconnect and block configuration type. The concrete value columns that span the device entirely from top to bottom. On of all these parameters, as well as the number of resources of the other side, Virtex-4, Virtex-5 and Virtex-6 devices have the FPGA and their relative positions, depend on the particular more complex architectures, since they are composed by device. stacked rows of configurable blocks, called clock regions. Each clock region is composed by columns of configurable elements. Each frame expands the height of a row/clock region. As a result, not the full height of the device has to be reconfigured at each time, permitting a 2D reconfiguration scenario. This can be seen in Figure 1. Access to the internal configuration memory may be done, among other possibilities, through the internal registers of the ICAP port. Partial reconfiguration implies to provide some configuration commands, the register programming for those registers in the area being reconfigured, and the bitstream data itself. For the purpose of this paper, the registers of interest are the FAR, where the address of the frame to be accessed has to be stored; the CRC, a checksum of the configuration data; the COR, where the configuration options are selected; the FDRI, where the FAR data to be configured is stored, and the FDRO, that provides readback configuration information. In addition, a command register (CMD) is used to trigger the internal state machine of the ICAP. Details of this process can be found in the configuration user guides like [10] [11]. To reconfigure a region, the FAR values corresponding to Figure 2. Virtex-5 FAR format this region have to be generated. In Virtex devices, the FAR is created like the composition of different fields, as it can be As it has been already said, the solution provided by Xilinx seen in Figure 2. The first one is the position of the frame, in to create self-reconfigurable platforms is called HWICAP. This the top or in the bottom of the device. Then, the clock region is a wrapper of the ICAP primitive, including some internal where it is situated has to be indicated. memories, an ICAP

A Modular Peripheral to Support Self-Reconfiguration in Socs

An Architecture and Compiler for Scalable On-Chip Communication

On-Chip Interconnect Schemes for Reconfigurable System-On-Chip

AXI Reference Guide

Wishbone Bus Architecture – a Survey and Comparison

An Overview of Soc Buses

Vitex-II Pro: the Platfom for Programmable Systems

Computing Platforms Chapter 4

Xilinx XAPP1000: Reference System : Plbv46 PCI Express in a ML555

Design and Implementation of Clocked Open Core Protocol Interfaces for Intellectual Property Cores and On-Chip Network Fabric

New Virtex™-II Pro Family Extends Platform FPGA Capability with Multi

View of the Thesis

Automated Bus Generation for Multi-Processor Soc Design