A Scalable FPGA Implementation of Cellular Neural Networks for Gabor-Type Filtering

2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006 A Scalable FPGA Implementation of Cellular Neural Networks for Gabor-type Filtering Ocean Y. H. Cheung, Philip H. W. Leong, Eric K. C. Tsang, Bertram E. Shi Abstract— We describe an implementation of Gabor-type stage of processing, such as the primary visual cortex (V1), filters on field programmable gate arrays using the cellu- additional selectivity to orientation, direction of motion and lar neural network (CNN) architecture. The CNN template binocular disparity have been recorded. A large proportion of depends upon the parameters (e.g., orientation, bandwidth) of the Gabor-type filter and can be modified at runtime so cells in V1 can be modeled by a orientation selective neuron that the functionality of Gabor-type filter can be changed which consists of a Gabor-type filter followed by a nonlinear dynamically. Our implementation uses the Euler method to element [2]. solve the ordinary differential equation describing the CNN. In this paper, we first present a high performance FPGA- The design is scalable to allow for different pixel array sizes, based cellular neural network which implements a Gabor- as well as simultaneous computation of multiple filter outputs 1 tuned to different orientations and bandwidths. For 1024 pixel type filter. This is a building block element which we frames, an implementation on a Xilinx Virtex XC2V1000-4 intend to use with different parameters to model cells in device uses 1842 slices, operates at 120 MHz and achieves 23,000 V1. The application of the Gabor-type filter in a neuro- Euler iterations over one frame per second. morphic system consisting of an analog VLSI retina chip interfaced to our Gabor chip is also presented. The retina I. INTRODUCTION chip serves as an imager and transmits its output to the FPGA It is clear that biological systems perform feats of signal via an address event representation (AER) transceiver. By processing that we have not been able to approach using combining both analog VLSI chips and digital chips (e.g. even the most sophisticated computers and signal processing DSP chips, microprocessors, and FPGA chips), we hope to techniques. Generally, biological based systems operate with make real-time implementations of early vision models and greater functionality, lower power consumption and increased visualize their temporal behaviour, while achieving a level robustness than their man-made electrical counterparts. of performance, integration, power consumption, area and Although biological visual systems have been widely flexibility not possible using any technology in isolation. studied at the physiological, psychophysical and functional An analog VLSI silicon retina is used for image ac- levels, our understanding of its signal processing mechanisms quisition. Neuronal activity in the retina is transmitted to is still very rudimentary. One of the obstacles in this pro- an FPGA using the address event representation (AER) cess is the difficulty of dealing with the vast amounts of communication protocol [3], [4], [5], [6]. The AER scheme processing necessary to test real-time temporal models of is not widely used in reconfigurable computing circles and the visual system. Although parallel clusters offer enormous offers an efficient communications scheme when activity is amounts of computing power, issues concerning latency, sparse, as in the case of most biologically inspired systems. power consumption and volume constraints are limiting in Using AER, multiple neuromorphic chips can be intercon- many applications. To address this issue, implementations nected in a modular manner and large networks of Gabor using subthreshold analog VLSI techniques have been pro- chips, each modeling different cells in the visual system can posed and real-time models of early vision systems have been be implemented. A preliminary version of this paper was developed. published in [7]. Analog VLSI is an excellent match to the distributed par- The design presented has the following contributions: allel computation present in biological systems. It offers high • We present a novel, flexible and scalable architecture speed, high density and low power consumption. However, for modeling early vision using cellular neural networks it has the disadvantages of long design time, low signal to (CNNs) on Field Programmable Gate Arrays noise ratio and is best suited for simple, regular designs. Field • The functionality of the Gabor-type filter can be programmable gate arrays (FPGAs) have complementary changed at runtime by downloading different CNN properties in that design times are short, arbitrary dynamic templates range can be achieved and complex control logic is relatively • We describe how a system can be developed using a easy to implement. We believe that hybrid systems which front end analog VLSI retina system to study the early process information in both domains are able to combine vision models in V1, and which operates in real time their benefits. • We propose and explore the simulation of neural activity In visual systems, as information travels from the retina on FPGAs using an AER interface to the cortex, neurons become progressively more selective to complex stimuli [1]. Cells in the retina are sensitive to 1This research was supported in part by Hong Kong Research Grants position, size, temporal frequency and color. In the higher Council. 0-7803-9490-9/06/$20.00/©2006 IEEE 15 ∆t The rest of the paper is organized as follows. In Section II, Where fij is defined in equation 2 and h = τ . The Gabor-type filters and their implementation using cellular accumulated error for Euler's method is O(h). neural networks are introduced. Next, in Section III the For a 3 × 3 CNN, the feedback template, akl is architecture of an FPGA-based Gabor-type filter is described. a a a Results obtained from our implementation are in Section IV. 00 10 20 0 a a a 1 Finally, conclusions are drawn in Section V. 01 11 21 @ a02 a12 a22 A II. MODEL and for the special case of a Gabor-type filter [15], the A Gabor filter can be represented by the complex valued template is 2 −jΩy convolutional kernel [8] 0:0 αye 0:0 0 2 jΩx 2 2 2 −jΩx 1 1 x2+y2 αxe −2(αx + αy) αxe : − 2 j(!xx+!y y) g(x; y) = e 2σ e (1) 2 jΩy 2πσ2 @ 0:0 αye 0:0 A where σ; !x; !y are real constants. The even kernel is cosine The control template, bkl is given by modulated and the odd kernel is sine modulated and hence 0 0 0 two filers are 90 degrees out of phase. Pollen and Ronner 0 0 β 0 1 : suggested that adjacent neurons in visual cortex have even @ 0 0 0 A and odd symmetry [9] and therefore Gabor-filter can model receptive field profiles. The spatial transfer function of this filter for an input Such filters can be used to model the receptive fields of image that is constant in time is neurons in the visual cortex [2], [10] and have applications in stereo vision [11], texture segmentation [12], and mo- HΩ H(!x; !y) = − − tion analysis [13]. Cellular neural networks (CNN) can be 2−2 cos(!x−Ωx) 2 2 cos(!y Ωy ) 1 + (∆Ω )2 + (∆Ω )2 used to implement Gabor-type filters, where the functions x y HΩ modulating the complex exponentials are not necessarily H(!x; !y) ≈ 2 2 (5) (!x−Ωx) (!y −Ωy ) 1 + 2 + 2 Gaussian [14]. In this section, we describe the equations (∆Ωx) (∆Ωy ) necessary to implement a Gabor-type filter and refer readers where to [14] for a full derivation. The CNN formulation leads to a particularly simple paral- β H(Ω) = 2 2 1 + α (2 − 2 cos Ωx) + α (2 − 2 cos Ωy) lel implementation and is adopted in this work. Let uij ; xij x y and yij represent the input, state and output pixels of an 2 β (∆Ωx) = 2 image respectively. The dynamics of the CNN are governed HΩαx by the differential equation 2 β (∆Ωy) = 2 dxij HΩαy τ = fij = −xij + aklykl + bklukl (2) dt X X The constant HΩ is the maximum gain of the filter, kl2Nr kl2Nr which occurs at the spatial frequency (!x; !y) = (Ωx; Ωy), where akl represents the feedback template and bkl represents corresponding to a sine wave grating with spatial frequency the control template. 2 2 Ωy magnitude Ω = Ωx + Ωy and orientation θ = arctan( ). The output of the CNN is commonly assumed to be a q Ωx The terms ∆Ω and ∆Ω are referred as the half bandwidths sigmoidal nonlinearity, x y in x and y, since the transfer function drops to half its 1 maximum value at spatial frequencies (!x; !y) = (Ωx yij = (jxij + 1j − jxij − 1j) (3) 2 ∆Ω; Ωy) and (!x; !y) = (Ωx; Ωy ∆Ω). The parameter β which is linear in the range [−1; 1] and saturates at −1 or +1 provides control over the filter gain. In the following, we outside this range. Since the Gabor-type template is intended assume that β is chosen so that the filter has unity gain, i.e., to operate in the linear region, for simplicity we will assume HΩ = 1. By changing the values of Ωx; Ωy; ∆Ωx and ∆Ωy (or in the following that yi;j = xi;j . Finite difference methods can be used to solve the ordinary equivalently Ωx; Ωy; αx and αy), the Gabor-type filter can be tuned to respond to different orientations and spatial scales. differential equation (ODE) in Equation 2. Although tech- π π π π niques such as Runga Kutta methods with higher accuracy For example, if we choose Ωx = 12 cos( 4 ); Ωy = 12 sin( 4 ), and π (i.

Load more