GPU Implementation Applied to Retinal and Cortical Vision Processes Jean-Charles Quinton
Total Page:16
File Type:pdf, Size:1020Kb
A generic library for structured real-time computations: GPU implementation applied to retinal and cortical vision processes Jean-Charles Quinton To cite this version: Jean-Charles Quinton. A generic library for structured real-time computations: GPU implementation applied to retinal and cortical vision processes. Machine Vision and Applications, Springer Verlag, 2010, 21 (4), pp.529-540. 10.1007/s00138-008-0180-9. hal-01966569 HAL Id: hal-01966569 https://hal.archives-ouvertes.fr/hal-01966569 Submitted on 28 Dec 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Machine Vision and Applications manuscript No. (will be inserted by the editor) A generic library for structured real-time computations GPU implementation applied to retinal and cortical vision processes Jean-Charles Quinton1 Computer Science Research Institute, National Polytechnic Institute, University of Toulouse, 2 rue Charles Camichel BP 7122 - F 31071 Toulouse Cedex 7, France Received: date / Revised version: date Abstract Most graphics cards in standard personal tion complexity really depend on the vision framework computers are now equipped with several pixel pipelines adopted, therefore various hypothesis that may benefit running shader programs. Taking advantage of this tech- from such a library are introduced. nology by transferring parallel computations from the CPU side to the GPU side increases the overall com- putational power even in non graphical applications by 1 Introduction freeing the main processor from an heavy work. A generic library is presented to show how anyone can bene- Whether one tries to implement biological or artificial fit from modern hardware by combining various vision systems, low-level processes are massively paral- techniques with little hardware specific program- lel. Visual perception is considered active in frameworks ming skills. Its shader implementation is applied such as interactivism, enaction or ecological vision. Such to retinal and cortical simulation. The purpose of approaches introduce temporal patterns of local antici- this sample application is not to provide a correct ap- pations in the retinal field resulting from eye saccades, proximation of real center surround ganglion or mid- body movements or environment dynamics. To some ex- dle temporal cells, but to illustrate how easily inter- tend, top-bottom modulations provide a way to reduce twined spatiotemporal filters can be applied on raw in- computations, concentrate on specific features and cope put pictures in real-time. Requirements and interconnec- with environmental/perceptual noise. Yet unanticipated Send offprint requests to: motion, rapid changes in brightness or any striking event 2 Jean-Charles Quinton (a) (b) (c) MT MT MT nervous signal equitemporal 1 2 3 4 1 2 3 4 1 2 3 4 assimilated movement non assimilated movement Fig. 1 Simplified one dimensional representation of anticipations performed by a Middle Temporal cortical cell (MT). (a) Inputs cells (1 and 2) get activated when an object moves in the field of view. The higher the input cell index (from 1 to 5), the lower the propagation time to the MT cell. (b) An assimilated movement will lead to the activation of the MT cell resulting from the synchrony of the signals. (c) Any movement that is either too fast, too slow or inverse will not lead to recognition. relative to current activity attract perception and must brary might be used as is, source code is provided not be processed to be taken into account. only to show how the wrapping of hardware specific func- tions is done, but also to convince the reader that good When high resolution frames are fully filtered, seg- performance can be achieved at low cost and with little mented and analyzed, for example when using computer effort. vision algorithms such as edge/feature point detectors or motion estimators, the high computational load required The library architecture consists of several layers ma- is obvious. Modeling retinal/cortical processes reproduc- nipulating filters. Filters not already implemented can ing center surround ganglion cells or middle temporal be defined by the user in a standard language, then (MT) cells behavior on wide receptive fields also intro- called from a script or a graphical user interface. The duces large-scale computations due to the variability of library then structures, orders and applies the filters on parameters such as speed, direction or receptor field size. the fly depending on temporal and spatial dependencies. This article does not introduce a new technology The lowest level layer can target central processing units nor algorithm, but combines different techniques and in- (CPU) or graphics processing units (GPU) depending on terfaces different programming languages to produce a available hardware. The library has been partly ported portable generic library which can be used and extended to support the Cell Broadband Engine and take advan- by anyone without any knowledge in the optimization tage of its synergistic processing elements. The huge ben- and graphics card programming fields. Though this li- efit from using vector instructions and such parallel ar- A generic library for structured real-time computations 3 chitectures is now affordable as the Cell BE is at the The portable implementation presented here is coded core of the Playstationr3 (installing Linux and the IBM in Java 1.6 using JOGL (Java bindings for OpenGL SDK is however required). But since PCs with powerful API), runs on low cost computers, allowing anyone to graphics cards are widespread today, the shader imple- implement and apply spatial and temporal filters on any mentation will be detailed. input source (computer generated or captured). Any in- termediate output is stored as a texture and can be fur- Recent changes and extensions on graphics cards now ther processed or extracted on request. As long as the allow us to implement parallel algorithms and use the user does not ask for a transfer to the computer main GPU rather than the CPU for real-time applications memory, all computations are performed on the graphics and higher performances. Many companies and laborato- card for speed improvement. Even if the hardware is in ries are investigating graphics hardware capabilities for constant evolution (render-to-texture for instance is general purpose computations but this article concen- now widespread), the reader inclined to know the speci- trates on how to easily make the most of it in a generic ficities and limitations of a GPU compared to a standard way for vision processes which share similarities with the CPU might read the paper of Michael Shantzis on the original aim of graphics card pipelines. Though several subject[23]. An example of application in the field of libraries exist for general purpose computations using retinal and cortical processes is detailed throughout this graphics hardware (GPGPU) such as RapidMind (pre- article. Although its purpose is neither to give an accu- viously known as Lib Sh)[9], Brook[6] or CUDA[2], they rate description of biological processes nor to give a full deal with data streams rather than pictures, integrate account of how to model and optimize such processes, it complex optimizations and include abstraction layers to illustrates the main capabilities of the library. be effective on any application. On the other hand, high 2 Theoretical framework integration level image processing frameworks such as CoreImage[1] or OpenVIDIA[12], neither generally provide access to Fast visual processes can be useful in any framework and intermediate results, handle time dependencies by de- a generic library needs to be compatible with all. fault nor support arbitrary hardware. Of course many The variability in vision theories is illustrated by task specific GPU enabled applications have also been the interactivist framework [5,7], implying a rad- developed across the world [12,17,24], but this paper is ical shift in the requirements for a vision library. an attempt to find a golden mean between generality Considering this approach, concepts and objects emerge and task dependence in the field of vision. from a network of interactions, each continuously an- 4 Jean-Charles Quinton Graphics card memory ... Input textures Pixel shader Optional parameters Output texture Shader source code Fig. 2 Textures combination by a GPU pixel shader. ticipating events resulting from taken actions. What is matching is correct. If the context is unambiguous enough, called events can be the activity of other internal pro- there is no need for further exploration as soon as the cesses or direct perceptions from various modalities. It activity in a highly connected part of the network, cor- is quite similar to Jean Piaget’s sensori-motor theory responding to a concept, starts to rise above the oth- [22] or Gibson’s affordance theory [13,14]. More specifi- ers. Moreover due to regulation and relative comparison, cally when applied to vision, it shares many aspects of such processes are highly resistant to noise and easily O’Regan’s conclusions [20,21]. Local signals coming out adapt to variations in the environment. The resulting of the optical nerve as well as higher-level features can be behavior also allows reducing the evoked potentialities anticipated when eye saccades are performed. Though space, concentrating on specific features which are con- saccades might be considered as hindering the already textually relevant [18]. For instance when walking in a uneven visual perception, several studies focus on the crowd or driving in a dense traffic, human beings do usefulness and intentionality of saccades [27,19].