A Parallel Camera Image Signal Processor for SIMD Architecture Seung-Hyun Choi1, Junguk Cho2, Yong-Min Tai2 and Seong-Won Lee1*

Choi et al. EURASIP Journal on Image and Video Processing (2016) 2016:29 EURASIP Journal on Image DOI 10.1186/s13640-016-0137-2 and Video Processing RESEARCH Open Access A parallel camera image signal processor for SIMD architecture Seung-Hyun Choi1, Junguk Cho2, Yong-Min Tai2 and Seong-Won Lee1* Abstract An image signal processor (ISP) for a camera image sensor consists of many complicated functions; in this paper, a full chain of the ISP functions for smart devices is presented. Each function in the proposed ISP full chain is designed to handle high-quality images. Every function in the chain is fully converted to a fixed-point arithmetic, and a special function is not used for easy porting to a Samsung Reconfigurable Processor (SRP). Several parallelizing optimization techniques are applied to the proposed ISP full chain for real-time operation on a given 600-MHz reconfigurable processor. To verify the performance of the proposed ISP full chain, a series of tests was performed, and all of the measured values satisfy the quality and performance requirements. Keywords: CMOS image sensor, Image signal processor, Reconfigurable processor, Parallel processing optimization 1 Introduction performance at the expense of scalability and flexibility, Image sensors are used in numerous types of image whereas the implementation of an ISP on a general- acquisition devices such as digital cameras, camcorders, purpose processor can be appropriate not only for the and CCTV cameras. Recently, their application region high image quality of complicated algorithms, but also has broadened to include smart devices, and the acquired for sound scalability and flexibility; however, the imple- images are not merely for storage but also for interaction mentation cost of the latter is high due to the large com- between a human and a computer. To satisfy the many putational amount, and a high-performance platform goals of image sensors, the role of image enhancement is such as a desktop PC is necessary. The high processing more important than ever before. performance and low power consumption of a parallel- An image signal processor (ISP) is one of the non-op- computing processor are accompanied by scalability and tical devices that enhance the image quality of captured flexibility for software implementation. The implemen- raw images and consists of several image processing tation of an ISP algorithm on a parallel-computing pro- algorithms including demosaicing, denoising, and white cessor, however, requires further optimization for the balancing, as well as other image enhancement algo- utilization of multiple processing elements in parallel. rithms. The latest ISP algorithms that include iterations The conventional parallel-ISP-optimization methodology with adaptive selections according to the image charac- requires the division of the algorithm into data processing teristics produce an excellent image quality. The high parts and control processing parts first, followed by their image quality costs vast amount of calculation, however, operation in parallel because of the adaptivity of the and also require complicated adaptive routines that ISP algorithm. Very Long Instruction Words (VLIW) cannot be executed in parallel. architecture can therefore be an easy choice for ISP im- An ISP can be implemented on a dedicated hardware, plementation, even though Single Instruction Multiple a general-purpose processor, or a parallel-computing Data (SIMD) architecture can exploit a greater extent processor. A dedicated hardware implementation, of parallelism. however, shows a high image quality and processing The ISP full chain that is suitable for parallel processing is proposed in this paper, and the chain is implemented through an optimization process for SIMD processor archi- * Correspondence: [email protected] 1Department of Computer Engineering, Kwangwoon University, 20, tecture to achieve both a high image quality and perform- Kwangwoon-ro, Nowon-gu, Seoul, Republic of Korea ance goals. The proposed ISP full chain is shown in Fig. 1. Full list of author information is available at the end of the article © 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Choi et al. EURASIP Journal on Image and Video Processing (2016) 2016:29 Page 2 of 14 Color White Balance interpolation (GWA) (Modified 10 bits 10 bits AHD) 10 bits Bayer pattern Bayer pattern Red, Green, Blue 8 bits Color Noise Color Y Conversion Reduction Correction (RGB to YCoCg) (Modified BF) 8 bits Co, Cg Detail Color Gamma Enhancement Conversion Correction (Modified LTI) (YCoCg to RGB) 8 bits Red, Green, Blue Fig. 1 Proposed ISP full chain In Fig. 1, GWA is Gray World Assumption, AHD is the parallel operations in the ISP algorithms, the proposed Adaptive Homogeneity-Directed Demosaicing, BF is Bi- ISP algorithm can take advantage of the parallel perform- lateral Filter, AC is Auto Contrast, and LTI is Luminance ance of a SIMD processor while maintaining an image Transient Improvement. quality that can pass the commercial image quality test of The way that the high-quality images are processed by Skype [31]. The proposed ISP can handle the resolution of all of the algorithms that are present in the proposed full HD video (1920 × 1080, 30 frames per second) on a ISP chain means that there are no iterations in the algo- 600-MHz SRP that is suitable for smart devices. rithm to reduce the execution time of the real-time This paper comprises the following: Section 2 describes budget [1]. While the basic idea of the algorithm is the existing research; Section 3 describes the implemen- maintained, the operations in the algorithm have been tation of the proposed ISP full chain; Section 4 describes simplified for easy parallelization on the SIMD architec- the performance verification process and the results of ture; in addition, heavy memory accesses and excessive the proposed ISP full chain; and the conclusion is computational overheads are reduced by limiting the op- presented in Section 5. erational ranges. Each complicated special operation is replaced by a simple operation that performs a similar 2 Background research function and the result was verified by experiments. 2.1 Algorithms of the ISP full chain The proposed parallel ISP algorithm is targeted to run The functions of the ISP full chain mainly support re- on the Samsung Reconfigurable Processor (SRP) [2–7] covering non-existing pixels, noise reduction, and image that can be configured as an SIMD processor. Numerous enhancement. The proposed ISP full chain consists of high-quality image processing algorithms form the basis white balancing, demosaicing, color correction, color of each of the functional components of the proposed space conversion, denoising, detail enhancement, and ISP full chain [8–30]. By increasing the homogeneity of gamma correction. Choi et al. EURASIP Journal on Image and Video Processing (2016) 2016:29 Page 3 of 14 The color images that enter through an image sensor neighboring pixels that will also be generated. The can show colors that are different to those that are seen homogeneity map is defined by Eq. (2), as follows: by the naked eye; to correct this, the White Balance Bx; y ; δ ∩L x; y ; ε ∩C x; y ; ε (WB) process can be used. The WB algorithm GWA [8, 9] ; ; δ; ε ; ε ðÞðÞ f ðÞðÞL f ðÞðÞC Hf ðÞ¼ðÞx y L C ; ; δ assumes that the average of the image is gray; similarly, jjBxðÞðÞy the white-patch Retinex (WR) algorithm [10] assumes that ð2Þ the maximum-intensity pixel is white. Since these as- BxðÞ¼ðÞ; y ; δ fgðp∈Xjd ðÞðÞx; y ; p ≤δ 3Þ sumptions can be statistically false, Iterative White Balan- X cing (IWB) [11] iteratively refines the white pixels while Lf ðÞ¼ðÞx; y ; εL fgp∈XjdLðÞfxðÞ; y ; fpðÞ≤εL ð4Þ illuminant voting [12] checks the lighting conditions. The ; ; ε ∈ ; ; ≤ε GWA is chosen for the proposed ISP, since it allows for Cf ðÞ¼ðÞx y C fgp XjdCðÞfxðÞy fpðÞ C ð5Þ an optimal parallelization during implementation that is where B is a set of the δ distance from (x, y) ∈ X; X is a due to a relatively structured computation compared with set of 2D pixel positions; B is defined by Eq. (3); Lf and the existing algorithms, as follows: Cf are in the neighborhood that is established by the dis- !tance of the luminance and color in the CIELab color Xn Xn Xn Xn Xn Xn Xn Xn ; ; = space and are defined by Eqs. (4) and (5), respectively; E CWBðÞ¼x y CxðÞy R þ G þ B 3 C δ ε ε ∈ x y x y x y x y is a set of tolerance values and , L, C E; and dL and ð1Þ dC are distance functions, where luminance and the ab plane in the CIELab color space are used. A detailed implementation of AHD is introduced in Hirakawa and where C represents one of R, G, and B and CWB repre- Parks [15]. sents the color value after white balancing. Inevitably, the acquired images comprise a variety of After the WB process, demosaicing is an algorithm for noises due to the characteristics of the sensor and con- the production of full RGB channels, which is achieved verter circuits that are used—especially with the low by the interpolation of the color pixels that are lacking light of an indoor environment. To remove these noises in image sensor-captured images. Many algorithms effectively, highly adaptive noise reduction methods including heuristic methods, directional interpolations, such as the Bilateral Filter (BF) [18–21] or a 3D noise frequency domain approaches, wavelet-based methods, reduction filter [22, 23] can be used.

Load more