applied sciences
Article Sampling Based on Kalman Filter for Shape from Focus in the Presence of Noise
1, 2, 3 Hoon-Seok Jang † , Mannan Saeed Muhammad † , Guhnoo Yun and Dong Hwan Kim 3,*
1 Center for Imaging Media Research, Korea Institute of Science and Technology, Seoul 02792, Korea 2 College of Information and Communication, Natural Sciences Campus, Sungkyunkwan University, Suwon 16419, Korea 3 Center for Intelligent and Interactive Robotics, Korea Institute of Science and Technology, Seoul 02792, Korea * Correspondence: [email protected]; Tel.: +82-2-958-6719 These authors contributed equally as first authors to this work. † Received: 10 July 2019; Accepted: 7 August 2019; Published: 9 August 2019
Abstract: Recovering three-dimensional (3D) shape of an object from two-dimensional (2D) information is one of the major domains of computer vision applications. Shape from Focus (SFF) is a passive optical technique that reconstructs 3D shape of an object using 2D images with different focus settings. When a 2D image sequence is obtained with constant step size in SFF, mechanical vibrations, referred as jitter noise, occur in each step. Since the jitter noise changes the focus values of 2D images, it causes erroneous recovery of 3D shape. In this paper, a new filtering method for estimating optimal image positions is proposed. First, jitter noise is modeled as Gaussian or speckle function, secondly, the focus curves acquired by one of the focus measure operators are modeled as a quadratic function for application of the filter. Finally, Kalman filter as the proposed method is designed and applied for removing jitter noise. The proposed method is experimented by using image sequences of synthetic and real objects. The performance is evaluated through various metrics to show the effectiveness of the proposed method in terms of reconstruction accuracy and computational complexity. Root Mean Square Error (RMSE), correlation, Peak Signal-to-Noise Ratio (PSNR), and computational time of the proposed method are improved on average by about 48%, 11%, 15%, and 5691%, respectively, compared with conventional filtering methods.
Keywords: shape from focus (SFF); jitter noise; focus curve; Kalman filter
1. Introduction Inferring three-dimensional (3D) shape of an object from two-dimensional (2D) images is a fundamental problem in computer vision applications. Many 3D shape recovery techniques have been proposed in literature [1–5]. The methods can be categorized into two categories based on the optical reflective model. The first one includes active techniques which use projected light rays. The second category consists of passive techniques which utilize reflected light rays without projection. The passive methods can further be classified into Shape from X, where X denotes the cue used to reconstruct the 3D shape as Stereo [6], Texture [7], Motion [8], Defocus [9], and Focus [10]. Shape from Focus (SFF) is a passive optical method that utilizes a series of 2D images with different focus levels for estimating 3D information of an object [11]. For SFF, a focus measure is applied to each pixel of the image sequence, to evaluate the focus quantity at every point. The best focused position is acquired by maximizing the focus measure values along the optical axis. Many focus measures have been reported in literature [12–16]. Initial depth map, obtained through any of the focus measure operators, has the problem of information loss between consecutive frames
Appl. Sci. 2019, 9, 3276; doi:10.3390/app9163276 www.mdpi.com/journal/applsci Appl. Sci. 2019, 9, 3276 2 of 23
Appl. Sci. 2019, 9, x 2 of 23 due to the discreteness of predetermined sampling step size. To solve this problem, refined depth map depthis acquired map usingis acquired approximation using approximation techniques, astechniques, reported in as literature reported [17 in– 22literature]. As an important[17–22]. As issue an importantof SFF, when issue images of SFF, are when obtained images by are translating obtained the by object translating plane the with object constant plane step with size, constant mechanical step size,vibration, mechanical referred vibration, to as jitter referred noise, to occurs as jitter in eachnoise, step, occurs as shown in each in step, Figure as 1shown[21]. in Figure 1 [21].
Figure 1. Image acquisition for Shape from Focus. Figure 1. Image acquisition for Shape from Focus. Since this noise changes the focus values of images by oscillating along the optical axis, accuracy of 3DSince shape this recovery noise changes is considerably the focus values degraded. of images Unlike by oscillating any image along noise the [23 optical,24], this axis, noise accuracy is not ofdetectable 3D shape by recovery simply observing is considerably images. degraded. Unlike any image noise [23,24], this noise is not detectableMany by filtering simply methods observing for images. removing the jitter noise have been reported [25–27]. In [25,26], Kalman andMany Bayes filtering filtering methods methods for for removing the Gaussian jitter noise jitter have noise been have reported been proposed, [25–27]. respectively. In [25,26], KalmanIn [27], aand modified Bayes Kalman filtering filtering methods method for removing for removing Gaussian Lévy noise jitter has noise been have presented. been proposed, respectively.In this paper,In [27], a a newmodified filtering Kalman method filtering for removing method for the removing jitter noise L𝑒́vy is proposednoise has been as an presented. extended versionIn this of [paper,25]. First, a new jitter filtering noise is method modeled for as removi Gaussianng the or specklejitter noise function is proposed to reflect as morean extended types of versionnoise that of [25]. can occur First, injitter SFF. noise At the is secondmodeled stage, as Gaus the focussian or curves speckle acquired function by oneto reflect of the more focus types measure of noiseoperators that can are occur modeled in SFF. as At Gaussian the second function stage, for the application focus curves of acquired the filter by and one a of clearer the focus performance measure operatorscomparison are of modeled various filters.as Gaussian Finally, function Kalman for filter application as the proposed of the filter method and is a designedclearer performance and applied. comparisonKalman filter of is various a recursive filters. filter Finally, that tracks Kalman the filter state as of athe linear proposed dynamic method system is containingdesigned and noise, applied. and is Kalmanused in manyfilter is fields a recursive such as computerfilter that vision,tracks robotics,the state radar,of a linear etc. [28 dynamic–33]. In manysystem cases, containing this algorithm noise, andis based is used on in measurements many fields madesuch as over computer time. More vision, precise robotics, results radar, can beetc. expected [28–33]. than In many from cases, using this only algorithmmeasurements is based at that on measurements moment. As the made filter over recursively time. More processes precise input results data, can including be expected noise, than optimal from usingstatistical only predictionmeasurements for the at currentthat moment. state can As bethe performed. filter recursively The proposed processes method input isdata, experimented including noise,by using optimal image statistical sequences prediction of synthetic for the and current real objects. state can The be performance performed. of The the proposed proposed method method is is experimentedanalyzed through by using various image metrics sequences to show of its synthe effectivenesstic and inreal terms objects. of reconstruction The performance accuracy of andthe proposedcomputational method complexity. is analyzed Root through Mean Square various Error me (RMSE),trics tocorrelation, show its effectiveness Peak Signal-to-Noise in terms Ratio of reconstruction(PSNR), and computational accuracy and computational time of the proposed complexi methodty. Root are Mean improved Square by Error an average (RMSE), of correlation, about 48%, Peak11%, Signal-to-Noise 15%, and 5691%, Ratio respectively, (PSNR), comparedand computatio with conventionalnal time of the filtering proposed methods. method In are the improved remainder byof an this average paper, of Section about2 48%, presents 11%, 15%, the concept and 5691%, of SFF respectively, and a summary compared of previously with conventional proposed filtering focus methods.measures In as the background. remainder of Sections this paper,3 and 4Sectio providen 2 presents the modeling the concept of jitter of noise SFF and and a focus summary curves, of previouslyrespectively. proposed Section focus5 explains measures the Kalman as background. filter as theSection proposed 3 and method4 provide in the detail. modeling Experimental of jitter noiseresults and and focus discussion curves, arerespectively. presented Section in Section 5 explai6. Finally,ns the Section Kalman7 concludes filter as the this proposed paper. method in detail. Experimental results and discussion are presented in Section 6. Finally, Section 7 concludes this paper.
Appl. Sci. 2019, 9, 3276 3 of 23 Appl. Sci. 2019, 9, x 3 of 23
2. Related Work
2.1. Shape from Focus In SFFSFF methods,methods, images with didifferentfferent focus levelslevels (such that some parts are wellwell focusedfocused andand the restrest ofof thethe partsparts areare defocuseddefocused withwith somesome blur)blur) are obtainedobtained by translatingtranslating the object plane at a predetermined step size along thethe opticaloptical axisaxis [[11]11].. By applying a focus measure,measure, the bestbest focusedfocused frame for each each object object point point is is acquired acquired to to find find the the depth depth of the of theobject object with with an unknown an unknown surface. surface. The Thedistance distance of the of corresponding the corresponding object object point pointis comput is computeded by using by usingthe camera the camera parameters parameters for the forframe, the frame,and utilizing and utilizing the lens the formula lens formula as follows: as follows:
11 11 1 == + (1)(1) f𝑓 u𝑢 𝑣v where,where, f𝑓is is the the focal focal length, length,u and 𝑢 v andare the𝑣 are distances the distances of object andof object image and from image the lens, from respectively. the lens, Figurerespectively.2 shows Figure the image 2 shows formation the image in the formation optical lens. in the The optical object lens. point The at theobject distance point atu isthe focused distance to the𝑢 is image focused point to the at theimage distance pointv at. the distance 𝑣.
Figure 2. Image formation in optical lens. Figure 2. Image formation in optical lens. 2.2. Focus Measures 2.2. FocusA focus Measures measure operator calculates the focus quality of each pixel in the image sequence, and is evaluatedA focus locally. measure As the operator image sharpness calculates increases, the focus the qu valueality of of each the focus pixel measure in the image increases. sequence, When and the imageis evaluated sharpness locally. is maximum, As the image the sharpness best focused increase images, is the attained. value of Some the focus of the measure popular increases. gradient-based, When statistical-based,the image sharpness and is Laplacian-based maximum, the operatorsbest focused are image briefly is given attained. in [12 Some]. of the popular gradient- based,First, statistical-based, there are Modified and Laplacian-ba Laplacian (ML)sed and operators Sum of Modifiedare briefly Laplacian given in [12]. (SML) as Laplacian-based operators.First, there When are Laplacian Modified is usedLaplacian in textured (ML) images,and Sumx andof Modifiedy components Laplacian of the (SML) Laplacian as Laplacian- operator maybased cancel operators. out and When provide Laplacian no response. is used MLin textured is calculated images, by adding𝑥 and 𝑦 the components squared second of the derivatives Laplacian foroperator each pixelmay ofcancel the image out andI as: provide no response. ML is calculated by adding the squared second derivatives for each pixel of the image 𝐼 as: !2 !2 ∂2I(x, y) ∂2I(x, y) F (x, y) = + (2) ML 𝜕 𝐼(𝑥,2 ) 𝑦 𝜕 𝐼(𝑥,2 𝑦) 𝐹 (𝑥,) 𝑦 = ∂x + ∂y (2) 𝜕𝑥 𝜕𝑦 If the image has rich textures with high variability at each pixel, focus measure can be evaluated for eachIf the pixel. image In orderhas rich to textures improve with robustness high variabi for weak-texturedlity at each pixel, images, focus SML measure is computed can be byevaluated adding thefor each ML values pixel. In in order a W toW improvewindow robustness as: for weak-textured images, SML is computed by adding the ML values in a 𝑊×𝑊× window as: !2 !2 X X ∂2I(x, y) ∂2I(x, y) FSML(i, j) = 𝜕 𝐼(𝑥,) 𝑦 + 𝜕 𝐼(𝑥, 𝑦) (3) x W y W ∂x2 ∂y2 𝐹 (𝑖, 𝑗) = ∈ ∈ + (3) ∈ ∈ 𝜕𝑥 𝜕𝑦
where, 𝑖 and 𝑗 are the x and y coordinates of center pixel in a 𝑊×𝑊 window, respectively. Appl. Sci. 2019, 9, 3276 4 of 23 Appl. Sci. 2019, 9, x 4 of 23 where,Next,i and therej are is the Tenenbaumx and y coordinates (TEN) as a of gradient-bas center pixeled in operator. a W W window,TEN is calculated respectively. by adding the × squaredNext, responses there is Tenenbaumof horizontal (TEN) and vertical as a gradient-based Sobel operators. operator. For robustness, TEN is calculated it is also bycomputed adding theby squaredadding the responses TEN values of horizontal in a 𝑊×𝑊 and window vertical Sobelas: operators. For robustness, it is also computed by adding the TEN values in a W W window as: × 𝐹 (𝑖, 𝑗) = (𝐺 (𝑥, 𝑦)) + 𝐺 (𝑥, 𝑦) (4) X ∈ X ∈ 2 2 FTEN(i, j) = (Gx(x, y)) + Gy(x, y) (4) x W y W ∈ ∈ where, 𝐺 (𝑥, 𝑦) and 𝐺 (𝑥, 𝑦) are images acquired through convolution with the horizontal and where,verticalG Sobelx(x, y )operators,and Gy(x ,respectively.y) are images acquired through convolution with the horizontal and vertical SobelFinally, operators, there respectively. is Gray-Level Variance (GLV) as a statistics-based operator. It has been proposed on theFinally, basis thereof the is idea Gray-Level that the Variance variance (GLV) of gray as alevel statistics-based in a sharp image operator. is higher It has beenthan proposedin a blurred on theimage. basis GLV of the for idea a central that thepixel variance in a 𝑊×𝑊 of gray window level in is a sharpcalculated image as: is higher than in a blurred image. GLV for a central pixel in a W W window is calculated as: × 1 ( ) ( ( ) ) 𝐹 𝑖, 𝑗 = X X 𝐼 𝑥, 𝑦 − 𝜇 (5) 𝑁1 ∈ ∈ n 2o FGLV(i, j) = (I(x, y) µ) (5) 2 x W y W N ∈ ∈ − where, 𝜇 is the mean of the gray values in a 𝑊×𝑊 window. where, µ is the mean of the gray values in a W W window. × 3. Noise Modeling 3. Noise Modeling When a sequence of 2D images is obtained by translating the object at a constant step size along When a sequence of 2D images is obtained by translating the object at a constant step size along the optical axis, mechanical vibrations, referred as jitter noise, occur in each step. In this manuscript, the optical axis, mechanical vibrations, referred as jitter noise, occur in each step. In this manuscript, two probability density functions are used for modeling the jitter noise. At first, the jitter noise is two probability density functions are used for modeling the jitter noise. At first, the jitter noise is modeled as Gaussian function with mean 𝜇 and standard deviation 𝜎 , as shown in Figure 3. modeled as Gaussian function with mean µn and standard deviation σn, as shown in Figure3.
Figure 3. Noise modeling through Gaussian function. Figure 3. Noise modeling through Gaussian function. µn represents the position of each image frame without the jitter noise, and σn represents the amount𝜇 ofrepresents jitter noise the occurred position in of each each image image frame. frameσ withoutn is determined the jitter by noise, checking and depth𝜎 represents of field andthe correspondingamount of jitter imagenoise occurred position. in The each depth image of frame. field is 𝜎 a ff ected is determined by magnification by checking and depth different of field factors. and σn is selected as σn 10 µm through repeated experiments with real objects used in this manuscript. corresponding image≤ position. The depth of field is affected by magnification and different factors. Second,𝜎 is selected the jitter as 𝜎 noise ≤ 10 is modeledμm through as speckle repeated function experiments as follows with [real34,35 objects]: used in this manuscript.
Second, the jitter noise is modeled as speckle function as followsζ [34,35]: 1 −2 f (ζ) = e 2 σn (6) 1 2 × 2σn (6) f(𝜁) = ×𝑒 2𝜎 where, ζ is the amount of jitter noise before or after filtering. where, ζ is the amount of jitter noise before or after filtering. 4. Focus Curve Modeling 4. Focus Curve Modeling In order to filter out jitter noise, the focus curve obtained by one of the focus measure operators is In order to filter out jitter noise, the focus curve obtained by one of the focus measure operators modeled by Gaussian approximation with mean z f and standard deviation σ f [11]. This focus curve modelingis modeled is by shown Gaussian in Figure approximation4. with mean 𝑧 and standard deviation 𝜎 [11]. This focus curve modeling is shown in Figure 4. Appl. Sci. 2019,, 9,, x 3276 55 of of 23
Figure 4. Gaussian fitting of the focus curve. Figure 4. Gaussian fitting of the focus curve. Related equation about this method is given as: Related equation about this method is given as: z z 2 ( 1 ( − f ) ) − 2 σ f F(z) = Fp e (7) ( ( ) ) × (7) 𝐹(𝑧) =𝐹 ×𝑒 where, z is the position of each image frame, F(z) is the focus value at z, z f is the best-focused position where,of the object 𝑧 is the point, positionσ f is standard of each image deviation frame, of the 𝐹(𝑧) approximated is the focus focus value curve at 𝑧, (by𝑧 Gaussianis the best-focused function), positionand Fp is of amplitude the object of point, the focus 𝜎 is curve. standard Using deviation the natural of the logarithm approximated to (7), (8) focus is obtained: curve (by Gaussian function), and 𝐹 is amplitude of the focus curve. Using the natural logarithm to (7), (8) is obtained: 2 1 z z f ln(F(Z)) = ln Fp 1( 𝑧−𝑧− ) (8) − 2 σ f 𝑙𝑛 𝐹(𝑍) =𝑙𝑛 𝐹 − ( ) (8) 2 𝜎 Using (8) and initial best-focused position obtained through one of the focus measure operators zi Using (8) and initial best-focused position obtained through one of the focus measure operators and the positions below and above initial best-focused position zi 1 and zi+1 and their corresponding 𝑧 and the positions below and above initial best-focused position− 𝑧 and 𝑧 and their focus values Fi, Fi 1 and Fi+1, (9) and (10) are obtained: − corresponding focus values 𝐹 , 𝐹 and 𝐹 , (9) and (10) are obtained: 2 2 1 ( zi z f zi 1 z f ) 1 ( 𝑧 −−𝑧 −− 𝑧− −−𝑧 ) 𝑙𝑛ln((𝐹F)i)−𝑙𝑛(𝐹ln(Fi 1)=−) = (9)(9) − − −2 σ2 2 𝜎f