Microlens Array Grid Estimation, Light Field Decoding, and Calibration
Total Page:16
File Type:pdf, Size:1020Kb
1 Microlens array grid estimation, light field decoding, and calibration Maximilian Schambach and Fernando Puente Leon,´ Senior Member, IEEE Karlsruhe Institute of Technology, Institute of Industrial Information Technology Hertzstr. 16, 76187 Karlsruhe, Germany fschambach, [email protected] Abstract—We quantitatively investigate multiple algorithms lenslet images, while others perform calibration utilizing the for microlens array grid estimation for microlens array-based raw images directly [4]. In either case, this includes multiple light field cameras. Explicitly taking into account natural and non-trivial pre-processing steps, such as the detection of the mechanical vignetting effects, we propose a new method for microlens array grid estimation that outperforms the ones projected microlens (ML) centers and estimation of a regular previously discussed in the literature. To quantify the perfor- grid approximating the centers, alignment of the lenslet image mance of the algorithms, we propose an evaluation pipeline with the sensor, slicing the image into a light field and, in utilizing application-specific ray-traced white images with known the case of hexagonal MLAs, resampling the light field onto a microlens positions. Using a large dataset of synthesized white rectangular grid. These steps have a non-negligible impact on images, we thoroughly compare the performance of the different estimation algorithms. As an example, we apply our results to the quality of the decoded light field and camera calibration. the decoding and calibration of light fields taken with a Lytro Hence, a quantitative evaluation is necessary where possible. Illum camera. We observe that decoding as well as calibration Here, we will focus on the estimation of the MLA grid benefit from a more accurate, vignetting-aware grid estimation, parameters (to which we refer to as pre-calibration), which especially in peripheral subapertures of the light field. is the basis for all decoding and calibration schemes found in the literature. In spite of the importance of the pre-calibration I. INTRODUCTION pipeline, the literature focuses mostly on the camera models and decoding but pays little or no attention to the necessary Computational cameras, that is, cameras utilizing combined details emerging in the pre-calibration, most importantly non optical and digital image processing techniques, have been trivial effects such as mechanical and natural vignetting. While gaining attention both in consumer applications, such as multi- for a correct pre-calibration, the detection of the perspectively lens camera systems in mobile devices, as well as scientific projected ML centers is necessary, all methods proposed and industrial applications, such as light field cameras [1], [16] in the literature rely on estimating the center of each ML or snapshot hyperspectral cameras [2]. Computational imaging image brightness distribution, approximating the orthogonally systems can usually well be described using the so-called 4D projected centers. Due to natural and mechanical vignetting, light field L (u; v; a; b), where λ describes a wavelength, λ,t this results in severe deviations from the true projected centers, t the time, and the coordinates (u; v; a; b) correspond to a in particular in off-center MLs. This is the main scope of this certain parametrization of the spatio-angular dependency of article. In particular, our contributions are as follows: the light field of which there are numerous. For computational cameras, one usually uses the plane-plane parametrization: • We propose a camera model and ray tracer implementa- a light ray inside a camera is uniquely described by the tion to synthesize application-specific white images with intersection points u = (u; v) and a = (a; b) of two parallel known ML centers as reference data. planes, e.g. the main lens plane and the sensor plane. • We propose a new pre-calibration algorithm motivated by In particular, microlens arrays (MLAs) are used in computa- our physical camera model, taking into account natural arXiv:1912.13298v1 [eess.IV] 31 Dec 2019 tional imaging applications allowing for a complex coding (or and mechanical vignetting effects. multiplexing) scheme of the light field onto an imaging sensor. • We present detailed accuracy requirements that the pre- Most prominently, MLAs are used in compact MLA-based calibration pipeline has to fulfill and show that the light field cameras [16], [13], but also in other applications proposed algorithm, in case the of a Lytro Illum camera, such as multi- or hyperspectral imaging [19], [20]. As usual, a fulfills these requirements. We compare our algorithm to certain camera model is then used to calibrate the intrinsic (and different schemes proposed in the literature (which we extrinsic) parameters of the camera to relate the image-side to show fail the accuracy requirements). the object-side light field. The model is then evaluated, e.g. • We evaluate the full light field decoding pipeline for the by using the ray re-projection error. In most light field camera different pre-calibration algorithms for simulated as well models [7], [6], [22], [23], the calibration is performed using as real light field data. image-side light fields that have been decoded from the sensor • We investigate the influence of the quality of the pre- calibration on the full calibration of a Lytro Illum light © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, field camera for different calibration methods. including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers Since the ML grid parameters are application-specific, we or lists, or reuse of any copyrighted component of this work in other works. make the source code for a full evaluation pipeline (image 2 x y a F f p ( 1, 2) (0, 2) (1, 2) (2, 2) − c2 ( 1, 1) (0, 1) (1, 1) (2, 1) h c2 p − 2 c1 ( 1, 0) (0, 0) (1, 0) (2, 0) c1 − p z c0 x c0 dx p c 1 c 1 dy − − p c 2 c − 2 β − w Aperture Lens MLA Sensor 2 (a) (b) Figure 1. Schematic drawings of the used camera model of the unfocused plenoptic camera with exaggerated ML size. (a) A 2D section (y = 0) of the used p camera model. The coordinates c i denote the centers of the ML (which are not explicitly depicted) and their perspective projections c i. (b) The used ± ± MLA model before rotation and tilt. The ML centers cij are depicted in orange with corresponding index labels (i; j). synthesis, ML grid estimation, and decoding) freely available A. Camera model [10] to be used in scientific research for any kind of MLA- The camera model used is depicted in Fig.1. In our model, based application. This includes the release of an open source the camera consists of a main lens and a collection of MLs, Python framework for light field decoding and analysis as well arranged in a hexagonal grid, which may be rotated (not as for applications in hyperspectral imaging. Even though the depicted in the figure) and tilted. All lenses are modeled as thin following presentation relies on a light field camera in the lenses. As is usual in the focused design, f-number matching unfocused design [16], the proposed pre-calibration method is of main lens and MLs is assumed. To simulate irregularities of equally applicable to cameras in the focused design [13]. the grid, we add independent uncorrelated Gaussian noise to The remainder of this paper is organized as follows. We the ideal grid point’s x- and y-coordinates. Natural vignetting introduce the used camera model and chosen camera pa- is implemented by using the cos4 Φ law and the ray’s incident rameters in SectionII. In Section III, we review different angle Φ. Finally, an object-side aperture with variable entrance methods for the estimation of ML grids and formulate precise pupil is placed at distance a to the main lens to account for accuracy requirements which the algorithms ought to fulfill. mechanical vignetting effects. Note, that we do not model Furthermore, we introduce a new estimation method which systematic, non-rigid deformations of the MLA as considered we thoroughly motivate using the physical camera model. All in [18]. We argue that these irregularities should be eradicated estimation methods are then quantitatively evaluated. In the in the manufacturing process of high-quality MLAs as they remaining SectionsIV andV, we investigate the influence of introduce irreducible blur in the light field (on which we will the ML grid estimation accuracy on light field decoding and elaborate in Section II-B). calibration, respectively. The ideal, unrotated, untilted and unshifted ML center coordinates are given by II. CAMERA MODEL AND REFERENCE DATA 0±i ± 1 j mod 2 d 1 The pre-calibration of MLA-based cameras is usually per- 2 x cid = o + ±jd + ; (1) formed using so-called white images (WIs)—images of a ±i±j g @ y A ±i±j 0 white scene, for example taken using an optical diffuser. In order to quantitatively evaluate the performance of the 2 for (i; j) 2 N . Here, dx; dy denote the ideal grid spacing, ML grid estimation algorithms, appropriate reference data is T T og = (og;x; og;y; 0) the grid offset, and = (, , 0) the needed. Of course, real WIs, as for example provided by the 2 grid noise with variance σg . The ideal hexagonal grid is de- Lytro cameras, are unsuited since the actual ML centers are p termined by a single grid spacing d via dx = d, dy = 3 d=2, unknown. Therefore, reference data has to be synthesized. where the ML radius is given by r = d=2. The ideal grid Previously, Hog et al.