<<

Linear calibration of a rotating and zooming camera ∗

Lourdes de Agapito, Richard I. Hartley and Eric Hayman Department of Engineering, Oxford University and G.E. Corporate Research and Development 1 Research Circle, Niskayuna, NY 12309

Abtract apply in case of a minimal assumption of zero-skew. A linear self-calibration method is given for com- These methods applied to moving cameras, and were puting the calibration of a stationary but rotating iterative1. camera. The internal parameters of the camera are al- Recent papers ([1, 9]) have extended calibration ca- lowed to vary from image to image, allowing for zoom- pability by giving algorithms for rotating cameras, al- ing (change of ) and possible variation of lowing changing internal parameters, similar in intent the principal point of the camera. In order for calibra- to the algorithms of [5, 6, 8]. The particular scenario tion to be possible some constraints must be placed on of a rotating and zooming camera is common in prac- the calibration of each image. The method works un- tice. For instance a camera at a sports arena under- der the minimal assumption of zero-skew (rectangular goes this sort of motion as it rotates and zooms to pixels), or the more restrictive but reasonable condi- follow a game. Unlike the algorithm of Hartley ([4]), tions of square pixels, known pixel aspect ratio, and these methods use interation, based on the Levenberg- known principal point. Being linear, the algorithm is Marquardt method to minimize a non-linear cost func- extremely rapid, and avoids the convergence problems tion. The linear methods of ([4]) do not appear to be characteristic of iterative algorithms. immediately extendable to the case of varying internal parameters. 1 Introduction In this paper however it is shown that a simple trick, The subject of self-calibration of a camera has re- first used in a different context in [11] transfers the ba- ceived considerable attention, following the ground- sic calibration equation ((3) below) to one for which breaking paper of Maybank and Faugeras [7]. This a simple linear method applies. The resulting linear idea opens the possibility of calibration of a camera method is extremely simple involving a least-squares in the field, without the aid of jigs, or knowledge of solution of a set of homogeneous equations in 6 un- the position of world-points. A subsequent paper of knowns. Each image in the sequence leads to one to Hartley ([4]) gave a method for the self-calibration of four equations, depending on the amount of assumed a rotating, but stationary camera, for which the the- knowledge of the camera. It turns out that the linear ory of [7] does not apply. The algorithm of [4] had the method, apart from being quicker than the iterative advantage of being linear and hence very simple and methods gives results of comparable quality. rapid, unlike the Maybank-Faugeras method, which was somewhat complex. 2 Rotating cameras The methods of these two papers required the cal- We consider a set of images taken with cameras ibration of the camera to be fixed over a sequence all located at the same point in space, which will be of images – no zooming was allowed. Subsequently taken to be the coordinate origin. As has been shown interest in zooming cameras led to a method for self- in (for instance) [4] one may analyse this situation by calibration of cameras with changing internal param- representing each of the cameras as a 3×3matrixPi.A eters ([5]). These results were strengthened in [6, 8] to 1It may be noted that these papers were all predated by ∗This work was sponsored by DARPA contract F33615-94- the paper [3] that gave a non-iterative algorithmfor two-view C-1549 calibration in the case of known aspect ratio and principal point.

1 point in the i-th image, represented by a homogeneous 2. The square-pixel constraint : For each cam- 3-vector xi corresponds to a ray in space consisting era, s =0andαx = αy. −1 of points of the form λPi xi. Points on this ray are −1 3. The known principal point constraint : For mapped into the j-th image to a point xj = PjPi xi. Denoting the transformation each camera, (x0,y0)=(0, 0).

− If the pixels have a known aspect ratio other than H = P P 1 (1) ij j i 1, or the principal point is at a different know point one sees that the i-th and j-th images are related by other than the origin, then a simple change of image coordinates converts to one of the cases above. a projective transformation Hij . In practice, one may compute these projective transformations between im- Now, a constraint such as the zero-skew constraint is not reflected in any simple way in the entries of ages by finding matching points in the two images and ∗  computing the projective transformation that relates ω = KK . There seems to be no easy way to use the points. At least four matched points are necessary (3) to develop an algorithm to compute the camera for computing the projective transformation between calibration, enforcing constraints of this type. There- two images. Practical methods for computing these fore, in [1, 9], iterative algorithms are proposed, us- transformations are given in [4, 2]. ing Levenberg-Marquardt iteration to find a non-linear The projective transformation may be related to least squares solution, parametrized directly by the en- the calibration matrices of each of the cameras, as tries of the Ki. This may potentially cause problems with lack of convergence, or convergence to local min- follows. Each camera matrix Pi may be decomposed ima, not to mention the greater complexity of coding. into an upper-triangular calibration matrix Ki and a A simple observation, however, leads to a linear al- rotation matrix Ri representing the orientation of the gorithm. Taking the inverse of (3) gives camera : Pi = KiRi. Substituting in (1) one obtains

− − − − −1 1 1 1 Hij ωiH = ωj (5) Hij = KjRjRi Ki = KjRij Ki . (2) ij

 − −1 Since Rij is a rotation, Rij Rij = I. Straight-forward where ωi = Ki Ki . Now, one may verify that with    computation then shows that Hij KiKi Hij = KjKj . K of the form (4), with s =0, This formula is often written as − −1 ω = K K  ∗  ∗ 2 − 2 Hij ωi Hij = ωj (3) 1/αx 0 x0/αx  2 − 2  = 01/αy y0/αy ∗  − 2 − 2 2 2 2 2 where ωi = KiKi is the dual image of the absolute x0/αx y0/αy 1+x0/αx + y0/αy conic ([4]) in the i-th image. This formula then repre- (6) sents the transformation of a conic under a projective transformation. In the case where Ki = Kj,thisfor- so ω represents a conic of the form mula may be used to generate a set of linear equations ∗ − 2 2 − 2 2 in the entries of ωi which may be used to solve for (x x0) /αx +(y y0) /αy +1=0 , ∗ ωi , and subsequently for Ki by Choleski factorization. This is briefly the calibration method described in [4]. the image of the absolute conic. The important points The difficulty with extending this linear solution to to note here are the case of a camera with varying intrinsic parame- ters under minimal assumptions such as zero skew or Proposition 2.1. 1. If s = K12 =0,thenω12 =0. known aspect ratio is that such constraints are not eas- ∗ 2. If s =0and αx = αy,thenω11 = ω22. ily related to the entries of ωi . Consider a calibration matrix   3. If s =0and x0 =0,thenω13 =0. Similarly if αx sx0 y =0then ω =0.   0 23 K = αy y0 (4) 1 3 Generation and solution of equations Three constraints are possible, of which the first two Select one image as being a reference image, and were tested in detail in this paper. let H0j be the homography relating the reference im- age to the j-th image. For j = 0, one has H00 = I. 1. The zero-skew constraint : For each camera, Let the image of the absolute conic in the refer- s =0. ence image be ω0. Since it is symmetric, it may be parametrized by its sixdiagonal and above-diagonal camera motion contains some component of rotation entries, which may be denoted in some designated or- about the pricipal axis (Z-axis)ofthecamera.Ifsome −1 der as (a1,a2,...,a6). Since the homographies H0j Z axis motion is included, however, then results are are known, the entries of ωj may be expressed lin- good. Thus, to obtain good results, one must either early in terms of the entries ai of ω0.Noweachof have some Z-rotations, or else include square-pixel the equation types in Proposition 2.1 gives one linear constraints. It should be realized that this failure equation in the entries of ωj, hence a linear equation mode does not represent a weaknesses in the linear in the entries ai of ω0. For each image each condition algorithm of this paper, but rather arises from a situ- gives one equation, and the set of all equations may ation in which self-calibration is intrinsically unstable. be written as Ea =0,wherea =(a ,a ,...,a ), 1 2 6 X-axis rotation. Consider the case where the and each row of E represents one equation. Given at rotation is about the X axis of the camera and the least five equations one can find a solution up to (non- cameras have zero skew. In addition, suppose for the essential) scale. With more than 5 equations, one finds a least-squares solution in the usual way ([4]), that is present that the coordinate origin is at the principal point, so that the principal point (x ,y )=(0, 0). In by finding a that minimizes ||Ea|| subject to ||a|| =1. 0 0 this case the calibration matrixof each camera is di- In forming these equations, one should include agonal, K =diag(αx,αy, 1), and the transformation equations for the reference image. Consider the zero- −1  matrix H0j = K0RX Kj is of the form skew condition. With H00 = I, the corresponding equation becomes simply a = 0, assuming that a   2 2 100 reprents the (1, 2) entry of ω . This equation should − 0 H 1 =  0 xx be included with the equations arising from the other 0j 0 xx images in the matrix E of all equations. A (possibly in- ferior) alternative would be to parametrize ω in terms where the x values represent non-zero entries, not all of only 5 entries, setting the (1, 2) entry to zero. Then the same. Now (5) may be written as each image other than the reference image gives one equation. The difference is that in this latter case, the − −1 ωj = H0j ω0H0j (7) computed skew will be exactly zero in the reference       image but non-zero (because the equations will not be 100 a1 a2 a3 100       satisfied exactly) in all other images. This approach = 0 xx a2 a4 a5 0 xx . would unreasonably single out the reference image for 0 xx a3 a5 a6 0 xx special treatment. The same remarks apply in the (8) square-pixel and known-principal point cases as well. Since one requires at least 5 equations in the 6 in- From this one easily verifies that the expression for dependent homogeneous entries of ω0,onerequiresa the (1, 2) entry of ωj involves only the entries a2 and total of 5 images to solve in the zero-skew case, 3 im- a3 of ω0. Assuming that a2 = 0 (the skew is zero) and ages in the square pixel case, and only 2 images if one referring to the expression (6) for ω0, one deduces that also knows the principal point. the zero-skew constraint imposes a constraint only on By solving for the vector (a0,a1,...,a6) one deter- x0, allowing one to deduce that x0 = 0. There is no mines the value of ω0. The other ωj may be computed constraint on y0, αx or αy, which may vary freely. from (5). From this one may compute the individual Thus, although y0 = 0 is assumed in this analysis, calibration matrices Kj by inversion and Choleski fac- there is no way this may be deduced using the zero- torization of ωj, or else directly from ωj using (6). skew constraint. 2 The algorithm will fail if the computed value of αx Onemayverifythatthechoiceofcoordinatesys- 2 or αy is negative, which is equivalent to ωj not being tems used to compute the homographies H0j does not positive definite. This rarely occurs in practice except materially alter the constraints imposed on the cali- in unstable situations in which calibration is intrinsi- bration by a rotation about the X-axis, as analyzed cally infeasible, or with inaccurate or incorrect input above. data. Y -axis rotation. In a similar way, if the rotation 4 Unstable configurations is about the Y axis, then the expression for (ωj)12 In the experimental evaluation of these algorithms, involves only a2 and a5, as may be deduced in a similar it was observed that the results were quite unstable manner. Thus, x0, αx and αy may vary freely. If all if only the zero-skew constraint is used, unless the rotations R0j are about the X or Y axis, then there is no way to determine the parameters αx and αy of the shows the results of this experiment. They show that camera. If on the other hand, one includes a square- for typical image noise levels of σ smaller than 0.5 pix- pixel constraints, or rotations about the Z-axis, then els the algorithm performs well. The plot for the prin- it turns out that the stability problems disappear. cipal point shows a large variation when σ =1pixel. However, it is well known that the principal point is Pan- motion. A further important situation a poorly constrained parameter which easily fits its that leads to failure is that of a camera mounted on value to the noise. Note also that the small devia- an alt-azimuth mounting, allowing about a tion of the camera (5◦) represents a very demanding vertical axis, and tilting about a horizontal elevation test for the algorithm. With wider excursions of the axis. In this case, the rotation of the camera with re- camera, better results would be achieved ([4]). spect to a horizonal reference position may be written In figure 1 we also show the performance of the in the form RxRy, (pan followed by tilt). One easily linear algorithms versus the iterative non-linear algo- verifies that this rotation matrixhas the distinguish- rithm described in [1]. The non-linear algorithm was ing property that the (1, 2) element is zero. Mimicking run imposing equivalent constraints of zero skew and the analysis for X-axis rotation above, and mutiplying square pixels on the intrinsic parameters to minimize a out as in (7) reveals that the (1, 2) element of ωj does non-linear cost function using a Levenberg-Marquardt not depend on the element a4 =(ω0)22. According to method. For clarity we have only shown the result (6), this implies that αy is not determined. Further- when the image noise level of σ =0.5pixelswhich more, although element a5 of ω0 may be determined, 2 shows that the results obtained with the linear algo- it represents the quotient y0/αy. Hence y0 depends on rithm are comparable with those from the non-linear αy, and so can not be determined either. iterative method. 5 Experiments 5.2 Real image sequences In this section we present experimental results ob- The image sequences used in our experiments were tained using the linear algorithm on both synthetic taken using a camera with a mounted on and real image sequences. a Yorick stereo head/eye platform [10]. The camera 5.1 Synthetic data was rotated using one of the two independent vergence Experiments were first carried out with synthetic axes to pan the camera, and the common elevation data to evaluate the performance of the linear algo- axis to tilt it. Note therefore that the motion selected rithm using the zero-skew and the square-pixel con- for these experiments falls in the case of the pan-tilt straints described in section 2. The data was created degeneracy described in section 4. The mechanics of to simulate a camera with a zoom lense providing a our head do not permit rotations about the Z axis. total focal length range of 12.5 mm to 35 mm. A cloud This experiment will therefore illustrate how the skew of 250 points was randomly generated within a con- zero constraint does not resolve this ambiguity and the fined spheric space of diameter 2 m lying in front of the square pixels constraint must be imposed. rotating camera at a distance of 5 m. The points were Two image sequences were taken. In the first se- then projected onto each of the image planes arising quence, the focal length of the camera remained fixed, from the different orientations of the camera and the while the pan and the tilt of the camera were varied to location of each image point was then perturbed in a perform a circular trajectory. In the second sequence, random direction by a distance governed by a Gaus- the focal length of the camera was set to increase lin- sian distribution with zero mean and standard devia- early, using the controlled zoom lens, while the camera tion σ measured in pixels. The size of the image planes performed a similar circular motion. The encoders of was 384×288 pixels. The skew of the image axes was the head/eye platform provided ground truth values taken to be zero, the aspect ratio of the image pix- for the pan and tilt angles of the camera which are els equal to one and the principal point was asumed accurate to 0.01 of a degree. The servo control of the to be located at the centre of the image. The camera zoom lens provided ground truth values of the position motion was such that the principal ray described a cir- of the zoom lens for each frame in the image sequence. cular trajectory of radius θ =5◦ measured from the The camera was then calibrated, using an accurately positive Z axis. To avoid the degenerate configuration machined calibration grid and classical calibration al- described at the end of section 4 a small amount of cy- gorithm, to obtain ground truth values for the inter- clorotation about the principal axis was then added to nal parameters at each of the different positions of the each camera position. The focal length of the camera zoom lens. Radial lens distortion was modelled using increased linearly throughout the sequence. Figure 1 a one parameter model and the images were appropri- FOCAL LENGTH - var focal length FOCAL LENGTH - var focal length 5000 5000 4500 sigma 0.1 4500 sigma 0.1 sigma 0.25 sigma 0.25 4000 sigma 0.5 4000 sigma 0.5 3500 LM sigma 0.5 3500 LM sigma 0.5 sigma 0.75 sigma 0.75 3000 sigma 1.0 3000 sigma 1.0 2500 2500 2000 2000 Focal length (pixels) 1500 Focal length (pixels) 1500 1000 1000 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Frame Frame PRINCIPAL POINT - var focal length PRINCIPAL POINT - var focal length

250 sigma 0.1 250 sigma 0.1 sigma 0.25 sigma 0.25 sigma 0.5 sigma 0.5 200 LM sigma 0.5 200 LM sigma 0.5 sigma 0.75 sigma 0.75 150 sigma 1.0 150 sigma 1.0

v_o (pixels) 100 v_o (pixels) 100

50 50

0 0 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350 u_o (pixels) u_o (pixels)

ASPECT RATIO- var focal length 1.3 sigma 0.1 1.2 sigma 0.25 sigma 0.5 LM sigma 0.5 1.1 sigma 0.75 sigma 1.0 1 Aspect ratio 0.9

0.8 0 5 10 15 20 25 30 Frame Figure 1: Calibration results with synthetic data in presence of various degrees of image noise, with one run at each noise level. Computed values for the focal length (top), the location of the principal point (middle) and the aspect ratio (bottom) imposing the zero skew (left) and square-pixels (right) constraints. Results obtained using the non-linear iterative method of [1] imposing the zero skew and the square-pixels constraint are shown for σ =0.5. The aspect ratio was set to one by the algorithm when the square-pixels constraint was used Figure 2: Mosaics constructed from the two bookshelf sequences during which the camera panned and tilted while the focal length remained fixed (left) and was varied (right). ately warped to correct for this factor. case of critical rotation sequences for which the cali- The homographies that relate corresponding points bration problem is inherently unstable. This serves as between views were computed in two stages. First, the a warning that the data used does not support a use- inter-image homographies were computed from cor- ful estimate of the cameras’ calibration parameters. It responding corners and second, they were refined by is probable that in these cases a calibration estimate minimizing the global reprojection image error using given by any other algorithm, such as the previous a bundle-adjustment technique [4, 2]. Figure 2 shows iterative algorithms, would be virtually useless. the mosaics constructed by registering both image se- The linear methods do not apply in some cases for quences. which the iterative algorithms may be used, such as When the zero skew constraint was imposed very fixed, but unknown aspect ratio, or fixed, but un- poor results were obtained for the calibration param- known principal point. The common cases of zero eters. The fundamental ambiguity described in sec- skew and known aspect ratio are covered, however. In tion 4 was confirmed by observing that the two small- practice, skew is almost always zero, and the aspect est singular values of the equation matrix E were very ratio is usually known to be one, or is available from close to zero, implying that there is a one parameter a spec-sheet for the camera. In any case the aspect family of possible solutions to the calibration. Impos- ratio is essentially invariant, and could be determined ing the zero skew constraint also failed to provide a off-line. Further experiments (not explained in detail solution when used in the non-linear iterative mini- here) show that the linear algorithm can be used in a mization. linear search over a range of feasible aspect ratios to Figure 3 shows the results obtained with the lin- determine the aspect ratio that gives the best fit to ear algorithm imposing the square-pixels constraint. the data. The results confirm the good performance of the lin- ear method, which in these particular experiments give References better estimates than the iterative algorithm. [1] L. de Agapito, E. Hayman, and I. D. Reid Self- 6Conclusion calibration of a rotating camera withvarying intrinsic parameters. In Proc. British Machine Vision Confer- The key idea of the paper is to focus on the image of ence, pages 105–114, 1998. the absolute conic rather than its dual. This leads to a linear algorithm for the constrained calibration prob- [2] D. Capel and A. Zisserman. Automated mosaicing lem, rather than the iterative algorithms previously withsuper-resolution zoom. In Proceedings of the reported. The linear algorithm is extremely simple IEEE Conference on Computer Vision and Pattern to implement and performs very well compared with Recognition, pages 885–891, June 1998. iterative algorithms, ofter giving better results. The [3] R. I. Hartley Estimation of relative camera positions method fails in the case where the computed image of for uncalibrated cameras. In Proc. European Confer- the absolute conic is not positive-definite. However, ence on Computer Vision, LNCS 588, pages 579–587. this did not occur in our experiments, except in the Springer-Verlag, 1992. FOCAL LENGTH - fixed focal length FOCAL LENGTH - var focal length 950 ground truth 2400 ground truth 900 Linear 2200 Linear 850 L M fixed pp L M fixed pp L M var pp 2000 L M var pp 800 1800 750 1600 700 1400 Focal length (pixels) 650 Focal length (pixels) 1200 600 1000 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Frame Frame PRINCIPAL POINT - fixed focal length PRINCIPAL POINT - var focal length 170 ground truth 250 ground truth 165 Linear Linear 160 L M fixed pp L M fixed pp L M var pp 200 L M var pp 155 150 150

v_o (pixels) v_o (pixels) 100 145 140 50 135 0 160 170 180 190 200 210 0 50 100 150 200 250 300 350 u_o (pixels) u_o (pixels) MOTION - fixed focal length MOTION - var focal length 24 22 ground truth 21 ground truth 22 linear linear LM fix pp 20 LM fix pp 20 LM var pp LM var pp 19 18 18 tilt (deg) pan (deg) 17 16 16 14 15 12 14 -10 -8 -6 -4 -2 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 tilt (deg) pan (deg)

Figure 3: Ground truth and computed values for the focal length (top), the location of the principal point (middle) and the motion of the camera (bottom) for the fixed focal length (left) and the variable focal length (right) bookshelf sequences. Results are shown for (i) the linear algorithm imposing the square-pixels constraint (ii) the Levenberg-Marquardt algorithm imposing the square-pixels constraint and (iii) the Levenberg-Marquardt algorithm imposing both the square-pixels and the fixed principal point constraints. For visualization purposes, the motion was represented by plotting pan versus elevation angles. Also for clarity, only the central section of the image is shown in the plot depicting the principal point. Note that the value of the focal length obtained from the linear algorithm is more accurate than the LM algorithm run using the same calibration assumptions (LM-var-pp). However, in both sequences, focal length estimation is improved by using the linear estimate as a starting estimate for iteration fixing the principal point (LM-fixed-pp). The principal point (in reality invariant during the fixed focal length sequence, and varying only slightly during the variable focal length sequence) appears to be better estimated by the linear algorithm. [4] R. I. Hartley. Self-calibration of Stationary Cameras. International Journal of Computer Vision, volume 22, number 1, pages 5–23, February, 1997. [5] A. Heyden and K. Astr¨˚ om. Euclidean reconstruction from image sequences withvarying and unknown fo- cal lengthand principal point. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1997. [6] A. Heyden and K. Astr¨˚ om. Minimal conditions on intrinsic parameters for euclidean reconstruction. In Proc. Asian Conference on Computer Vision,Hong Kong, 1998. [7] S. Maybank and O. Faugeras. A theory of self- calibration of a moving camera. International Journal of Computer Vision, 8(2):123–151, 1992. [8] M. Pollefeys, R. Koch, and L. Van Gool. Self cali- bration and metric reconstruction in spite of varying and unknown internal camera parameters. In Proc. 6th International Conf. on Computer Vision, Bom- bay, pages 90–96, 1998. [9] Yongduek Seo and Ki Sang Hong, Auto-calibration of a Rotating and Zooming Camera, In Proc. of IAPR workshop on Machine Vision Applications (MVA’98), Makuhari, Chiba, Japan, Nov. 17 – 19, 1998. [10] P. M. Sharkey, D. W. Murray, S. Vandevelde, I. D. Reid, and P. F. McLauchlan. A modular head/eye platform for real-time reactive vision. Mechatronics, 3(4):517–535, 1993. [11] A. Zisserman, D. Liebowitz, and M. Armstrong. Re- solving ambiguities in auto-calibration. Philosophi- cal Transactions of the Royal Society of London. Vol 356(1740):1193–1211, 1998.