Introduction to Pinhole Cameras

Introduction to pinhole cameras

Lesson given by Sébastien Piérard in the course “Introduction aux techniques audio et vidéo” (ULg, Pr. J.J. Embrechts)

INTELSIG, Monteﬁore Institute, University of Li`ege,Belgium

October 30, 2013

1 / 21 Optics : the light travels along straight line segments

The light travels along straight lines if one assumes a single material (air, glass, etc.) and a single frequency. Otherwise . . .

2 / 21 Optics : behavior of a lens

Assuming the Gauss’ assumptions, all light rays coming from the same direction converge to a unique point on the focal plane.

3 / 21 Optics : an aperture

4 / 21 Optics : a pinhole camera

A pinhole camera is a camera with a very small aperture. The lens becomes completely useless. The camera is just a small hole. High exposure durations are needed due to the limited amount of light received.

5 / 21 What is a camera ?

A camera is a function:

point in the 3D world → pixel x y z → u v

I x y z in the world 3D Cartesian coordinate system (c) I x y z in the 3D Cartesian coordinate system located at the camera’s optical center (the hole), with the axis z(c) along its optical axis, and the axis y (c) pointing upperwards. (f ) I u v in the 2D Cartesian coordinate system locate in the focal plane I u v in the 2D coordinate system screen or image

6 / 21 (x y z) → (x y z)(c)

y x @I@ @ 1@ y (c) @ 6 @@R −→ z t (c) z @ @@R @@ x(c)

 (c)     x x tx  (c)     y  = R3×3 y + ty  (c) z z tz 7 / 21 (x y z) → (x y z)(c)

 (c)     x x tx  (c)     y  = R3×3 y + ty  (c) z z tz R is a matrix that gives the rotation from the world coordinate system to the camera coordinate system. The columns of R are the base vectors of the world coordinate system expressed in the camera coordinate system. In the following, we will assume that the two coordinate systems are orthonormal. In this case, we have RT R = I ⇔ R−1 = RT .

8 / 21 (x y z)(c) → (u v)(f )

@ ¨¨ @ ¨ ¨¨ s @¨ ¨¨@ ¨¨ @ ¨ s @ y (c) ¨ v (f ) ¨¨ @ @ ¨ @ @ 6 ¨ @ 6 @ ¨ @ ¨ @ z(c) ¨ @ @ @@R f @@R x(c) @ u(f ) @ @ @ @ @ @ @ @

9 / 21 (x y z)(c) → (u v)(f )

We suppose that the image plane is orthogonal to the optical axis. We have :

f u(f ) = x(c) z(c) f v (f ) = y (c) z(c)

10 / 21 (u v)(f ) → (u v)

v (f ) v 6 ¢ - v0 ¢ ¢ ¢ u(f ) ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ 1 6 θ kv ?¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ - u - u 1 0 ku

11 / 21 (u v)(f ) → (u v)

We have :

v (f ) u = (u(f ) − )k + u v = v (f )k + v tan θ u 0 v 0

The parameters ku and kv are scaling factors and (u0, v0) are the coordinates of the point where the optical axis crosses the image ku plane. We pose suv = − tan θ and obtain:

     (f ) su ku suv u0 su      (f ) sv =  0 kv v0 sv  s 0 0 1 s

Often, the grid of photosensitive cells can be considered as nearly rectangular. The parameter suv is then neglected and is considered as 0.

12 / 21 The complete pinhole camera model

With homogeneous coordinates the pinhole model can be written as a linear relation:  t  x su k s u  f 0 0 0 x u uv 0  R t  y sv =  0 k v  0 f 0 0  3×3 y       v 0    t  z s 0 0 1 0 0 1 0  z    0 0 0 1 1

 t  x su α S u 0 x u uv 0  R t  y ⇔ sv =  0 α v 0  3×3 y       v 0   t  z s 0 0 1 0  z    0 0 0 1 1 x su m m m m  11 12 13 14 y ⇔ sv = m m m m       21 22 23 24 z s m m m m   31 32 33 34 1

with αu = f ku, αv = f kv , and Suv = f suv . 13 / 21 Questions

Question What is the physical meaning of the homogeneous coordinates ?

Think about all concurrent lines intersection at the origin and the three ﬁrst elements of the homogeneous coordinates with (x y z) on a sphere . . . Question

How many degrees of freedom has M3×4 ?

(m31m32m33) is a unit vector since (m31m32m33) = (r31r32r33).

14 / 21 The calibration step: ﬁnding the matrix M3×4

15 / 21 The calibration step: ﬁnding the matrix M3×4

x su m m m m  11 12 13 14 y sv = m m m m       21 22 23 24 z s m m m m   31 32 33 34 1 therefore

m x + m y + m z + m u = 11 12 13 14 m31 x + m32 y + m33 z + m34 m x + m y + m z + m v = 21 22 23 24 m31 x + m32 y + m33 z + m34 and

(m31 u − m11)x + (m32 u − m12)y + (m33 u − m13)z + (m34 u − m14) = 0

(m31 v − m21)x + (m32 v − m22)y + (m33 v − m23)z + (m34 v − m24) = 0

16 / 21 The calibration step: ﬁnding the matrix M3×4

 m11    m x1 y1 z1 1 0 0 0 0 −u1x1 −u1y1 −u1z1 −u1  12  x2 y2 z2 1 0 0 0 0 −u2x2 −u2y2 −u2z2 −u2  m13     m   ......   14   ......   m21   x y z 1 0 0 0 0 −u x −u y −u z −u   m   n n n n n n n n n n   22  = 0  0 0 0 0 x1 y1 z1 1 −v1x1 −v1y1 −v1z1 −v1   m23       0 0 0 0 x2 y2 z2 1 −v2x2 −v2y2 −v2z2 −v2   m24   ......   m   ......   31  ......  m32  0 0 0 0 xn yn zn 1 −vnxn −vnyn −vnzn −vn  m33  m34

17 / 21 Questions

Question What is the minimum value for n ? 5.5 Question How can we solve the homogeneous system of the previous slide ?

Apply a SVD and use the vector of the matrix V corresponding the smallest singular value. Question Is there a conditions on the set of 3D points used for calibrating the camera ? Yes, they should not be coplanar.

18 / 21 Intrinsic and extrinsic parameters

      m11 m12 m13 m14 αu Suv u0 tx       m21 m22 m23 m24 =  0 αv v0  R3×3 ty  m31 m32 m33 m34 0 0 1 tz | {z } | {z } intrinsic parameters extrinsic parameters The decomposition can be achieved via the orthonormalising theorem of Graham-Schmidt, or via a QR decomposition since

      m11 m12 m13 αu Suv u0 r11 r12 r13       m21 m22 m23 =  0 αv v0 r21 r22 r23 m31 m32 m33 0 0 1 r31 r32 r33       m33 m23 m13 r33 r23 r13 1 v0 u0       ⇔ m32 m22 m12 = r32 r22 r12 0 αv Suv  m31 m21 m11 r31 r21 r11 0 0 αu

19 / 21 Questions

Question What is the minimum of points to use for recalibrating a camera that has moved ? 3 since there are 6 degrees of freedom in the extrinsic parameters. Question What is this ?

20 / 21 Bibliography

D. Forsyth and J. Ponce, Computer Vision: a Modern Approach. Prentice Hall, 2003. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, 2004. Z. Zhang, “A ﬂexible new technique for camera calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1330–1334, 2000.

21 / 21