Description of 2D and 3D Coordinate Systems And

Description of 2D and 3D Coordinate Systems and Derivation of their Rotation Matrices

Conventions: In a 3D coordinate system, Xs, Ys, Zs will be used for object coordinates in the scanner coordinate system. This is the coordinate system from which the transformation is made. Xc ,Yc ,Zc will be used for the object coordinates expressed in the camera coordinate system after they have been scaled by the camera lens. The Zc value for the object will be the location of the image plane in the camera, the CCD sensor and will be equal to the focal length of the camera, f (at CCD sensor, Zc = f). This is the coordinate system to which the scanner coordinates are transformed. Xi, Yi, Zi will be used to designate the object coordinates after they are transformed to a coordinate system which has as its origin, the camera coordinate system origin but before scaling by the lens occurs. Xi, Yi, Zi are unaffected by the camera lens optics. x, y, z will be used for the translation needed to move the origin of the scanner coordinate system to the origin of the camera coordinate system.

The derivation will first be explained used a 2D example. In a 2D planar coordinate system Xs and Ys will be used for the coordinate system that corresponds to the scanner coordinates, that is the system from which the transformation is made. Xc and Yc will be used for the coordinate system that corresponds to the camera coordinates, that is the system to which the X and Y coordinates are transformed. The terms camera and scanner are used here only to maintain continuity between the 2D and the 3D derivations. A 2D transformation does not apply to an actual transformation between scanner and camera coordinates.

In a 3D coordinate system Omega (ω) will describe rotation about the X-axis, Phi (Ф) will describe rotation about the Y-axis, and Kappa (κ) will describe rotation about the Z-axis. Theta (θ) will describe rotation in a 2D planar coordinate system.

Derivation of 2D transformation In a 2D planar coordinate system a counter-clockwise rotation from the scanner coordinates to the camera coordinates can be accomplished with the following transformation matrix assuming that the origins of the two coordinate systems are located at the same spot. In a counter-clockwise system positive rotation is in the counter-clockwise direction. Negative rotation is in the clockwise direction.

Counter-clockwise rotation will be used in all transformations. This appears to be the most commonly used convention, although it is possible to perform the transformations equally as well with a convention that uses a clockwise rotation. This may be the preferable convention for rotation because the cross-product of two vectors which have been defined relative to each other in a counter-clockwise convention has the positive direction of the result of the cross-product pointing toward the viewer. That is, with the two vectors in the horizontal plane, the cross-product is positive if it points up.

The black coordinate axes in Figure 1 are rotated counter-clockwise to align them with the blue coordinate axes.

Lionel White Original 1/2007 modified 8/27/2008 1

Figure 1. Counterclockwise rotation of coordinate systems, transformation of X, Y to x, y. Point P is transformed from the black coordinate system to the blue coordinate system.

Conversion from one coordinate system to the other is derived below. Y = Xb + bP x = oa + ab + bx = X*cosθ + Xb*sinθ + bP*sinθ = X*cosθ + (Xb + bP)*sinθ = X*cosθ + Y*sinθ

y = -cY + od = -aX + od = -X*sinθ + Y*cosθ

The above equations for x and y are expressed below in matrix notation.

x  cos sin X        y  sin cos Y 

If the two coordinate systems have their origins separated by ΔX and ΔY distances, then a translation of the origin of the XY coordinate system must be made to the xy origin by adding the difference between their locations.

x  cos sin X  X        y  sin cos Y  Y 

 cos sin  The transformation matrix M    is a 2X2 matrix.  sin cos 

Lionel White Original 1/2007 modified 8/27/2008 2 A clockwise rotation from the scanner coordinates to the camera coordinates will use the following transformation matrix. The only difference is the signs for sinθ are reversed.

x cos  sin X  X  X  X         M   y sin cos Y  Y  Y  Y 

Derivation of the 3D transformation matrix In a 3D coordinate system the terms right and left hand coordinate systems are used. This method of defining a 3D coordinate system has the positive direction of the X-axis aligned with the thumb, the positive direction of the Y-axis aligned with the index finger, and the Z- axis aligned with the middle finger. The thumb and index finger are spread at 90o and aligned with the plane of the palm. The middle finger is bent at 90o to the palm. If the coordinate system aligns with the appropriate fingers of the left hand it is called a left- hand coordinate system. And vice-versa for the right-hand coordinate system. There is no rotation possible that will align a left-hand coordinate system with a right-hand coordinate system. However, the problem is solved by simply negating (multiplication by -1) the values of one of the axes of one of the coordinate systems. It makes no difference, which system or which axis.

Note: The Riegel Laser Scanner uses a right-hand coordinate system. A camera with the positive direction of the Z-axis exiting the lens toward the object, X being the horizontal dimension with positive to the right, and Y being the vertical dimension with positive being up, uses a left-hand coordinate system. The camera can be converted to a right-hand system by negating the values on the Z-axis.

Positive rotation about an axis is determined by aligning the thumb of the hand (right or left) in the positive direction of the axis and curling the fingers. The direction of the curled fingers is the direction of positive rotation. A right hand rotation has the same matrix form as a counter-clockwise rotation in a 2D coordinate system. A left hand rotation has the same matrix form as a clockwise rotation in a 2D coordinate system.

Having three axes, a 3D coordinate system will require a 3X3 transformation matrix. The matrices to transform the scanner coordinates to the camera coordinates are described below. Right-hand rotation will be used. The equations could as well be derived using a left-hand rotation.

1 0 0    Mx  0 cos sin  Rotation about the camera x-axis 0  sin cos

Lionel White Original 1/2007 modified 8/27/2008 3 cos 0  sin   My   0 1 0  Rotation about the camera y-axis sin 0 cos   cos sin 0   Mz   sin cos 0 Rotation about the camera z-axis  0 0 1

Note: Below the negation of the Z-axis value is disregarded, the camera and the scanner coordinates are assumed to use the same coordinate system. This will be corrected at the end of the derivation to reflect the reality of the Riegl scanner and the camera coordinate systems.

A single rotation about the X-axis would have the form shown below.

X i  X s  x X i  1 0 0 X s  x    Y   Mx * Y  y  Y  0 cos sin Y  y   i   s    i    s           Z i   Z s  z  Z  0  sin cos Z s  z 

Rotation about the three camera axes can be done in any order, but must then be done consistently throughout the computations in the same order. Generally, matrix multiplication is not transitive, that is [M] * [N] does not equal [N] * [M]. Rotation about the X-axis followed by rotations about the Y and Z-axes is the common sequence that is used. Thus, a transformation from the scanner coordinate system to the camera coordinate system is accomplished with the following matrix operations.

The full transformation would have the form shown below.

Lionel White Original 1/2007 modified 8/27/2008 4 X i  X s  x  Y   Mz * My * MzY  y   i   s  or, expressed in complete form below Z i   Z s  z 

X i   cos sin 0cos 0  sin1 0 0 X s  x  Y    sin cos 0 0 1 0 0 cos sin Y  y   i      s  Z i   0 0 1sin 0 cos 0  sin cos Z s  z 

The combined transformation matrix M can be expressed as M  Mz * My * Mx . And the transformation from scanner to camera coordinates can be expressed as

X i  X s  x  Y   M Y  y   i   s  Z i   Z s  z 

Multiplying out the rotation matrices Mz, My, and Mx is shown below  cos cos cos sin  sin sin cos sin sin  cos sin cos    M   cos sin cos cos  sin sin sin sin cos  cos sin sin   sin  sin cos cos cos 

Each element of the matrix is named according to its row/column position. m11  cos cos m12  cos sin  sin sin cos m13  sin sin  cos sin cos m21  cos sin m22  cos cos  sin sin sin m23  sin cos  cos sin sin m31  sin m32  sin cos m33  cos cos

X i  m11 m12 m13X s  x  Y   m21 m22 m23Y  y   i    s  Z i  m31 m32 m33 Z s  z 

In order to bring the camera and the scanner coordinate systems into the right-hand coordinate system of the scanner, it is necessary to multiply Zi by -1.

Lionel White Original 1/2007 modified 8/27/2008 5  X i  1 0 0 m11 m12 m13X s  x  Y   0 1 0 m21 m22 m23Y  y   i     s   Zi  0 0 1m31 m32 m33 Z s  z 

In order to scale the coordinates of the camera to that of the image plane (Xc, Yc) in the camera, Xi and Yi must be scaled by the ratio of the focal length (f) to that of the Zi value for each point. The first and second rows of the matrix equation are multiplied by the scaling factor f/Zi. The third row of the matrix equation is left unchanged.

 f   f   X i   0 0  Zi Z i  X c     m11 m12 m13X s  x f f  Y    Y    0 0 m21 m22 m23Y  y   c   Z i   Z   s   Z  i i m31 m32 m33 Z  z   i    Z   0 0 1  s   i       

Refer to Derivation of Tranformation Parameter Computation for a 2D/3D Coordinates System From Laser Scanner Coordinates to Camera Coordinates for a more complete explanation of the transformation to camera focal plane coordinates.

Note: The magnification of a camera lens system is the ratio of the focal length to the distance to the object. If a camera lens has a focal length of 25 mm and this is considered to be 1X, then replacing the 25 mm lens with a 50 mm lens will increase the magnification by a factor of two and reduce the area that was imaged by a factor of ½. Assume that the CCD sensor has a width of 25 mm (this is very close to the actual width of high quality Digital Single Lens Reflex (DSLR) cameras. A lens with a focal length of 25 mm is used. The distance to the outcrop is 100 m. The picture that is taken will cover 100 m of width of the outcrop, assuming that the camera is aligned perpendicular to the outcrop and that the outcrop runs straight left to right. If a lens with a focal length of 50 mm is put on the camera, the magnification will be about 2X and will only cover 50 m of the outcrop. Likewise, if a lens with a focal length of 100 mm is put on the camera, the image will appear to be magnified by 4X the first image and will cover only 25 m of the outcrop.

Derivation of the Collinearity Equations (basis for scanner to camera transformation) Below is a 3D transformation matrix without scaling by the camera lens.

X i  m11 m12 m13X s  x  Y   m21 m22 m23Y  y   i    s  Z i  m31 m32 m33 Z s  z 

The camera coordinate system has the origin located at the rear nodal point of the lens system. The Z-axis of the camera is parallel to the axis of the lens system and is perpendicular to the

Lionel White Original 1/2007 modified 8/27/2008 6 CCD sensor in the camera. The positive direction of the Z-axis is toward the object being photographed. Since the CCD sensor is located behind the focal point the CCD sensor is located at a Z-coordinate of (–f ), [f: focal length of the lens]. The CCD sensor represents the X-Y plane of the camera coordinate system and is orthogonal to the Z-axis. Note that with the CCD sensor (X-Y plane) located behind the focal point, inversion in the X and Y dimensions occurs. This is corrected by virtually moving the CCD sensor a distance (f) in front of the focal point. This places the image on the CCD sensor in the same orientation that the object has in space. It would be the same as taking a picture of the object and then holding the picture up in front of the camera. What is on the left side of the object appears on the left side of the picture. What is on the upper side of the object appears on the upper side of the picture.

The above transformation matrix transforms the scanner coordinates to camera coordinates but does not scale the dimensions to the scale of the CCD sensor. That is, the dimensional reduction that occurs due to the camera lens is not reflected in the transformation matrix at this point.

An image dimension is related to the object dimension by the ratio of the focal length of the lens and the distance to the object. Using the relationship of similar triangles as shown below, Xc = (f/Zi)*Xs, where f is the focal length of the camera, Zc is the distance from the rear nodal point of the camera lens to the object and Xs is the dimension of the object.

The Xi and Yi values will be scaled by the ratio f/Zi yielding Xc and Yc which are the coordinates on the CCD sensor. Thus, the matrix equation above when translated to the plane of the CCD sensor can be expressed as  f   f  X i   0 0 Z i Z c X c     m11 m12 m13X s  x f f  Y    Y    0 0m21 m22 m23Y  y   c   i Z   Z   s   Z  i c m31 m32 m33 Z  z   i   Z   0 0 1  s   i       

Lionel White Original 1/2007 modified 8/27/2008 7 The above complete transformation matrix is expressed in equation form as

f X c  m11X s  x m12Ys  y m13Z s  z Zi f Yc  m21X s  x m22Ys  y m23Z s  z Z i

Z i  m31X s  x m32Ys  y m33Z s  z

What are called the collinearity equations are formed by substituting for Zi in the equations for Xc and Yc. The collinearity equations are

f m11X s  x m12Ys  y m13Z s  z X c  m31X s  x m32Ys  y m33Z s  z

f m21X s  x m22Ys  y m23Z s  z Yc  m31X s  x m32Ys  y m33Z s  z

The scanner and camera systems are not compatible, one is left handed the other is right handed. They are made compatible by negating the values of one of the axes. By negating the value of the Z axis, the equations become

 f m11X s  x m12Ys  y m13Z s  z X c  m31X s  x m32Ys  y m33Z s  z

 f m21X s  x m22Ys  y m23Z s  z Yc  m31X s  x m32Ys  y m33Z s  z

The following assignments are made in order to simplify the notation later, refer to the numerators and denominators of the above equations.

U   f m11X s  x m12Ys  y m13Z s  z

V   f m21X s  x m22Ys  y m23Z s  z

W  m31X s  x m32Ys  y m33Z s  z

Xc and Yc are shown below expressed in terms of U, V, W.

U X  c W V Y  c W

Xc and Yc are the coordinates of a point on the camera CCD sensor with the origin of the coordinate system at the center of the CCD sensor. These are not UV coordinates. UV coordinates have the origin at the upper left corner of a photograph and have values between 0 and 1.0. Xc and Yc have dimensions which relate to the size of the CCD sensor and range

Lionel White Original 1/2007 modified 8/27/2008 8 from -1/2 the sensor width to +1/2 the sensor width and from -1/2 the sensor height to +1/2 the sensor height. In solving for the transformation parameters (ω, Φ, κ, x, y, and z) four points are identified common between the photograph and the 3D model. Thus we have four sets of values for Xc, Yc, Xs, Ys, and Zs. We do not have values for Zi in the camera coordinate system. By substituting for Zi in the equations for Xc and Yc, the need to solve for Zi is eliminated.

A different derivation of the collinearity equations is provided by Mikhail and is described below.

Starting with the 3D transformation below, the system is turned into a 2D/3D transformation by adding a uniform scaling factor (a) and setting the image Z value, Zi, to –f, the focal length of the camera.

X i  m11 m12 m13X s  x  X c  m11 m12 m13X s  x  Y   m21 m22 m23Y  y   Y   am21 m22 m23Y  y   i    s  becomes  c    s  Z i  m31 m32 m33 Z s  z   f  m31 m32 m33 Z s  z 

Writing this in equation form

X c  am11X s  x m12Ys  y m13Z s  z

Yc  am21X s  x m22Ys  y m23Z s  z

 f  am31X s  x m32Ys  y m33Z s  z

Division of the first two equations by the third equation eliminates the scaling factor a.

 f m11X s  x m12Ys  y m13Z s  z X c  m31X s  x m32Ys  y m33Z s  z

 f m21X s  x m22Ys  y m23Z s  z Yc  m31X s  x m32Ys  y m33Z s  z

The least squares solutions for these equations are provided in the other documents.

References: Mikhail, E.M., Bethel, J.S., McGlone, J.C.; Introduction to Modern Photogrametry, 479 pages, John Wiley and Sons, 2001.

Larson, R.E., Edwards, B.H.; Elementry Linear Algebra, 568 pages, D.C. Heath and Company, 3rd Edition, 1996.

Wolf, P.R., Dewitt, B.A.; Elements of Photogrammetry with Applications in GIS, Edition, McGraw Hill, Inc., 3rd Edition, 2000,

Lionel White Original 1/2007 modified 8/27/2008 9