CS 351 Introduction to Computer Graphics
Total Page:16
File Type:pdf, Size:1020Kb
EECS 351-1 : Introduction to Computer Graphics Building the Virtual Camera ver. 1.4 3D Transforms (cont’d): 3D Transformation Types: did we really describe ALL of them? No! --All fit in a 4x4 matrix, suggesting up to 16 ‘degrees of freedom’. We already have 9 of them: 3 kinds of Translate (in x,y,z directions) + 3 kinds of Rotate (around x,y,z axes) + 3 kinds of Scale (along x,y,z directions). ?Where are the other 7 degrees of freedom? 3 kinds of ‘shear’ (aka ‘skew’): the transforms that turn a square into a parallelogram: -- Sxy sets the x-y shear amount; similarly, Sxz and Syz set x-z and y-z shear amount: Shear(Sx,Sy,Sz) = [ 1 Sxy Sxz 0] [ Sxy 1 Syz 0] [ Sxz Syz 1 0] [ 0 0 0 1] 3 kinds of ‘perspective’ transform: --Px causes perspective distortions along x-axis direction, Py, Pz along y and z axes: Persp(px,py,pz) = [1 0 0 0] [0 1 0 0] [0 0 1 0] [px py pz 1] --Finally, we have that lower-left ‘1’ in the matrix. Remember how we convert homogeneous coordinates [x,y,z,w] to ‘real’ or ‘Cartesian’ 3D coordinates [x/w, y/w, z/w]? If we change the ‘1’ in the lower left, it scales the ‘w’—it acts as a simple scale factor on all ‘real’ coordinates, and thus it’s redundant:—we have only 15 degrees of freedom in our 4x4 matrix. We can describe any 4x4 transform matrix by a combination of these three simpler classes: ‘rigid body’ transforms == preserves angles and lengths; does not distort or reshape a model, includes any combination of rotation and translation. ‘affine’ transforms== preserves parallel lines, but not angles or line lengths includes rotation, translation, scale, shear or skew… ‘perspective’ transforms; links x,y or z to the homogeneous coord w: provides the math behind a ‘pinhole’ camera image, but in orderly 4x4 matrix form! Cameras and 3D Viewing: Algebra and Intuition Camera: device that ‘flattens’ 3D space onto a 2D plane. Get the intuition first, then the math: f (focal length) y (xv, yv, zv, 1) (xv, yv, zv,zv/f) ‘Center of z Projection’ 3D model ‘2D focal plane’ (connected vertices) A planar perspective camera, at its simplest most general form, is just a plane and a point. The plane splits the universe into two halves; the half that contains the point, which is the location of our eye or camera, and then the other half of the universe that we’ll see with our eye or camera. The point is known by many names; the ‘camera location’, the ‘center of projection’(COP), the ‘viewpoint’ the ‘view reference point’ etc., and the plane is ‘the image plane’, the ‘near clip plane’, or even the ‘focal plane’. We paint our planar perspective image onto that plane in a very simple way. Draw a line from our point (viewpoint, camera point, etc) through the plane, and trace it into the other half of the universe until it hits something. Copy the color we find on that ‘something’ and paint it onto the plane where our tracing-line pierced it. Use this method to set the color of all the plane points to form the complete planar perspective image of one half of the universe. However, real cameras are finite: we can’t capture the entire infinitely large picture plane, and instead we usually select the picture within a rectangle region of that plane. First, let’s define the camera’s ‘focal length’ f as the shortest distance from its center-of-projection (COP) to the universe-splitting image plane. We make a vector perpendicular to the image plane that measures the distance from our eye-point to the plane, and call that our direction-of-gaze, or ‘lookAt’ direction. We define a rectangular region of the plane, usually (but not always) centered at the point nearest the COP, and record its colors as our perspective image. The size and placement of that rectangle, along with focal length ‘f’ describes all the basic requirements for simple lenses in any kind of planar perspective camera: Now let’s formalize these ideas. Given a plane and a point, suppose your eye is located at the point and that it becomes the origin of a coordinate system. Call the origin point the ‘eye’ point, or the ‘center of projection’ or the ‘view reference point’(VRP), and use your eye to look down the ‘-z’ axis. Why not +z? because we want to use right-handed coordinate systems, and we want to keep these x & y axes of our 3D camera for use as the x & y axes of our 2D image, aimed rightward & upward. At z= -f, = -znear, we construct the ‘focal plane’ that splits the universe in half. draw a picture on a 2D ‘focal’ plane. How? Trace rays from the eye origin (VRP) to points (vertices, usually) in that other half of the universe, and find their location on the focal plane. Given a 3D vertex point (xv, yv, zv), where do we draw a point on our image plane? Ans: the image point is (f·xv / -zv, f·yv / -zv, f·zv / -zv) = f· (xv / -zv, yv / -zv, -1) As we change the focal distance ‘f’ between the camera origin (e.g. ‘center of projection’) and the 2D image plane, the image gets scaled up or down, larger or smaller, but does not otherwise change: adjusting the ‘f’ parameter is equivalent to adjusting a zoom lens. However, if we instead chose to keep ‘f’ fixed and instead move the vertices of a 3D model forwards or backwards along the zv axis, the effect is more complex. Vertices nearer the camera, those with small zv values, change their on-screen image positions far more than vertices farther away. Therefore, moving a 3D object closer or further away is NOT equivalent to adjusting a zoom lens, because it causes depth-dependent image changes: some call it foreshortening, others call it ‘changing perspective’ or ‘flattening the image’. As a 3D object moves away from the camera, the rays that carry color from its surface points to our camera’s center-of-projection (COP) get more and more parallel, with larger angles for shorter distances. Changing those angles yields a different appearance due to color changes (imagine the surface of a CD), and may change its occlusion: a surface that moves near the camera may block your view of more distant points. No matter how large this 3D object may be, as -zv reaches infinity the image of the object shrinks to a single point. This is the image’s z-axis ‘vanishing point’; the points that painters use to help them make an accurate perspective drawing. For scenes with world-space axes that don’t match the camera axes, drawing lines parallel to each of these world-space axes will cause them to converge on other points; if they fall within the picture itself, we can see 1, 2, or 3 vanishing points. Any parallel lines in 3D that aren’t parallel to our image plane will form vanishing points somewhere on our image plane, so you can have as many ‘vanishing points’ as you wish in a drawing; the number of vanishing points doesn’t really tell you much about the camera that made an image, only the content of the scene that the camera captured. (‘vanishing points’ in paintings define convergence of major axes of viewed objects, such as horizontal lines where walls meet floors, or vertical lines where walls meet each other). --Easy Question: suppose your camera DIDN’T divide by zv: then where does the 3D vertex point ()appear in the image plane? Then we form no vanishing points, and the model stays the same size no matter how far away. This is known as ‘orthographic’ projection, and if we define ‘d’ as the size of the image, we can use it to replace the ‘f’, and specify the ‘distance from the origin’ like this: Answer: the image point is (d·xv d·yv, d·zv) = d· (xv, yv, 1) Homogeneous Coordinates: Clever Matrix Math for Cameras and 3D Viewing: The ‘intuitive’ camera-construction method above isn’t linear, and isn’t suitable for reliable graphics programs. It requires us to divide by numbers that can change; if we use it to draw pictures by the millions, eventually our 3D drawing program will fail with a divide-by-zero problem. It is messy to use, too—how can you ‘move’ the camera freely? What if you want an image plane that isn’t perpendicular to the Z axis, that doesn’t have its ‘vanishing point’ in the very center of the picture it makes? THIS is the real reason why we use homogeneous coords & matrices (4x4) for graphics: --it lets us make a 4x4 matrix that maps 3-D points to a 2-D image plane, like a camera; --it lets you position the camera separately from the model, place them in ‘world’ space; --it never causes ‘divide-by-zero’ errors while you compute your image, --it lets you adjust your camera easily, and even seems to let you ‘tilt’ the image plane (e.g. emulate an architectural ‘view’ camera), though you actually just adjust the position of the rectangle on the plane that you will use as your image. --You can transform just the vertices of a model to the image plane, and then draw the lines and polygons in 2D. The most-basic perspective camera matrix: (same as the algebraic one above) just ‘messes around’ with the bottom-most row of the 4x4 matrix; it avoids the divide by coupling zv to the ‘w’ value—now ‘w’ is doing something very useful: Tpers = [ 1 0 0 0] [ 0 1 0 0] [ 0 0 1 0] [ 0 0 1/f 0] (note last two elements) That’s all you need for any perspective camera! Compare it with the coords in the drawing above.