for Imaging Systems Course Notes for IMGS-321 11 December 2013

Roger Easton Chester F. Carlson Center for Imaging Science Rochester Institute of Technology 54 Lomb Memorial Drive Rochester, NY 14623 1-585-475-5969 [email protected]

December 11, 2013 Contents

Preface ix 0.1References:...... 1

1 Introduction 1 1.1 Models of Light and Propagation ...... 2 1.1.1 Raymodeloflight(“geometricaloptics”)...... 2 1.1.2 Wavemodeloflight(“physicaloptics”):...... 2 1.1.3 Photonmodeloflight(“quantumoptics”):...... 3

2 Ray (Geometric) Optics 5 2.1Whatisanimagingsystem?...... 5 2.1.1 SimplestImagingSystem—PinholeinAbsorber...... 5 2.2First-OrderOptics...... 6 2.3Third-OrderOptics...... 9 2.3.1 Higher-OrderApproximations...... 10 2.4NotationsandSignConventions...... 10 2.4.1 NatureofObjectsandImages:...... 11 2.5HumanEye...... 13 2.6PrincipleofLeastTime...... 13 2.7 Fermat’s Principle for Reflection...... 14 2.7.1 PlaneMirrors...... 17 2.8Fermat’sPrincipleforRefraction:...... 18 2.8.1 Dispersion...... 19 2.8.2 RefractiveConstantsforGlasses...... 21 2.9ImageFormationintheRayModel...... 24 2.9.1 RefractionataSphericalSurface...... 24 2.9.2 ImagingwithSphericalMirrors...... 27 2.10First-OrderImagingwithThinLenses...... 28 2.10.1ExamplesofThinLenses...... 30 2.10.2SphericalMirror...... 32 2.11 Image Magnifications...... 32 2.11.1 Transverse Magnification:...... 32 2.11.2 Longitudinal Magnification:...... 33 2.11.3 Angular Magnification...... 34 2.12SingleThinLenses...... 35 2.12.1PositiveLens...... 35 2.12.2NegativeLens...... 36 2.12.3MeniscusLenses...... 36 2.12.4 Simple Microscope (magnifier,“magnifyingglass,”“loupe”)...... 37 2.13SystemsofThinLenses...... 41 2.13.1Two-LensSystem...... 41 2.13.2 Effective(Equivalent)FocalLength...... 43

v vi CONTENTS

2.13.3SummaryofDistancesforTwo-LensSystem...... 48 2.13.4 “EffectivePower”ofTwo-LensSystem...... 48 2.13.5 Lenses in Contact: t =0...... 49 2.13.6 Positive Lenses Separated by tf1 + f2 ...... 60 2.13.11CompoundMicroscopes...... 61 2.13.12 Two Positive Lenses with Different Focal Lengths and Different Separations . 62 2.13.13SystemsofOnePositiveandOneNegativeLens...... 63 2.13.14NewtonianFormofImagingEquation...... 64 2.13.15Example(1)ofTwo-LensSystem...... 65 2.13.16Example(2)ofTwo-LensSystem:TelephotoLens...... 69 2.13.17ImagesfromTelephotoSystem:...... 72 2.13.18Example(3)ofTwo-LensSystem:TwoNegativeLenses...... 74 2.14PlaneandSphericalMirrors...... 76 2.14.1ComparisonofThinLensandConcaveMirror...... 79 2.15StopsandPupils...... 79 2.15.1FocalRatio—f-number...... 80 2.15.2Example:FocalRatioofLens-ApertureSystems...... 81 2.15.3Example:ExitPupilsofTelescopicSystems...... 85 2.15.4 Pupils and Diffraction...... 90 2.15.5FieldStop...... 91 2.16MarginalandChiefRays...... 91 2.16.1Telecentricity...... 92 2.16.2MarginalandChiefRaysforTelescopes...... 94

3 Tracing Rays Through Optical Systems 95 3.1ParaxialRayTracingEquations...... 95 3.1.1 ParaxialRefraction...... 96 3.1.2 ParaxialTransfer...... 97 3.1.3 LinearityoftheParaxialRefractionandTransferEquations...... 98 3.1.4 ParaxialRayTracing...... 98 3.2MatrixFormulationofParaxialRayTracing...... 100 3.2.1 RefractionMatrix...... 101 3.2.2 RayTransferMatrix...... 102 3.2.3 “Vertex-to-VertexMatrix”forSystem...... 104 3.2.4 Example1:SystemofTwoPositiveThinLenses...... 105 3.2.5 Example2:TelephotoLens...... 108

3.2.6 VV0 DerivedFromTwoRays...... 109 3.3Object-to-Image(Conjugate)Matrix...... M 110 3.3.1 Matrix of the “Relaxed” Eye (focused at ) ...... 114 3.4Vertex-VertexMatricesofSimpleImagingSystems...... ∞ 115 3.4.1 Magnifier(“magnifyingglass,”“loupe”)...... 115 3.4.2 GalileanTelescopeofThinLenses...... 116 3.4.3 KeplerianTelescopeofThinLenses...... 117 3.4.4 ThickLenses...... 117 3.4.5 Microscope...... 121 3.5 Image Location and Magnification...... 122 3.6MarginalandChiefRaysfortheSystem...... 122 3.6.1 ExamplesofMarginalandChiefRaysforSystems...... 123 CONTENTS vii

4 Depth of Field and Depth of Focus 141 4.0.2 ExamplesofDepthofFieldfromVideoandFilm...... 143 4.1 Criterion for “Acceptable Blur” ...... 149 4.2DepthofFieldviaRayleigh’sQuarter-WaveRule...... 152 4.3HyperfocalDistance...... 156 4.4MethodsforIncreasingDepthofField...... 156 4.5 Sidebar: Transverse Magnificationvs.FocalLength...... 157

5Aberrations 161 5.1ChromaticAberration...... 161 5.2Third-OrderOptics,MonochromaticAberrations...... 165 5.2.1 NamesofAberrations...... 173 5.2.2 Aberration Coefficients...... 174 5.2.3 Fourth-Order(Third-OrderRay)Aberrations:...... 181 5.2.4 ZernikePolynomials...... 190 5.3 Structural Aberration Coefficients...... 193 5.4OpticalImagingSystemsandSampling...... 193 5.5OpticalSystem“RulesofThumb”...... 193 Preface This book is intended to introduce the mathematical tools that can be applied to model and predict the action of optical imaging systems.

ix

0.1 REFERENCES: 1 0.1 References:

Many references exist for the subject of wave optics, some from the point of view of physics and many others from the subdiscipline of optics. Unfortunately, relatively few from either camp concentrate on the aspects that are most relevant to imaging.

Useful Optics Texts: [P3] (the three) Pedrottis, Introduction to Optics, Pearson Prentice-Hall, 2007. [G] Gaskill, Jack D., Linear Systems, Fourier Transforms, and Optics, John Wiley, 1978. [JG] Goodman, Joseph, Introduction to Fourier Optics, Third Edition, Roberts & Company, 2005. [H] Eugene Hecht, Optics, 4th Edition, Addison-Wesley, 2002. [PON] Reynolds, DeVelis, Parrent, Thompson, The New Physical Optics Notebook, SPIE, 1989. [BW] Max Born and Emil Wolf, Principles of Optics, 7th Expanded Edition, Cambridge University Press, 2005. [GF] Grant R. Fowles, Introduction to Modern Optics (Second Edition), Dover Publications, 1975. [RHW]RobertH.Webb,Elementary Wave Optics, Dover Publications, 1997. [FLS] R. Feynman, R. Leighton, M. Sands, The Feynman Lectures on Physics,Addison- Wesley, 1964. [KF] M.V. Klein and T.E. Furtak, Optics, Second Edition, Wiley, 1986 [JW] F. Jenkins and H. White, Fundamentals of Optics, 4th Edition, McGraw-Hill, 1976. [NP] A. Nussbaum and R. Phillips, Contemporary Optics for Scientists and Engineers, Prentice-Hall, 1976. [I] K. Iizuka, Engineering Optics, Springer-Verlag, 1985. [FBS] D. Falk, D. Brill, and D. Stork, Seeing the Light, Harper and Row, 1986. Lawrence Mertz, Transformations in Optics, John Wiley & Sons, 1965.

Physics Texts with useful discussions: [HR] D. Halliday and R. Resnick, Physics, 3rd Edition, Wiley, 1978. [C] F. Crawford, Waves, Berkeley Physics Series Vol. III, McGraw-Hill, 1968. John D. Jackson, Classical Electrodynamics, Third Edition, Wiley, 1998, §6. Feynman, Leighton, and Sands, Lectures on Physics, particularly Volume 1.§25-§33 and Vol- ume II §32-§33

Curriculum: and Imaging

1. Models for light propagation

(a) ray model (“geometric optics”) (b) wave model (“physical optics”) (c) photon model (quantum optics)

2. First-order optics

(a) third-order optics, aberrations (b) higher-order approximations

3. Sign conventions for distances and angles

(a) Nature of objects and images (real and virtual) 2 Preface

4.Humaneye 5. Refractive index

(a) Optical path length (b) Fermat’s principle of least time (P3 §2.2, H §4.5, BW §3.3)

(c) Snell’s law for reflection: θ2 = θ1 − i. plane

(d) Snell’s law for refraction: n1 sin [θ1]=n2 sin [θ2] i. plane interface between two media (e) Dispersion (variation in n with λ) i. relationship between mean refractive index and dispersion ii.crownandflint glasses (f) Dispersing prisms

6. Refraction at a Spherical Surface

(a) Paraxial approximation, imaging equation (b)Reflection at a spherical surface

7. Imaging with thin lenses

(a) Imaging equation in terms of object and image distances and focal length (b)system“power” (c) spherical mirrors (d) object/image conjugates (e) Image magnifications i. Transverse magnification ii. Longitudinal magnification iii. Angular magnification (f)Singlethinlenses i. positive lens ii. negative lens iii. meniscus lens iv.simplemicroscope (g) Systems of thin lenses i.lensesincontact ii.effective focal length and power of two-lens system iii. focal and principal points iv. afocal systems (telescopes) v. eyeglasses vi. compound microscopes vii. Newtonian form of imaging equation viii. telephoto lens ix. Stops and pupils A.aperturestop B. entrance and exit pupils 0.1 REFERENCES: 3

C. field stop (h) Marginal and chief (principal) rays i. telecentricity

8. Tracing rays through optical systems

(a) paraxial ray tracing equations i. paraxial refractiontransfer ii. paraxial transfer iii. linearity of equations (b) matrix formulation of paraxial ray tracing i. refraction matrix ii. transfer matrix iii. Lagrangian invariant iv. vertex-to-vertex matrix for imaging system v. object-to-image (conjugate) matrix vi. matrix for eye model (c) Examples of imaging system matrices i. magnifier ii. Galilean telescope iii. Keplerian telescope iv.thicklens v.microscope (d) image location and magnification (e)Depthoffield and depth of focus i.examplesfromfilm and video ii. criterion for “acceptable blur” iii.depthoffield via Rayleigh’s quarter-wave rule iv. hyperfocal distance v. methods for increasing depth of field vi. transverse magnification vs. focal length (f) Aberrations i. Chromatic aberration A. achromatic doublet B. apochromatic triplet ii. Third-Order (Seidel) Aberrations A. spherical aberration (relation to defocus) B.coma C. astigmatism D.distortion E.curvatureoffield F. piston error

9. Computed Ray Tracing, OSLOTM

Chapter 1

Introduction

The obvious first question to consider is “what is optics” (or perhaps “what are optics?” heh, heh). Onereasonabledefinition of optics is the application of physical principles and observed phenomena to manipulate “light” in useful ways. This presupposes the definition of “light,” which I specify as electromagnetic radiation of any “color,” temporal frequency, and wavelength. This is more general than the definition put forth by humanocentrics (e.g., color scientists), but is much more reasonable in our field, where we want to take advantage of all measureable radiation to learn information about objects that emit, reflect, refract, or otherwise modify radiation. The definition in imaging is somewhat narrower: the application of the properties if materials and of light to form “images,” which are “recognizable (though approximate) replicas of the spatial and spectral distribution of light reflected, transmitted, and/or emitted by an object.” To design optical image-forming systems, we must model the propagation of light from the object (source) to the optic, the action of the optic on the incident light distribution, and finally propagation from the optic to the sensor. The last step of conversion of the spatial (and possibly spectral) distribution of incident light into measurable physical and/or chemical changes in some mediumbythesensor,isoutsidethescopeofthisdiscussion. We hope to find a mathematical model of optical imaging as a “system,” where an output dis- tribution g is created from an input object distribution f by the action of an imaging system , e.g., g [x, y, λ]= f [x, y, z, λ] . We generally use this model to (try to) solve the inverse imagingO problem by inferringO{ the input object} from the output image and knowledge of the system. The task may be difficult or even impossible; it is easy to see one difficulty because most sensors measure only a 2-D distribution of monochromatic light and therefore cannot possibly recover the three spatial dimensions of a realistic object from a single image.

Schematic of an optical system that acts on an input with three spatial dimensions, time, and wavelength f [x, y, z, t, λ] to produce a 2-D monochrome (gray scale) image g [x0,y0].

1 2 CHAPTER 1 INTRODUCTION 1.1 Models of Light and Propagation

To be able even to write down, let alone solve, the imaging equation(s) for optical systems, we need to specify the mathematical model of light that will describe its behavior as it propagates and interacts with input objects, optical systems, and output sensors. To simplify the descriptions in the different contexts, three physical models for light and its interactions are used that are (loosely speaking) distinguished by the physical scale of the phenomena:

1.1.1 Ray model of light (“geometrical optics”) macroscopic-scale phenomena (e.g., reflection, refraction)

1.(a) light propagates as RAYS that travel in straight lines until encountering an change in properties of a medium or an interface between media. Except to differentiate the color of light, the wavelength λ and temporal frequency ν of the light are assumed to be zero and infinity, respectively (λ 0,ν ), which means that there are no effects due to diffraction; → →∞ (b) uses Fermat’s principle of least time to derive Snell’s law, which describes the phenomena of reflection and refraction; (c) useful for designing imaging systems (to locate the images and determine their magnifi- cations) (d) calculations for modeling the behavior of optical systems (lenses and/or mirrors) are (relatively) simple and may be easily implemented in software; (e)thequality of images from the system is assessed in terms of aberrations of the optical system, which describe deviations of the image from ideal behavior.

1.1.2 Wave model of light (“physical optics”): 1. microscopic-scale phenomena (diffraction/interference, reflection, refraction, refractive index, ...)

(a) considers light (electromagnetic radiation) to propagate as WAVES ; (b) propagation and interaction of light are described by Maxwell’s equations; 8 1 (c) light propagates with velocity c in vacuum c / 3 10 ms− and velocity v

(e) the oscillation frequency ν0 of waves emitted by a particular light source is constant regardless of medium and is related to the vacuum wavelength λ0 via:

λ0 ν0 = c · (f) the ratio of the propagation velocities in vacuum and in a medium is the index of refraction of the medium: c n ≡ v

(g) the wavelength of the wave in a medium is shorter the “vacuum wavelength” λ0 via: λ λ = 0 medium n

(h) wave optics explains the image-forming phenomena of reflection, refraction, diffraction (and interference, which is really just another name for diffraction) and the phenomena of polarization and dispersion that affect the quality of images; 1.1 MODELS OF LIGHT AND PROPAGATION 3

(i) mathematical calculations in wave optics are more “complicated” than those in ray optics and often not easy to implement in computers. For example, it is difficult to evaluate the exact form of light after propagating a short distance from the source; (j) uses the Huygens-Fresnel principle to derive the mathematical model for propagation of light, which if often divided into three regions: i. linear, shift-invariant model in the Rayleigh-Sommerfeld diffraction region (valid everywhere) ii. linear, shift-invariant approximation in the near field for propagation by a “suffi- ciently large” distance from the source (Fresnel diffraction) iii. linear, shift-variant approximation in the far field for propagation to “very large” distances from the source (Fraunhofer diffraction); (k) wave/physical optics is useful for assessing the quality of the images produced by systems.

1.1.3 Photon model of light (“quantum optics”): atomic-scale phenomena (emission and absorption of radiation)

1.(a) light is composed of PHOTONS with both wave and particle characteristics; (b) used to explain/analyze the physical interaction of light and matter, such as emission by sources (e.g., lasers), and the photoelectric effect in sensors; c E h (c) Fundamental relationships: E0 = hν0 = h and momentum p = = ,whereh is λ0 c λ0 Planck’s constant: 34 15 h = 6.626 10− Js= 4.136 10− eV s ∼ × ∼ × Phenomena described by the ray and wave models are most relevant to imaging, though the quantum model is vital for understanding the properties and artifacts of light sensing. You probably have seen some consideration of ray optics in undergraduate physics, and any such experience will be useful in this course. The most common treatments of optics consider rays first because the mathematical models and calculations are simpler. However, the preparation of linear systems you just had makes it possible and even desirable to consider the wave model first by applying the concepts of the impulse response and transfer function; these may significantly simplify the concepts and calculations. There are several goals to be reached by the conclusion of this discussion; we want to have the capabilities to do several things:

locate the image(s) of an object generated by the lens, , or system of lenses and/or • mirrors;

determine the “character” (real or virtual) and the size(s) (i.e., the transverse magnification) • of the image(s);

determine the “field of view” of the imaging system, i.e., the angular subtense of the object • that is imaged;

determinetherangeofdistancesinthescenefromtheopticalsystemthatappearstobe“in • focus” (the depth of field);

determine the capability of the optics to distinguish closely spaced objects — this is the “spatial • resolution” of the system (often specified in terms of measurements from the “point spread function”orthe“modulation transfer function” = “MTF,” which are optical analogues of the “impulse response” and “transfer function” that are considered in the course on Fourier methods); 4 CHAPTER 1 INTRODUCTION

understand the constraints on system performance due to the properties of materials used in the • imaging system, such as the variation in refractive index of glass with wavelength (dispersion)

Much of this discussion (especially about depth of field and spatial resolution) will benefitfrom concepts derived in the course on Fourier methods, but we must also be aware of the limitations in these concepts due to nonlinearities and/or shift-variant properties of the optical system. Chapter 2

Ray (Geometric) Optics

Ray optics (commonly, though unfortunately, called “geometric optics”) uses the model of light as a ray to evaluate the locations and properties of images created by systems of lenses and/or mirrors. It does not consider any effects due to the wave model of light, such as interference or diffraction (which are actually just different words for the same phenomenon: “interference” considers few light sources and “diffraction” considers an infinite number, or just “many”). The subject of ray optics may be subdivided into categories of “first-order,” “third-order,” and even higher-order optical computations. It also cannot explain other wave-propagation phenomena, such as total internal reflection.

2.1 What is an imaging system?

As a simple definition, we may consider an imaging system to map the distribution of the input “object” to a “similar” distribution at the output “image” (where the meaning of “similar” is to be determined). Often the input and output amplitudes are represented in different units. For example, the input often is electromagnetic radiation with units of, say, watts per unit area, while the output may be a transparent negative emulsion measured in dimensionless units of “density” or “transmit- tance.” In other words, the system often changes the form of the energy; it is a “transducer.” In the ray model, we can think of the imaging system as “selecting” and/or “redirecting” rays of light to map the energy onto the image sensor. The “selection” or “redirection” process uses some type of physical interaction between light and matter to remap the energy emitted or modified by the object onto the sensor. Among the more obvious physical interactions in our experience are refraction and reflection, but these are not the only, nor even the simplest, possible mechanisms. The very simplest interaction between light and matter is absorption, where the light energy is transferred to matter and “disappears” (of course, it does not really “vanish,” but most often is converted into heat in the matter, but it is no longer available to create an image, so it may as well have “disappeared.” We can use an absorber to create the simplest imaging system: the pinhole camera

2.1.1 Simplest Imaging System — Pinhole in Absorber Consider a 3-D volume of space that contains the object. Occasionally, a ray of light emitted (or reflected) from a location in the volume is selected by the pinhole and reaches the sensor.

every point in space is “in focus” on the sensor transverse magnification Mt determined by relative distances

z2 MT = −z1 negative sign means image is inverted

5 6 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The number of rays from the object that actually reach the image is small. The interaction with the sensor requires the quantum model of discrete energy packets, so the number of packets is small if the hole diameter is small. If the object is a uniformly emitting planar source, the numbers of packets measured from different locations in the field are different (Poisson statistics); these numerical variations in what should be identical measurements appear as “noise.” The metric of noise is determined by the mean value μ of the signal and the variation about that mean, which is described by the standard deviation σ. The signal-to-noise ratio is a dimensionless quantity that may be defined many ways, but we’ll use a simple definition that will suit this purpose μ μ SNR = = √μ ≡ σ √μ

More photons leads to larger signals (μ ) and larger standard deviation (σ ), but mean increases ↑ ↑ faster than the variance σ = √μ, so the SNR is

better statistics and less relative noise

“Quality” of image depends on diameter d0 of pinhole. Improve statistics by increasing the number of photons. Larger dose or larger pinhole. The “blur” quality of the image is better for smaller pinhole because less uncertainty in ray path. How to improve? Longer exposure time multiple pinholes

Depth of field

Redirect rays: reflective pinholes Reflection Refraction Diffraction (wave property), e.g., holography

2.2 First-Order Optics

Of most concern to us will be “first-order,” “paraxial ” or “Gaussian” optics, where the angles of light rays measured relative to the optical axis are assumed to be small, so that the ray heights remain small as the rays propagate down the optical axis, which is the source of another common term of “paraxial optics,” meaning that the ray remains near the optical axis. In cases such that the ray angle θ ∼= 0, then we can approximate trigonometric functions by the first terms in their power-series expansions (the “Taylor series” ):

0 1 2 2 n (x x0) (x x0) df (x x0) d f 1 d f n f [x]= − f [x0]+ − + − + + (x x0) + 0! · 1! dx 2! dx2 ··· n! · dxn · − ··· Ã ¯x=x0 ! Ã ¯x=x0 ! ¯x=x0 n ¯ ¯ ¯ ∞ (x x0) ¯ ¯ ¯ = − f (n) [x ] ¯ ¯ ¯ n! 0 n=0 · X If the base value and the derivatives are evaluated at the origin, we have a “Maclaurin series:”

∞ 1 f [x]= f (n) [0] xn n! n=0 · X 2.2 FIRST-ORDER OPTICS 7

The Maclaurin series for the sine is:

∞ 1 dn sin [θ]= (sin [θ]) θn n! · dθn · n=0 ¯θ=0 X ¯ 1 0 1 ¯ 1 1 2 1 3 1 4 sin [θ]= sin [0] θ + (+¯ cos [0]) θ + ( sin [0]) θ + ( cos [0]) θ + (+ sin [0]) θ + 0! · · 1! · · 2! · − · 3! · − · 4! · · ··· θ3 θ5 =0+θ +0 +0+ − 3! 5! − ··· θ3 θ5 = θ + − 3! 5! − ··· θ3 θ5 = θ + − 6 120 − ··· Note that only odd powers of θ arepresentintheseriesforsin [θ], because the sine is an odd (antisymmetric) function that satisfies the condition sin [ θ]= sin [+θ]. − −

The corresponding series for the even (or symmetric) cosine includes only even powers of θ:

2 4 2n θ θ ∞ θ cos [θ]=1 + = ( 1)n 2! 4! (2n)! − − ··· n=0 − X = lim cos [θ] =1 θ 0 ⇒ ∼= { } θ2 = cos [θ] 1 ⇒ ≡ − 2 So the approximation of the cosine with two terms is the difference of a constant and a parabola.

The series for the (odd, antisymmetric) tangent is less commonly known and includes only the odd powers of θ:

3 2n θ 2 5 ∞ 2n 2 1 2n 1 tan [θ]=θ + + θ + = 2 − B2n θ − = lim tan [θ] = θ 3 15 ··· (2n)! ⇒ θ=0 { } n=0 ¡ ¢ ∼ X ¡ ¢ th where B isbthe Bernoulli number. The first-, third-, and fifth-order series approximations for the tangent are: π tan [θ] = θ for > θ 0 ∼ 2 | | ' θ3 tan [θ] = θ + ∼ 3 θ3 2 tan [θ] = θ + + θ5 ∼ 3 15

The validity of these approximations is perhaps more obvious from the graphs, where we can see that sin [θ] / θ and tan [θ] ' θ for small positive values of θ. 8 CHAPTER 2 RAY (GEOMETRIC) OPTICS

0.5

0.4

0.3

0.2

0.1

0.0 0.0 0.1 0.2 0.3 0.4 0.5 theta Comparison of θ (black), sin [θ] (red), and tan [θ] (blue) for 0 θ +0.5 radians, showing that ≤ ≤ sin [θ] / θ and tan [θ] ' θ over this domain.

The corresponding first-order approximation to the cosine is the unit constant

lim cos [θ] =1 θ 0 { } →

1.2

1.1

1.0

0.9

0.8 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 theta The first-order approximation to cos [θ] (red) compared to the unit constant (black), showing that the two are very similar for small values of θ.

The advantage of the first-order approxmation is that evaluation of the ray heights and angles becomes simple because of the proportionality. 2.3 THIRD-ORDER OPTICS 9 2.3 Third-Order Optics

It likely is obvious from the definition of first-order optics that “third-order” optics includes the second term in the expansions:

θ3 θ3 sin [θ] = θ = θ ∼ − 3! − 6 θ3 tan [θ] = θ + ∼ 3 θ2 θ2 cos [θ] = 1 =1 ∼ − 2! − 2

0.5

0.4

0.3

0.2

0.1

0.0 0.0 0.1 0.2 0.3 0.4 0.5 theta Comparison third-order approximations of sin [θ] (red), and tan [θ] (blue) to the linear term θ (black) . Note that the third-order approximation for the cosine is a biased parabola:

1.2

1.1

1.0

0.9

0.8 0.0 0.1 0.2 0.3 0.4 0.5 theta

2 cos [θ] (black) and its third-order approximation as 1 θ (red). − 2 10 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The results for ray angles using third-order optics will differ from those of first-order optics; these differences lead to image aberrations.

2.3.1 Higher-Order Approximations

We clearly can add additional terms to the power series that will increase the accuracy of any calculations at the cost of significantly more complexity.

2.4 Notations and Sign Conventions

One of the simplest and most difficult aspects of ray optics is the set of conventions to be adopted for all of the quantities to be measured. As in many aspects of optics, there are competing choices for conventions that have their own distinct advantages, but that lead to different equations for image locations, etc. We are going to use the directed distance convention, where distances are positive if measured from left to right. The problem becomes remembering which are the points measured “from” and “to,” respectively. The figure shows sign conventions for the different quantities. Note that in all cases, light travels from left to right in all media with positive refractive index (n>0),so the distances are positive if measured in the same direction of light travel and negative if measured in the other direction.

Sign conventions for distances, heights, angles, and curvatures. The distance is positive if measured from left to right; the height is positive if the endpoint is above the axis; the angle from the axis or from a normal is positive if measured in the counterclockwise direction (positive θ); and the curvature is positive if its center is to the right of the vertex (intersection of the surface and the optical axis).

Now consider the example in the figure where an optical system forms acts on a red “object” (the upright red arrow) located at the object point labeled by O to produce an “image” at O0.The horizontal black line is the line of symmetry of the optical system and is calle the “optical axis.” 2.4 NOTATIONS AND SIGN CONVENTIONS 11

Sign conventions for a specificcase:theobjectheightatO is positive, while the image height at O0 is negative. The angle θ of the (blue) ray from the base of the object to the (green) first surface is positive. The radius of curvature R of the first surface is positive.

The front and rear surfaces of the optical system are shown in green; their intersections with the optical axis are the vertices of the system. The object space includes all features to the left of the vertex V that is closer to the object, so V is the object-space vertex of the imaging system. Similarly, the image space includes all features to the right of the vertex V0 that is closer to the image O0, so V0 is the image-space vertex. The ray shown in blue from the object O to the green optical surface makes an angle θ measured from the optical axis to the ray; since this angle is measured counterclockwise, it is a positive angle θ>0. The image-space ray from V0 to O0 measured from theaxisisaclockwiseangle,soθ0 < 0. The front surface of the optical system has a radius of curvature R that is measured from the vertex to the center of curvature, i.e. R =VC, where the overscored pair of letters denotes the distance from the first feature to the second. In this case, the distance from V to C is measured from left ot right, so VC R > 0. In the same manner, the distance from the rear vertex V0 to ≡ its center of curvature C0 is measured from right to left, so R0 V0C0 < 0; R0 is negative in this example. ≡ Two other features are shown in the figure that we have not yet described, one each in object and image space. F and F0 are object-space and image-space focal points, respectively. They are endpoints of the object-space and image-space focal lengths; the other endpoints are either the vertices (if the lenses are “thin”) or the principal points (which we shall label as H and H0, respectively). That discussion will have to wait until later. We will often have the need to propagate a light ray through an optical system consisting of asetofdifferent thin lenses or a set of surfaces separated by different media. The cascade of calculations requires distances measured from the object to the lens or front surface and from lens or back surface to the image. The need to express multiple distances will be addressed by both subscripts and “primed” notation, depending on context, where the “unprimed” notation will refer to the distance before the lens or surface and the “primed” notation to that after. When multiple surfaces are needed, the first will be denoted by the subscript “1,” the next by “2,” etc. Notation can also be a problem. The two different lower-case Greek letters for “phi” (straight φ and cursive ϕ)willbeusedindifferent ways: φ represents the “power” of a lens or surface and is 1 measured in reciprocal length, most commonly reciprocal meters m− , which is named the diopter. The cursive phi (ϕ) will be used to represent an angle, and therefore is dimensionless. The cursive letter f is used to represent a function, e.g., f [x, y, t], whereas the “straight” letter f will be used to denote the focal length with dimensions of length. This means that: 1 φ = f

2.4.1 Nature of Objects and Images: 1. Real Object: Rays incident on the lens are diverging from the source; the object distance is positive 12 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2. Virtual Object: Rays into the lens are converging toward the “source” located “behind” the lens; object distance is negative

3. Real Image: Rays emerging from the lens are converging toward the image; image distance is positive

4. Virtual Image: Rays emerging from the lens are diverging, so that the “image” is behind the lens and the image distance is negative 2.5 HUMAN EYE 13

2.5 Human Eye

Since this course considers optics of imaging systems, and since the images generated by many optical systems are viewed by human eyes, we need to at least introduce the optics of the eye; we will consider it in more detail when we trace rays through the “standard” eye model later. The optics of the human eye include the curved surface (the “cornea,” which exhibits most of the power of the system) and a deformable lens. The system is intended to form an image on the retina, which is a fixed distance from the cornea. The lens is deformed by action of ciliary muscles to change the plane that is viewed “in focus.” When the muscles are relaxed, the lens is “flatter,” i.e., the radii of curvature of the surfaces are larger. To view an object “close up,” the focal length of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time of viewing objects up close). If the retina is located “too far” from the cornea, so that the image is “in front” of the retina when the muscles are relaxed, then the eye sees a “blurry” image of distant objects, but nearby objects may be well focused. This is the condition of “nearsightedness” or “myopia.” If the retina is “too close” to the cornea, the image is focused behind it and the eye sees distant objects more sharply (“hyperopia” or “farsightedness.”)

2.6 Principle of Least Time

The mathematical model of ray optics is based on a principle stated by Fermat. Long before that, Hero of Alexandria hypothesized a model of light propagation that could be called the principle of least distance:

A ray of light traveling between two arbitrary points traverses the shortest possible path in space. (Hero of Alexandria)

This statement applies to reflection and transmission through homogeneous media (i.e., the medium is characterized by a single index of refraction). However, Hero’s principle is not valid if the object and observation points are located in different media (as is the normal situation for refraction) or if multiple media are present between the points. In 1657, Pierre Fermat modified Hero’s statement to formulate the principle of least time (which actually works):

A light ray travels the path that requires the least time to traverse. (Fermat)

The laws of reflection and refraction may be easily derived from Fermat’s principle. A moving ray 14 CHAPTER 2 RAY (GEOMETRIC) OPTICS

(or car, bullet, or baseball) traveling a distance s at a velocity v requires t seconds: s t = v If the ray travels at different velocities for different increments of distance, the total travel time is the summation over the different distances and different velocities:

M s t = m v m=1 m X c If we define the velocity of a light ray in a medium of index n to be v = . then: n

M M sm 1 t = = (nmsm) c c c m=1 m=1 ≡ X nm X ³ ´ where the optical path length is defined:

M (nmsm) m=1 ≡ X For a single medium, the optical path length is:

n s ≡ · Note that the optical path length is longer than the physical path length; it is the distance that a ray would travel in vacuum in the same time that it would take to travel the physical distance s; the optical path is longer than the physical path because light travels more slowly in the medium (nm 1). The principle of least time may be restated as a light ray requires the least time to traverse≥ the path with the shortest optical path length, or:

A ray traverses the route with the shortest optical path length. This suggests a philosophical question, “How does the light ray know which path to take before it leaves the source?” I leave it to you to ponder this question, but will say that the difficulty if formulating an answer suggests the limitation of the (simple) ray model for light propagation.

2.7 Fermat’s Principle for Reflection

Now consider the path traveled upon reflection that minimizes an easily evaluated optical path length: 2.7 FERMAT’S PRINCIPLE FOR 15

Schematic for determining the angle of reflection using Fermat’s principle.

As drawn, the angle θ1 is positive (measured from the normal to the ray) and θ2 is negative (from the normal to the ray). The ray travels in the same medium of index n both before and after reflection. The components of the optical path length are:

so = h2 + x2

op = pb2 +(a x)2 − q And the expression for the total optical path length is:

= n (so + op) · = n h2 + x2 + b2 +(a x)2 − µ q ¶ = [x]p (a function of x)

By Fermat’s principle, the path length traveled is the minimum of the optical path length ,sothe position of o along the x-axis is found by setting the derivative of with respect to x to zero:

d d = n h2 + x2 + b2 +(a x)2 =0 dx dx − µ µp q ¶¶ 2x 2(a x) = n + − − 2 2 · ⎛2√h + x 2 b2 +(a x)2 ⎞ − ⎝x a qx ⎠ = − =0 2 2 √h + x − b2 +(a x)2 − x q a x = = − 2 2 ⇒ √h + x b2 +(a x)2 − q 16 CHAPTER 2 RAY (GEOMETRIC) OPTICS

From the drawing, note that: x sin [θ1]= √h2 + x2 a x sin [ θ2]= − − b2 +(a x)2 − = sin [θ1q]=sin[ θ2] ⇒ − = θ2 = θ1 ⇒ − In words, the magnitudes of the angles of incidence and reflection are equal (as already derived by evaluating Maxwell’s equations at the boundary). The negative sign is necessary because of the sign convention for the angle; the angle is measured from the normal and increases in the counterclockwise direction, but the reversal of the propagation direction of the ray means that it also may be “explained” by assuming that the index of refraction for the image space is the negative of that for the object space.

Snell’s law for reflection at interface.

Note that Snell’s law for reflection does not include either refractive index n, which means that the outgoing ray angle is not affected by the different refractive indices of the the two media, so the image location and quality are not influenced by the indices. The “amount” of the ray that is reflected IS affected by the two refractive indices via the Fresnel equations, which require the principles of wave optics for explanation. At this point, we will just introduce the relationship without proof. If light is incident normally to the interface between two media (θ =0)with refractive indices n1 and n2,thereflectivity of the surface obeys:

2 n1 n2 R = − if θ =0 n + n µ 1 2 ¶ If the first medium is air with n ' 1 and the second is glass with n ∼= 1.5,thereflectivity is: 1 1.5 2 R = − =0.04 1+1.5 µ ¶ Note that the reflectivityisthesameifthefirst medium is glass and the second is air:

1.5 1 2 R = − =0.04 1.5+1 µ ¶ The reflectivity at different incident angles obeys more complicated expressions, in part because the light must be decomposed into different polarizations depending on the direction of oscillation of the electric field. 2.7 FERMAT’S PRINCIPLE FOR REFLECTION 17

2.7.1 Plane Mirrors

Other than perhaps the pinhole, the simplest image forming system is the plane mirror, which is so familiar that it may seem hardly worth mentioning. Clearly its action obeys Snell’s reflection law that θ2 = θ1, which means that the the appearance of an image is “reversed” relative to the object, i.e., the−parity of the image is inverted. It also allows introduction of the concepts of object space and image space, which will be used thenceforth and forevermore. The object space is the locus of points where objects may exist, which is all points “in front of” the mirror (real objects) and “behind” the mirror (virtual objects) . A real object forms a virtual image “behind” the mirror, and a virtual object forms a real image “in front of” the mirror. In other words, the object and image spaces for reflection by a plane mirror both include the entire 3-D space.

Object and image space for a plane mirror. Rays diverging from a real object forms a virtual image “behind” the mirror, but rays converging to a virtual object “behind” the mirror form a real image “in front of” the mirror. 18 CHAPTER 2 RAY (GEOMETRIC) OPTICS 2.8 Fermat’s Principle for Refraction:

Schematic for refraction using Fermat’s principle.

In this drawing, both θ1 and θ2 are positive (measured from the normal to the interface in the counterclockwise direction). The optical path length is:

= n1 so + n2 op · · 2 2 2 2 = n1 h + x + n2 b +(a x) − q By Fermat’s principle, the path lengthp traveled is that such that is minimized, so we again set the derivative of with respect to x to zero and identify trigonometric functions for the resulting ratios.

d 2x 2(a x) = n1 + n2 − − =0 dx 2 2 2 2√h + x 2 b2 +(a x) − x q a x = n1 = n2 − =0 2 2 ⇒ √h + x b2 +(a x)2 − x q sin [θ1]= √h2 + x2 a x sin [θ2]= − b2 +(a x)2 − q = n1 sin [θ1]=n2 sin [θ2] ⇒ = Snell’s Law for refraction ⇒ Note that with this sign convention, Snell’s law may be applied to reflection by setting the refractive index of the second medium to be the negative of the first:

n1 sin [θ1]=n2 sin [θ2]

= n1 sin [θ1]= n1 sin [θ2] ⇒ − = sin [θ1]=sin[θ2] ⇒− = θ2 = θ1 ⇒ − 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: 19

The expression of Snell’s law for refraction is general, but we can easily apply the first-order paraxial approximation that sin [θ] ∼= θ if the ray angles are small (θn ∼= 0):

n1 sin [θ1]=n2 sin [θ2]= n1 θ1 = n2 θ2 in paraxial approximation ⇒ · · n1 = θ2 = θ1 in paraxial approximation ⇒ n2 ·

2.8.1 Dispersion Unlike the reflection law, Snell’s law for refraction DOES include the refractive indices. This means that the angle of refraction will change as the indices change, as with wavelength. All (or perhaps I should day ALL) transparent materials exhibit a variation in refractive index with wavelength, which is called dispersion. Note that the features of dispersion depend on the material (e.g., glass). The full explanation of dispersion is beyond the scope of this course, so we will just describe its effects. In a transparent matrial over the range of visible wavelengths, the refractive index n DE- CREASES with increasing λ. In the study of wave optics, this ensures that the phase velocity ω dω for the “average” wave v = is larger than the group or modulation velocity . Among other φ k dk things, this ensures that a signal transmitted as a modulation of a light wave cannot travel at a speed faster than the velocity of light. A schematic dispersion for a hypothetical glass is shown in the figure; note that the slope of the dispersion curve decreases with increasing λ;thecurve“flattens out” as λ increases in the visible range.

Typical dispersion curve for glass at visible wavelengths, showing the decrease in n with increasing λ and the three spectral wavelengths specified by Fraunhofer and used to specify the “refractivity”, “mean dispersion”, and “partial dispersion” of a material.

The refractive indices for several real glasses shows an additional feature of dispersion curves: the relationship between the “amount” of dispersion and the refractive index. Glasses with lower refractive index (n ∼= 1.5, the so-called crown glasses) have a “flatter” graph and therefore less dispersion. In other words, nblue is larger than nred, but not much larger., so that the smaller the refractive index, the smaller the dispersion. Flint glasses have larger values of the refractive index (n ∼= 1.7) and larger variations across the visible spectrum:

(nblue nred) > (nblue nred) − flint − crown 20 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Dispersion curves for various optical glasses as a function of wavelength λ in the visible region of 10 the spectrum (measured in Angstroms, where 1 Å =0.1nm=10− m, 4000 Å =400nm) The rapid rise in the index at wavelengths in the ultraviolet region is due to the atomic resonances there.

If we use the paraxial approximation for rays in air entering a glass with refractive index n,the outgoing ray angle θ2 is: 1 θ2 = θ1 in paraxial approximation n2 ·

Dispersion ensures that (n2)blue > (n2)red, which means that (θ2)blue < (θ2)red and the deviation angle δblue >δred. Since the outgoing ray angles are different for different colors, images will be formed at different distances in different colors. This is the source of chromatic aberration in imaging systems.

Effect of dispersion on refraction: since the refractive index for red light is smaller, the angle of refraction measured from the normal is larger. Put another way, this means that the deviation angle due to refraction is smaller for red light than for blue light.

In imaging, we often think of dispersion in refractive elements as an unfortunate “bug” in the 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: 21 system, but you probably also know that it can be a very useful feature; it provides a tool for spreading white light into its constituent spectrum in a dispersing prism.

Dispersing prism with the two refractions, showing that the angle of deviation from the original path is larger for blue light than for red light.

From the figure, note that the angle of deviation of the ray from the original path is larger for blue light due to the dispersion of light δblue >δred for prism The relationship between the wavelength and the deviation angle is complicated for refraction. As a side comment, note that light may also be dispersed into its spectrum by the phenomenon of diffraction in gratings. However, the relationship between the wavelength and the deviation angle for diffraction is very simple: the angle of deviation is proportional to the wavelength (for small angles): δ λ = δblue <δred for grating ∝ ⇒ This means that it is easier to construct an accurate spectrometer based on diffraction than based on refractive dispersion.

2.8.2 Refractive Constants for Glasses The refractive properties of glass are approximately specified by the refractivity and the measured differences in refractive index at the three Fraunhofer wavelengths F, D, and C:

Refractivity nD 1 1.75 nD 1.5 − ≤ ≤ Mean Dispersion nF nC > 0 differences between blue and red indices − Partial Dispersion nD nC > 0 differences between yellow and red indices − nD 1 Abbé Number ν − ratio of refractivity and mean dispersion, 25 ν 65 ≡ nF nC ≤ ≤ − (note that larger dispersions result in smaller Abbé numbers) Glasses are specified by six-digit numbers abcdef, where nD =1.abc, to three decimal places, and the Abbé number ν = de.f. Note that larger values of the refractivity mean that the refractive index is larger and thus so is the deviation angle in Snell’s law. A larger Abbé number means that the mean dispersion is smaller and thus there will be a smaller difference in the angles of refraction. Such glasses with larger Abbé numbers and smaller indices and less dispersion are crown glasses, while glasses with smaller Abbé numbers are flint glasses, which are “denser”. Examples of glass specifications include Borosilicate crown glass (BSC), which has a specification number of 517645,so its refractive index in the D line is 1.517 and its Abbé number is ν =64.5. The specification number 22 CHAPTER 2 RAY (GEOMETRIC) OPTICS

for a common flint glass is 619364,sonD =1.619 (relatively large) and ν =36.4 (smallish). Now consider the refractive indices in the three lines for two different glasses: “crown” (with a smaller n) and “flint:”

Line λ [nm] n for Crown n for Flint

C 656.28 1.51418 1.69427 D 589.59 1.51666 1.70100 F 486.13 1.52225 1.71748

The glass specification numbers for the two glasses are evaluated to be:

For the crown glass :

refractivity: nD 1=0.51666 = 0.517 − ∼ 1.51666 1 Abbé number : ν = − = 64.0 1.52225 1.51418 ∼ − Glass number = 517640

For the flint glass:

refractivity:L nD 1=0.70100 = 0.701 − ∼ 0.70100 1 Abbé number: ν = − = 30.2 1.71748 1.69427 ∼ − Glass number = 701302

Dispersion curve of a material from very short to very long wavelengths. The index increases with increasing λ as additional resonances are passed, but the index of refraction decreases with increasing wavelength in the visible wavelengths (bold face). 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: 23

The dispersion curves for optically transparent materials, such as glass and air, exhibit some very similar features, though the details may be significantly different. Starting at very short wavelengths (λ ' 0), the refractive index n is approximately unity. In words, the wavelength is so short (and the oscillation frequency so large) that the energy per photon is very large, so that photons pass through the material without interacting with the atoms; the material appears to be vacuum. For longer (but still very short) wavelengths (“hard” X rays), the refractive index actually is slightly less than unity, which means that X rays incident on a prism are refracted away from the prism’s base, rather than towards the base in the manner of visible light. This is the reason why X rays can be totally reflected at grazing incidence, which is the focusing mechanism used in X-ray telescopes (such as Chandra). As the wavelength of the incident light increases further, though still within the X-ray region, the radiation incident on the material is heavily absorbed; this is the “K-absorption edge” where the energy of the incident X rays is just sufficient to ionize an electron in the innermost atomic “shell” — the “K shell.” For example, the wavelength of this absorption is λK ∼= 0.67 nm for silicon. Other absorptions occur at yet longer wavelengths (smaller incident photon energies), where electrons in the L and M shells, etc., of the atom are ionized. The spectrum of a material with a large atomic number (and thus several filled electron shells) will exhibit several such resonant absorptions.

Ionization of a K-shell electron by an incoming X ray of sufficient energy. This is the reason for the large absorptions of “hard” X rays by materials. Lower-energy (longer-wavelength) X rays will ionize electrons in the L or M shells, thus producing other absorption “edges.”

As the wavelength of the incident radiation increases further, into the “far ultraviolet” region of the spectrum, the real part of the refractive index decreases to a value much less than unity within a wide band of anomalous dispersion. The fact that n<1 in this region may be confusing because it seems that the velocity of light exceeds c, but these waves do not propagate in the material due to the strong absorption (large value of κ). The wavelength of maximum absorption corresponds to the largest of the several “natural oscillation frequencies” of bound electrons in the material. In the visible region of the spectrum, the dispersion curve exhibits the familiar decrease in n with λ that was shown above. For example, the index of air is n ∼= 1.000279 at λ =486.1nm (Fraunhofer’s “F” line) and n ∼= 1.000276 at λ = 656.3nm (“C” line). The corresponding values for diamond are nF =2.4354 and nC =2.4100. The closer the nearest ultraviolet absorption to the dn visible spectrum, the steeper will be the slope dλ in the visible region and thus the larger the visible dispersion (defined below). The dispersion curve descends yet more steeply somewhere in the near infrared region and then rises due to anomalous dispersion in the vicinity of an infrared absorption band (labeled “λ2”on the graph). For quartz (crystalline SiO2), the center of this band is located at λ ∼= 8.5 μm,butthe absorption already is quite strong for wavelengths as short as λ ∼= 4 μm. Most optical materials have several such infrared absorption bands and the “base level” of the index of refraction is larger after each such band. This behavior is confirmed by far-infrared measurements of the refractive index of quartz (crystalline SiO2), which varies over the interval 2.40 n 2.14 for 51 μm λ 63 μm.The large values of n ensure that the focal length of a convex quartz≤ ≤ lens is much shorter≤ at≤ far-infrared 24 CHAPTER 2 RAY (GEOMETRIC) OPTICS wavelengths than at visible wavelengths. As the wavelength is increased still further into the radio region of the spectrum after the last absorption band, the refractive index decreases slowly due to normal dispersion from that last absorption and approaches a limiting value of . 0 r 2.9 Image Formation in the Ray Model

We know that light rays are deviated at interfaces between media with different refractive indices. The goal in this section is to use interfaces of specified shapes to “collect” the light and “reshape” the wavefronts in a way that recreates “images” of the original sources.

2.9.1 Refraction at a Spherical Surface Optical systems typically are used to form images of the source distribution by constructing optical elements (“lenses”) made out of transparent media with different refractive indices to redirect the electromagnetic radiation. Until rather recently, lenses were fabricated almost exclusively from glass, which required the optical surfaces to be ground to the desired curvature and polished to remove scratches, etc., from the grinding. Two pieces of glass are typically employed in the grinding process: the “optic” and the “tool.” Water and a grinding compound composed of flecks of some hard substance resembling sand are placed on the surface of one glass and the two surfaces rubbed together with some force applied to the top optic. The two glass pieces are In the grinding process, The surface that is easiest to fabricate is a sphere, because the two surfaces will be in contact at all translations. Glass is ground out of the center of the top piece and off of the edges of the bottom piece, leaving a concave sphere on top and a convex sphere on the bottom. The “grit” of the grinding compound is reduced gradually to leave a smoother surface. The surface is then polished using very fine “jeweler’s rouge” to produce smooth surfaces of “optical” quality. More recently, optical elements have been fabricated from thin plates cemented over a hollowed-out “grid” to lighten the weight. Also plastics and other materials have been developed that may be cast to produce optical surfaces of various shapes with minimal polishing.

Grinding optical surfaces: a slurry of water and grinding compound (e.g., carborundum) is placed betweentwoglasssurfaces.Thetopglassispusheddownandmovedaroundtogrindglassfromthe center region of the top piece. The resulting surfaces must be spherical because they are the only curves that remain in contact at all locations.

Consider the action of a spherical surface of a medium with index n2 on an incident ray in a medium of index n1: 2.9IMAGEFORMATIONINTHERAYMODEL 25

Refraction at a spherical surface between two media of refractive index n1 and n2.

The point source is located at s and its distance to the vertex v is sv z1 > 0. The distance ≡ from vertex v to the observation point p is vp z2 > 0. The physical distance traveled by a ray in ≡ medium n1 to the surface is sa 1 and that in medium n2 is ap 2. The radius of curvature of ≡ ≡ the surface is vc = ac R>0 as drawn. For emphasis, we repeat that z1, z2,andR are all positive in our convention. The≡ ray intersects the surface at angle ϕ (the “position angle”) measured from the center of curvature c. The optical path length of the ray from s to p through a is

OPL = n11 + n22 = n1 (sa)+n2 (ap)

The triangles sac and acp has sides 1 and R with hypotenuse z1 + R, while acp has sides 4 4 4 R and z2 R,withhypotenuseap 2. The physical lengths 1 and 2 may be evaluated from the other two− sides and the included angle≡ ϕ via the law of cosines:

2 2 2 sac = =(z1 + R) + R 2R (z1 + R)cos[ϕ] 4 ⇒ 1 − 2 2 = 1 = (z1 + R) + R 2R (z1 + R)cos[ϕ] ⇒ − 2 q 2 2 acp = =(z2 R) + R 2R (z2 R)cos[π ϕ] 4 ⇒ 2 − − − − 2 2 = 2 = (z2 R) + R +2R (z2 R)cos[ϕ] ⇒ − − q 2 2 = (z2 R) + R 2R (R z2)cos[ϕ] − − − q The corresponding optical path length is:

OPL = n11 + n22

2 2 = n1 (z1 + R) + R 2R (R + z1)cos[ϕ] · − µq ¶ 2 2 + n2 (z2 R) + R 2R (R z2)cos[ϕ] · − − − µq ¶ which is obviously a function of the position angle ϕ. We can now apply Fermat’s principle to find 26 CHAPTER 2 RAY (GEOMETRIC) OPTICS the angle ϕ for which the OPL is a minimum:

d (OPL)=0 dϕ

n1 2R (R + z1)sin[ϕ] n2 2R (R z2)sin[ϕ] = · + · − 2 2 2 2 (z1 + R) + R 2R (R + z1)cos[ϕ] (z2 R) + R 2R (R z2)cos[ϕ] − − − − q q n1 (R + z1) n2 (R z2) =2R sin [ϕ] + − µ 1 2 ¶ which may be rearranged to:

n1 (R + z1) n2 (R z2) 0=2R sin [ϕ] + − µ 1 2 ¶ n1 (R + z1) n2 (R z2) = 0= + − ⇒ 1 2 n R n R n z n z = 1 + 2 = 2 2 1 1 ⇒ 1 2 2 − 1 n n 1 n z n z = 1 + 2 = 2 2 1 1 ⇒ R − 1 2 µ 2 1 ¶

This last relation between the physical path lengths 1 and 2 and the distances z1 and z2 is exact. Nowweusetheexpressionforthephysicalpathlength1 to find its ratio relative to the axial distance z1 and use simple algebra to rearrange:

2 2 (z1 + R) + R 2R (z1 + R)cos[ϕ] 1 = − z1 q z1 1 2 2 2 (z1 + R) + R 2R (z1 + R)cos[ϕ] = − 2 Ã z1 ! 1 2 2 2 2 2 z + R +2Rz1 + R 2R cos [ϕ] 2Rz1 cos [ϕ] = 1 − − z2 µ 1 ¶ 1 2R2 2R 2 1 = 1+ + (1 cos [ϕ]) z z2 z − 1 µ µ 1 1 ¶ ¶ This relation also is exact, but may be approximated by applying a truncated series for cos [ϕ]:

ϕ2 ϕ4 ϕ6 cos [ϕ]=1 + + = 1 if ϕ = 0 − 2! 4! − 6! ···∼ ∼ ϕ2 ϕ4 ϕ6 = 1 cos [ϕ]=1 1 + + ⇒ − − − 2! 4! − 6! ··· µ ¶ ϕ2 ϕ4 ϕ6 = + 2! − 4! 6! − ··· ∼= 0 if ϕ ∼= 0 This leads to the first-order approximation that the path length and axial length are approximately equal: 1 = 1= 1 = z1 z1 ∼ ⇒ ∼ 2.9IMAGEFORMATIONINTHERAYMODEL 27

Similarly, we can show that: 2 ∼= z2 This paraxial or Gaussian approximation (also called first-order optics because it is based on only the first-order term in the cosine series) is valid only for small ray angles ϕ measured from the optical axis. In words, the optical path lengths of rays that travel along the optical axis and rays that travel “away” from the axis (but still with ϕ ∼= 0)areequal. The simplified imaging equation has the form:

1 n2z2 n1z1 1 = (n2 n1) R − ∼ R − µ 2 1 ¶ n1 n2 1 = + = (n2 n1) ⇒ z1 z2 ∼ R −

This is the paraxial imaging equation for single surface; clearly it is an approximation to the true equation, and also clearly it is similar to the imaging equation we have already considered.

Object at Infinite Distance

Now consider some pairs of object and image distances z1 and z2. If the object is located at , then: −∞

n1 n2 n2 1 + = = (n2 n1) z2 z2 ∼ R − ∞ n2R = z2 = f2 the “image-space focal length” ⇒ ∼ n2 n1 ≡ − which is what we “normally” think of as being the focal length of the optic.

Image at Infinite Distance If the image is located at + , the object distance must be ∞ n1 1 n1R = (n2 n1)= z1 = f1 the “object-space focal length” z1 ∼ R − ⇒ ∼ n2 n1 ≡ − 1 1 = (n2 n1) f1 R − Also note that: n1R f1 n2 n1 n1 = µ − ¶ = = n1 f2 = n2 f1 f2 n2R n2 ⇒ · · n2 n1 µ − ¶ In words, the ratio of the focal lengths in the two spaces (object and image) is the ratio of the indices of refraction in the two spaces. Rule of Thumb: Estimating focal lengths of converging lenses: For a single positive (converging) lens (i.e., not a lens “system” with multiple elements), it is easy to estimate the focal length of a lens by finding the distance from the lens to the image of a distant bright object. The requirement for “distant” is not critical — forming the image of ceiling lamp on the floor or a tabletop will give a useful estimate for a positive lens with a short focal length.

2.9.2 Imaging with Spherical Mirrors Theequationforasinglerefractivesurfacemaybeusedtoderivethefocallengthofaspherical mirror by setting the refractive index of image space to the negative of that in object space: 28 CHAPTER 2 RAY (GEOMETRIC) OPTICS

1 1 n1 φ = = ( n1 n1)= 2 f R − − − R In air, the equation for the focal length of a spherical mirror is: R R f = in air −2n →−2 In words, the focal length of a spherical mirror is half of the radius of curvature; the focal length is positive (converging) if R>0 and negative if R<0,asshown.

Spherical mirrors: concave mirror with negative radius of curvature R = VC < 0 makes outgoing light rays converge and so f > 0; convex mirror with positive radius of curvature makes rays diverge and f < 0.

2.10 First-Order Imaging with Thin Lenses

Normally we do not consider the case of an object in one medium with the image in another — usually both object and image are in air and a lens (a “device” composed of material with different refractive index n and curved surfaces) diverts the rays to form the image. We can derive the formula for the object and image distances if we know the radii of the lens surfaces and the indices of refraction. We merely cascade the formula for a single surface:

n1 n2 n2 n1 At first surface: + = − z1 z10 R1 n2 n3 n3 n2 At second surface: + = − z2 z20 R2 where z1 is the (usually known) object distance, z10 is the image distance for rays refracted by the first surface, z2 is the object distance for the second surface, and z20 is the image distance for rays exiting the second surface (and thus from the lens). For the common “convex-convex” lens, the 2.10 FIRST-ORDER IMAGING WITH THIN LENSES 29

center of curvature of the first surface is to the right of the vertex, and thus the radius R1 of the first surface is positive. Since the vertex is to the right of the center of curvature of the second surface, then R2 < 0. If the lens is “thin”, then the ray encounters the second surface immediately after refraction at the first surface, so the ray heights at the two surfaces are the same. The object distance for the second surface is the negated image distance from the first: z2 = z0 . Put another − 1 way, the absolute value of the image distance for the front surface z0 isthesameastheobject | 1| distance for the second surface z2 . If the lens is “thick”, then the object distance for the second lens is different from the image distance| | for the first, and the ray heights will be different if the ray angle is not zero. The thickness t of the lens must satisfy the relationship:

z0 + z2 = t = z2 = t z0 for thick lens 1 ⇒ − 1 for a thick lens. For a thin lens with t =0

z2 =0 z0 = z2 = z0 for thin lens − 1 ⇒ − 1

The equations for the two surfaces may be added and the RHS may be rearranged to obtain a single imaging equation for a lens with two surfaces:

n1 n2 n2 n3 n2 n1 n3 n2 + + + = − + − z z z z R R µ 1 10 ¶ µ 2 20 ¶ µ 1 ¶ µ 2 ¶ n3 1 1 n1 = + n2 R R − R − R 2 µ 1 2 ¶ 1

For a thin lens with t =0, substitute z2 = z0 to obtain: − 1 n n n n n n t =0= 1 + 3 = 1 + 2 + 2 + 3 ⇒ z1 z z1 z2 z2 z 20 µ − ¶ µ 20 ¶ n1 n3 n3 1 1 n1 + = + n2 z z R R − R − R 1 20 2 µ 1 2 ¶ 1 where the object is immersed in index n1, the lens has index n2, and the image is immersed in index n3.

In the usual case of both object and image in air so that n3 = n1 =1,the equation simplifies to: 1 1 1 1 1 1 + = + n2 z z R R − R − R 1 20 2 µ 1 2 ¶ 1 1 1 1 1 + =(n2 1) z z − R − R 1 20 µ 1 2 ¶ Note the similarity between this equation and that we inferred from the derivation of the image plane using wave optics: 1 1 1 + = z1 z2 f where the distances z1 and z2 from the object to the lens and lens to image are what we had called z1 and z2 previously, and we identify:

1 1 1 1 1 (n2 1) = = + (Lensmaker’s Equation) − R − R f z z µ 1 2 ¶ 1 20 which defines the focal length of the thin lens in terms of its physical parameters for a thin lens. This is the so-called lensmaker’s equation for thin lenses IN AIR; it determines the distance z20 to the image for object distance z1, the radii of curvatures R1 and R2 of the spherical surfaces, and the 30 CHAPTER 2 RAY (GEOMETRIC) OPTICS

index of refraction n2 of the glass. Note that the object distance z1 and the image distance z20 both appear with the same algebraic sign, which may be interpreted as demonstrating an “equivalence” of the object and image because the propagation of light rays may be reversed to exchange the roles of object and image. Corresponding object and image points (or object and image lines or object and image planes) are called conjugate points (or lines or planes). In the more general case where the refractive index of object space is n3 > 1 so that n3 = n1, the focal length of the lens is: 6 n1 n3 1 (n2 1) = − R − R f µ 1 2 ¶ and that of image space is n3.

2.10.1 Examples of Thin Lenses 1. Plano-convex lens, curved side forward (“convexo-planar lens”)

R1 = R1 > 0 | | R2 = (sign has no effect) ±∞ 1 1 1 1 n2 1 + =(n2 1) = − > 0 z1 z − R1 − R1 20 µ| | ∞¶ | | If z1 =+ ,thenz0 = f > 0, the focal length ∞ 2 1 n2 1 1 = − = φ system power (measured in meters− = diopters) f R1 R1 f = = 2R1 (since n2 = 1.5 for glass) n2 1 ∼ ∼ − 1 1 We often use the “power” φ = f − (measured in m− = diopters) instead of the focal length f to describe the lens, since powers of different lenses combine by addition, instead of as reciprocals of sums of reciprocals. The power measures the ability of the lens or lens system to deviate rays, i.e., to change the ray angle.

2. Plano-convex lens, plane side forward:

R1 = ±∞ R2 = R2 < 0 − | | 1 1 (n2 1) (n2 1) + = − =+ − > 0 z1 z − R2 R2 20 | | R2 f = | | = 2 R2 n2 1 ∼ | | − So the focal length of the lens is the same regardless of its orientation (front-to-back). Since the focal lengths for the two configurations (curved side in front or behind lens) are the same, you might assume that the same image quality can be expected for the two configurations. This is NOT the case, but the explanation requires the theory of aberrations. At this point, we will just try to give a bit of motivation for another rule of thumb, while postponing the proof. Rule of Thumb: Orientation of Plano-Convex Lens: When using a plano-convex lens to form an image, the quality of the image is better if the power is more evenly divided among the two surfaces. This means that the the curved side of the lens is placed towards the longer conjugate (which usually is towards the object) and the plane side towards the shorter conju- gate. This miniizes the spherical aberration that causes rays from a point object to cross the optical axis at different distances from the lens. This perhaps may be visualized better if we consider the case of a distant object (assume z1 = ) and a plano-convex lens with the flat ∞ 2.10 FIRST-ORDER IMAGING WITH THIN LENSES 31

side towards the object. For an object at infinity, the rays incident upon the lens are parallel (“collimated”) both when they are incident to and when they exit the flat surface. In other words, the flat side contributes no power to the imaging, so all of the focusing power comes from the curved surface.

Rule of thumb: when using a plano-convex lens, place the curved side towards the longer conjugate to get a better image.

3. Plano-concave, plane side forward:

R1 = ±∞ R2 =+ R2 > 0 | | 1 1 1 1 (n2 1) + =(n2 1) = − < 0 z1 z − − + R2 − R2 20 µ∞ | |¶ | | R2 f = | | = 2 R2 −n2 1 ∼ − | | − 4. Double convex lens with equal radii:

R1 = R > 0 | | R2 = R1 = R − − | | 1 1 1 1 (n2 1) + =(n2 1) =2 − > 0 z1 z − R − − R R 20 µ| | µ | |¶¶ | | 1 2 (n2 1) = φ = · − f R | | R f = | | = R > 0 if n2 = 1.5 2 (n2 1) ∼ | | ∼ · − 32 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2.10.2 Spherical Mirror The mirror changes the direction of rays by reflection that obeys Snell’s law for reflection so that the angle of reflection is the negative of the angle of incidence (measured from the normal to the surface). For a concave spherical mirror, the incident ray angle varies with height above the optical axis. difference in analysis between the single refractive surface and the mirror may be simplified by recognizing that the mirror “reverses” the direction of propagaion of light, which may be explained by setting n2 = n1 = 1 − − 1 1 1 2 R = − = = f = f R − R −R ⇒ − 2 In words, the focal length of a spherical mirror is half of the radius of curvature. A concave mirror with negative radius is positive (center to left of vertex)

2.11 Image Magnifications

The most common use for a lens is to change the apparent size of an object (or image) via the magnifying properties of the lens. The mapping of object space to image space “distorts” the size and shape of the image, i.e., some regions of the image are larger and some are smaller than the original object. We can define three types of magnification: transverse, longitudinal, and angular, where the first two describe the impact of the imaging system on lengths that are respectively perpendicular to and parallel to the optical axis, while the last refers to the action on the angles of rays measured from the optical axis. Note that the very name of “magnification” is rather misleading because most imaging systems produce images that are smaller than the object; they actually “minify” the features because the magnifications are smaller than unity.

2.11.1 Transverse Magnification:

The transverse magnification MT is what we usually think of as magnification—itistheratioof object to image dimension measured transverse to the optical axis. In the figure, note the two similar triangles a1b1c and a2b2c: 4 4

Thetransversemagnification of the image is the ratio of the height of the image to that of the y2 object: MT = . y1 It is easy to see that:

y1 y2 y2 = | | = (because y2 < 0) z1 z2 −z2 y2 z2 = MT = ⇒ y1 ≡ −z1

If MT is larger than or smaller than unity, the image is magnified or minified, respectively. If | | MT > 0, the image is upright or erect and if MT < 0, the image is inverted (“upside down”). 2.11 IMAGE MAGNIFICATIONS 33

2.11.2 Longitudinal Magnification:

The longitudinal magnification ML is the ratio of the “length” or “depth” of the image measured along the optical axis to the corresponding length of the object; the longitudinal magnification is the ratio of differential elements of length of the image and object, which approach an infinitesimal in the limit:

∆z2 ML = ∆z1 ∆z dz lim 2 = 2 ∆z1 0 ∆z dz → 1 1 The expression may be derived by evaluating the total derivative of the lensmaker’s equation.

1 1 1 1 + =(n 1) z z − R 1 − R 1 2 µ 2 ¶ 1 1 Since the imaging equation relates the reciprocal distances z1− and z2− , the longitudinal magnifica- tion varies for different object distances. The total derivative of the left-hand side of the imaging equation is:

1 1 1 1 d + = d + d z z z z µ 1 2 ¶ µ 1 ¶ µ 2 ¶ 1 1 = 2 dz1 2 dz2 −z1 − z2 The derivative of the right-hand side is:

1 1 1 1 d (n 1) =(n 1) d − R 1 − R − · R 1 − R µ µ 2 ¶¶ ∙µ 2 ¶¸ =0(because n, R1,andR2 are constants)

Wecombinethesetoseethat: 1 1 1 1 2 dz1 2 dz2 =0= 2 dz1 = 2 dz2 −z1 − z2 ⇒−z1 z2 dz z 2 = 2 = 2 ⇒ dz − z 1 µ 1 ¶ We can now identify the ratio of the two differential lengths along the axis as the longitudinal magnification ML: 2 dz2 z2 2 ML = = (MT ) < 0 ≡ dz − z − 1 µ 1 ¶ The longitudinal magnification is negative because the image moves away from the lens (increasing z2)astheobjectmovestowards the lens (decreasing z1). The longitudinal magnification affects the irradiance of the image (i.e., the “flux density” of the rays at the image); if ML is large, then the light in the vicinity of an on-axis location is “spread out” over a longer longitudinal| | dimension at the image, which requires the irradiance of the image to decrease. 34 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The scaling of the 3-D “image” along the three axes. The scaling along the “transverse” axes x and ydefine the transverse magnification, while the scaling of the image along the z-axis is determined by the longitudinal magnification.

The effect of longitudinal magnification on the irradiance of the image of a uniformly luminous rod of length ab.Thesectionatz1 =2f is imaged with unit negative transverse magnification at z2 =2f. Sections of the rod with z1 > 2f are imaged at z2 < 2f, and the energy density is remapped to account for the nonlinear distance relationship 1 + 1 = 1 . z1 z2 f

2.11.3 Angular Magnification This is the ratio of the angles of the outgoing ray and the corresponding incoming ray measured relative to the optical axis. Angular magnification is particularly relevant for systems that do not form images, e.g., afocal telescopes. We shall shortly utilize this concept when considering the single-lens magnifier. θout Mθ = θin

If Mθ > 0, then the angle of the emerging ray is larger than that of the corresponding entering ray.| This| will increase the angular separation between rays generated by two objects so that it will be easier for the eye to resolve them. The angular magnification is sometimes called teh magnifying power of the lens. 2.12 SINGLE THIN LENSES 35 2.12 Single Thin Lenses

2.12.1 Positive Lens

The power of a single lens with two surfaces is determined by the lensmaker’s equation:

1 1 1 φ = = φ + φ =(n2 1) f 1 2 − R − R µ 1 2 ¶

The power is positive if 0 0, R2 < 0, which means that the ray encounters positive power at both surfaces. The action of a single thin positive lens with known focal length on an object with known location may be solved graphically by sketching three specific rays from the tip of the object:

1. the ray parallel to the optical axis; this ray is refracted by the lens to pass through the image- space focal point F,

2. the ray through the center of the lens, which is not refracted by the thin lens and so maintains the same angle relative to the optical axis, and

3. the ray through the object-space focal point F0 to the lens; this ray is refracted and travels parallel to the optical axis. The intersection of these three rays (or obviously of any two) is the location of the image of the tip of the object:

The example in the figure closely matches the situation where the image is an inverted replica of the object, so that h0 = h and MT = 1. The two equations that must be satisfied are − −

z2 = z1 = MT = 1 1 1 1 ⇒ − + = = z1 = z2 =2 f z1 z2 f ⇒ · This situation where the object and image distances are twice the focal length is often called imaging at equal conjugates. This drawing assumes that the indices of refraction in object and image space are identical. If the indices are different (e.g., if the object is in water and the image in air), then the imaging equation 36 CHAPTER 2 RAY (GEOMETRIC) OPTICS must be modified:

n n1 n2 1 φ = − − R1 − R2 n n = 1 + 2 z1 z2

If the refractive indices in object and image spaces are larger than that of the lens, such as a case where the object and image are in glass or water and the lens is “made of” air, the curvatures must be reversed, so that R1 < 0 and R2 > 0 to make a positive lens.

Lens made of rare medium (e.g., air) within a dense medium (e.g., glass, water). The reversal of refractive indices requires inverting of the signs of the radii of curvature.

2.12.2 Negative Lens

A lens with negative power at both surfaces may be constructed if R1 is negative and R2 is positive. Two (or more) rays that have passed through a lens with negative power will exhibit a larger diivergence on the output side than on the input side.

2.12.3 Meniscus Lenses

A lens with radii of curvature with the same sign on both surfaces is a meniscus lens. If both radii are positive, then the powers of the two surfaces are:

n 1 1 n 1 1 φ1 = − + − =(n 1) R1 R2 − · R1 − R2 | | | | µ| | | |¶ which may be positive or negative depending on the relative sizes of R1 and R2;thepowerispositive if R2 >R1 and negative if R2

Meniscus lens with positive power; the radii of curvature of both surfaces is positive since the vertices are to the left of the centers, but the fact that R2 >R1 ensures that φ>0.

Examples of meniscus lenses with positive and negative power are also shown:

Meniscus lenses with positive and negative powers from the Newport optics catalog. The red lines represent rays that show the respective converging and diverging actions of the lenses.

2.12.4 Simple Microscope (magnifier, “magnifying glass,” “loupe”) This is arguably the simplest imaging system, but some of the concepts it illustrates are sufficiently sophisticated that many optickers and/or imaging scientists may not understand them entirely. The simple microscope is a single lens with positive focal length that is used to increase the size of the image on the retina than could be formed with the eye alone. It also may be called the magnifying glass if handheld or a loupe if designed to rest on the object). You may know already that the eye lens is deformed by ciliary muscles that are relaxed when the lens is “flatter,” i.e., the radii of curvature of the surfaces are larger so the focal length is longer. To view an object “close up,” the focal length of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time of viewing objects up close). 38 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The closest distance to an object that appears to be sharply focused by the unaided eye is the near point, which (obviously) depends on the flexibility of the deformable eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual. The distance to the near point may be as close as 50 mm ∼= 2in for a young child and in the range between 1000 mm 2000 mm for an elderly person. This reduction in “accommodation” for close − objects is one of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼= 10 in from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing the angular subtense of fine details for those individuals. For this reason, nearsighted individuals in ancient times (before optical correction) often were attracted to professions requiring fine work, such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in these crafts.

The reference for angular magnification is the angle subtended by the object if viewed at the near point of the average eye so that z1 = 250 mm. If the object height is y, the angle when viewed at the near point is: 1 y y θ =tan− = 250 mm 250 mm ∼ 250 mm h i where the first-order approximation tan [θ] ∼= θ if θ ∼= 0 is used in the last step.

Magnifier with Object at Focal Point of Positive Lens

If the object is positioned at the object-space (front) focal point of a positive lens with focal length flens, then the arays from the “tip” of the object are parallel when they exit the lens and so may be viewed “in focus” by an eye with a relaxed lens for an object at an infinite distance away. The angle subtended by the object one focal length away is:

1 y y θ =tan− = lens f ∼ f ∙ lens ¸ lens 2.12 SINGLE THIN LENSES 39

Magnifier with object at focal point of lens. Figure (a) at top shows the angle θ250 mm subtended by the object when located at the near point; (b) shows the angle θlens subtended by the object when located at the object-space focal point of the lens. The blue ray in (b) emerges parallel to the optic axis, which shows that the object distance z1 = f.

The angular magnification or magnifying power of the magnifier is the ratio of the angle subtended by the object when viewed at the closer distance through the lens to the angular subtense viewed at the near point:

1 y y tan− θlens flens flens Mθ = = ∙ y ¸ ∼= µ y ¶ θ250 mm tan 1 − 250 mm 250 mm 250 mm h i ³ ´ Mθ = ,objectatfocalpoint flens

If the focal length of the magnifying lens is, say f =50mm, then the magnifying power of the lens for the object at the focal point is: 250 mm M = =5 θ 50 mm

Magnifier with Image Formed at Near Point

We can instead use the magnifying lens held close to the eye to form a virtual image at the near point of the eye. This means that the distance from the lens to the virtual image formed by the lens is the distance to the near point: V O = z2 = 250 mm. ISubstitute this distance into the imaging 0 0 − 40 CHAPTER 2 RAY (GEOMETRIC) OPTICS equation: 1 1 1 + = z1 250 mm f − 1 1 1 250 mm f = = + = z1 = · ⇒ z1 f 250 mm ⇒ 250 mm + f The angle subtended by the object at the near point is the same as before:

1 y1 y1 θ =tan− = 250 mm 250 mm ∼ 250 mm h i but the angle subtended by the image when positioned at the near point viewed through the lens is different: 1 y2 y2 y1 θlens =tan− = = 250 mm ∼ 250 mm z1 ∙|− |¸ where the similarity of the triangles has been used. This expression may be recast by substituting the expression for z1:

y1 y1 y1 250 mm + f θlens = = = ∼ z1 250 mm f f · 250 mm · µ ¶ 250 mm + f µ ¶ The magnifying power is:

y1 250 mm + f θlens f · 250 mm 250 mm + f Mθ = = ³ ´ µ y ¶ = θ250 mm 1 f 250 mm 250 mm ³ ´ Mθ = +1 image at near point flens

Magnifier with image at near point of eye. The top figure again shows the angle θ250 mm subtended by the object when located at the near point. The second figure shows the image at the near point, which is more distant than the object. 2.13 SYSTEMS OF THIN LENSES 41

2.13 Systems of Thin Lenses

The images produced by systems of thin lenses may be located by finding the “intermediate” image produced by the first lens, which then become in turn the objects for the second lens, which generates an image that is the object for the third lens, etc. This type of analysis also may be applied directlytothemorerealisticcaseof“thick”lenses,wherethefirst “lens” actually represents the first surface of the thick lens and the light propagates through the glass between the surfaces. Though straightforward, this “sequential” solution to the image may be tedious and also not very illuminating (pun intended) about the action of the system of lenses. The object and distance for th the n lens will be denoted by zn and the corresponding image distance by the primed quantity zn0 .

2.13.1 Two-Lens System

Consider a two-lens system with first lens L1 and second lens L2 separated by the distance t.The object for the system shown in the figure is labelled by O and the corresponding image by O0,the object- and image-space focal points are F and F0, and the object- and image-space vertices (first and last surfaces of the system) by V and V0.

Imaging by a system of two thin lenses L1 and L2 separated by the distance t. The object and image distances for the first lens are z1 and z10 and for the second lens are z2 and z20 .

From the diagram, we see that z10 the image distance from the first lens, z2 the object distance for the second lens, and the lens separation t are related by:

z10 + z2 = t sotheobjectdistanceforthesecondlensisz2 = t z0 . The imaging equation for the first lens − 1 determines z10 :

1 1 1 1 1 1 z1 f1 + = = = = − z1 z10 f1 ⇒ z10 f1 − z1 z1f1 z1f1 = z10 = ⇒ z1 f1 − If z1 = , then the ∞ z1f1 z1 z10 = lim = f1 lim = f1 1=f1 z1 z1 f1 · z1 z1 f1 · →∞ − →∞ µ − ¶ 42 CHAPTER 2 RAY (GEOMETRIC) OPTICS

In words, the image distance from the first lens for an object at is the focal length of the first ∞ lens, as it should be. The object distance to the second lens is z2 = t z0 , which may be rewritten − 1 in terms of z1, f1,andt for the general case:

z1f1 z2 = t z10 = t − − z1 f1 − z1t f1t z1f1 = − − z1 f1 − z1 (t f1) f1t = − − z1 f1 − In the limit of infinite object distance, the object distance to the second lens is:

z1 f1t z2 [for z1 = ] = lim (t f1) ∞ z1 z1 f1 · − − z1 f1 →∞ µ − − ¶ =1 (t f1) 0 · − − = t f1 − which is the difference in the separation of the lenses and the distance from the image-space focal point of the first lens; this often is a negative distance (i.e., virtual object for the second lens).

In the general case, apply the imaging equation for the second lens and substitute for the ex- pression for z2: 1 1 1 = z20 f2 − z2 1 z1 f1 = − f2 − z1 (t f1) f1t − − z1 f1f2 t (f1 + f2)+ 1 − (z1 f1) (z1 f1) = − − z z1 20 f2 t f1 f2 · − · · (z1 f1) − z1 f1 z1 f2 t f1 f2 f2 t · · − · · (z1 f1) · − (z1 f1) = z20 = − = µ − ¶ ⇒ z1 f1f2 z1 (f1 + f2) f1f2 t (f1 + f2)+ t · − − (z1 f1) (z1 f1) − (z1 f1) − − − The image distance for a specified (non-infinite) object location is called the back focal distance by some authors: f1 z1 f2 t · · − (z1 f1) BFD = z20 = V0O0 = µ − ¶ z1 (f1 + f2) f1f2 t · − − (z1 f1) − 2.13 SYSTEMS OF THIN LENSES 43

In the limit of infinite object distance, the BFD becomes the back focal length BFL:

lim [z20 ]=z20 [f1, f2,t; z1 = ] V0F0 z1 →∞ ∞ ≡ z1 f2 t f1 · − · (z1 f1) = lim ⎛ µ − ¶⎞ z1 z1 (f1 + f2) f1f2 →∞ t · − ⎜ − (z1 f1) ⎟ ⎜ − ⎟ f⎝2 (t f1 1) ⎠ = · − · t 1 (f1 + f2) 0 f1f2 − · − · t f2 f1f2 f1 f2 f2 t = · − = · − · t (f1 + f2) (f1 + f2) t − − f1 f2 f2 t (f1 t) f2 BFL = V0F0 = · − · = − · (f1 + f2) t (f1 + f2) t − − These complicated expressions, for the image distances measured from the second lens in terms of the two focal lengths f1 and f2,theseparationt,andthedistancez1 from the object to the first lens, are useful, but it tell little on its face about the entire “lens system.” We would much prefer establishing relationships from the object to the lens system and from the system to the image. The first step in this analysis is to define an equivalent or effective focal length for the entire system, which is the focal length of the equivalent single thin lens.

2.13.2 Effective (Equivalent) Focal Length We can use the results just derived to find an expression for the imaging action of a two-lens system by finding the location and focal length of the equivalent single lens that would generate the same image. This is an important concept, so we will do a rigorous derivation, which is perhaps simplified by adding some details to the figure:

Ray diagram of system of two positive thin lenses to illustrate the concept of “effective” (or “equivalent”) focal length feff , back focal length BFL = z20 = V0F0, and principal point H0 The continuations of the input outgoing rays intersect at B, whose projection onto the optical axis is at H0, this is the location of the equivalent single lens that would generate the same outgoing ray from the incoming ray. The distance from H0,theimage-space principal point,toF0 is the image-space effective (or equivalent ) focal length:

H F feff 0 0 ≡ 44 CHAPTER 2 RAY (GEOMETRIC) OPTICS

We have already evaluated the back focal length, which is the image location for an object at infinity:

(f1 t) f2 V0F0 = z20 [z1 = ]= − · ∞ (f1 + f2) t −

Compare two sets of similar triangles: ∆ AVF 10 ∆ CV0F10 and ∆ BH0F0 ∆ CV0F0 shown in the figures: ∼ ∼ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢

From the first pair of triangles ∆ AVF 10 ∆ CV0F10 , we can construct ratios of their “heights” and “axial lengths:” ∼ ¡ ¢ ¡ ¢ h h h V F 1 = 2 = 2 = 0 10 h VF10 V0F10 ⇒ 1 VF10

Now note that the distance VF10 = f1, while V0F10 may be rewritten:

V F = VF0 VV0 = f1 t 0 10 1 − − sotheratiomayberewritten:

h2 f1 t = − h1 f1

From the second pair of similar triangles ∆ BH0F0 ∆ CV0F0 ,wecandefine the distance ∼ H F feff and V F = BFL = z0 [z1 = ], so we now have two expressions for the ratio: 0 0 ≡ 0 0 2 ∞ ¡ ¢ ¡ ¢ h V F BFL 2 = 0 0 = h1 H0F0 feff h BFL 2 = h1 feff

Equate the two boxed equations::

f1 t BFL − = f1 feff 1 1 f1 t = = − ⇒ feff BFL · f1

Now substitute the formula for the back focal length BFL,whichisz0 if z1 = : 2 ∞ 2.13 SYSTEMS OF THIN LENSES 45

f2 (t f1) 1 (f1 + f2) t z20 = · − = = − t (f1 + f2) ⇒ z (f1 t) f2 − 20 − · 1 1 f1 t = = − ⇒ feff BFL · f1 1 (f1 + f2) t f1 t = − − feff (f1 t) f2 · f1 − · which may be rearranged to obtain a relationship for the reciprocal of the effective focal length in terms of the reciprocals of the individual focal lengths:

1 (f1 + f2) t f1 t = − − feff (f1 t) f2 · f1 − · (f1 + f2) t 1 1 t = − = + f2 f1 f1 f2 − f1f2 ·

1 1 1 t f1 f2 = + = feff = · feff f1 f2 − f1f2 ⇒ (f1 + f2) t − These two equivalent expressions specify what is certainly the most important equation we have derived to date and arguably the most important to be derived in this class. It determines the effect on the image of separating two thin lenses by some distance t. This expression may also be written in terms of the powers of the two lenses, where the power th 1 of the n lens is the reciprocal of the focal length: φ f − . n ≡ n φ = φ + φ φ φ t eff 1 2 − 1 · 2 · Note that if 1 1 φ1 + φ2 t = f1 + f2 = + = φ1 φ2 φ1φ2 then the feff = = BFL =+ and φeff =0; the object and image are both an infinite distance from the system.∞ The⇒ focal points∞ are located at and the system is called afocal. Such a system has infinite focal length and no power, which means±∞ that the image of an object at infinity is also at infinity,. Since z1 = z20 = , then the transverse magnification is zero.However, such a system exhibits a useful angular magni∞fication, as we shall see.

Back Focal Length and Image-Space Principal Point We have evaluated the back focal length:

f1 f2 f2 t BFL = V0F0 = · − · (f1 + f2) t − and the system focal length: f1 f2 feff = · (f1 + f2) t − We now define the image-space principal point H0 to be the point that is located one effective focal length from the image-space focal point, i.e., so that H0F0 = feff

f1 f2 H0F0 feff = · ≡ (f1 + f2) t −

We can think of H0 as the location of the single equivalent thin lens that generates the same outgoing ray that emerges from the two-lens system. For a single thin lens, H0 coincides with the image-space 46 CHAPTER 2 RAY (GEOMETRIC) OPTICS

vertex V0, which in turn coincides with the object-space vertex V since the thin lens has thickness t =0. From the equation for the BFL and the definition of the principal point, we can also specify the distance from the principal point to the vertex:

feff H F = H V + V F = H V + BFL ≡ 0 0 0 0 0 0 0 0 f1 f2 f1 f2 f2 t = H0V0 = feff BFL = · · − · ⇒ − (f1 + f2) t − (f1 + f2) t − − f2 t H0V0 = · (f1 + f2) t − We can (and will) derive corresponding results in the object space, i.e., object-space principal and focal points.

A pair of positive thin lenses showing the image-space principal and focal points H0 and F0, respecively.

Compare Back Focal “Length” and Back Focal “Distance”

As the object distance decreases from , the distance from the rear vertex to the the image typically increases, so that the BFD for a finite∞ object distance typically is larger than the BFL for an infinite object distance. This can be seen by comparing the two expressions for some specimen focal lengths. For f1 = 100 mm, f1 =25mm.andt =75mm, the focal length of the equivalent single lens is:

1 1 1 75 mm − feff = + =+50mm 100 mm 25 mm − 100 mm 25 mm µ · ¶ The back focal length (distance from rear vertex to focal point) is:

(f1 t) f2 BFL = z20 [z1 = ]= − · ∞ (f1 + f2) t − 25 mm (75 mm 100 mm) = · − =12.5mm 75 mm (100 mm + 25 mm) − 2.13 SYSTEMS OF THIN LENSES 47

If the object distance is decreased from z1 = to z1 = 1000 mm, the back focal distance is: ∞

z1 z1 f2 t f1f2 f2 t f1 f2 · − z1 f1 · · − · · (z1 f1) BFD = µ − ¶ = − z1 z1 f1f2 (t f2) f1 t (f1 + f2)+ − − z1 f1 · − (z1 f1) (z1 f1) µ − ¶ − − 1000 mm 25 mm 75 mm 100 mm 25 mm · − 1000 mm 100 mm · · BFD [z1 =1m]= µ − ¶ 20. 1000 mm ≈ (75 mm 25 mm) 100 mm − − 1000 mm 100 mm · µ − ¶ 1000 mm 25 mm 75 mm 100 mm 25 mm · − · · (1000 mm 100 mm) = 1000 mm −100 mm 25 mm 75 mm (100 mm + 25 mm) + · − (1000 mm 100 mm) (1000 mm 100 mm) − − 14.773 mm >BFL ≈ In words, as the object distance decreases from infinity, the image distance moves “back” away from the focal point.

Front Focal Length

The front focal length ( FFL) FV is the distance z1 in the case where z0 = .Itiscalculatedby 2 ∞ setting the denominator of the expression for z20 to zero:

z1f1 (t f2) =0 − − z1 f1 − z1f1 = = t f2 ⇒ z1 f1 − − z1 t f2 = = − ⇒ z1 f1 f1 − = z1f1 =(t f2)(z1 f1) ⇒ − − = z1f1 = tz1 tf1 z1f2 + f1f2 ⇒ − − = z1 (f1 + f2 t)=f1f2 tf1 ⇒ − − f1 (f2 t) lim z1 = FV = · − = FFL z0 (f1 + f2) t 2 →∞ −

Note that this expression has the same form as the front focal distance except that f1 and f2 are “swapped”.

Front Focal Distance

Also note that the front focal distance ( FFD) is the axial distance from an object to the first surface (front vertex) of the imaging system applies for finite object distances. This is synonymous with the term the working distance, a concept often used in microscopy.

f2 z2 f1 t · · − (z f ) FFD = OV = µ 2 2 ¶ 1 − t (z2 (f1 + f2) f1f2) − (z2 f2) · · − − 48 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Object-Space Principal Point

We have already shown how to find the location of the equivalent single lens on the “output side” by extending the rays entering and exiting the system until they meet. We can locate the equivalent single lens in “object space” by “reversing” the system and introducing rays from the left again.. Since we know the distance from the object-space focal point to the object-space vertex and the effective focal length, we can find the distance from the vertex to principal point in object space.

f1 f2 FH = feff = · (f1 + f2) t − = FV + VH = FFL+ VH f1 (f2 t) = · − + VH (f1 + f2) t − This implies that the distance from the object-space vertex to the object-space principal point is:

f1 f2 f1 (f2 t) VH = · · − (f1 + f2) t − (f1 + f2) t − − f1 t VH = · (f1 + f2) t −

2.13.3 Summary of Distances for Two-Lens System

f1 f2 feff = H0F0 = FH · (f1 + f2) t − f2 (f1 t) BFL = V0F0 · − (f1 + f2) t f2 t − H0V0 = H0F0 V0F0 · − (f1 + f2) t − f1 (f2 t) FFL = FV · − (f1 + f2) t f1 t − VH = FH FV · − (f1 + f2) t −

2.13.4 “Effective Power” of Two-Lens System

Theexpressionforthepowerofthesystemcomposed of two lenses in air with focal lengths f1 and f2 is:

1 1 1 t φeff [Diopters] = + ≡ feff [m] f1 [m] f2 [m] − f1f2 φ [Diopters] = φ + φ φ φ t eff 1 2 − 1 2 Clearly the power is zero if the separation distance t is equal to the sum of focal lengths; this is the recipe for a telescope. If the two lenses have positive power and the separation is just less than the sum of focal lengths, the effective focal length can be very large. This is also the case if if one of the two lenses has negative power (so that the numerator is negative) and the separation is just larger than the sum of the focal lengths (so that the denominator is negative and approximately zero). 2.13 SYSTEMS OF THIN LENSES 49

2.13.5 Lenses in Contact: t =0

If the lenses are in contact, then t =0and the front and back focal lengths are equal to the focal length of the “equivalent single thin lens”:

f1f2 FFL = BFL = = feff , if t =0 f1 + f2 1 1 1 = = + , if t =0 ⇒ feff f1 f2

Two “thin” positive lenses in contact. The focal length of the system is shorter than the focal f1f2 lengths of either, and may be evaluated to see that feff = . The image-space principal point is f1+f2 the location of the “equivalent thin lens”. Since both lenses are “thin”, the principal point coincides with the locations of both lenses, so that V0 = H0 = H = V.

The power of the system composed of two thin lenses in contact is the sum of the powers:

φ [Diopters] = φ + φ φ φ 0 eff 1 2 − 1 2 · = φ1 + φ2 for two thin lenses in contact

This is the assumed system for the magnifier with the lens held “close to the eye.”

2.13.6 Positive Lenses Separated by t

If two positive thin lenses are separated by less than the sum of the focal lengths, the image-space focal point F0 is closer to the first lens than it would have been had the second lens been absent. As shown, the effective focal length of the system is feff < f1. We can apply the equation for feff to this case to see that:

f1f2 feff = > 0 (f1 + f2) t − f1 + f2 > feff > 0 if f1 + f2 >t>0 50 CHAPTER 2 RAY (GEOMETRIC) OPTICS

A pair of positive thin lenses separated by less than the sum of the focal lengths.

Consider a specificexamplewithf1 = 100 mm, f2 =50mm,andt =75mm. The focal length of the equivalent single lens is:

f1f2 (100 mm) (50 mm) 200 2 feff = = = mm = 66 mm (f1 + f2) t (100 mm + 50 mm) 75 mm 3 3 − − The image formed by the firstlensislocatedatitsfocalpoint:

1 1 1 1 − 1 1 − z10 = = = 100 mm f1 − z1 100 mm − µ ¶ µ ∞¶

The object distance to the second lens is therefore the difference t z0 : − 1

z2 = t z0 =75mm 100 mm = 25 mm − 1 − −

The image of an object located at z1 = appears at z0 : ∞ 2 1 1 1 1 − 1 1 − 50 2 z20 = = = mm = 16 mm f2 − z2 50 mm − 25 mm 3 3 µ ¶ µ − ¶ 2 V F = .16 mm 0 0 3 measured from the rear vertex V0 of the system. We already know that the system focal length is 2 66 3 mm, so the image-space principal point H0 (the position of the equivalent thin lens) is located 2 66 3 mm IN FRONT of the system focal point, i.e., 50 mm in front of the second lens and 25 mm behind the first lens. 2 H F = f =66 mm 0 0 eff 3 2 V F = BFL =16 mm 0 0 3 2 2 H V = H F V F =66 mm 16 mm = 50 mm 0 0 0 0 − 0 0 3 − 3 We have already shown how to find the location of the equivalent single lens on the “output side” by extending the rays entering and exiting the system until they meet. We can locate the equivalent single lens in “object space” by “reversing” the system, as shown in the figure. The “first” lens in the system is now (what we have called the second lens) L2 with f2 =50mm. The “second” lens is L1 with f1 = 100 mm and the separation is t =75mm. The resulting effective focal length remains 200 2 unchanged at feff = mm = 66 mm. If we bring in a ray from an object at , the “intermediate” 3 3 ∞ 2.13 SYSTEMS OF THIN LENSES 51

image formed by L2 is located at the focal point of L2:

1 1 1 1 − 1 1 − z10 = = =50mm f2 − z1 50 mm − µ ¶ µ ∞¶ Thus the image distance to L1 is:

0 z2 = t z =75mm 50 mm = +25 mm − 1 −

The image of the object at z1 = produced by the entire system is located at z0 : ∞ 2 1 1 1 1 − 1 1 − 100 1 z0 = = = mm = 33 mm 2 f − z 100 mm − +25 mm − 3 − 3 µ 1 2 ¶ µ ¶ measured from the “second” lens L1 (or equivalently from the second vertex). The image is “behind” the second lens and is thus virtual. The object-space principal point H is the point such that the 2 1 distance FH = feff =66 mm, which means that H is located 33 mm IN FRONT of L2. 3 − 3

The “object-space” principal point H may be located by “reversing” the system and bringing in a ray from an object at infinity.

When we “re-reverse” the system to graph the object- and image-space principal points, H is located “behind” the lens L2, as shown in the graphical rendering of the entire system: 52 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The principal and focal points of the two-lens imaging system in both object and image spaces. The object-space principal point is the location of the equivalent thin lens if the imaging system is reversed. We can now use these locations of the equivalent thin lens in the two spaces to locate the images by applying the thin-lens (Gaussian) imaging equation, BUT the distances z and z0 are respectively measured from the object V to the object-space principal point H and from the image- space principal point H0 to the image point O0. The process is demonstrated after first locating the images via a direct calculation.

“Brute Force” Calculation of Image Now consider the location and magnification of the image created by the original two-lens imaging system (with L1 in front) for an object located 1000 mm in front of the system (so that OV = 1000 mm). We can locate the image step by step:

Intermediate image created by L1: 1 1 1 1 − 1 1 − 1000 z0 = = = mm = 111.11 mm 1 f − z 100 mm − 1000 mm 9 ∼ µ 1 1 ¶ µ ¶

Transverse magnification of intermediate image:: 1000 z10 9 mm 1 (MT )1 = = = −z1 −1000 mm −9

Distance from intermediate image to L2 : 1000 325 z2 = t z0 =75mm mm = mm = 36.11 mm − 1 − 9 − 9 ∼ −

Distance from L2 to final image: 1 1 1 1 − 1 1 − 650 z20 = = 325 =+ mm = +20.97 mm f2 − z2 50 mm − mm 31 ∼ µ ¶ µ − 9 ¶ 2.13 SYSTEMS OF THIN LENSES 53

Transverse magnification of second image: 650 31 mm 18 (MT ) = =+ 2 − 325 mm 31 − 9 The transverse magnification of the image from the entire system is the product of the transverse magnifications from each lens:

1 18 2 MT =(MT ) (MT ) = + = 1 · 2 −9 · 31 −31 µ ¶ µ ¶ which indicates that the image is minified and inverted.

Imaging Equation using Principal Points

We have just seen that the object- and image-space principal points are the “reference” locations from which the system focal length is measured;

feff = FH = H0F0

In exactly the same way, these principal points are the “reference” locations from which the object and image distances are measured:

z = OH

z0 = H0O0

The ray entering the system can be modeled as traveling from the object O to the object-space principal point H. The resulting outgoing (image) ray travels from the image-space principal point H0 to the image point O0. This may seem a little “weird”, but actually makes perfect sense if we relate the measurements to the equation for a single thin lens. In that situation, focal lengths are measured from the object-space focal point to the thin lens and from the lens to the image-space focal point. In other words, the object- and image-space vertices V and V0 of a thin lens coincide with the principal points H and H0. We know that an object located at the lens (z =0) generates an image at the lens (z0 =0) with magnification of +1; the heights of the object and image at the principal points are identical. In the realistic system where the object- and image-space principal points are at different locations, the image of an object located at the object space principal point is formed at the image-space principal point with unit transverse magnification MT =+1.Inother words, the principal points are the locations of conjugate points with unit transverse magnification. Notice the difference to the situation where the object distance OH =2f, so that the image distance H O =2f with transverse magnification MT = 1: 0 0 − OH = z =2f 1 1 1 + = z z0 f z0 = H0O0 =2f 2f MT = = 1 −2f − This case where the object and image distances are equal so that the transverse magnification is 1 often is called imaging at equal conjugates. −

Note the positions of the principal and focal planes of the system we just analyzed: f1 = +100 mm, f2 =+50mm,andt =+75mm. The principal points are “crossed,” which means that the object- space principal point is farther towards image space than the image-space principal point (H is “behind” the H0). Such a system is more “compact,” because the image is closer to the object-space principal point, so that F0 is closer than V0O0 54 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Principal points of an imaging system: The dashed ray from the object at O reaches the object-space principal point H with height h. The image ray (solid line) departs from the image-space principal point H0 with the same height h andgoestotheimagepointO0, so that the 1 1 1 distances OH = z and H O = z0 satisfy the imaging equation + = . 0 0 z z0 fe ff

Location of Image using Principal Points We can also analyze this system by using the model of the single thin lens located at the object- and image-space principal points. We have already shown that the focal length of the system is: 200 f = FH = H F =+ mm eff 0 0 3

Theobjectandimagedistancesz and z0 of the single lens equivalent to the two-lens system are respectively measured principal points: z = OH and z0 = H0O0.

The object distance is measured to the object-space principal point, which is 100 mm behind L1 (or V), thus the object distance is the distance from O to L1 plus 100 mm:

z = OV + VH = 1000 mm + 100 mm = 1100 mm 2.13 SYSTEMS OF THIN LENSES 55

The single-lens imaging equation may be used to find the image distance z0, which now is MEA- SURED FROM THE IMAGE-SPACE PRINCIPAL POINT H0 (and NOT from the image-space vertex V0).

1 1 1 − z0 = f − z µ eff ¶ 1 1 1 − = mm 200 mm − 1100 µ 3 ¶ 2200 = H O = mm = 70.97 mm 0 0 31 ∼ The image distance from the vertex is calculated by subtracting the distance from the image-space principal point H0 to the image-space vertex V0:

V O = H O H V 0 0 0 0 − 0 0 2200 650 = mm 50 mm = mm = +20.97 mm 31 − 31 ∼ The resulting transverse magnification is:

2200 z0 31 mm 2 MT = = = = 0.065 − z −1100 mm −31 ∼ − Both the image distance and the transverse magnification match the values obtained with the step- by-step calculation performed above (as they must!).

2.13.7 Cardinal Points

The object-space and image-space focal and principal points are four of the six so-called cardinal points that determine the paraxial properties of an imaging system. There are three pairs of locations where one of each pair is in object space and the other is in image space. The object- and image- space focal points are F and F0, while the principal points H and H0 are the locations on the axis in object and image space that are images of each other with transverse magnification MT =+1. The nodal points N and N0 are the points in object and image space where the ray angle of the entering and exiting rays are identical, which means that the angular magnification of rays “into” and “out of” the nodal points is Mθ =+1. The principal and nodal points coincide for systems with the object and image spaces in the same medium (e.g., both object space and image space in air). A table of significant points on the axis of a paraxial system is given below:

Axial Point Object Space (front) Image Space (back) Conjugate Points? (object and image?)

Focal Points F F0 No

Nodal Points N N0 Yes: Mθ =+1

Principal Points H H0 Yes: MT =+1

Vertices V V0 No

H0O0 z0 Object/Image O O0 Yes: MT = = − OH − z Entrance/Exit Pupils E E0 Yes,MT varies

“Equal Conjugates” OH=2feff z0 =H O =2feff Yes: MT = 1 2 0 0 − 56 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2.13.8 Lenses separated by t = f1 + f2: Afocal System (Telescope)

If the two lenses are separated by the sum of the focal lengths, then an object at forms an image at ; the system focal length is infinite. Since the focal points are both located∞ at infinity, we say that∞ the system is afocal; it has zero power, i.e., the rays exit the system at the same angle that they entered it. If the focal length of the first lens is longer than that of the second, the system is a telescope.

Two thin lenses separated by the sum of their focal lengths. An object located an infinite distance from the first lens forms an “intermediate” image at the image-space focal point f10 of the first lens. The second lens forms an image at infinity. Both object- and image-space focal lengths of the equivalent system are infinite: f = f 0 = . The system has “no” focal points — it is afocal. ∞

The focal length of this system is: 1 1 1 t =0 = + =0 feff ⇒ f1 f2 − f1 f2 · 1 1 f + f = + 1 2 =0 f f − f f µ 1 2 ¶ µ 1 2 ¶ = t = f1 + f2 ⇒ which shows that the separation between the two lenses is t = f1 + f2.

Angular Magnification of a Telescope

The telescope has infinite focal length and therefore no “power,” but you already know that it does “something.” Consider the system’s effectonaraythatentersthefirst lens at its center at angle θ, so it is transmitted through the lens with no change in angle. Because the ray crossed the axis at the first lens and travels the distance z2 = f1 + f2 to the second lens, where it is deviated to make the angle θ0 with the optical axis. We need to relate θ and θ0 to evaluate the angular magnification. 2.13 SYSTEMS OF THIN LENSES 57

Angular magnification of a telescope: the red ray strikes the center of the first lens at angle θ and is transmitted without deviation (because the sides are parallel at the center and the lens is thin). The ray is deviated by the second lens at angle θ0. The angular magnification is the ratio of these two angles.

From the figure, note that the angle of the entering ray is positive and that of the exiting ray is negative. The angle of the entering ray may be determined from the triangle “between” the lenses with sides (f1 + f2) and h: h tan [θ]= = θ f1 + f2 ∼

To find the exiting angle θ0, we need to find the distance from the second lens to the point where theraycrossestheaxis.Thisiseasytofind using the imaging equation for a thin lens in air:

1 1 1 z2 f2 + = = z20 = · z2 z f2 ⇒ z2 f2 20 − where the object distance z2 is the distance between the lenses:

z2 = t = f1 + f2 so the image distance for the red ray is:

z2 f2 (f1 + f2) f2 f2 z20 = · = z20 = · =(f1 + f2) z2 f2 (f1 + f2) f2 · f1 − −

The angle θ0 satisfies the condition:

h h f1 h tan θ0 = = = = θ0 f2 ∼ −z0 −(f1 + f2) −f2 · f1 + f2 2 f1 £ ¤ · So the angular magnification is:

f1 h θ0 − f2 · f1+f2 f1 Mθ = = = θ ∼ ³h ´ −f2 f1+f2 ³ ´ where the negative sign means that the two angles have different algebraic signs. In words, the angular magnifcation of a telescope is the ratio of the focal lengths of the lenses. If the two lenses are both positive (Keplerian telescope), then the angular magnification is negative. If the objective (first lens) has positive power and the ocular (second lens) is a negative (Galilean telescope), then 58 CHAPTER 2 RAY (GEOMETRIC) OPTICS the angular magnification is positive.

The angular magnification shows that two distant objects separated by a small angle (as a double star in the sky) will be separated by a larger angle if viewed through a telescope.

2.13.9 Positive Lenses Separated by t = f1 or t = f2

We now continue the sequence of examples for two positive lenses separated by increasing distances. If two positive lenses are separated by the focal length of the first lens, then the focal length of the system is: f1 f2 f1 f2 feff = · = · = f1 (if t = f1) (f1 + f2) f1 f2 − In words, the focal length of a system of two lenses separated by the focal length of the first lens is equal to the focal length of the second lens.

If the two lenses are separated by the focal length of the second lens, then the system focal length is f2.

f1 f2 f1 f2 feff = · = · = f2 (if t = f2) (f1 + f2) f2 f1 − Recall that the transverse magnification is approximately proportional to the focal length if the object is distant:

z f · z0 z f MT = = − − z −³ z ´ 1 f 1 = f = − ·z f −z · 1 f − Ã − z ! + n + n+1 f ∞ f ∞ f = = z z z − · n=0 − n=0 X µ ¶ X µ ¶ f = f if z f ∼ −z ∝− À where the formula for the converging geometric series has been used. In words, the transverse magnification of a distant object formed by an imaging system is approximately proportional to the focal length (which is why long focal lengths are used to image distant objects).

For the purpose of this example, we analyze the second case because it is the basis for probably the most common application of imaging optics. The extension to the first case is trivial. Since the focal length of the system is identical to the focal length of the second lens, this suggests the question of how does the image change if the front lens is added. 2.13 SYSTEMS OF THIN LENSES 59

Effect of adding lens L1 at the object-space focal point of lens L2,sothatt = f2 and feff = f2. The upper sketch is the lens L2 alone, and the lower drawing shows the situation with L1 added.

Consider a specificcasewithf2 = 100 mm and f1 = 200 mm.IfonlyL2 is present and the object distance is z2 = 1100 mm, then the image distance is:

1 1 1 1 − 1 1 − z0 = = = 110 mm 2 f − z 100 mm − 1100 mm µ 2 2 ¶ µ ¶ The associated transverse magnification is:

z +110 mm 1 (M ) = 20 = = T L2 alone −z2 −+1100 mm −10

Now add L1 at the front focal point of L2 and find the associated image. The object distance to L1 is 1100 mm 100 mm = 1000 mm.Thefirst lens forms an image at distance: − 1 1 1 1 − 1 1 − z0 = = = 250 mm 1 f − z 200 mm − 1000 mm µ 1 1 ¶ µ ¶ with transverse magnification:

z10 +250 mm 1 (MT )1 = = = −z1 −+1000 mm −4 The object distance to the second lens is:

z2 = t z0 = 100 mm 250 mm = 150 mm − 1 − − and the resulting image distance behind lens L2 is:

1 1 1 1 − 1 1 − z20 = = =+60mm f2 − z2 100 mm − 150 mm µ ¶ µ − ¶ Compare the image distances behind lens L2 and the system focal lengths without and with L1 in the system:

z20 (without L1) = V0O0 (without L1) = +110 mm > V0O0 (with L1) =+60mm 60 CHAPTER 2 RAY (GEOMETRIC) OPTICS

the image has moved “closer” to lens L2.

feff (without L1)= 100 mm = feff (with L1)

Now check the other attributes of the image. Recall that MT = 0.1 if using L2 alone. If using both lenses, the transverse magnification of the image formed by the− second lens is: 60 mm 2 (MT ) = =+ 2 − 150 mm 5 − The magnification of the system is the product of the magnifications due to each lens:

MT for system with L1 and L2 =(MT ) (MT ) 1 · 2 1 2 1 = + = = MT for L2 alone −4 5 −10 µ ¶µ ¶ MT (without L1)= MT (with L1) if t = f2 which is the same as for lens L2 alone! The transverse magnification of the system is not changed by the addition of lens L1 with focal length f1 placed at the front focal point of lens L2,Iff1 > 0, the image distance measured from L2 is shorter if L1 is present than if L1 is missing. Obviously, if the first lens has negative power (f1 < 0), the image distance measured from L2 is longer if L1 is present than if L1 is missing. Put another way, the addition of lens L1 located at the object-space focal point of lens L2 moves the principal points and focal points by equal distances either “forward” (towards L2)iff1 > 0 or “backwards” (farther from L2)iff1 < 0, but the the focal length is unchanged. This system demonstrates the principle of eyeglass lenses, where the ideal location for the corrective lens is at the object-space focal point of the eyelens (this is the reason that eyeglasses are “on your nose”). The corrective action of a negative lens L1 placed at the front focal point of L2 moves the image location “backwards” (away from L2) to correct “nearsightedness” without changing the transverse magnification of the imaging system. A positive lens L1 placed at the front focal point of L2 will move the image “forwards” (towards L2) to correct “farsightedness.”

2.13.10 Positive Lenses Separated by t>f1 + f2

If the two positive lenses are separated by more than the sum of the focal lengths, the focal length of the resulting system is negative:

f1 f2 feff = · < 0 (f1 + f2) t − If the object distance is ,thefirst lens forms an “intermediate” image at its image-space focal ∞ point, i.e., at z10 = f1. Since the object distance z2 measured from the second lens is larger than f2,a “real” image is formed by the second lens at the system focal point F 0. If we extend the exiting ray until it intersects the incoming ray from the object at infinity, we can locate the equivalent single thin lens for the system, i.e., the image-space principal point H0. In this case, this is located farther from the second lens than the focal point. The effective focal length feff = H0F0 < 0, so the system has negative power. 2.13 SYSTEMS OF THIN LENSES 61

The system composed of two thin lenses separated by d>f1 + f2. The image-space focal point F0 of the system is beyond the second lens, but the image-space principal point H0 is located even farther from L2. The distance H0F0 = feff < 0, so the system has negative power!

2.13.11 Compound Microscopes

We have already discussed the simple magnifier, where the object is located closer to the positive lens than the focal length, thus forming a larger upright virtual image close to the near point of the eye. In the compound magnifier (more commonly called the compound microscope)formedfrom two lenses, the objective and eyelens generally have a short positive focal length and a longer focal length, respectively. The focal points of the two lenses are separated by a fixed distance, the “tube length,” which is now standardized by the Royal Microscope Society as t = 160 mm,thoughsome companies manufacture other lengths (e.g., Leitz with t = 170 mm).Notthatitmattersinthis class, it is important to ensure that the objective is used with the correct tube length to minimize aberrations in the final image. Modern microscope systems are often “infinity corrected,” which means that the object is located in the front focal plane of the objective so that the rays emerging are parallel (collimated). This feature allows a beamsplitter to be introduced in the light path for a second eyelens, camera, or other apparatus. A lens within the microscope tube (the “tube lens,” duh) creates an intermedia image that is viewed by the eyelens. In more traditional microscopes, the object typically is located just beyond the focal point of the short-focal-length positive objective lens (so that the object distance z1 ' f1), thus forming a large real inverted image that is positioned at the front focal point of the ocular (eye lens). The eye lens then forms an image at infinity, i.e., the parallel rays emerging from the ocular are viewed by a relaxed eye. Microscope objectives and eyepieces are labeled by “magnifying powers,” e.g. 10X - 40X for the objective and 10X for the ocular. The total magnification is the product, so that a 10X objective and 10X ocular yields a magnification of 100X.

The magnifying power of an objective with focal length f1 and tube length 160 mm is: 160 mm M1 = − f1 For example, objectives with these focal lengths have magnifying powers:

f1 =16mm= M1 =10X ⇒ f1 =1.6mm = M1 = 100X ⇒ The magnifying power of the eyelens is calculatedfromthesameformulausedforthesimplemag- 62 CHAPTER 2 RAY (GEOMETRIC) OPTICS nifier: 250 mm (Mθ)1 = f2 with sample value: f2 =25.4mm = M2 = 10X ⇒ ∼ The magnifying power of the compound microscope is the product of the two magnifying powers:

M.P. =(Mθ) (Mθ) 1 · 2 160 mm 250 mm = − f1 · f2 160 mm 250 mm = − = 1000X 1.6mm · 25.4mm ∼ − where again the negative sign means that the image is inverted.

2.13.12 Two Positive Lenses with Different Focal Lengths and Different Separations

From the list of distances for a two-lens system:

f1 f2 feff = H0F0 = FH · (f1 + f2) t − (f1 t) f2 BFL = V0F0 − · (f1 + f2) t f2 t − H0V0 = H0F0 V0F0 · − (f1 + f2) t − f1 (f2 t) FFL = FV · − (f1 + f2) t f1 t − VH = FH FV · − (f1 + f2) t −

we can determine the impact of the lens separation t for the specificexample:

f1 = +100 mm

f2 =+25mm

t BFL FFL feff 0mm +20 mm +20 mm +20 mm

+25 mm = f2 0mm +18.75 mm +25 mm = f2 +50 mm 33 1 mm +16 2 mm +33 1 mm − 3 3 3 +75 mm 100 mm +12.5mm +50 mm − +100 mm = f1 300 mm 0mm +100 mm = f1 − +125 mm = f1 + f2 (afocal) ∞ ∞ ∞ +150 mm +500 mm +50 mm 100 mm − +175 mm +300 mm +37.5mm 50 mm − 2.13 SYSTEMS OF THIN LENSES 63

The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and f2 =+25mm,withamagnified view in (b). The system is afocal if t = f1+ f2 = 125 mm; feff > 0 for tf1+ f2.

2.13.13 Systems of One Positive and One Negative Lens

We also consider the case where f1 = +100 mm and f2 = 25 mm. The focal length for t =0is: − 1 1 1 1 − 1 1 − 100 feff = + = + = − mm = 33.33 mm f1 f2 +100 mm 25 mm 3 ∼ − µ ¶ µ − ¶ The system focal length is negative for t75 mm.

The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and f2 = 25 mm,withamagnified view in (b). The system is afocal if t = f1+ f2 =75mm; feff < 0 − for t 0 for t>f1+ f2. 64 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2.13.14 Newtonian Form of Imaging Equation

We have already seen the familiar Gaussian form of the imaging equation: 1 1 1 + = z z0 f

An equivalent form is obtained by defining the distances x and x0 that are the differences between the object and image distances and the focal length:

z = x + f = x = z f ⇒ − z0 = x0 + f = x0 = z0 f ⇒ −

In the case of a real object O and real image O0 asshowninthefigure, both x and x0 are positive.

The definition of the parameters x, x0 in the Newtonian form of the imaging equation. For a real image, both x and x0 are positive.

By simple substitution into the imaging equation, we obtain:

1 1 1 (x0 + f)+(x + f) x + x0 +2f = + = = 2 f x + f x0 + f (x + f) (x0 + f) xx0 +(x + x0) f + f · 2 xx0 +(x + x0) f + f = f = · ⇒ (x + x0)+2f 2 2 = x x0 + f =2f ⇒ · 2 = x x0 = f ⇒ · This is the Newtonian form of the imaging equation. The same expression applies for virtual images, but the sign of the distances must be adjusted, as shown:

The parameters x, x0 of the Newtonian form for a virtual image. 2.13 SYSTEMS OF THIN LENSES 65

2.13.15 Example (1) of Two-Lens System Find the cardinal points of the two-lens system

f1 = +100 mm

f2 =+25mm t =+50mm

The effective focal length is:

f1 f2 feff = · (f1 + f2) t − 100 mm 25 mm 100 1 = · =+ mm = +33 mm 100 mm + 25 mm 50 mm 3 3 − Now find the location of the focal point from the formula for the back focal length:

f2 (f1 t) BFL = V0F0 = · − (f1 + f2) t − 25 mm (50 mm 100 mm) 50 = · − = mm 50 mm (100 mm + 25 mm) 3 − Alternatively, we can track a ray from infinity through the system. The image distance from the first lens is f1 = +100 mm, so the object distance to the second lens is

z2 = t f1 =50mm 100 mm = 50 mm − − − The image distance from the second lens is:

z2 f2 ( 50 mm) (+25 mm) 50 z20 = · = − · = mm = V0F0 z2 f2 ( 50 mm) (+25 mm) 3 − − − (parenthetical note, this is half the focal length). We can now draw the image-space focal and principal points: 66 CHAPTER 2 RAY (GEOMETRIC) OPTICS

To find the object-space focal point, we can evaluate the front focal length:

f1 = +100 mm

f2 =+25mm t =+50mm

f1 (f2 t) (+100 mm) (25 mm 50 mm) 100 FFL = FV = · − = · − = mm (f1 + f2) t (100 mm + 25 mm) 50 mm − 3 − − which says that the object-space focal point is to the right of the object space vertex. From the effective focal length, we can locate the object-space principal point: 100 FH = f =+ mm eff 3 FV = FH + HV 100 100 mm = + mm + HV − 3 100 100 200 = HV = mm mm = mm ⇒ − 3 − 3 − 3

Alternatively, we “turn the system around” and bring in light from the left. The image distance from the “first lens” (actually L2) is equal to its focal length:

z10 = f2 =+25mm

So the object distance to the lens with f1 = +100 mm is:

z2 = t z0 =50mm 25 mm = +25 mm − 1 − So the distance from this lens to the system image-space focal point is:

z2 f1 (+25 mm) (+100 mm) 100 z20 = · = · = mm z2 f1 (+25 mm) (+100 mm) − 3 − − The object-space focal point is virtual and the object-space principal is located at the distance f eff behind it in the reversed system. 2.13 SYSTEMS OF THIN LENSES 67

We can now reverse the second case and plot the four cardinal points ( F, F0, H, H0)onthesame graph:

Object-space and image-space cardinal points for two-lens system with f1 =+100mm, f2 =+25mm, t =+50mm. The ray from infinity on the object side is in red, that from infinity on the image side is in blue.

In this case, the object-space focal point F just happens to coincide with the image-space principal point H0 and the same is true for the object-space principal point H and the image-space focal point F0.Thisisofnorealsignificance, since the two spaces are independent. 68 CHAPTER 2 RAY (GEOMETRIC) OPTICS

ImagesfromSystem:(1)ObjectatObject-SpaceFocalPoint

An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case: 100 z1 = FFL = mm − 3 100 z1 f1 3 mm 100 mm z10 = · = −100 · =+25mm z1 f1 mm (100 mm) − −¡ 3 ¢− The object distance to the second lens is:¡ ¢

z2 = t z0 =+50mm 25 mm = 25 mm − 1 − which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected).

Images from System: (2) Object at Object-Space Principal Point

An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case: 100 100 200 z1 = FFL feff = mm mm = mm − − 3 − 3 − 3 200 z1 f1 3 mm 100 mm z10 = · = −200 · =+40mm z1 f1 mm (100 mm) − −¡ 3 ¢− z10 ¡40 mm ¢ 3 (MT )1 = = 200 =+ −z1 − mm 5 − 3 The object distance to the second lens is:

z2 = t z0 =+50mm 40 mm = +10 mm − 1 − z2 f2 10 mm 25 mm 50 z20 = · = · = mm z2 f2 10 mm 25 mm − 3 − 50 − z20 3 mm 5 (MT )2 = = − =+ −z2 − 10 mm 3 The system magnification for that object distance is the product of the two:

3 5 (MT ) =(MT ) (MT ) = + + =+1 system 1 · 2 5 · 3 µ ¶ µ ¶ as expected for the object and image at the principal points.

Images from System: (3) Equal Conjugates

If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is: 100 100 z1 = FFL+ feff = mm + mm = 0 mm − 3 3 z10 =0mm

(MT )1 =+1 2.13 SYSTEMS OF THIN LENSES 69

The object distance to the second lens is:

z2 = t z0 =+50mm 0mm=+50mm − 1 − z2 f2 +50 mm 25 mm z20 = · = · =+50mm z2 f2 +50 mm 25 mm − − z20 50 mm (MT )2 = = = 1 −z2 −50 mm − The system magnification for that object distance is the product of the two:

(MT ) =(MT ) (MT ) =(+1) ( 1) = 1 system 1 · 2 · − − as expected for the object and image at the equal-conjugate points.

2.13.16 Example (2) of Two-Lens System: Telephoto Lens

Now consider a system composed of a positive lens and a negative lens separated by just a bit more than the sum of the focal lengths: f1 = +100 mm, f2 = 25 mm,andt =+80mm. The focal length − of the equivalent thin lens is feff = 500 mm:

f1 f2 feff = · f1 + f2 t − 100 mm ( 25 mm) = · − = +500 mm 100 mm + ( 25 mm) 80 mm − − Note that the focal length of the system is MUCH longer than the focal lengths of either lens.

Now locate the image-space focal point and principal point. For an object located at ,the BFL is found by substitution into the appropriate equation: ∞

(f1 t) f2 BFL = V0F0 = − · (f1 + f2) t − (100 mm 80 mm) ( 25 mm) = − · − = 100 mm (100 mm + ( 25 mm)) 80 mm − − The image of an object at is located 100 mm behind the second lens, and thus 180 mm behind ∞ the first lens; this distance VF0 = 180 mm is the physical length, which is MUCH longer than the focal length of 500 mm. This is the advantage of a telephoto lens; the focal length is much longer than the lens itself.

The locations of the image-space principal point is determined from the back and equivalent focal lengths:

H0F0 = H0V0 + V0F0

500 mm = H0V0 + 100 mm

H0V0 = +400 mm

H V = H V VV0 = 400 mm 80 mm = +320 mm 0 0 0 − − so the principal point is located 320 mm in front of the object-space vertex V.Asketchofthe system and the image-space cardinal points is shown below: 70 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Image-space focal and principal points of the telephoto system. The equivalent focal length of the system is feff = +500 mm, but the image-space focal point is only +100 mm behind the rear vertex V0. Tthe image-space principal point is 500 mm in front of the focal point.

The object-space focal point is located by applying the expression for the “front focal distance”:

f1 (f2 t) (+100 mm) (( 25 mm) 80 mm) FFL = FV = · − = − − = +2100 mm (f1 + f2) t (100 mm + ( 25 mm)) 80 mm − − − which is far in front of the object-space vertex V. The object-space principal point is found from:

FH = FV + VH +500 mm = +2100 mm + VH VH = 500 mm 2100 mm = 1600 mm = HV = VH = +1600 mm − − ⇒ − So the object-space principal point is very far in front of the first vertex.

Object-space focal and principal points of the telephoto system. Both are located far ahead of the front vertex V.

We can locate the image of an object at a finite distance say, 3min front of the first lens (OV = 3000 mm) using the three methods: (1) “brute-force” calculation, (2) by applying the Gaussian imaging formula for distances measured from the principal points, and (3) from the Newtonian imaging equation. 2.13 SYSTEMS OF THIN LENSES 71

(1) “Brute-Force Calculation”

The distance from the object to the first thin lens is 3000 mm, so the intermediate image distance satisfies: 1 1 1 + = z1 z10 f1 1 1 1 − 3000 z0 = = mm = 103.45 mm 1 100 mm − 3000 mm 29 ∼ µ ¶ The transverse magnification of the image from the first lens is:

z10 1 (MT )1 = = −z1 −29 The object distance to the second lens is negative: 3000 680 z2 = t z0 =80mm mm = mm = 23.45 mm − 1 − 29 − 29 ∼ − the object is virtual. The image distance from the second lens is: 1 1 1 + = z2 z20 f2 1 1 29 − 3400 z0 = =+ mm = +377.8mm 1 −25 mm − −680 mm 9 ∼ µ µ ¶¶ The corresponding transverse magnification is:

3400 z20 + 9 mm (MT )2 = = 680 = 16.1 −z2 − mm ∼ − ¡ − 29 ¢ The system magnification is the product of the component¡ ¢ transverse magnifications:

3400 1 + 9 mm 5 MT =(MT ) (MT ) = = 1 · 2 −29 · − 680 mm −9 Ã ¡ − 29 ¢! ¡ ¢

(2) Gaussian Formula

Now evaluate the same image using the Gaussian formula for distances measured from the principal points. The distance from the object to the object-space principal point is:

z1 = OH = OV + VH = 3000 mm + ( 1600 mm) = +1400 mm − The image distance measured from the image-space principal point is found from the Gaussian image formula:

1 1 1 1 1 1 − 7000 = = z0 = H O = =+ mm = 777.8mm z f − z ⇒ 0 0 500 mm − 1400 mm 9 ∼ 0 eff µ ¶

The distance from the rear vertex to the image is found from the known value for H0V0 = +400 mm:

V O = H O H V 0 0 0 0 − 0 0 7000 3400 =+ mm 400 mm = mm = 377.8mm 9 − 9 ∼ 72 CHAPTER 2 RAY (GEOMETRIC) OPTICS thus matching the distance obtained using “brute force”. The transverse magnification of the image created by the system is: 7000 z0 + 9 mm 5 MT = = = − z −+1400 mm −9

(3) Newtonian Lens Formula Now repeat the calculation for the image position using the Newtonian lens formula. The distance fromtheobjecttotheobject-spacefocalpointis:

x = OF = OV + VF = OV FV = 3000 mm 2100 mm = 900 mm − − Therefore the distance from the image-space focal point to the image is:

2 feff (500 mm) 2500 x0 = F O = = = mm = 277.8mm 0 0 x 900 mm 9 ∼

So the distance from the rear (image-space) vertex V0 to the image is:

V0O0 = V0F0 + F0O0 2500 3400 = 100 mm + mm = mm = 377.8mm 9 9 ∼ which again agrees with the result obtained by the other two methods.

2.13.17 Images from Telephoto System: Image (1): Object at Object-Space Focal Point An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case:

z1 = FFL = +2100 mm z1 f1 (+2100 mm) 100 mm z10 = · = · = +105 mm z1 f1 (+2100 mm) (100 mm) − − The object distance to the second lens is:

z2 = t z0 =+80mm 105 mm = 25 mm − 1 − − which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected).

z2 f2 ( 25 mm) ( 25 mm) z20 = · = − · − = z2 f2 ( 25 mm) ( 25 mm) ∞ − − − −

Image (2) from Telephoto System: Object at Object-Space Principal Point An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case:

z1 = FFL feff = 2100 mm 500 mm = 1600 mm − − z1 f1 (1600 mm) 100 mm 320 z10 = · = · =+ mm z1 f1 (1600 mm) (100 mm) 3 − 320 − z10 + 3 mm 1 (MT )1 = = = −z1 − 1600 mm −15 2.13 SYSTEMS OF THIN LENSES 73

: The object distance to the second lens is: 320 80 z2 = t z0 =+80mm mm = mm − 1 − 3 − 3 80 z2 f2 3 mm ( 25 mm) z20 = · = −80 · − = 400 mm z2 f2 mm ( 25 mm) − − ¡− 3 ¢− − z20 (¡ 400 mm)¢ (MT )2 = = − 80 = 15 −z2 − mm − − 3 The system magnification for that object¡ distance is¢ the product of the two:

1 (MT ) =(MT ) (MT ) = ( 15) = +1 system 1 · 2 −15 · − µ ¶ which again confirms that the transverse magnification is that expected for the object and image at the principal points.

Image (3) from Telephoto System: Equal Conjugates

If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is:

z1 = FFL+ feff = 2100 mm + 500 mm = 2600 mm z1 f1 (+2600 mm) 100 mm z10 = · = · = +104 mm z1 f1 (+2600 mm) (100 mm) − − z10 (+104 mm) 1 (MT )1 = = = −z1 − (2600 mm) −25 The object distance to the second lens is:

z2 = t z0 =+80mm 104 mm = 24 mm − 1 − − z2 f2 ( 24 mm) ( 25 mm) z20 = · = − · − = +600 mm z2 f2 ( 24 mm) ( 25 mm) − − − − z20 (+600 mm) (MT )2 = = =+25 −z2 − ( 24 mm) − The system magnification for that object distance is the product of the two:

1 (MT ) =(MT ) (MT ) = (25) = 1 system 1 · 2 −25 · − µ ¶ as expected for the object and image at the equal-conjugate points. 74 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2.13.18 Example (3) of Two-Lens System: Two Negative Lenses

Now consider a system composed of a positive lens and a negative lens separated by just a bit more than the sum of the focal lengths: f1 = 100 mm, f2 = 25 mm,andt = +125 mm.Thefocal length of the equivalent thin lens is: − −

f1 f2 feff = · = H0F0 = FH f1 + f2 t − ( 100 mm) ( 25 mm) = − · − = 10 mm ( 100 mm) + ( 25 mm) 125 mm − − − − Note that the focal length of the system negative and shorter than either lens..

Now locate the image-space focal point and principal point. For an object located at ,the BFL and FFL are found by substitution into the appropriate equation: ∞

(f1 t) f2 BFL = V0F0 = − · (f1 + f2) t − ( 100 mm 125 mm) ( 25 mm) 45 = − − · − = mm = 22.5mm ( 100 mm) + ( 25 mm) 125 mm − 2 − − − − BFL = 22.5mm −

f1 (f2 t) FFL = FV = · − (f1 + f2) t − ( 100 mm) ( 25 mm 125 mm) = − · − − = 60 mm ( 100 mm) + ( 25 mm) 125 mm − − − − FFL = 60 mm − 2.13 SYSTEMS OF THIN LENSES 75

(1)ObjectatObject-SpaceFocalPoint An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case:

z1 = FFL = 60 mm (virtual object) − z1 f1 ( 60 mm) ( 100 mm) z10 = · = − · − =+150mm z1 f1 ( 60 mm) ( 100 mm) − − − − The object distance to the second lens is:

z2 = t z0 = +125 mm 150 mm = 25 mm − 1 − − which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected):

2 z2 f2 ( 25 mm) ( 25 mm) 625 mm z20 = · = − · − = = z2 f2 ( 25 mm) ( 25 mm) 0mm ∞ − − − − Images from System: (2) Object at Object-Space Principal Point An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case:

z1 = FFL feff = 60 mm ( 10 mm) = 50 mm − − − − − z1 f1 ( 50 mm) ( 100 mm) z10 = · = − · − = +100 mm z1 f1 ( 50 mm) ( 100 mm) − − − − z10 +100 mm (MT )1 = = =+2 −z1 − 50 mm − The object distance to the second lens is:

z2 = t z0 = +125 mm 100 mm = +25 mm − 1 − z2 f2 (+25 mm) ( 25 mm) z20 = · = · − = 12.5mm z2 f2 (+25 mm) ( 25 mm) − − − − z20 ( 12.5mm) 1 (MT )2 = = − =+ −z2 − (+25 mm) 2 The system magnification for that object distance is the product of the two:

1 (MT ) =(MT ) (MT ) =(+2) + =+1 system 1 · 2 · 2 µ ¶ which again confirms that the transverse magnification is that expected for the object and image at the principal points.

Images from System: (3) Equal Conjugates If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is:

z1 = FFL+ feff = 60 mm + ( 10 mm) = 70 mm − − − z1 f1 ( 70 mm) ( 100 mm) 700 1 z10 = · = − · − =+ mm = 233 mm z1 f1 ( 70 mm) ( 100 mm) 3 3 − 700− − − z10 3 mm 10 (MT )1 = = =+ −z1 −( 70 mm) 3 ¡− ¢ 76 CHAPTER 2 RAY (GEOMETRIC) OPTICS

The object distance to the second lens is: 700 325 z2 = t z0 = +125 mm mm = mm = 108.3mm − 1 − 3 − 3 ∼ − 325 z2 f2 3 mm ( 25 mm) z20 = · = −325 · − = 32.5mm z2 f2 mm ( 25 mm) − − ¡− 3 ¢− − z20 (¡ 32.5 mm)¢ 3 (MT )2 = = − 325 = −z2 − mm −10 − 3 The system magnification for that object¡ distance¢ is the product of the two:

10 3 (MT ) =(MT ) (MT ) = + = 1 system 1 · 2 3 · −10 − µ ¶ µ ¶ as expected for the object and image at the equal-conjugate points.

2.14 Plane and Spherical Mirrors

One of the most familiar optical elements is the plane mirror (you probably see one every morning!). For each ray incident at angle θ measured from the normal to the surface, a reflected ray is generated at angle θ relative to the normal. Consider a full sphere with reflective surface on the inside and a point object− O at the center, as shown in (a) in the figure. All rays from the object encounter the surface at normal and reflect back to form an image at the center. We can infer the focal length of thesphericalconcavemirrorfromthisobservation by noting that the object and image distances are identically R, so the focal length is determined by the thin-lens imaging equation: 1 1 1 = + f z1 z2 1 1 1 2 R z1 = z2 = R = = + = = f = ⇒ f R R R ⇒ 2 Note that in this case of a complete sphere, the algebraic sign of the radius of curvature is not well defined, but since rays converge to form the image, the focal length clearly must be positive. Because the object and image distances are equal, this clearly is imaging at equal conjugates with transverse magnification is MT = 1: − z2 2 f MT = = · = 1 −z1 −2 f − · ThenegativesignonMT means that if the object source is moved “upward” from its position on the horizontal axis at the center, then the reflected rays will converge to a point “below” the optic axis, as shown in part (b) of the figure. In part (c) of the figure, half of the spherical mirror surface is removed so that all rays emitted towards the left will escape without striking the mirror and all rays emitted towards the right will strike the surface one time before returning to the “image” at the center and then escaping to the right. This mirror surface clearly makes rays converge to a real image coincident with the object and so must have a positive focal length EVEN THOUGH the radius of curvature R is negative (because V is to the right of C). 2.14 PLANE AND SPHERICAL MIRRORS 77

Spherical mirror: (a) rays from point source at center of sphere are all normal to the surface and reflect back upon themselves to form a point image at object, so that z1 = z2 = R;(b)ifthepoint source is moved “upward”, the image moves “downward,” which shows that MT = 1;(c)halfthe sphere is removed leaving a hemisphere with R = CV < 0. −

Derivation of the focal length of a concave spherical mirror. The magnified section at the bottom R shows the triangles used to evaluate f in terms of R: f = 2 in the paraxial approximation.

We can consider the hemispherical concave mirror with radius of curvature R = VC < 0.Even though the radius is negative, we have already inferred that the focal length of this system is positive since the image rays converge, so we have:

R R R f =| | = − = 2 2 − 2 78 CHAPTER 2 RAY (GEOMETRIC) OPTICS

A ray from an object at infinity that is close to (and parallel to) the optical axis, as shown in the in the figure. From triangle ∆CAV in the magnified view, it is apparent that: x x x sin [θ]= = = CV VC R − − From ∆F AV ,weseethat 0 x tan [2θ]= F0V0 Now apply the paraxial approximation that sin [θ] ∼= tan [θ] ∼= θ if θ ∼= 0: x sin [θ]= = θ = x = R θ R ∼ ⇒ − · x− tan [2θ]= = 2θ = x = f 2θ f ∼ ⇒ · Now equate the two terms to find a relationship between f and R:

R R θ = f 2θ = f = − · · ⇒ − 2

This expression for the focal length may be substituted into the imaging equation for a single thin lens: 1 1 1 2 + = = z1 z2 f −R

For the case just considered of a concave surface, R<0 and f > 0. If the object distance z1 > f, then the image distance z2 is positive, BUT IS MEASURED FROM RIGHT TO LEFT. If the mirror is a convex spherical surface with R = VC > 0; the image of a ray from an object at infinity crosses the axis at the image-space focal point behind the mirror, so the optic makes rays diverge and therefore has negative power.

Convex mirror has positive radius of curvature (R>0) but the reflectedraysdivergeandsothe R surface has negative focal length via f = . − 2 2.15 STOPS AND PUPILS 79

2.14.1 Comparison of Thin Lens and Concave Mirror

Comparison of the vertices, focal points, principal points, and equal-conjugate points of a concave mirror and a thin lens. The vertices and the principal points coincide in both cases so that MT =+1for object and image at the vertex of the mirror and at the surfaces of the lens. The R object- and image-space focal points of the mirror coincide at the distance feff = for the mirror, − 2 and the equal conjugate points are located at the center of curvature so that z1 = z2 =2feff .Forthe lens, the equal conjugate points are also located such that z1 = z2 =2feff with MT = 1. −

2.15 Stops and Pupils

In any multielement optical system, the beam of light that passes through the system is shaped like a solid circular “spindle” with different radii at different axial locations. A larger exiting ray cone means that more light reaches the image to make it brighter, so the diameter of this specificelement is the limiting factor for image “brightness.” The diameter of one optical element will limit the size oftherayspindlethatexitsthesystem; this limiting element is the aperture stop ofthesystemand may be a lens or an aperture with no power (an iris diaphragm) that is placed specifically to limit the diameter of the ray cone. Consider the example of a two-lens system with an iris positioned between them shown in the figure. The iris limits the cone of rays from the object at O 80 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Schematic of the aperture stop S and entrance and exit pupils E and E0, respectively for a system formed from two positive lenses and an iris with no power. The entrance pupil E is the image of the stop S seen from the left through the first lens L1, while the exit pupil is the image of S seen from the right through the second lens L3. Note that the element that is the stop may vary with object location O.

Obviously, the aperture stop in an imaging system composed of a single lens is that lens. In a two-element system, the stop will be one of the two lenses, determined by the relative diameters and the locations of the lenses. The image of the stop seen from the input “side” of the lens is the entrance pupil, which determines the angular spread of the ray cone from an object point that “gets into” the optical system, and thus determines the “brightness” of the image. The image of the stop seen from the output “side” is the exit pupil (once called the Ramsden disk ). In an imaging system intended for viewing by eye, it is useful to locate the exit pupil at the iris of the eye and to match its diameter to that of the iris of the eye to ensure that all light through the optical system makes it into the eye to form the viewable image.

2.15.1 Focal Ratio — f-number For multilens systems, the size of the entrance pupil determines the angular extent of the ray cone that enters the system from a point source. The figure shows a simple hypothetical imaging system with object-space and image-space principal points H and H0, respectively and aperture stop of diameter d0 as the first element in the system (the same analysis applies for systems with the entrance pupil at other locations for an object at infinity). In this system, the stop is also is the entrance pupil. A point source at infinity creates a plane wave through the entrance pupil, which is then incident on the object-space principal plane H withthesamediameter.Theunittransverse magnification of the two principal planes ensures that the light emerging from the image-space 2.15 STOPS AND PUPILS 81

principal plane H0 has that same diameter d0 = dNP. The cone angle of rays incident on the image plane at the image-space focal point F0 is the ratio of the diameter to the distance H0F0 = feff : d d 0 = NP feff feff This means that the focal ratio of the system is: f f/# = eff dNP Note that a corresponding expression could be constructed based on the diameter of the exit pupil, but the propagation distance then would have to be the distance from the exit pupil to the image, which (in this case) is longer than the effective focal length.

Specification of the system focal ratio: the plane wave from a point source at infinity is incident through the aperture stop with diameter d0 onto the object-space principal plane H. The light emerging from the image-space principal plane H0 has the same diameter d0. The light propagates the focal length f to the image. The angle of the ray cone is d0 ,which is the system focal ratio eff fe ff f/#.

This f-number specifies the ability of the system to collect light.

2.15.2 Example: Focal Ratio of Lens-Aperture Systems The focal ratio of a single thin lens obviously is the ratio of the focal length to the diameter of the lens: f f/#= d0 Note that the smallest possible focal ratio exists for a full sphere (which is anything but thin and the paraxial approximation certainly does not apply over its full diameter). It might be useful to determine the focal ratio for such a case with “normal” glass (n =1.5). The focal length of the 82 CHAPTER 2 RAY (GEOMETRIC) OPTICS sphere in the (ridiculously invalid) thin-lens paraxial approximation where R =12.5mm is obtained from the lensmaker’s equation:

1 1 1 − f = (n2 1) − R − R µ µ 1 2 ¶¶ 1 1 1 − =(1.5 1) − 12.5mm − 12.5mm µ − ¶ =3.125 mm

The focal ratio is: f 3.125 mm 1 f/#= = = d0 25 mm 8 This is ridiculously invalid because it assumes that the sphere is simultaneously “thin” and “fat” If we assume the spherical lens is composed of two thin lenses at the vertices with the power of asinglesurface:

1 1.5 1 − f = f = − =25mm 1 2 12.5mm µ ¶ t =25mm f1 f2 25 mm 25 mm feff = · = · =25mm f1 + f2 t 25 mm + 25 mm 25 mm − − (f1 t) f2 BFL = − · =0 f1 + f2 t −

Single Thin Lens + Aperture “in front”

Consider a system with a diaphragm (iris or aperture) of diameter d0 located at a distance t “in front” of the lens with focal length f1 and diameter d1. Sincetheaperturehasnopowertorefract light (φ =0diopters), then its “focal length” is infinite (f0 = ). The focal length of the two-“lens” system is: ∞ f0 f1 f0 feff = · = f1 lim = f1 (f0 + f1) t · f0 (f0 + f1) t − →∞ µ − ¶ which makes sense: the focal length of a system consisting of one refracting element and one “non- refracting” element is that of the refracting lens. For an object at infinity (z1 = = z2 = f1), the diaphragm is the aperture stop if its diameter is smaller than that of the lens: ∞ ⇒

d0

t f1 zXP = · t f1 − which shows that the exit pupil is virtual (“behind” the lens as seen from image space) if t 0= t>f1. ⇒ Consider some examples with f1 = 100 mm, d1 =25mm, t =25mm,andd0 =10mm.Iftheiris 2.15 STOPS AND PUPILS 83 is deleted, then the focal ratio is: f 100 mm f/#= eff = = f/4 d1 25 mm The iris is the stop and entrance pupil. The location of the exit pupil is:

t f1 25 mm 100 mm 100 zXP = · = · = mm t f1 25 mm 100 mm − 3 − 100 − 3 mm 4 MXP = − =+ − 25 mm 3 40 1 dXP = d0 MXP = mm = 13 mm · 3 3 The iris is the stop and entrance pupil, so the focal ratio is: f 100 mm f/#= eff = = f/10 dNP 10 mm

Single Thin Lens + Aperture “behind”

If the lens comes first in the system, then we need to find the condition of the iris diameter to determine if it is the aperture stop. At some risk of confusion, we’ll maintain the notation where the diameter of the lens is d1 and that of the aperture is d0 even though it is second in the system. For an object at infinity, the figure shows that the distance to the iris must be less than the focal length to have any possibility of being the aperture stop. The image of the aperture seen from object space is located at t f1 z = · t f1 − which is positive (so the entrance pupil is real) if t

d0 = MT d0 0 · If we use the same numerical values as before but with the iris “behind,” the distance to the entrance pupil is: t f1 25 mm 100 mm 100 zNP = · = · = mm t f1 25 mm 100 mm − 3 − − 100 zNP 3 mm 4 MNP = = − =+ −25 mm − 25 mm 3 4 40 dNP =+ 10 mm = mm 3 · 3 This is the diameter of the incoming beam at the lens, so the focal ratio is:

feff 100 mm f/#= = 40 = f/7.5 dNP 3 mm 84 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Three examples of systems: the first is a single thin lens with the aperture stop at the lens, so the stop coincides with the entrance and exit pupils; the second moves the iris “in front” of the lens so that it is also the entrance pupil; in the third, the iris is behind the lens and the magnified diameter of the entrance pupil is the relevant parameter for the focal ratio. 2.15 STOPS AND PUPILS 85

2.15.3 Example: Exit Pupils of Telescopic Systems

Galilean Telescope

In the example of a telescopic system, such as binoculars, composed of an objective lens L1 with diameter d1 and an eyelens L2 with diameter d2, where the two lenses are separated by the sum of their focal lengths. Consider the specific example of a Galilean telescope with f1 = +200 mm, D1 =50mm, f2 = 25 mm, D2 =25mm,andt = f1 + f2 = 175 mm.Wehavealreadyseenthatthe angular magnification− of the system is the ratio of the focal lengths of the two lenses:

f1 +200 mm Mθ = = =+8 −f2 − 25 mm − To determine which element is the aperture stop for a ray incident from an object at infinity, we need to determine where this ray strikes the second lens. In this case, it strikes well within the lens diameter — the ray height from the first lens is:

d t 175 mm 25 d y = 1 1 =25mm 1 = mm = 3.125 mm < 2 2 · − f · − 200 mm 8 2 µ 1 ¶ µ ¶ so the first lens is the aperture stop, and therefore also the entrance pupil.

Location of aperture stop for the specified Galilean telescope. Since the ray from infinity that strikes the edge of the positive lens passes well within the boundary of the negative lens, the aperture stop is the positive lens for an object at infinity.

The exit pupil is the image of the aperture stop (first lens) seen through the second lens, which has negative focal length, ensuring that the exit pupil will be virtual. The distance from the stop to the second lens is: z2 = t = f1 + f2 =175mm and the image distance from the second lens is:

z2 f2 175 mm ( 25 mm) 175 z20 = · = · − = mm = 21.875 mm z2 f2 175 mm ( 25 mm) − 8 − − − − 86 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Figure 2.1:

The size of the exit pupil is determined from the transverse magnification:

175 z20 8 mm 1 MT = = − =+ −z2 − 175 mm 8

Since the diameter of the stop is d1 =50mm, the diameter of the exit pupil is: 1 dXP = MT dStop =+ 50 mm = +6.25 mm · 8 · For the Galilean telescope, the exit pupil is virtual (located 21.875 mm “behind” the eyelens) and small.

Keplerian Telescope

Now repeat the analysis for a corresponding Keplerian telescope with f1 =+200mm, d1 =50mm, f2 =+25mm, d2 =25mm, t = f1 + f2 = 225 mm and angular magnification:

f1 +200 mm Mθ = = = 8 −f2 − +25 mm −

Again, the height of the ray at the edge of the first lens from an object at infinity has height at the second lens: d t 225 mm 25 y = 1 1 =25mm 1 = mm = 3.125 mm 2 · − f · − 200 mm − 8 − µ 1 ¶ µ ¶ d y < 2 | | 2 The first element is still the stop and the entrance pupil. The image of the first lens through the 2.15 STOPS AND PUPILS 87 second is the exit pupil; its location and size are determined using the thin-lens imaging equation:

z2 = t = f1 + f2 = 225 mm z2 f2 225 mm 25 mm 225 z20 = · = · = mm = +28.125 mm z2 f2 225 mm 25 mm 8 − 225 − z20 8 mm 1 MT = = = −z2 −225 mm −8 1 dXP = dStop MT =50mm = 6.25 mm · · −8 − µ ¶ The exit pupil is “real” (outside of the system at a distance of 28.125 mm beyond the eyelens) and inverted. In both of the telescopes just considered, note that the diameter of the exit pupil is the ratio of thefocallengthoftheeyepieceandthefocalratiooftheobjectlens:

f2 d1 d1 dXP = = = f1 f1 Mθ d1 f2 ³50 mm´ ³ ´ (d ) = =6.25 mm XP Galilean +8 50 mm (dXP) = = 6.25 mm Keplerian 8 − − In words, the diameter of the exit pupil is equal to the ratio of the diameter of the entrance pupil (which is the objective in this case) and the magnifying power; more power means a smaller exit pupil. Common binoculars used for birdwatching are listed as “10 50,” which means that the angular magnification (magnifying power) is 10 and the diameter of the× entrance pupil (which is that of the objective lens0 is 50 mm / 2in. The diameter of the eyelens is: 50 mm d = =5mm XP 10 Until recently, the most common variety of binocular was the “7 50,” which has a magnifying power of 7 and objectives with d =50mm, so the diameter of the exit× pupil is: 50 mm d = 7mm XP 7 ' This is a close match to the diameter of the iris of the dark-adapted eye and thus are a good choice for astronomical viewing; for that reason, 7 50 binoculars were known as “night glasses.” When used with the smaller iris diameter of the eye× during daytime, much of the diameter of the exit pupil would illuminate the opaque iris and not contribute to the brightness of the image on the retina.

For a formerly common amateur telescope with a mirror objective with d1 =6in∼= 150 mm and afocallengthf1 =48in∼= 1220 mm, the focal ratio is: 48 in f/#= =8 6in so the diameter of the exit pupil is when viewed through an eyelens with focal length f2 is f f d = 2 = 2 XP f/# 8

If the focal length of the eyelens is f2 =25mm∼= 1in, then the diameter of the exit pupil is about 3mm, which is pretty small. If the focal length of the eyelens is f =4mm 1 in, the magnifying 2 ∼= 6 88 CHAPTER 2 RAY (GEOMETRIC) OPTICS power of the system is: f 48 in M = 1 = +288 θ ∼= 1 f2 6 in which is a large number that will impress a naive user. BUT the diameter of the exit pupil is very small f 1 in 1 d = 2 = 6 = in = 0.5mm XP 8 8 48 ∼ so it would be very difficult to “see” anything through this telescope. This illustrates the flaw in the strategy that was once used often by manufacturers of cheap telescopes intended as gifts for children; the manufacturers would often quote a very large value for the magnifying power that required an eyepiece with a very short focal length and therefore a very small exit pupil. The images were very difficult to see by novices and experienced users alike. The location of the exit pupil also is important. It is useful to have it placed “outside” the imaging system where the eye would be located so that it is feasible to get all of the light through the pupil into the eye. The distance from the rear vertex of the system to the exit pupil is the eye relief : V0E0 = eye relief An imaging system with “lots of” eye relief may be easier to view through, since the location where the eye is optimally placed is back away from the eyelens. An example of a system that needs a large eye relief is a rifle scope, where the eyepiece lens will be located “far” in front of the viewing eye. For different object distances, it is possible for the aperture stop to “move around,” i.e., the element that defines the aperture stop may change with object distance. The locations and sizes of the pupils are determined by applying the ray-optics imaging equation to these objects. To some, the concept of finding the “image of a lens” may seem confusing, but it is no different from before — just think of the lens as a regular opaque object at its location and find the images through the optics that come after (for the exit pupil) or that came before (entrance pupil). Which element in a multielement system is the “stop” depends on the relative sizes of the lenses. In the first case shown below, the first lens (the objective) is small enough that it acts as the stop (and thus also the entrance pupil). The image of the objective lens seen through the eyelens is the exit pupil, and is “between” the two lenses and very small. Because the exit pupil is small and “remote” (located “within” the optical system), so is the field of view of the Galilean telescope. In the second example, the smaller eyelens is the stop and also the exit pupil, while the image of the eyelens seen through the objective is the entrance pupil and is far behind the eyelens and relatively large.

More Examples of Galilean and Keplerian Telescopes

Consider the two two-lens telescope designs. The Galilean telescope has a positive-power objective and a negative-power ocular or eyelens. The Keplerian telescope has a positive objective and a positive eyelens. Assume that the objective is identical in the two cases with f1 = +100 mm and d1 =30mm. The focal lengths and diameters of the oculars (eyepieces) are f = 15 mm and ± d2 =+15mm(these are the approximate dimensions and focal lengths of the lenses in the OSA Optics Discovery Kit). The lenses of a telescope are separated by f1 + f2,(f1 + f2 ∼= 85 mm and 115 mm for the Galilean telescope and Keplerian telescope, respectively). We want to locate the stops and pupils. The stop is found by tracing a ray from an object at through the edge of the first element and finding the ray height at the second lens. If this ray height∞ is small enough to pass through the second lens, then the first lens is the stop; if not, then the second lens is the stop. 2.15 STOPS AND PUPILS 89

Galilean telescope for object at z1 =+ : (a) the objective lens is the aperture stop and entrance pupil because it limits the cone of entering∞ rays. The image of the stop seen through the eyelens is the (very small) exit pupil; (b) the larger objective means that the eyelens is the aperture stop and the exit pupil. The image of the eyelens seen through the objective is the entrance pupil, and is behind the eyelens because the object distance to the objective is less than the focal length.

Consider the Galilean telescope first. The ray height at the firstlensisthe“semidiameter”ofthe d1 lens: 2 =15mm; it is not called the “radius” to avoid confusion with a “radius of curvature.” From there, the ray height would decrease to 0mm at a distance of f1 = +100 mm, but it first encounters the negative lens at a distance of t =+85mm. The ray height at this lens is 100 mm 85 mm − 15 mm = 2.25 mm 100 mm ·

d2 which is much smaller than the lens semidiameter of 2 =7.5mm. Hence the first lens (the objective lens) is the stop. Theentrancepupilistheimageofthestopthrough all of the elements that come before the stop. In this case, the first lens is also the entrance pupil and its transverse magnification is unity. The exit pupil is the image of the stop through all elements that come afterwards, which is the negative lens. The distance to the “object” is f1 + f2 =85mm, so the imaging equation is used to locate the exit pupil and determine its magnification: 1 1 1 1 + = = 85 mm z0 f2 15 mm − 1 1 1 − 51 z0 = = mm = 12.75 mm −15 mm − 85 mm − 4 − µ ¶ z0 12.75 mm MT = = − =0.15 − z − 85 mm The exit pupil is upright, but more important, its distance from the second lens is negative; the exit pupil is a virtual image and not accessible to the eye. The viewer “sees” the exit pupil in front of 90 CHAPTER 2 RAY (GEOMETRIC) OPTICS theeye.Thislimitsthefield of view of the Galilean telescope. Follow the same procedure to determine the stop and locate the pupils and their magnifications for the Keplerian telescope. The ray height at the first lens for an object located at is again 15 mm. The ray height decreases to 0mm at the focal point, but then decreases still farther∞ until encountering the ocular lens at a distance of f1 + f2 = 115 mm.Therayheighth at this lens is determined from similar triangles: 15 mm 100 mm = = h = 2.25 mm h 15 mm ⇒ − − So the first lens is the stop and entrance pupil (with unit magnification) in this case too. The distance from the stop to the second lens is f1 + f2 = 115 mm, so the imaging equation for locating the exit pupil and determining its magnification is: 1 1 1 1 + = = 115 mm z0 f2 +15 mm 1 1 1 − 69 z0 = + =+ mm = +17.25 mm 15 mm − 115 mm 4 µ ¶ z0 +17.25 mm MT = = = 0.203 − z − 85 mm ∼ − The exit pupil is a real image of the aperture stop in the Keplerian telescope — we can place our eye at it and see a larger field of view.

Vignetting The location of the aperture stop is determined for an object located “on” the optical axis. If the object is “off” the axis, the cone of rays that get throught the system is “skewed” or “tilted.” If other elements in the system (lenses or diaphragms) constrain parts of the skewed cone of rays, then the cone of rays is truncated and the brightness of the image is reduced; this phenomenon is “vignetting.”

Example of vignetting; the brightness of the scene at the edges is reduced due to the presence of an “out-of-focus” aperture in the system.

2.15.4 Pupils and Diffraction The concept of pupils may be combined with diffraction to evaluate the effective focal ratio (f/number) of the imaging system. For a single thin lens, the diffraction spot is determined by the size and shape specified by the pupil function p [x, y] or p (r) and the distance to the image. If the lens has a circular 2.16 MARGINAL AND CHIEF RAYS 91

pupil of diameter d0, the pupil function r p (r)=CY L d µ 0 ¶ determines the extent of the ray cone that enters the system. We derived the resulting diffraction pattern, which is proportional to a scaled circularly symmetric sombrero function, which is the analogue of the SINC function using the first-order Bessel function, and therefore is sometimes called the “besinc” function. πd2 r h (r) 0 SOMB ∝ 4 · ⎛ λ0z2 ⎞ d0 ⎝³ ´⎠ If the object distance is large, then the image distance z2 ' f and the amplitude of the impulse response is: r h (r) SOMB ∝ ⎛ λ0f ⎞ d0 ⎝ ⎠ The diameter of the Airy disk is approximately: ³ ´

f D0 = 2.44λ0 = 2.44 λ0 f/# ∼ d ∼ · · µ 0 ¶ 2.15.5 Field Stop As suggested by its name, a fieldstoplimitsthefield of view of the system. It may be as simple as the finite size of the sensor (e.g., a rectangular piece of photosensitive emulsion or a CCD sensor), or it may be placed at an intermediate image within the system or even at the object itself. Images of the field stop are located at the same locations as intermediate images of the object.

2.16 Marginal and Chief Rays

Many important characteristics of an optical system, including the possible presence of vignetting, are determined by the trace of two specific rays through the imaging system. For an object O with image O0,aperturestopS and entrance pupil E and exit pupil E0,themarginal ray traces from the center of O totheedgeofS and back to the center of O0.Thechief ray (or principal ray) is traced from the edge of O (or edge of the “field of view”) hrough the center of S to the edge of O0.SinceE and E0 are images of the stop S, the marginal and chief rays also go through the edges and centers of the pupils, respectively. The marginal ray is specified by its ray heights y and ray angle u at different points on the optical axis; the corresponding notation for the chief ray includes “overscores” or “bars:” y, u. Heights and angles of the marginal ray after refraction at a surface are “primed,” e,g, y0 and u0. The corresponding quantities for the chief ray are y0,andu0. From the definition of the marginal ray, an object or image is located at any location (value of z)wherey =0. Similarly, the aperture stop, entrance pupil, and exit pupil are located at values of z where y =0. An image exists wherever the marginal ray crosses the axis and the aperture stop or pupils are located wherever the chief ray crosses the axis. Complete specification of these two rays is sufficient to characterize the location of object and image(s), the fieldofview,andthe magnifications. The chief ray is the axis of the unvignetted light beam from a point at the edge of the field of view. The radius of the unvignetted light beam (or perhaps more appropriately called the semidiameter to avoid potential confusion with the “radius of curvature) is the sum of the heights of the marginal and chief rays: d unvignetted = y + y at any location z 2 92 CHAPTER 2 RAY (GEOMETRIC) OPTICS

Figure 2.2: The marginal and chief rays for a two-element imaging system where the second element is the stop. The marginal ray comes from the center of the object O, grazes the edge of the stop and through the center of the image O0. The chief ray travels fromt the edge of the object through the center of the stop to the edge of the image.

Because paraxial calculations are linear, it is customary to normalize the ray heights and angles for the calculation and then scaling the results to satisfy the conditions of the specificsystem.For example, we generally select the chief ray height y =1and the marginal ray angle u =1at the object. Clearly the choice of unit ray angle (in radians) is inconsistent with the paraxial approximation, but this is just a computational convenience because all quantities are scalable.

2.16.1 Telecentricity

If the aperture stop is located such that the entrance and/or exit pupils are at infinity, then the system is telecentric. One way to do this is to place the aperture stop at one of the focal points of the system, which means that the corresponding pupil is at the same location and the other pupil is at infinite. As shown in the figure, if the stop is located at the object-space focal point of a single thin lens, then the entrance pupil is at the same location and the exit pupil is at infinity in image space — this is an image-space telecentric system. 2.16 MARGINAL AND CHIEF RAYS 93

Telecentric system consisting of single thin lens with aperture stop placed at object-space focal point, showing chief ray (solid blue) and marginal ray (red). The chief intersects the optical axis at that focal point and so emerges from the lens parallel to the optical axis. The dashed blue lines parallel to the chief ray intersect at the image. The defocused image is the same height as the focused image.

If the stop is located at the image-space focal plane, then the entrance pupil is at infinity, forming an object-space telecentric system. If either the entrance or exit pupil is at infinity, then the chief ray must be parallel to the optical axis on that side of the imaging system. This means that the system transverse magnification will be constant even if the image is blurry. Put another way, a blurred image has the correct magnification. A “double telecentric” system is an afocal system (telescope) with the stop located at the common focal plane of the two lenses. This means that both the entrance and exit pupils are at infinity. The fact that the magnification of the system does not depend on accuracy of focusing makes telecentric systems particularly useful for metrology.

Double telecentric system with the aperture stop at the common focal point of the two lenses. The marginal ray is shown in red and the chief ray in solid blue. 94 CHAPTER 2 RAY (GEOMETRIC) OPTICS

2.16.2 Marginal and Chief Rays for Telescopes The marginal ray of an afocal system used to image an object at infinity travels parallel to the optical axis before the first lens and after the last (u =0,u0 =0).Therelativesizesofthetwo lenses determine which is the aperture stop — for a Galilean telescope, the aperture stop is usually thenegativeocularlens MORE TO COME Chapter 3

Tracing Rays Through Optical Systems

The imaging equation(s) become quite complicated in systems with more than a very few lenses. However, we can determine the effect of the optical system by ray tracing,wheretheactionon two (or more) rays is determined. Raytracing may be paraxial or exact. Historically, graphical, matrix, or worksheet ray tracing were commonly used in optical design, but most ray tracing is now implemented in computer software so that exact solutions are more commonly implemented than heretofore.

3.1 Paraxial Ray Tracing Equations

Consider the schematic of a two-element optical system made of thick lenses, so the vertices and principal planes of individual lenses do not coincide at the same points.

Schematic of ray tracing of a provisional marginal ray from an object at an infinite distance. The th system has two elements and the locations Hn and Hn0 are the principal planes of the n element. th The ray height at the n element is yn and the ray angle during transfer between elements n 1 − and n is un. The two elements are represented by their two principal “planes”, which are the planes of unit magnification. The refractive power of the first element changes the ray angle of the input ray. In the example shown, the input ray angle u1 =0radians, i.e., the ray is parallel to the optical axis. The height of this ray above the axis at the object-space principal plane H1 is y1 units. The ray

95 96 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

Figure 3.1: Refraction of a paraxial ray at a surface with radius of curvature R between media with refractive indices n and n0. The ray height and angle at the surface are y and u, respectively. The angle of the ray measured at the center of curvature is α. The height and angle immediately after refraction are y and u0. The object and image distances are s and s0 (which are now called z and z0 in the text).

emerges from the principal plane H10 at the same height y1 but with a new ray angle u2.Theray “transfers” to the second element through the distance t2 in the index n2 and has ray height y2 at principal plane H2. The ray emerges from the principal plane at the same height but a new angle u3.

3.1.1 Paraxial Refraction

Consider refraction of a paraxial ray emitted from the object O at a surface with radius of curvature R. For a paraxial ray, the surface may be drawn as “vertical”. The height of the ray at the surface is y. From the drawing, the incoming ray angle u measured from the optical axis is:

1 y y u =tan− = > 0 z ∼ z h i and the corresponding equation for the outgoing ray measured from the optical axis is:

1 y y u0 =tan− ∼= > 0 z0 z0 h i The angle of the height of the ray at the refractive surface measured from the center of curvature is:

1 y y α = tan− = − R ∼ −R h i The incident and refracted angles measured from the surface at height y are the angles of incidence and refraction. From the drawing:

i = u α − i0 = u0 α − 3.1 PARAXIAL RAY TRACING EQUATIONS 97

Now apply Snell’s law in the paraxial approximation:

n sin [i]=n0 sin [i0]= n i = n0 i0 ⇒ · ∼ · n (u α)=n0 (u0 α) · − · − = n0u0 ∼= nu nα + n0α = nu + α (n0 n) ⇒ y − − = nu + (n0 n) −R · − ³ (n0´ n) = nu y − − · R nu y φ ≡ − ·

n0u0 = nu y φ ∼ − · The paraxial refraction equation in terms of the incident angle u, refracted angle u0,rayheighty, 1 surface power φ = , and indices of refraction n and n is: f 0

n0u0 nu φ = − y

3.1.2 Paraxial Transfer

Paraxial transfer from one surface to the next in a medium with refractive index n0.

The transfer equation determines the ray height y0 at the next surface given the initial ray height y, the physical distance t0 and the ray angle u0 in the medium with index n0. From the drawing, we have:

y0 = y + t0 u0 · t0 y0 = y + (n0u0) n · µ 0 ¶ where the substitution was made to put the ray angle in the same form n0u0 that appeared in the t0 refraction equation. The distance t0 is called the reduced thickness (note the potential for n0 ≤ 98 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

t0 confusing reduced thickness and optical path length n0t0). n0

3.1.3 Linearity of the Paraxial Refraction and Transfer Equations

Note that both the paraxial refraction and transfer equations are linear in the height and angle, i.e., neither includes any operations involving squares or nonlinear functions (such as sine, tangent, or logarithm). Among other things, this means that they may be scaled by direct multiplication to obtain other “equivalent” rays, as to match the marginal ray height to the semidiameter of the aperture stop or the chief ray angle to the semidiameter of the field stop. For example, the output angle may be scaled by scaling the input ray angle and the height by a constant factor α:

α (nu yφ)=α (nu) (α y) φ = α (n0u0) − · − · We will take often advantage of this linear scaling property to scale rays to to find the exact marginal and chief rays from the provisional counterparts.

3.1.4 Paraxial Ray Tracing

To characterize the paraxial properties of a system, two provisional rays are traced:

1. Initial height of marginal ray at first surface: y =1.0, initial marginal ray angle nu =0;

2. Initial height of chief ray at first surface: y =0.0, initial chief ray angle nu =1.

We have already named these rays; the first is the provisional marginal ray that intersects the optical axis at the object (and thus also at every image of the object). The second ray (distinguished by the overscore) is called the provisional chief (or principal) ray and travels from the edge of the object to the edge of the field of view through the center of the stop (and thus through the center of the pupils, which are images of the stop). Since the paraxial ray tracing equations are linear, these provisional rays may be scaled to the parameters of the system.

The process of ray tracing is perhaps best introduced by example. Consider a two-element three-surface system. The first surface is the cornea, with radius of curvature in the model of R1 =+7.8mm. The “aqueous humor” between the corneaandthelenshasathicknessofinthe model of 3.6mm and refractive index of n2 =1.336. The surfaces of the lens have curvatures R2 =+10mm,andR3 = 6mm,thicknessof3.6mm,andrefractiveindexn3 =1.413.The“vit- − reous humor” between the lens and the retina has the same refractive index of n4 =1.336 as the “aqueous humor.” 3.1 PARAXIAL RAY TRACING EQUATIONS 99

Marginal and chief rays traced through the three-surface optical system.

The refraction at the first surface changes the angle but not the height of a ray from the object. If the incident ray angle is 0 radians, then the new ray angle for the provisional marginal ray is:

1 (n0u0) =(nu) y1 [mm] φ mm− 1 1 − · 1 =0 (1.0) (+0.043077) − £ ¤ = 0.043077 radian − Note that we are retaining 6 decimal places in this calculation to ensure the best result at the end. We will then truncate (round) the value to a more reasonable accuracy.

The transfer equation for the provisional marginal ray between the first and second surface changes the height of the ray but not the angle. The height at the second surface is:

t y = y + 0 (n u ) [mm] 10 1 n 0 0 1 µ 0 ¶1 3.6 =1+ ( 0.043077) = +0.883924 mm 1.336 −

Thus the ray exits the first surface at the “reduced angle” n0u0 = 0.04 radians and arrives at the ∼ − second surface at height y0 ∼= +0.88 units. The corresponding equations for the chief ray at the first surface are:

(n0u0) =(nu) y φ 1 1 − 1 1 =1 (0.0) (+0.043077) − =1radian 100 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

t y = y + 0 (n u ) 10 1 n 0 0 1 µ 0 ¶1 3.6 =0+ (1) = +2.694611 mm = 2.695 mm 1.336 ∼ Since the provisional chief ray went through the center of the lens, its angle did not change. The height of the chief ray at the second surface is proportional to the ray angle.

Ray-Tracing Table The equations may be evaluated in sequence to compute the rays through the system. These are presented in the table. Each column in the table represents a surface in the system and the “primed” quantities refer to distances and angles following the surface. In words, t0 in the first row are the distances from the surface in the column to the next surface.

Parameter Initial Surface 1 Surface 2 Surface 3 Image Surface

R +7.8mm +10.0mm 6.0mm − t0 3.6mm 3.6mm n0 1.0 1.336 1.413 1.336 φ = n0 n 0.043077 mm 1 0.007700 mm 1 0.012833 mm − − R− − − − − − t0 3.6mm =2.694611 mm 3.6mm =2.54771 mm 12.699 mm 1.336 1.413 n0 ⇓ Rays ⇓ y 1mm 1mm 0.883924 mm 0.756833 mm 0mm n u 0 0.043077 radian 0.049883 radian 0.059596 radian 0.059596 radian 0 0 − − − − y 0mm 2.694611 mm 5.189519 mm 16.779317 mm

n0u0 1 radian 1 radian 0.979251 radian 0.912654 radian

The raytrace indicates that the provisional marginal ray emerges from the last surface with height y 0.756833 mm and angle = . These are used to calculate the (boxed) distance to ⎡ n0u0 ⎤ ⎡ 0.059596 radians ⎤ − the image location⎣ ⎦ (where⎣ the marginal ray⎦ height is 0):

t0 y0 =0=y + (n0u0) n0 t 0=(+0.756833) + 0 ( 0.059596) n0 − t0 +0.756833 = = ∼= +12.699 mm ⇒ n0 0.059596

This is the “reduced distance” in the image medium with index n4; the physical distance t0 is: +0.756833 = t0 = mm n0 =12.699 1.336 = 16.966 mm ⇒ 0.059596 · · ∼

The height and angle of the provisional chief ray at the image location are y ∼= 16.78 mm and n0u0 ∼= 0.91 radians, respectively, which may be scaled to the size of a known sensor to determine the field of view. This particular system is often used as a model for the human eye with the lens “relaxed” to view objects at .Thefirst surface represents the cornea of the eye, while the other two surfaces are ∞ 1 the front and back of the lens. Note that the power of the cornea (0.043077 mm− ∼= 43 diopters) is considerably larger than the powers of the lens surfaces (7.7 diopters and 12.8 diopters, respectively).

3.2 Matrix Formulation of Paraxial Ray Tracing

The same linear paraxial ray tracing equations may be conveniently implemented as matrices acting on ray vectors for the marginal and chief rays whose components are the height and angle. The ray 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 101 vectors may be defined as:

y paraxial marginal ray vector : ⎡ nu ⎤ ⎣ ⎦ y paraxial chief ray vector : ⎡ nu ⎤ ⎣ ⎦ Note that there is nothing magical about the convention for the ordering of y and nu (i.e., which goes “on top” of the vector); this is the convention used by Roland Shack at the Optical Sciences Center at the University of Arizona, but Willem Brouwer’s book “Matrix Methods in Optical Instrument Design” uses the opposite order. Note that the choice of convention here determines the form of the system matrix, but the two choices are equivalent.

In this notation, the two column vectors that represent the marginal and chief rays may be combined to form a ray matrix L:

y y y y L = ≡ ⎡⎛ nu ⎞ ⎛ nu ⎞⎤ ⎡ nu nu ⎤ ⎣⎝ ⎠ ⎝ ⎠⎦ ⎣ ⎦ which may be evaluated at any point in the system. The determinant of this ray matrix is:

det [L]=y (nu) (nu) y · − · ≡ℵ which we shall show to be a constant — the so-called Lagrange invariant. In words, the Lagrange invariant is the product of the chief ray height and marginal ray angle subtracted from the product of the marginal ray height and chief ray angle. We denote it by the symbol (“aleph,” chosen here for the simple reason that it is distinctive). We shall see that is unaffectedℵ by both the refraction and transfer, and therefore is invariant as we progress throughℵ different locations in the system.

3.2.1 Refraction Matrix

Given the ray vectors or the ray matrix, we can now define operators for refraction and transfer. Recall that paraxial refraction of a marginal ray and of a chief ray at a surface with power φ changes the ray angles but not the heights (at the surfaces):

n0u0 = nu y φ for marginal ray − · n0u0 = nu y φ for chief ray − · The refraction process for the marginal ray may be written as a matrix and the output is the product with the ray vector which will have the same ray height and a diffRerent angle:

y y y y = R ⎡ nu nu ⎤ ⎡ n0u0 n0u0 ⎤ ⎣ ⎦ ⎣ ⎦ ac = R ⎡ bd⎤ ⎣ ⎦ 102 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS where we need to evaluate the four values a d. Consider the action of the refraction matrix on the marginal ray: −

y ac y y = = R ⎡ nu ⎤ ⎡ bd⎤ ⎡ nu ⎤ ⎡ n0u0 ⎤ ay + c⎣ (nu⎦)=⎣y = ⎦a⎣=1,c⎦=0⎣ ⎦ · ⇒ by + d (nu)=n0u0 = nu y φ = b = φ,d=1 · − · ⇒ − substitute these values to see the form of the refraction matrix:

10 = R ⎡ φ 1 ⎤ − ⎣ ⎦ The determinant of the refraction matrix is:

10 det =det =(1)(1) ( φ)(0)=1 R ⎡ φ 1 ⎤ − − − ⎣ ⎦ The action of a refraction matrix on a ray matrix L is: R

L = L0 R 10 y y y0 y0 = ⎡ φ 1 ⎤ ⎡ nu nu ⎤ ⎡ n0u0 n0u0 ⎤ − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y y = ⎡ nu y φ nu y φ ⎤ − · − · ⎣ ⎦ The determinant of the ray matrix after refraction is:

det L0 = y (nu y φ) y (nu y φ) − · − − · = y nu yy φ y nu + yy φ £ ¤ · − · − · · = y nu y nu = =det[L] · − · ℵ which confirms that the Lagrangian invariant is not affected by refraction.

3.2.2 Ray Transfer Matrix

The transfer of the marginal ray from one surface to the next within the medium with index n0 is

t0 y0 = y + (n0u0) n0 which also may be written as the product of a ray matrix with the marginal ray vector: T

t0 y y +(n0u0) n = ⎡ µ 0 ¶ ⎤ T ⎡ n u ⎤ 0 0 n0u0 ⎣ ⎦ ⎢ ⎥ ⎣ t0 ⎦ 1 y y0 = n0 = ⎡ µ ¶ ⎤ ⎡ ⎤ ⎡ ⎤ 01 n0u0 n0u0 ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 103 so the determinant of the transfer matrix also is 1:

t0 1 t0 det n0 =(1)(1) (0) =1 ⎡ µ ¶ ⎤ − n 01 µ 0 ¶ ⎢ ⎥ ⎣ ⎦ The action of the transfer matrix on the ray matrix L is: T

t0 y0 y0 1 y y L0 = L = = n0 T ⎡ ⎤ ⎡ µ ¶ ⎤ ⎡ ⎤ n0u0 n0u0 01 n0u0 n0u0 ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ t0 ⎣ t0 ⎦ y + n0u0 y + n0u0 n · n · = ⎡ µ 0 ¶ µ 0 ¶ ⎤ n0u0 n0u0 ⎢ ⎥ ⎣ ⎦ and the determinant of the ray matrix after the transfer operation is:

det [L0]=det[ L] T t0 t0 = y0 + n0u0 (n0u0) y0 + nu (n0u0) n − n µ µ 0 ¶ ¶ µ µ 0 ¶ ¶ t0 t0 = y0 n0u0 + n0u0 n0u0 y0 nu n0u0 n0u0 · n · − · − n · µ 0 ¶ µ 0 ¶ = y0 n0u0 y0 n0u0 = =det[L] · − · ℵ so the determinants of the ray matrix before and after refraction are also identically the Lagrangian invariant ; in other words, neither the refraction nor the transfer matrices has any effect on the determinantℵ of a ray matrix, so the Lagrangian invariant is preserved by refraction or transfer (hence its name!).

Ray Transfer Matrix for an Optical System

The refraction and transfer matrices may be combined in sequence to model a complete system. If we start with the marginal ray vector at the input object, the first operation is transfer to the first surface. The next is refraction by that surface, transfer to the next, and so forth until a final transfer to the output image: L = L T nRn ···T2R2T 1R1T 0 object image If the initial ray matrix is located at the object (as¡ usual), the¢ marginal ray height is zero, so the ray matrix at the object and any images has the form:

0 y L = in object ⎡ ⎤ (nu)in (nu)in ⎣ ⎦ 0 y L = out image ⎡ ⎤ (nu)out (nu)out ⎣ ⎦ 104 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS so the system from object to image is:

S ≡ T nRn ···T2R2T 1R1T 0 L = L S · object image 0 y 0 y ( ) in = out T nRn ···T2R2T 1R1T 0 ⎡ ⎤ ⎡ ⎤ (nu)in (nu)in (nu)out (nu)out ⎣ ⎦ ⎣ ⎦ Note that the individual refraction and transfer matrices are sequenced in inverse order, i.e., the last matrix is the first in the sequence for the system. The transfer matrix 0 acts on the input ray matrix, so it must appear on the right. T

Ray Matrix for Provisional Marginal and Chief Rays

The system is characterized by using provisional marginal and chief rays located at the object. The linearity of the computations ensure that the rays may be scaled subsequently to satisfy other system constraints, such as the diameter of the stop. The provisional marginal ray at the object has height y =0and ray angle nu =+1, while the provisional chief ray at the object has height y =+1and angle nu =0. Thus the provisional ray matrix at the object is:

01 L0 = ⎡ 10⎤ ⎣ ⎦

3.2.3 “Vertex-to-Vertex Matrix” for System

We can construct a matrix that represents JUST the optical system by excluding the input ray matrix, the transfer matrix from object to object-space vertex, the transfer from image-space vertex to image, and the output ray matrix. This subset is the “vertex-to-vertex matrix” of the VV0 system and is a complete specification of the paraxial properties of the system. TheM general form for the matrix is: AB VV =( n 2 2 1 1)= M 0 R ···T R T R ⎡ CD⎤ where A, B, C, D are factors to be determined from the various⎣ refractions⎦ and transfers for a specific system. The entries A and D in the matrix are “pure” numbers (without units), while B and D have dimensions of length and reciprocal length, respectively. From matrix algebra, it is possible to show that the determinant of the matrix product is the product of the determinants. We already know that the determinants of the matrices for any transfer or refraction is unity, which establishes a constraint on the vertex-to-vertex matrix:

det [ VV ]=detn det n 1 det 2 det 1 det 1 M 0 R · T − ····· R · T · R =11 1 1=1 · ····· · = det [ ]=1 ⇒ MVV0 = AD BC =1 ⇒ − 1 Consider a simple example of the matrix for a two-lens system with powers φ =(f )− VV0 1 1 1 M and φ2 =(f2)− separated by t. The product of the two refraction matrices and the transfer matrix 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 105 is:

= MVV0 R2T 1R1 10 1 t 10 = ⎡ φ 1 ⎤ ⎡ 01⎤ ⎡ φ 1 ⎤ − 2 − 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 1 φ tt = − 1 ⎡ (φ + φ φ φ t)1 φ t ⎤ − 1 2 − 1 2 − 2 ⎣ ⎦ 1 φ1tt VV = − M 0 ⎡ φ 1 φ t ⎤ − eff − 2 ⎣ ⎦ where the known expression for the system power 1 1 1 t = + = φeff = φ1 + φ2 φ1 φ2 t feff f1 f2 − f1 f2 ⇒ − · · · has been substituted in the last expression. It is easy to confirm that the determinant of this system matrix is unity. We have four equations in the four unknowns A, B, C, D, which may be combined to find useful systems metrics in terms of the elements in the vertex-to-vertex matrix : MVV0 1 1 effective focal length of system feff = = φeff −C FV D front focal length FFL = = n − C V F A back focal length BFL = 0 0 = n −C VH D 1 distance from front vertex to object-space principal point = − n C H V 1 A distance from image-space principal point to rear vertex 0 0 = − n0 C V0O0 t2 m A B At1 distance from rear vertex to image (if obj. dist. t1 is known) = = − = − n0 n0 C −D Ct1 1 − D OV t1 − m B + Dt2 distance from object to front vertex (if image dist. t2 is known) = = = n n C A + Ct2 When evaluating matrices, note that you need to retain plenty of significant figures in the calcu- lation (at least 6) to ensure that the derived values are sufficiently accurate.

3.2.4 Example 1: System of Two Positive Thin Lenses

To illustrate, consider the system of two thin lenses in the last section with f1 = +100 mm, f2 = 200 +50 mm,andt =75mm, which we showed to have f =+ mm = 66.7mm. The system matrix eff 3 ∼ is:

1 φ1ttAB VV = − = M 0 ⎡ (φ + φ φ φ t)1 φ t ⎤ ⎡ CD⎤ − 1 2 − 1 2 − 2 ⎣ ⎦ ⎣ ⎦ 10175mm 10 1 75 mm = = 4 ⎡ 1 1 ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤ ⎡ 3 1 ⎤ − 50 mm − 100 mm − 200 mm − 2 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 106 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS and its determinant evaluates to one:

1 75 mm det 4 =1 ⎡ 3 1 ⎤ − 200 mm − 2 ⎣ ⎦ From the values in the last section, we can see that

B =75mm=t 1 200 = mm = feff −C 3 which in turn demonstrates our old result that the power of a two-lens system is: 1 1 1 t C = = φ = φ1 + φ2 φ1φ2t = + −feff ⇒ − f1 f2 − f1f2 The input ray matrix consists of the provisional marginal and chief rays at the object, which “pass through” the transfer matrix from object to front surface. For example, if the object is located 1000 mm from the front vertex, the transfer matrix is:

1 1000 mm 0 = T ⎡ 01⎤ ⎣ ⎦ If a ray is “cast out” from the center of the object (y =0)at an angle of 1 radian, the

y 0 y0 1000 mm = = = T 0 ⎤ T 0 ⎤ ⎡ ⎤ ⎡ nu ⎡ 1 n0u0 ⎤ ⎡ 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ In words, the height of the provisional marginal ray at the front vertex is 1000 mm and the angle is 1 radian, a HUGE angle, but remember that all equations in this paraxial assumption are linear, so the angle and ray height can be scaled to any value. The emerging provisional marginal ray is:

1 4 75 mm 1000 mm 325 mm y0 3 = 31 = ⎡ 1 ⎤ ⎡ 1 ⎤ ⎡ ⎤ ⎡ n u ⎤ −200 mm − 2 − 2 0 0 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ In words, the marginal ray from an object 1000 mm at an angle of 1 radian at the front vertex of the 31 lens emerges from the image-space vertex with height y0 = 325 mm and angle of n0u0 = 2 radians. To find the location of the image, find the distance until the marginal ray height y =0− ,whichis the location of the image:

t 325 mm 1 0 325 mm 0 V0O0 = 31 = n0 31 = 31 T ⎡ ⎤ ⎡ 01⎤ ⎡ ⎤ ⎡ ⎤ − 2 − 2 − 2 ⎣ ⎦ 31⎣ t ⎦ ⎣ ⎦ ⎣ ⎦ = 325 mm + 0 =0 ⇒ − 2 · n µ 0 ¶ t 2 650 = 0 = 325 mm + =+ mm = +20.97 mm ⇒ 1 · 31 31 ∼ µ ¶ which agrees with the result obtained earlier. We observed that the transverse magnification of the image in this configuration is

z0 H0O0 2mm MT = = = = 0.064 − z − OH −31 mm ∼ − 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 107 so the provisional marginal ray at the image point is:

y0 0 0 = = 31 1 ⎡ n u ⎤ ⎡ ⎤ ⎡ M − ⎤ 0 0 − 2 T ⎣ ⎦ ⎣ ⎦ ⎣ ⎦

The marginal ray out of the vertex-to-vertex matrix for the object distance OV = 1000.

Back Focal Length (BFL) The image of an object located at is the image-space focal point of the system. This ray enters the system with angle nu =0and∞ arbitrary height, which we can model as y =1. The emerging ray is: 1 1 75 1 4 4 ⎡ 3 1 ⎤ = ⎡ 3 ⎤ ⎡ 0 ⎤ −200 −2 −200 ⎢ ⎥ ⎣ ⎦ ⎢ ⎥ 1 ⎣ ⎦ 3 ⎣ ⎦ The ray height is 4 mm and the angle is n0u0 = 200 . The distance to the point where the ray height is zero is the back focal distance: −

1 t 1 1 0 0 BFL = V F = 4 = n0 4 = 0 0 T ⎡ 3 ⎤ ⎡ ⎤ ⎡ 3 ⎤ ⎡ 3 ⎤ 01 200 ⎢ −200 ⎥ ⎢ −200 ⎥ − 1 ⎣ 3 ⎦t ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ = + 0 =0 ⇒ 4 −200 mm · n µ 0 ¶ t 1 200 mm 100 = 0 = = mm = 16.7mm ⇒ 1 4 × 3 6 ∼

Front Focal Length (FFL): Ray Through “Reversed” System To find the front focal distance, we can trace the “provisional” marginal ray “backwards” through the system, or trace it through the “reversed” system where the lenses are placed in the opposite order. The “reversed” system matrix is:

1 10 175 10 75 ( ) = = −2 VV0 reversed 1 1 ⎡ 3 ⎤ M ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ 1 100 50 4 − − ⎢ −200 ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Note that the “diagonal” elements of the “forward” and “reversed” vertex-to-vertex matrices are “swapped”, while the “off-diagonal” elements are identical. 108 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

If the input ray height is 1 and the angle is 0, the outgoing ray from the reversed matrix is:

1 1 1 75 1 mm mm 100 −2 = −2 = FFL = FV = −2 =+ mm ⎡ 3 1 ⎤ ⎡ ⎤ ⎡ 3 ⎤ ⇒ 3 3 4 0 ⎢ −200 ⎥ ⎢ −200 ⎥ −200 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ µ ¶

3.2.5 Example 2: Telephoto Lens

To illustrate, we apply the vertex-to-vertex matrix for the thin-lens telephoto considered in the last section with f1 = +100 mm, f2 = 25 mm,andt =+80mm: − 101+80mm 10 = VV0 1 1 M ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ − 25 mm −100 mm ⎣ − 1 ⎦ ⎣ ⎦ ⎣ ⎦ 80 mm 1 φ tt = 5 = − 1 ⎡ 1 21 ⎤ ⎡ ⎤ (φ1 + φ2 φ1φ2t)1 φ2t ⎢ −500 mm 5 ⎥ − − − ⎣ 1 ⎦ ⎣ ⎦ = feff = = +500 mm ⇒ −C A 1 = BFL = = ( 500 mm) = +100 mm ⇒ −C − 5 · − µ ¶ D 21 = FFL = = ( 500 mm) = +2100 mm ⇒ − C − 5 · − µ ¶ VH D 1 21 = = − = 1 ( 500 mm) = 1600 mm = HV = +1600 mm ⇒ n C 5 − · − − ⇒ µ ¶ VH D 1 21 = = − = 1 ( 500 mm) = 1600 mm = HV = +1600 mm ⇒ n C 5 − · − − ⇒ µ ¶ H V 1 A 1 = 0 0 = − = 1 ( 500 mm) = 400 mm = V H = +400 mm ⇒ n C − 5 · − − ⇒ 0 0 0 µ ¶ If the object is located 1000 mm from the first surface, the ray matrix at the front vertex of the system is :

y 0 0 = 0 T ⎡ nu ⎤ T ⎡ 1 ⎤ ⎣ ⎦ ⎣ ⎦ 1 1000 mm 0 1000 mm = ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The height of the provisional marginal ray at the front vertex is 1000 units and the angle is 1 radian, which are huge values, but can be scaled to any value because all equations are linear.

1 80 mm 1000 mm 280 mm y 5 ⎡ 1 21 ⎤ = 11 = ⎡ 1 ⎤ ⎡ ⎤ ⎡ nu ⎤ 5 ⎢ −500 mm 5 ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ In words, the marginal ray from an object 1000 mm in front of the lens emerges with height 280 mm 11 and angle of + radians. 5 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 109

To find the location of the image, findthedistanceuntilthemarginalrayheighty =0:

t 280 mm 1 0 280 mm 0 V O = = n0 = 0 0 T ⎡ 11 ⎤ ⎡ ⎤ ⎡ 11 ⎤ ⎡ 11 ⎤ 5 01 5 5 ⎣ ⎦ 11⎣ t ⎦ ⎣ ⎦ ⎣ ⎦ = 280 mm + + 0 =0 ⇒ 5 · n µ 0 ¶ t 5 1400 = 0 = 280 mm = mm = 127.3mm ⇒ 1 · −11 − 11 ∼ − µ ¶ which indicates that the image is virtual. (Figure out why!) The magnification of the image in this configuration is

z0 OH mm 2 MT = = = − z −H0O0 mm −31

3.2.6 Derived From Two Rays MVV0

Consider the action of the vertex-vertex matrix on two rays that we know both before and after the system. For two arbitrary (but noncollinear) rays, we have:

y1 y0 = 1 MVV0 ⎡ ⎤ ⎡ ⎤ nu1 nu10 ⎣ ⎦ ⎣ ⎦ y2 y0 = 2 MVV0 ⎡ ⎤ ⎡ ⎤ nu2 nu20 ⎣ ⎦ ⎣ ⎦ In actual use, the marginal ray and chief ray are the rays of choice. The marginal ray goes from the center of the object to the center of the image while grazing the edge of the aperture stop (and therefore the edge of the entrance and exit pupils), while the chief ray goes from the edge of the object through the center of the aperture stop (and therefore of the pupils) to the edge of the image. The vertex-vertex matrix applied to the incoming marginal from the center of the object yields the emerging marginal ray:

y y0 = VV0 M ⎡ nu ⎤ ⎡ n0u0 ⎤ and the same relation for the chief ray is: ⎣ ⎦ ⎣ ⎦

y¯ y¯0 = VV0 M ⎡ nu¯ ⎤ ⎡ n0u¯0 ⎤ ⎣ ⎦ ⎣ ⎦ We can combine the two vectors to form a 2 2 matrix: ×

y y¯ y0 y¯0 = VV0 M ⎡ nu nu¯ ⎤ ⎡ n0u0 n0u¯0 ⎤

⎣ L⎦ = ⎣L0 ⎦ MVV0 110 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

We can now use the properties of the 2 2 matrix to derive the form of vertex-vertex matrix: × 1 1 ( L) L− = L0L− VV0 M 1 1 ( L) L− = LL− = I MVV0 MVV0 MVV0 · 1 = L0L− = ⇒ ¡ MVV¢ 0 In words, we can evaluate the vertex-vertex matrix from its action of the marginal and chief rays. The inverse of the input-ray matrix is easy to derive:

y y¯ L = ⎡ nu nu¯ ⎤ ⎣ ⎦ 1 1 nu¯ y¯ = L− = − ⇒ det L · ⎡ nu y ⎤ − ⎣ ⎦ 1 nu¯ y¯ = − y nu¯ y¯ nu ⎡ nu y ⎤ · − · − ⎣ ⎦ 1 nu¯ y¯ − ≡ ⎡ nu y ⎤ ℵ − ⎣ ⎦ where y nu¯ y¯ nu is the previously defined Lagrangian invariant. So the vertex-vertex matrix hasℵ≡ the form:· − ·

y0 y¯0 1 nu¯ y¯ = − MVV0 y nu¯ y¯ nu ⎡ n0u0 n0u¯0 ⎤ · ⎡ nu y ⎤ µ · − · ¶ − ⎣ ⎦ ⎣ ⎦ 1 y0 y¯0 nu¯ y¯ = − · ⎛⎡ n0u0 n0u¯0 ⎤ ⎡ nu y ⎤⎞ ℵ − ⎝⎣ ⎦ ⎣ ⎦⎠ 1 y0 nu¯ y¯0 nu y y¯0 y¯ y0 = · − · · − · · ⎡ n0u0 nu¯ n0u¯0 nu n0u¯0 y n0u0 y¯ ⎤ ℵ · − · · − · ⎣ ⎦ y0 y¯0 y0 y¯ ⎡ ¯ ¯ ¯ ¯ ⎤ 1 ¯ nu nu¯ ¯ ¯ y0 y¯0 ¯ = ¯ ¯ ¯ ¯ · ⎢ ¯ ¯ ¯ ¯ ⎥ ⎢ ¯ nu nu¯ ¯y y¯¯ ⎥ ℵ ⎢ ¯ ¯ ¯ ¯ ⎥ ⎢ − ¯ ¯ ¯ ¯ ⎥ ⎢ ¯ n0u0 n0u¯0 ¯ ¯ n0u0 n0u¯0 ¯ ⎥ ⎢ ¯ ¯ ¯ ¯ ⎥ ⎣ ¯ ¯ ¯ ¯ ⎦ ¯ ¯ ¯ ¯ where we have used the shorthand notation¯ for the¯ determinant¯ ¯ in the last expression:

y0 y¯0 y0 y¯0 det = ⎡ ⎤ ¯ ¯ nu nu¯ ¯ nu nu¯ ¯ ¯ ¯ ⎣ ⎦ ¯ ¯ ¯ ¯ ¯ ¯ 3.3 Object-to-Image (Conjugate) Matrix

The vertex-vertex matrix applied to a “test ray” with height y and angle u in index n from the object to the front vertex is: 3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX 111

y AB y y0 = = VV0 M ⎡ nu ⎤ ⎡ CD⎤ ⎡ nu ⎤ ⎡ nu0 ⎤

⎣ y⎦0 = ⎣A y + B⎦ ⎣(nu)⎦ ⎣ ⎦ · · nu0 = C y + D (nu) · · For rays emerging from one plane and converging to the corresponsing “conjugate” plane (the image), the output ray height at the image is a function ONLY of the image ray height — the angles of all rays at the object do not matter, since they all converge to the image. In mathematical terms:

y0 = Ay + B (nu)=f [y] (does not depend on angle) · = B =0 ⇒ = y0 = A y ⇒ ·

We know the relationship between y0 and y is the transverse magnification: y 0 = M = A y T

rays (a, b, c) diverge from the object and converge as (a0,b0,c0) to form the image; the choice of specific ray angle at the object has no effect on the location of the convergence — only the heights of the rays at the object matter.

If we define the angular magnification to be the ratio of the angles “from” the object and “to” the image:: ∆u0 = Mθ ≡ ∆u we can find a relatiohsip from the matrices:

n0u0 = C y + D (nu1) 1 · · n0u0 = C y + D (nu2) 2 · · 112 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

Evaluate the difference of these:

n0 (u0 u0 )=C y C y + D (nu2 nu1) 2 − 1 · − · · − n0 ∆u0 = n D (∆u) · · · ∆u0 n = Mθ = D ⇒ ∆u ≡ n0 · n0 = D = Mθ ⇒ n ·

We can combine these two observations to see the form of the “conjugate-to-conjugate” matrix:

MT 0 = OO0 ⎡ 1 n0 ⎤ M Mθ −f n · ⎢ eff ⎥ ⎣ ⎦ We know that the determinant of this matrix must also be one, which implies that:

n0 n0 1 MT Mθ =1 = Mθ = · n ⇒ n MT so we can also write the conjugate matrix as:

MT 0 OO = 1 1 M 0 ⎡ ⎤ −feff MT ⎣ ⎦ The principal planes H and H0 are those for which MT =+1

+1 0 HH = 1 M 0 ⎡ +1 ⎤ −feff ⎣ ⎦ The points of equal conjugates are related by MT = 1, so the object-image matrix for these points is: − 10 − OO = 1 M 0 ⎡ 1 ⎤ −feff − ⎣ ⎦

We can include the translation matrices from object to vertex and from vertex to image along with the vertex-to-vertex matrix : MVV0 AB VV = M 0 ⎡ CD⎤ ⎣ ⎦ The matrix that relates two conjugate planes (object O and image O0) may be obtained by adding transfer matrices for the appropriate distances from the object to the front vertex t1 = n1 OV · ¡ ¢ 3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX 113

and from the rear vertex to the image t2 = n2 V O ,whichyieldsforn1 = n2 =1: · 0 0 ¡ ¢ 1 t2 1 t1 OO = VV M 0 ⎡ 01⎤ •M 0 • ⎡ 01⎤ ⎣ ⎦ ⎣ ⎦ 1 t2 AB 1 t1 = ⎡ 01⎤ ⎡ CD⎤ ⎡ 01⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ A + t2C (A + t2C) t1 + B + t2D = ⎡ CCt1 + D ⎤ ⎣ ⎦ MT 0 = 1 ⎡ φ ⎤ − MT ⎣ ⎦

1 = MT = A + t2C =(Ct1 + D)− ⇒ φ = C − 0=(A + t2C) t1 + B + t2D

We know that the marginal ray heights at the object and image are zero (yin = yout =0),which sets some limits on the “conjugate-to-conjugate” matrix. Apply this matrix to the ray matrix L at the object and at the image:

L = L0 MOO0 A + t2C (A + t2C) t1 + B + t2D 0 y 0 y in = out ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ CCt1 + D (nu)in (nu) in (nu)out (nu)out ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 1 Evaluate the inverse matrix L− and apply to both sides from the right:

1 1 ( L) L− = L0 L− MOO0 1 − A + t2C (A + t2C) t1 + B + t2D ¡ ¢ 0 y 0 y = out in ⎡ ⎤ ⎡ ⎤ · ⎡ ⎤ CCt1 + D (nu)out (nu)out (nu)in (nu)in ⎣ ⎦ ⎣ y ⎦ ⎣ ⎦ out 0 y = ⎡ in ⎤ (nu) (nu) (nu) (nu) (nu)out out· in− out· in y (nu) ⎢ in in (nu) ⎥ ⎢ in ⎥ ⎣ ⎦ yout The ratio of the chief ray heights at the object and image is the transverse magnification MT , yin ≡ (nu) 1 µ ¶ whereas the ratio of the marginal ray angles out = (nu)in MT 114 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

Example: System with Two Positive Thin Lenses

Again, consider the example of a system composed of two thin lenses with f1 = +100 mm, f2 = +50 mm,andt =+75mm:

1 10175mm 10 75 mm = = 4 VV0 1 1 ⎡ 3 1 ⎤ M ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ 50 mm 100 mm − − ⎢ −200 mm −2 ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ From the table of properties of the matrix, we see that: 1 200 feff = =+ mm −C 3 D 100 FFL = FV = = mm − C − 3 A 50 BFL = V F = =+ mm 0 0 −C 3 D 1 VH = − = +100 mm C A 1 H V = − =+50mm 0 0 C which again match the results obtained before. The matrix that relates the object and image planes for the two-lens system presented above is:

650 1 2 1 75 1 1000 0 = 31 4 = −31 2 VV0 1 ⎡ 3 1 ⎤ ⎡ 3 31 ⎤ T M T ⎡ 01⎤ ⎡ 01⎤ ⎢ −200 −2 ⎥ ⎢ −200 − 2 ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ which has the form of the principal plane matrix except the diagonal elements are not both unity. However, note that they are reciprocals of teach other, so that

2 0 −31 det ⎡ 3 31 ⎤ =1 ⎢ −200 − 2 ⎥ ⎣ ⎦ 2 We had evaluated the transverse magnification in this configuration to be MT = ,sowenote −31 that the upper-left component of the conjugate-to-conjugate matrix is the transverse magnification. The general form of a conjugate-to-conjugate matrix is:

MT 0 OO = 1 M 0 ⎡ φ ⎤ − MT ⎣ ⎦ and the specific form that relates the principal planes with MT =1is

10 HH = M 0 ⎡ φ 1 ⎤ − ⎣ ⎦ This is the matrix of the equivalent “single thin lens.”

3.3.1 Matrix of the “Relaxed” Eye (focused at ) ∞ The vertex-to-vertex matrix for the three refractions and two transfers is: 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 115

t0 t0 10 1 2 10 1 1 10 = n20 n10 VV0 ⎡ ⎤ ⎡ ⎤ M ⎡ φ 1 ⎤ ⎡ φ 1 ⎤ ⎡ φ 1 ⎤ − 3 01 − 2 01 − 1 ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ where the individual terms evaluate to:⎣ ⎦ ⎣ ⎦

n10 n1 1.336 1 2 1 1 φ1 = − = − =4.3077 10− mm− =43.077 m− =43.077 Diopters R1 7.8mm × t 3.6mm 10 = =2. 694 6 mm n10 1.336

n20 n2 1.413 1.336 2 1 φ2 = − = − =0.77 10− mm− =7.7 Diopters R2 10 mm × t 3.6mm 20 = =2.5478mm n20 1.413

n30 n3 1.336 1.413 2 1 φ3 = − = − =1.2833 10− mm− =12.833 Diopters R3 6mm × − so the vertex-to-vertex matrix has the form:

0.756 83 5.1895mm VV = 0 2 1 M ⎡ 5.959 6 10− mm− 0.912 65 ⎤ − × ⎣ 2 1 1 ⎦ = feye = 5.959 6 10− mm− − =+16.780 mm ⇒ × 2 1 1 = φ =5.9596 10− mm− = 59.596 m− = 60 Diopters ⇒ eye ¡ × ¢ − ∼ Arayfrominfinity has a ray angle of zero, but the ray height is determined from the diameter of the iris. If we assume that the iris diameter is 1mm, then the output ray vector is:

0.75683 5.1895 mm 1mm 0.756 83 mm y0 = = 2 1 2 ⎡ 5.9596 10− mm− 0.91265 ⎤ ⎡ 0 ⎤ ⎡ 5.959 6 10− ⎤ ⎡ n0u0 ⎤ − × − × ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 3.4 Vertex-Vertex Matrices of Simple Imaging Systems

We now get to where the “rubber meets the road;” the discussion of simple examples of actual imaging systems. It is useful to emphasize the point that optical systems may create a real image that may be “sensed” by a CCD or photographic emulsion, while those for human viewing will produce virtual images or are afocal (image at infinity).

3.4.1 Magnifier (“magnifying glass,” “loupe”) The magnifier or loupe is a lens (or system of lenses) with positive focal length that is used to increase the size of the image on the retina than could be formed with the eye alone. Recall that when the ciliary muscles that deform the eye lens are relaxed, the lens becomes “flatter,” increasing the focal length. To view an object “close up,” the focal length of the lens must shorten by making the lens more spherical. The closest distance to an object that appears to be sharply focused by the unaided eye is the “near point,” which (obviously) depends on the flexibility of the deformable eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual. The distance to the near point may be as close as 50 mm for a young child and 1000 mm 2000 mm for an elderly person. This reduction in “accommodation” is one − of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼= 10 in from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing the angular subtense of fine details for those individuals. For this reason, nearsighted individuals 116 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS in ancient times (before optical correction) often were attracted to professions requiring fine work, such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in these crafts. In use, the object is held closer to the eye than the near point and viewed through the positive lens, which in turn is held closer to the eye than its focal length to create a virtual image “behind” the lens at the near point. If the focal length of the magnifying lens is f = 100 mm and the image is distance is z1 =10mm, the object-to-image matrix is:

1 250 mm 101 z mm = − OO0 1 M ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 01⎤ −50 mm ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 6(6z 250) mm = · − ⎡ 1 1 1 z ⎤ − 50 mm − 50 ⎣ ⎦ Since this has the form of an “object-to-image” matrix, the off-diagonal element in the upper-right corner must evaluate to zero: 250 mm 2 (6 z 250) mm = 0 = z = =41 mm · − ⇒ 6 3 The diagonal element in the upper-left corner of the “object-to-image” matrix is the transverse magnification 250 mm M =+6=1+ T f This is the transverse magnificxation of the magnifier if the image is at the near point. If the object is located at the object-space focal point, then the image is at infinity:

1 mm 10150mm = ∞ OO0 1 M ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 01⎤ −50 mm ⎣ 1 ⎦ ⎣ ⎦1⎣ ⎦ 6 z (z 250) z z 6 mm = − 50 − − 50 − ⎡ 1 ∙ µ 1 ¶¸ ⎤ 1 z ⎢ −50 mm − 50 ⎥ ⎣ ⎦ 0 = ∞1 ⎡ 0 ⎤ −50 mm ⎣ ⎦

3.4.2 Galilean Telescope of Thin Lenses

The Galilean telescope is an afocal system formed from an objective lens with positive power and an eyelens with negative power separated by the sum of the focal lengths. If the focal length of the objective and eyelens are f1 = +200 and f2 = 25 units, the separation t = (200 25) = 175 units. The system matrix is: − −

1 101 175 mm 10 175 mm = = 8 VV0 ⎡ 1 ⎤ ⎡ 1 ⎤ M 1 ⎡ 01⎤ 1 ⎡ ⎤ −( 25 mm) −(+200 mm) 08 ⎢ − ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Note that the system power φ =0= feff = , as it must be for an afocal system (both object- and image-space focal points at infinity).⇒ The∞ ray from an object at with unit height generates ∞ 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 117 the outgoing ray: 1 1 175 mm 1mm y0 [ mm] mm 8 = = 8 ⎡ 08⎤ ⎡ 0 ⎤ ⎡ n0u0 ⎤ ⎡ 0 ⎤

⎣ 1 ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so the outgoing ray is at height 8 and the angle is zero; both incoming and outgoing rays are parallel to the axis. Note that the diagonal elements of are positive and the determinant is 1. MVV0 For a “provisional” chief ray into the system with height 0 and angle 1, the outgoing ray is:

1 175 mm 0 y [ mm] 175 mm 8 = = ⎡ 08⎤ ⎡ 1 ⎤ ⎡ nu ⎤ ⎡ 8 ⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ So the outgoing ray angle is 8 times larger; this is the angular magnification of the telescope; the image is upright since the incoming and outgoing ray angles are both positive. The form of an afocal system is: 1 0 (afocal system) = mθ MVV00 ⎡ ⎤ 0 mθ ⎣ ⎦

3.4.3 Keplerian Telescope of Thin Lenses

The Keplerian telescope with f1 = +200 and f2 =+25units with separation t = (200 + 25) = 225 units. The system matrix is:

101 225 mm 10 1 225 mm = − 8 ⎡ 1 1 ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤ ⎡ 0 8 ⎤ − (25 mm) − (+200 mm) − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The diagonal elements are negative, the determinant is 1, and the system power φ =0= feff = . The outgoing ray angle is 8,whichspecifies that the angular magnification is 8 and the⇒ image∞ is inverted. − The ray from an object at with unit height generates the outgoing ray: ∞ 1 1 225 mm 1mm y0 [ mm] mm − 8 = = − 8 ⎡ 0 8 ⎤ ⎡ 0 ⎤ ⎡ n0u0 ⎤ ⎡ 0 ⎤ − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so the outgoing ray is at height 1 — the image is “inverted” and the angle is zero. − 8 The “provisional” chief ray into the system has height 0 and angle 1; the outgoing ray is:

1 225 mm 0 y [mm] 225 mm − 8 = 0 = ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 0 8 1 n0u0 ⎤ ⎡ 8 − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ So the outgoing ray angle is 8 times larger than the incoming ray but negative (which implies that the image is inverted).

3.4.4 Thick Lenses

The matrix method is convenient for thick lenses. If the thick lens is made of glass with n0 =1.5, radii of curvature R1 =+50mm,andR2 = 100 mm,andthicknesst0 (which we shall vary). It is useful to evaluate the focal length of the single− “thin” lens with these radii and refractive index 118 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS from the lensmaker’s equation:

1 1 1 =(n 1) f − · R − R µ 1 2 ¶ 1 1 1 − 200 2 f = (1.5 1.0) =+ mm = 66 mm − · 50 mm − 100 mm 3 3 µ µ − ¶¶

The powers of the two surfaces are:

n0 n 1.5 1 0.5 1 φ1 = − = − =+ =+ R1 50 mm 50 mm 100 mm n n0 1 1.5 0.5 1 φ2 = − = − = − =+ R2 100 mm 100 mm 200 mm − − so if the thickness is zero, the focal length evaluates to:

φ = φ + φ φ φ t eff 1 2 − 1 · 2 · 1 1 1 1 = + + + + + 0 100 mm 200 mm − 100 mm · 200 mm · µ ¶ µ ¶ µ ¶ µ ¶ 3 = 200 mm 1 1 t 200 feff = + =+ mm f1 f2 − f1 f2 3 · which agrees with the result obtained from the lensmaker’s equation.

The system matrix for the lens with thickness t0 may be evaluated with this parameter:

= MVV0 R2T 1R1 t 101 0 mm 10 = 1 1.5 1 ⎡ + 1 ⎤ ⎡ ⎤ ⎡ + 1 ⎤ − 200 mm 01 − 100 mm ⎢ µ ¶ ⎥ ⎢ µ ¶ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 1 0.006666 7 t0 0.666 6667 t0 mm = − · · 1 1 ⎡ (0.0033333 t0 1) 1 0.003333 3 t0 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ Note that the thickness t0 is present in each of the four terms in the matrix. Now we can derive matrices for different values of the thickness: t0 =0mm, 1mm, 2mm, 5mm,and10 mm,wherewe substitute into the table of properties to find the BFL, FFL, VH,andH0V0:

t0 =0mm(thin lens)

1 0.006666 7 00.666 6667 0mm VV (t0 =0mm)= − · · M 0 ⎡ 1 (0.003333 3 0 1) 1 1 0.003333 3 0 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ 10 = ⎡ 3 1 ⎤ − 200 mm ⎣ ⎦ 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 119

1 200 2 feff = =+ mm = 66 mm −C 3 3 D 1 200 FFL = FV = = =+ mm = feff − C − 3 3 − 200 mm A 1 200 BFL = V F = = ¡ ¢ =+ mm = feff 0 0 −C − 3 3 − 200 mm D 1 (1 1) VH = − = − ¡ =0mm¢ C 41 −50 mm µ ¶ A 1 (1 1) H V = − = − =0mm 0 0 C 41 −50 mm µ ¶ All quantities correspond to the values we would expect for the single thin lens: the front and back focal lengths are identical to the effective focal length, which means that the principal points coincide with the vertices — they are all located AT the lens.

t0 =1mm

1 0.006666 7 10.666 6667 1mm VV (t0 =1mm)= − · · M 0 ⎡ 1 (0.0033333 1 1) 1 1 0.003333 3 1 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ 0.993 33 0.666 67 mm = ⎡ 1 0.996 67 ⎤ − 66.814 mm ⎣ ⎦ 1 feff = = 66.814 mm −C ∼ D 0.996 67 FFL = FV = = =66.592 mm − C − 1 − 66.814 mm A 0.993 33 BFL = V F = = ¡ ¢ =66.368 mm 0 0 −C − 1 − 66.814 mm D 1 (0.996 67 1) VH = − = ¡ − =0.2225¢ mm C 1 − 66.814 mm A 1 (0.993 33 1) H V = − = ¡ − ¢ =0.4456 mm 0 0 C 1 − 66.814 mm So the object- and image-space principal planes¡ are within¢ the lens and close to the surfaces. Note that the front and back focal lengths are slightly different: the image-space principal point is “more within the lens” since the second surface has less power than the front surface.

t0 =2mm

1 0.006666 7 20.666 6667 2mm VV (t0 =2mm)= − · · M 0 ⎡ 1 (0.0033333 2 1) 1 1 0.003333 3 2 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ 0.986 67 1.3333 mm = 2 1.493 3 10− ⎡ × 0.993 33 ⎤ − mm ⎣ ⎦ 120 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

1 1 feff = = = 66.966 mm C 1.493 3 10 2 ∼ − − × − − mm D³ 0.993´ 33 FFL = FV = = =66.519 mm C 1.493 3 10 2 − − × − − mm A ³ 0.986 67 ´ BFL = V0F0 = = =66.073 mm C 1.493 3 10 2 − − × − − mm D 1 (0.993³ 33 1) ´ VH = − = − =0.4467 mm C 1.493 3 10 2 × − − mm A 1 ³(0.986 67 1)´ H0V0 = − = − =0.8926 mm C 1.493 3 10 2 × − − mm ³ ´ Note that the same “behavior” exists for this lens: the image-space principal point is farther “inside” the lens than the object-space principal point.

t0 =5mm

1 0.006666 7 50.666 6667 5mm VV (t0 =5mm)= − · · M 0 ⎡ 1 (0.0033333 5 1) 1 1 0.003333 3 5 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ 0.966 67 3. 3333mm = 2 = feff = 67.417 mm 1. 483 3 10− ∼ ⎡ × 0.983 33 ⎤ ⇒ − mm ⎣ ⎦ 1 1 feff = = = 67.417 mm C 1. 483 3 10 2 ∼ − − × − − mm D³ 0.983´ 33 FFL = FV = = =66.293 mm C 1. 483 3 10 2 − − × − − mm A ³ 0.966 67 ´ BFL = V0F0 = = =65.170 mm C 1. 483 3 10 2 − − × − − mm D 1 (0.983³ 33 1) ´ VH = − = − =1.1238 mm C 1. 483 3 10 2 × − − mm A 1 ³(0.966 67 1)´ H0V0 = − = − =2.247 mm C 1. 483 3 10 2 × − − mm ³ ´ t0 =10mm

1 0.006666 7 10 0.666 6667 10 mm VV (t0 =10mm)= − · · M 0 ⎡ 1 (0.003333 3 10 1) 1 1 0.003333 3 10 ⎤ 100 mm · − − 200 mm − · ⎣ ⎦ 0.933 33 6.6667mm = 2 1. 466 7 10− ⎡ × 0.966 67 ⎤ − mm ⎣ ⎦ 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 121

1 1 feff = = = 68.180 mm C 1.466 7 10 2 ∼ − − × − − mm D³ 0.966´ 67 FFL = FV = = =66.293 mm C 1.466 7 10 2 − − × − − mm A ³ 0.933 33 ´ BFL = V0F0 = = =63.635 mm C 1. 466 7 10 2 − − × − − mm D 1 (0.966³ 67 1) ´ VH = − = − =2.2724 mm C 1. 466 7 10 2 × − − mm A 1 ³(0.933 33 1)´ H0V0 = − = − =4.5456 mm C 1. 466 7 10 2 × − − mm ³ ´

From these results, we see that the effective focal length gets LONGERasthelensgetsTHICKER for the same radii of curvature and that the image-space principal point “penetrates” more inside the lens as the lens thickness is increased.

3.4.5 Microscope

A simple microscope is also composed of two lenses (assumed to be “thin” in this discussion, though the optical components generally are composed of multiple elements). The distance t between the image-space (rear) focal point of the first lens and the object-space (front) focal point of the ocular (the “tube length”) is fixed, often at t = 160 mm.Thefirst lens (the “objective”) has a (very) short focal length and the object typically is placed just “outside” its object-space focal point so that z1 ' f1. The objective generates a real image between the objective and eyepiece (or “ocular”), whichisalenswithashortfocallengthusedasasimplemagnifier.

Assume f1 =5mm, f2 =50mm

101 160 mm 10 = VV0 ⎡ 1 ⎤ ⎡ 1 ⎤ M 1 ⎡ 01⎤ 1 −( 50 mm) −(5 mm) ⎢ − ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 31 160 mm = − ⎡ 41 21 ⎤ −50 mm 5 ⎣ ⎦ 31 160 mm det − =1 ⎡ 41 21 ⎤ −50 mm 5 ⎣ ⎦ 122 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

1 50 feff = =+ mm = +1.220 mm −C 41 ∼ 21 D 5 210 FFL = FV = = µ ¶ =+ = 5.12 mm − C − 41 41 mm ∼ − −50 mm µ ¶ A 31 1550 BFL = V F = = − = = 37.8mm 0 0 −C − 41 −41 mm ∼ − −50 mm µ ¶ 21 1 D 1 5 − 160 VH = − = µ ¶ = mm = 3.902 mm C 41 − 41 ∼ − −50 mm µ ¶ A 1 31 1 1600 H V = − = − − = mm = 39.02 mm 0 0 C 41 41 −50 mm µ ¶

101 160 mm 1013mm = OV0 ⎡ 1 ⎤ ⎡ 1 ⎤ M 1 ⎡ 01⎤ 1 ⎡ 01⎤ −( 50 mm) −(5 mm) ⎢ − ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 31 160 mm = − ⎡ 41 21 ⎤ −50 mm 5 ⎣ ⎦ 3.5 Image Location and Magnification

1 1 1 + = z1 z2 f z2 f MT = = in usual case −z1 ∼ −z1

1 1 1 1 1 1 − z1f + = = z2 = = z1 z2 f ⇒ f − z1 z1 f µ ¶ − z2 f f MT = = = f if z1 f −z1 −z1 f ∼ −z1 ∝ À − In words, if the object distance z1 is large (compared to the focal length f), then the transverse magnification is (approximately) proportional to the focal length. Therefore, doubling the focal length doubles the magnification if the object is distant (with the caveat that the magnification is still negative and smaller than unity, 1

y y¯ y y¯ L = = ⎡⎛ nu ⎞ ⎛ nu¯ ⎞⎤ ⎡ nu nu¯ ⎤ det [L]=y⎣⎝nu¯ ⎠y¯⎝nu ⎠⎦ ⎣ ⎦ · − · ≡ℵ 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 123

The marginal ray goes through the center of the object and any image(s) (i.e., the point where the marginal ray crosses the optical axis is either the object or an image of the object). It also “grazes” the edge of the aperture stop, so if we know the location and the diameter of the aperture stop in the system, we can scale the height of the marginal ray so that its height matches the semidiameter of the aperture stop at that location. The chief ray goes through the center of the stop (and of the entrance and exit pupils), so we set the chief ray height at the location of the stop to be zero and its angle to be arbitrary (say unity), then propagate that provisional ray “forward” towards the image-space vertex and “backwards” towards the object-space vertex (note that when tracing “backwards” toward the first lens, the matrices in the ray trace must be inverted). During the tracing, we find the element that most constrains the chief ray, and then scale the height of the provisional chief ray to make sure that it gets “through” the other elements. The angle of the chief ray emerging from the front vertex to the object is the half-angle of the field of view; the angle of the chief ray emerging from the image-space vertex is the half angle of the image field at the sensor.

3.6.1 Examples of Marginal and Chief Rays for Systems

In the lab, you constructed Keplerian and/or Galilean telescope with an iris diaphragm at various locations. We can use this as a model for demonstrating how to evaluate the marginal and chief rays. To evaluate the location of the stop, we must know the diameters as well as the locations of the lenses. We can cast a provisional marginal ray into the system from the object to determine which element is the aperture stop. We then scale the provisional marginal ray so that its height and the semidiameter of the stop “match.” We then propagate a provisional chief ray forward and backward from the center of the stop and scale its angle so that it grazes the element that constrains it. From the angle of the chief ray entering and exiting the system, we can determine the field of view. We will use the Galilean telescope as the first example.

Example 1: Galilean telescope, object at ∞

Consider a telescope with the following parameters.

L1 : f1 = +200 mm, d1 =40mm

L2 : f2 = 40 mm, d2 =5mm − t = f1 + f2 = 160 mm

10 1 = R ⎡ 1 1 ⎤ − +200 mm ⎣ ⎦ 1 160 mm = T ⎡ 01⎤ ⎣ ⎦ 10 2 = R ⎡ 1 ⎤ 40 mm 1 − − ⎣ ⎦ 124 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

The vertex-vertex matrix of this system is

1 101 160 mm 10 5 160 mm VV = = M 0 ⎡ 1 ⎤ ⎡ ⎤ ⎡ 1 ⎤ ⎡ ⎤ 40 mm 1 01 +200 mm 1 05 − − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 1 5 160 mm VV0 = M ⎡ 05⎤ ⎣ ⎦ for which element C =0, which is characteristic of an afocal system. For an object at at infinity, the provisional marginal ray into the system is has angle of zero and height equal to the semidiameter of the first element. y d1 20 mm = 2 = ⎡ nu ⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ provisional ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ We can propagate this ray through the first lens and translate it to the second lens:

y 1 160 mm 1020 mm 4mm 1 = = TR ⎡ nu ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤ ⎡ 0 ⎤ ⎡ 1 ⎤ provisional − +200 mm − 10 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ In words, the height of the provisional marginal ray at the second lens is 4mm. Note that the ray after the second lens has the form:

1 y 5 160 mm 1mm 4mm VV0 = = M ⎡ nu ⎤ ⎡ 05⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ provisional ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so that the height of the provisional marginal ray at the second lens is the same before and after refraction (no surprise there) and that the ray angle after the second lens is 0 (parallel to the optical axis, again no surprise). Note that the ray height at L2 is larger than the specified semidiameter of the second lens: d2 5mm y0 > = =2.5mm = L2 is aperture stop 2 2 ⇒ This means two things: (1) that the second lens is the aperture stop, and (2) that we must scale the height and angle of the provisional marginal ray to ensure that it grazes the edge of the stop. The scaling factor is the ratio of the height of the provisional marginal ray

d2 2.5mm 5 2 = = y at L 4mm 8 ¡ ¢ 2 We apply this scale factor to the marginal ray at all locations in the system. The marginal ray at the firstlensfromanobjectatinfinite distance is:

y 5 y 5 20 mm 12.5mm = = = ⎡ nu ⎤ 8 · ⎡ nu ⎤ 8 ⎡ 0 ⎤ ⎡ 0 ⎤ at L1 provisional ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y 12.5mm = ⎡ nu ⎤ ⎡ 0 ⎤ at L1 ⎣ ⎦ ⎣ ⎦ which means that the marginal ray strikes the first lens well inside of the semidiameter; the entering “tube” of rays does not fill the lens. 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 125

Now that we know that the second lens is the aperture stop, we can propagate a provisional chief ray from center of the stop in both directions. One possible choice for the provisional chief ray is:

y0 0mm = ⎡ n0u0 ⎤ ⎡ 1 ⎤ provisional ⎣ ⎦ ⎣ ⎦ where again an angle of 1 radian is HUGE, but we will scale it based on the parameters of the rest of the system. Propagate this ray through the system (towards image space) to obtain

0mm 100mm 0mm 2 = = R ⎡ ⎤ ⎡ 1 ⎤ ⎡ ⎤ ⎡ ⎤ 1 40 mm 1 1 1 − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is the stop because it passes through the center of the lens.

The provisional chief ray may be propagated from the stop “backwards” towards the first lens. The translation matrix is inverted because the light is traveling “backwards” because we are traveling from right to left.

1 − 1 0mm 1 +160 mm 0mm 160 mm − = = − T ⎡ 1 ⎤ ⎛⎡ 01⎤⎞ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎣ ⎦ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ The height of the provisional chief ray at the first element is negative, which means that it is BELOW d1 the optical axis at a MUCH LARGER distance than the semidiameter 2 =20mmof L1.Toensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor:

d1 20 mm 1 2 = y 160 mm 8 ¡ ¢ So now go back to the original prescription for the provisional chief ray and scale it to obtain the “actual” chief ray:

y 0mm y 1 y 0mm at L2 = = = = 1 ⎡ nu ⎤ ⎡ 1 ⎤ ⇒ ⎡ nu ⎤ 8 · ⎡ nu ⎤ ⎡ ⎤ provisional provisional 8 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y0 0mm = 1 ⎡ n0u0 ⎤ ⎡ ⎤ at L2 8 ⎣ ⎦ ⎣ ⎦

y0 20 mm = − 1 ⎡ n0u0 ⎤ ⎡ ⎤ at L1 8 ⎣ ⎦ ⎣ ⎦ We can now propagate this ray through L1. The chief ray emerging from the front vertex is:

1 1 0mm − − 0mm 1 1 10 1 +160 mm 1− − 1 = 1 R T ⎡ ⎤ ⎛⎡ 1 1 ⎤⎞ ⎛⎡ 01⎤⎞ ⎡ ⎤ 8 − +200 mm 8 ⎣ ⎦ ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ ⎦ 20 mm = − ⎡ 1 ⎤ 40 ⎣ ⎦ 126 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

Now propagate this chief ray forwards through the system by multiplying by VV M 0 1 160 mm 20 mm 0mm 5 − = ⎡ ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ 05 40 8 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ which has height of zero emerging from L2 (the aperture stop), as expected.

The field of view of the system is twice the angle at the front of L1:

1 1 1 180◦ FoV =2 radian = radian = = 2.864◦ · 40 20 20 · π ∼

The exit pupil is (obviously) located at the aperture stop L2, while the entrance pupil is the image of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation of the chief ray emerging from the front vertex:

y0 20 mm (emerging from front vertex)= − ⎡ ⎤ ⎡ 1 ⎤ n0u0 40 ⎣ ⎦ ⎣ ⎦ 1 The height is 20 mm and the angle is 40 radian, so the distance to the location where the ray crosses the optical axis is: 20 mm zV0NP = − 1 = +800 mm − 40 the distance from the vertex to the entrance pupil is positive, so the pupil is behind the objective and is virtual. The transverse magnification of the entrance pupil is: 800 mm MT = =+5 − 160 mm − so the diameter of the entrance pupil is magnified:

dNP =5 5mm=25mm · 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 127

Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with aperture stop at second lens (eyepiece).

Example 2: Galilean telescope with aperture stop at FIRST lens, object at ∞ We already know that the height of the provisional marginal ray height at the second lens was y =4mm,sowecanselectadiameterforL2 that exceeds this value, so that the aperture stop is now the first lens:

L1 : f1 = +200 mm, d1 =40mm

L2 : f2 = 40 mm, d2 = 10 mm − t = f1 + f2 = 160 mm

The vertex-vertex matrix is the same as before:

1 5 160 mm VV0 = M ⎡ 05⎤ ⎣ ⎦

We know from the results just calculated that if d2 =10mm, then its semidiameter exceeds that height of the provisional marginal ray, so the aperture stop then becomes the first lens. The marginal ray we calculated for the first lens then becomes the actual marginal ray; at the first lens, the marginal ray is: y 20 mm (at L1)= ⎡ nu ⎤ ⎡ 0 ⎤ ⎣ ⎦ ⎣ ⎦ 128 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

and the marginal ray leaving the system after L2 is:

y y0 (after L1)= ⎡ nu ⎤ ⎡ n0u0 ⎤ ⎣ ⎦ ⎣ ⎦ 20 mm = VV0 M ⎡ 0 ⎤ ⎣ ⎦ 1 160 mm 20 mm 4mm = 5 = ⎡ 05⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Since aperture stop has moved to L1 from L2,wehavetoevaluateadifferent chief ray; it will go through the center of L1, so the provisional chief ray at L1 is:

y 0mm (at L1)= ⎡ nu ⎤ ⎡ 1 ⎤ provisional ⎣ ⎦ ⎣ ⎦ After the first refraction, the provisional chief ray is:

y0 100mm 0mm (after L1)= 1 = ⎡ n0u0 ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ provisional −+200 mm ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ which again should be no surprise, since the chief ray goes through the center of L1, the lens has no impact on the ray.

Now propagate the provisional chief ray to L2 by applying the translation matrix:

0mm 1 160 mm 0mm 160 mm = = T ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so the ray height of the chief ray is again MUCH larger than the semidiameter of the lens. The scaling factor that must be applied to the provisional chief ray is the ratio of the semidiameter of L2 to the ray height: d2 5mm 5 1 2 = = = y 160 mm 160 32 ¡ ¢ Therefore the true chief ray at the first lens is:

y 1 y (at L1)= ⎡ nu ⎤ 32 · ⎡ nu ⎤ provisional ⎣ ⎦ ⎣ ⎦ 1 0mm 0mm = = 32 · ⎡ ⎤ ⎡ 1 ⎤ 1 32 ⎣ ⎦ ⎣ ⎦ y 0mm (at L1)= ⎡ ⎤ ⎡ 1 ⎤ nu 32 ⎣ ⎦ ⎣ ⎦ 1 In words, the angle of the chief ray into the first lens (and therefore into the aperture stop) is 32 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 129 radians, so the full-angle fieldofviewofthesystemis: 1 FoV =2 u = radian · 16 1 180 = = 3.58◦ 16 · π ∼ which is larger than the fieldofviewinthefirst case with the smaller diameter for L2. Just for fun, propagate both the marginal and chief rays through the system at the same time:

y y y0 y0 VV0 = M ⎡⎛ nu ⎞ ⎛ nu ⎞⎤ ⎡⎛ nu0 ⎞ ⎛ nu0 ⎞⎤ ⎣⎝ ⎠ ⎝ ⎠⎦ ⎣⎝ ⎠ ⎝ ⎠⎦ 1 160 mm 20 mm 0 mm 4mm 5mm = 5 = ⎡ ⎤ ⎡ 1 ⎤ ⎡ 5 ⎤ 05 0 32 0 32 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 4mm 5mm y0 y0 = = ⎡⎛ ⎞ ⎛ 5 ⎞⎤ ⎡⎛ ⎞ ⎛ ⎞⎤ 0 32 nu0 nu0 ⎣⎝ ⎠ ⎝ ⎠⎦ ⎣⎝ ⎠ ⎝ ⎠⎦ So the ray height of the marginal ray after the second lens is 4mm and the ray angle is 0 radians 5 (propagates to the image at infinity), while the chief ray height after L2 is 5mm and the angle is 32 radians. The full angle of the image field is 10 = 5 radians 17.9 . 32 16 ∼= ◦

Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with stop at first lens.

The entrance pupil coincides with the aperture stop in this system, while the exit pupil is the image of the aperture stop seen through L2. The object distance to the stop is f1 + f2 = 160 mm,so the exit pupil distance is:

z1 f2 160 mm ( 40 mm) zXP = · = · − = 32 mm z1 f2 160 mm ( 40 mm) − − − − 130 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS and the diameter of the exit pupil is: 32 mm dXP = MT 40 mm = − 40 mm = +8 mm · − 160 mm ·

Example 3: Galilean telescope with aperture stop between lenses, object at ∞

Now consider the result if we place an iris diaphragm with diameter d =8mmmidway between L1 and L2. The prescription for the system is:

L1 : f1 = +200 mm, d1 =40mm

L2 : f2 = 40 mm, d2 =10mm − t = f1 + f2 = 160 mm

S : VS =80mm, SV0 =80mm,dStop =8mm

The matrix for the imaging elements is unchanged:

1 5 160 mm VV0 = M ⎡ 05⎤ ⎣ ⎦ but we need to confirm that the new iris is the aperture stop. Cast in a provisional marginal ray from an object at infinity:

20 mm 1020 mm 20 mm 1 = = R ⎡ 0 ⎤ ⎡ 1 1 ⎤ ⎡ 0 ⎤ ⎡ 1 ⎤ − +200 mm − 10 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Now propagate this ray to the iris, located at a distance of 80 mm after L1 :

20 mm 180mm 20 mm = T ⎡ 1 ⎤ 01⎡ 1 ⎤ − 10 − 10 ⎣ ⎦ ⎣ ⎦ 12 mm d 8mm = = y =12mm> Stop = =4mmat iris ⎡ 1 ⎤ ⇒ 2 2 − 10 ⎣ ⎦ So again we need to scale the provisional marginal ray by the ratio:

dStop 2 4mm 1 = = ³ y ´ 12 mm 3 So the marginal ray at the first lens is: 20 2 1 20 mm mm 6 mm = 3 = 3 3 ⎡ 0 ⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ ⎣ ⎦ ⎣ 20 ⎦ ⎣ ⎦ y mm = 3 ⎡ nu ⎤ ⎡ 0 ⎤ ⎣ ⎦ ⎣ ⎦ 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 131

Now propagate this ray through the first surface to the iris:

180mm 1020 mm 4mm 3 = ⎡ 01⎤ ⎡ 1 1 ⎤ ⎡ 0 ⎤ ⎡ 1 ⎤ − +200 mm − 30 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ We can now propagate this from the iris to and through the second lens:

10180mm 4mm 4 mm = 3 ⎡ 1 ⎤ ⎡ ⎤ ⎡ 1 ⎤ ⎡ ⎤ 40 mm 1 01 30 0 − − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 4 So the marginal ray exiting the system is at a height of 3 mm and an angle of 0 radians (parallel to the axis, as expected for a telescope).

Now propagate the provisional chief ray forward (toward L1) from the iris; the translation from the iris is:

y 0mm = ⎡ nu ⎤ ⎡ 1 ⎤ at stop ⎣1 ⎦ ⎣ ⎦ 1+80mm − 0mm 1 80 mm 0mm 80 mm = − = − ⎛⎡ 01⎤⎞ ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ If we propagate the provisional chief ray from the iris towards L2, we obtain:

1+80mm 0mm +80 mm = ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Note both ray heigths are too large, but that the ray height of the provisional chief ray at L2 is much larger in percentage than its height at L1; the ratios are:

d1 20 mm 1 2 = = 80 mm 80 mm 4 ¡ ¢ d2 5mm 1 2 = = 80 mm 80 mm 16 ¡ ¢ So the second lens constrains the chief ray. Apply the scaling factor to the provisional chief ray to find the true chief ray at the iris:

1 0mm 0mm y = 1 = 16 · ⎡ 1 ⎤ ⎡ ⎤ ⎡ nu ⎤ 16 at stop ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Propagate it “forward” towards and through L1 to find the prescription for the chief ray entering 132 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS the system:

1 1 − 10 1+80mm − 0mm 5mm 1 1 = − 3 ⎛⎡ 1 ⎤⎞ ⎛⎡ 01⎤⎞ ⎡ ⎤ ⎡ ⎤ −+200 mm 16 80 ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ y 5mm = − 3 ⎡ nu ⎤ ⎡ ⎤ into L1 80 ⎣ ⎦ ⎣ ⎦ The field of view of the system is twice the chief ray angle into the system:

3 3 3 180 ◦ FoV =2 radians = radians = = 4.30◦ · 80 40 40 · π ∼

Propagate the chief ray towards and through L2 to find the chief ray exiting the system:

101+80mm 0mm +5 mm 1 = 3 ⎡ 1 ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 40 mm 1 01 − − 16 16 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y +5 mm = 3 ⎡ nu ⎤ ⎡ ⎤ out of L2 16 ⎣ ⎦ ⎣ ⎦

Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with iris diaphragm between lenses. 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 133

Example 4: Keplerian telescope, object at ∞

Substitute a positive lens with the diameter of 5mmfor L2, which also means that we have to change the distance between the lenses:

L1 : f1 = +200 mm, d1 =40mm

L2 : f2 =+40mm, d2 =5mm

t = f1 + f2 = 240 mm

The vertex-vertex (system) matrix is:

101 240 mm 10 VV0 = M ⎡ 1 1 ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤ − +40 mm − +200 mm ⎣ ⎦1⎣ ⎦ ⎣ ⎦ +240 mm −5 VV0 = M ⎡ 0 5 ⎤ − ⎣ ⎦ The prescription for provisional marginal ray into system from object at infinity has the same ray height as the semidiameter of L1:

y 20 mm = ⎡ nu ⎤ ⎡ 0 ⎤ provisional ⎣ ⎦ ⎣ ⎦ The outgoing provisional marginal ray from the system is:

1 y 5 240 mm 20 mm 4mm VV0 = − = − M ⎡ nu ⎤ ⎡ 0 5 ⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ provisional − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Since the ray height of the provisional ray is larger than the semidiameter of L2,thenL2 is the aperture stop: d2 y0 > = L2 is aperture stop 2 ⇒ so we must scale the provisional marginal ray by a factor

y d2 y 5 mm y 5 y = 2 = 2 = ⎡ nu ⎤ Ã y ! · ⎡ nu ⎤ 4mm ⎡ nu ⎤ 8 · ⎡ nu ⎤ ¡ ¢ provisional provisional provisional ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y 5 20 mm 12.5mm = 8 · = ⎡ nu ⎤ ⎡ 0 ⎤ ⎡ 0 ⎤ at L1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦

Now to the chief ray; the provisional chief ray emerging from center of aperture stop has zero height and angle of unity:

y0 0mm = ⎡ n0u0 ⎤ ⎡ 1 ⎤ provisional − ⎣ ⎦ ⎣ ⎦ 134 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

The ray is propagated to the first lens:

1 0mm 1 +240 mm − 0mm +240 mm = = T ⎡ 1 ⎤ ⎛⎡ 01⎤⎞ ⎡ 1 ⎤ ⎡ 1 ⎤ − − − ⎣ ⎦ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ so the height of the provisional chief ray at the first element is y = 240 mm, which is MUCH larger d1 | | than the semidiameter 2 =20mmof L1. To ensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor: 20 mm 1 = 240 mm 12 So now go back to the original prescription for the provisional chief ray:

y0 0mm y0 1 y0 0mm = = = = ⎤ ⇒ ⎡ 12 1 ⎡ n0u0 ⎤ ⎡ 1 n0u0 ⎤ ⎡ n0u0 ⎤ ⎡ ⎤ provisional − provisional −12 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y0 0mm = 1 ⎡ n u ⎤ ⎡ ⎤ 0 0 −12 ⎣ ⎦ ⎣ ⎦ We can now propagate it from the rear vertex to and through the front vertex of the system. The chief ray emerging from the front vertex is:

1 1 1 10− 1 +240 mm − 10− 0mm +20 mm = ⎛⎡ 1 1 ⎤⎞ ⎛⎡ 01⎤⎞ ⎛⎡ 1 1 ⎤⎞ ⎡ 1 ⎤ ⎡ + 1 ⎤ − +200 mm − +40 mm − 12 60 ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ 1 In words, the chief ray height at the front surface is y =20mmand the chief ray angle is nu =+60 radian (where the negative sign again just means that the ray angle into the system is the negative of that emerging therefrom). The fieldofviewofthesystemistwicetheangle:

1 1 1 180◦ FoV =2 radian = radian = = 1.91◦ · 60 30 30 · π ∼ 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 135

Marginal ray (red) and chief ray (blue) from object at infinity traced through Keplerian telescope with aperture stop at second lens.

Example 5: Keplerian telescope, stop at eyepiece, nearby object OV = 500 mm Consider a telescope with the following parameters. ¡ ¢

L1 : f1 = +200 mm, d1 =40mm

L2 : f2 =+40mm, d2 =5mm

t = f1 + f2 = 240 mm

z1 = OV = 500 mm

The provisional marginal ray goes from the center of the object to the edge of the first lens, through the system, and to the center of the image. The first provisional ray is:

y 0mm (at object)= ⎡ nu ⎤ ⎡ 1 ⎤ provisional ⎣ ⎦ ⎣ ⎦ It is useful to locate the image by propagating this provisional ray through the system:

101 240 mm 10 1 500 mm 0mm 140 mm = ⎛⎡ 1 1 ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤⎞ · ⎡ 01⎤ · ⎡ 1 ⎤ ⎡ 5 ⎤ − +40 mm − +200 mm − ⎝⎣ ⎦ ⎣ ⎦ ⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ So the image location relative to the rear vertex is: y 140 mm V O = = =+28mm 0 0 −u 5 radians − V0O0 =+28mm 136 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS so the image is real.

Now find the height of the provisional marginal ray at L1:

y 1 500 mm 0mm 500 mm (at L1)= = ⎡ nu ⎤ ⎡ 01⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ provisional ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ where the ray height is MUCH too large and must be scaled to “fit” into the lens. The scale factor is: d1 20 mm 1 2 = = y (at lens) 500 mm 25 ¡ ¢ So the second iteration of the provisional marginal ray at the front of the first lens is:

1 500 mm 20 mm = 25 · ⎡ ⎤ ⎡ 1 ⎤ 1 25 ⎣ ⎦ ⎣ ⎦ which has a much smaller incident angle.

Now propagate this ray through the first lens to the second lens:

20 mm 1240mm 1020 mm 1 = TR ⎡ 1 ⎤ ⎡ 01⎤ ⎡ 1 1 ⎤ ⎡ 1 ⎤ 25 − +200 mm 25 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 28 mm 5 3 mm = 5 = 5 ⎡ 3 ⎤ ⎡ 3 ⎤ − 50 − 50 ⎣ ⎦ ⎣ ⎦ so the ray height is still too large; it is blocked by L2 (which therefore is the aperture stop); scale this ray to fit into the second lens by applying the factor:

d2 2.5mm 12.5 25 2 = = = y (at L ) 28 mm 28 56 ¡ ¢2 5 So the third iteration produces the actual marginal ray from an object at a distance of 500 mm from L1: y 25 0mm 0mm 0mm = = ∼= ⎡ nu ⎤ 56 · ⎡ 1 ⎤ ⎡ 1 ⎤ ⎡ 0.017857 ⎤ at object 25 56 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y 0mm 0mm = ∼= ⎡ nu ⎤ ⎡ 1 ⎤ ⎡ 0.017857 ⎤ at object 56 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The prescription for the marginal ray at L1 is:

1 500 mm 0mm 125 mm 8.929 mm = 14 = ⎡ ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ∼ ⎡ 1 ⎤ 01 56 56 56 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ where the ray height is much smaller than the semidiameter of L1, so the lens is overly large.

We can propagate this through the system to find the actual prescription for the exiting marginal 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 137 ray:

1 1 500 mm 0mm 5 240 mm 1 500 mm 0mm VV0 = − M · ⎡ 01⎤ · ⎡ 1 ⎤ ⎡ 0 5 ⎤ ⎡ 01⎤ ⎡ 1 ⎤ 56 − 56 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y 5 mm = 2 ⎡ ⎤ ⎡ 5 ⎤ nu 56 at V0 − ⎣ ⎦ ⎣ ⎦ Just to check, find the distance to the image to make sure it matches the result for the provisional marginal ray: y 5 mm V O = = 2 =+28mm 0 0 −nu − 5 − 56 which agrees with what we found earlier.

NowthatweknowthatL2 is the aperture stop for the specified object location, we can propagate a provisional chief ray from center of the stop in both directions. (We will find that the chief ray is unaffected by the location of the object.) The provisional chief ray is:

y0 0mm = ⎡ n0u0 ⎤ ⎡ +1 ⎤ provisional ⎣ ⎦ ⎣ ⎦ Propagate through the system towards image space to obtain

0mm 100mm 0mm 2 = = R ⎡ ⎤ ⎡ 1 ⎤ ⎡ ⎤ ⎡ ⎤ 1 40 mm 1 1 1 − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is the stop because it passes through the center of the lens. The provisional chief ray may be propagated from the stop “forwards” towards the first lens. The translation matrix yields the ray height and angle at the first lens:

1 − 0mm 1 +240 mm 0mm 240 mm y0 = = − = (at L1) T ⎡ 1 ⎤ ⎛⎡ 01⎤⎞ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎡ n0u0 ⎤ provisional ⎣ ⎦ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Note that the height of the provisional chief ray at L1 is y = 240 mm,whichmeansthatitis − d1 BELOW the optical axis at a MUCH value than the semidiameter 2 =20mmof L1. To ensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor:

d1 20 mm 1 2 = y 240 mm 12 ¡ ¢ So now go back to the original prescription for the provisional chief ray and scale it to obtain the “actual” chief ray:

y0 0mm y0 1 y0 0mm y0 = = = = = 12 1 ⎡ n0u0 ⎤ ⎡ 1 ⎤ ⇒ ⎡ n0u0 ⎤ ⎡ n0u0 ⎤ ⎡ ⎤ ⎡ n0u0 ⎤ provisional provisional 12 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Note that this is the same chief ray as for the case where the object is at infinity. In words, the chief ray is determined by the stop and the diameters of the other elements, not by the location of the object. 138 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS

We can now propagate the scaled chief ray from the rear vertex to and through the front vertex of the system. The chief ray emerging from the front vertex is:

1 1 10− 1 +240 mm − 0mm 20 mm = − ⎛⎡ 1 1 ⎤⎞ ⎛⎡ 01⎤⎞ ⎡ 1 ⎤ ⎡ 1 ⎤ − +200 mm 12 − 60 ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ ⎦ ⎣ ⎦ 1 which has the correct ray height (the semidiameter of L1) y =20mmand angle nu = 60 radian. The field of view of the system is twice the angle: −

1 1 1 180◦ FoV =2 radian = radian = = 1.91◦ · 60 30 30 · π ∼

The exit pupil is (obviously) located at the aperture stop L2, while the entrance pupil is the image of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation of the chief ray emerging from the front vertex:

y0 20 mm (emerging from front vertex)= 1 ⎡ n0u0 ⎤ ⎡ ⎤ − 60 ⎣ ⎦ ⎣ ⎦ 1 The height is 20 mm and the angle is 40 radian, so the distance to the location where the ray crosses the optical axis is: 20 mm zV NP = = +1200 mm 0 − 1 − 60 in front the objective; the entrance pupil is real and its magnification is: +1200 mm M = =5 T 240 mm so the diameter of the entrance pupil is:

dNP =5 dStop =5 5mm=25mm · · 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 139

Marginal ray (red) and chief ray (blue) from object at a distance of 500 mm from the first lens traced through Keplerian telescope with aperture stop at second lens.

Chapter 4

Depth of Field and Depth of Focus

From experience with snapshots or movies, we all know that the optical images are not “in focus” for objects at all distances from the lens; objects at distances other than that focused appear blurry. This is not necessarily bad — it is used as a creative tool by photographers and cinematographers to concentrate the attention of the viewer on particular objects of interest. However, in many (if not all) scientific applications, this limitation to the region of “good” imaging is detrimental; we’d like to see the entire 3-D object “in sharp focus.” For this reason, it is essential to understand the factors that affect the depth of the region of “sharp focus,” which is the so-called “depth of field” on the object as “seen” through the imaging system.

The concept of depth of field and focus and the dependence on f/# is illustrated in the figure for a specified linear dimension of “acceptable sharpness.” The extent of the cone of rays between the two locations truncated by this sharpness criterion is the “depth of focus.” Clearly this range is larger for a smaller cone angle (larger f/#). This would lead us to the conclusion that the depth of focus (and also its object-space equivalent, the depth of field) is proportional to the f/#:

∆z f/# ∝ A more accurate criterion requires application of the principles of wave optics to show that diffraction induces a “blur spot” whose linear dimension also increases with focal ratio that defines the dimension of “acceptable” blur. A hybrid combination of the principles of ray and wave optics leads to a criterion that the depths of field and of focus actually vary with the square of the f/#:

∆z (f/#)2 ∝ This hybrid criterion is discussed after illustrating the concept of depths of field and focus using examples from film and television.

141 142 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

The depth of focus for a known linear dimension of “acceptable sharpness” depends on angle of the cone of rays, which is determined by the focal ratio (f/# ) of the system. If the cone of rays is large (small f/#), then the extent of the cone in front of and behind the point of best focus is small; if the angle of the ray cone is small (large f/#), then a wider range of depths appear “in focus.” 143

4.0.2 Examples of Depth of Field from Video and Film

Extensive discussion in Wikipedia at http://en.wikipedia.org/wiki/Depth_of_field

1. The Colbert Report, video image with “normal lens” shows the different in apparent sharp- ness with depth in the scene. This naturally draws attention to the object that is in focus and often serves as a cue to the audience about which is the object of interest. There are three areas of interest at different distances from the lens, which is focused on the nearest plane (Stephen Colbert); the more distant plane where Jon Stewart sits is noticeably blurry, but the bookshelf in the distant plane is very blurry.

Note the difference in sharpness with depth; Stephen Colbert in the foreground is in sharp focus, Jon Stewart is clearly less sharp, and the items in the background are quite blurry. 144 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

2. Sherlock c 2011, Masterpiece Mystery from the BBC, using limited depth of field to draw attention to° the point of interest This example shows how the director draws the attention of the audience to the desired point of interest. The two frames are from AScandalinBelgravia,thefirst episode in the second season of Sherlock broadcast by the BBC and PBS. The two frames are taken from the same camera position and separated in time by approximately two seconds. In the first frame, “Sherlock” (Benedict Cumberbatch) is speaking about the camera phone of “Irene Adler” (Lara Pulver). After he finishes speaking, the camera focus shifts rapidly to Adler in the background for her reply. Note that her form is barely distinguishable in the first frame, which focuses the viewer’s attention upon Sherlock in the foreground.

Use of limited depth of field to draw the attention of the audience to the subject of interest. The camera shifts focus rapidly from the foreground character (at top) to the background character (at bottom). 145

3. Citizen Kane by Orson Welles, small aperture (large f/#) = large depth of field ⇒

Both foreground and background are in focus — note cheek of “Mr. Bernstein” (Everett Sloane) in near foreground on right and venetian blinds in the windows at the back. “Walter Thatcher” (George Coulouris) on left and “Charles Foster Kane” (Orson Welles) in center are in focus. The distance to the windows appears to be small because of the sharp focus.

Different frame of same scene from “Citizen Kane” shot with same focus setting. George Coulouris (as “Walter Thatcher”) and Everett Sloane (as “Mr. Bernstein”) remain in focus in the foreground. Orson Welles (as “Charles Foster Kane”) has walked to the windows, which are now clearly many feet from the foreground characters. “Kane’s” stature appears to have been diminished. The film “Citizen Kane”( c 1941 RKO Pictures, Inc.) is famous for its creative cinematog- raphy by Gregg Toland and° the director/star Orson Welles, including original camera angles (especially upward shots from the floororevenfrombeneaththefloor plane), movements, transitions, and the use of “deep focus.” Consider the two frames from the film of a group of three characters: the standing Orson Welles in the center (at age 26 as the elderly “Charles FosterKane,”atestamenttotheskillofmakeupartist Maurice Seiderman), George Coulouris on the left (as “Walter Parks Thatcher”, who had been Kane’s guardian), and Everett Sloane on the right (as Kane’s assistant “Mr. Bernstein”). In the first frame, the three characters are grouped together and the entire scene appears to be in focus, from the skin on Bernstein’s face on the right to the venetian window blinds in the back. From the sharp focus of the back- ground windows and expectations about depth of field based on past experience, viewers likely 146 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

will surmise that the windows must be physically close to the characters and therefore that Kane is much taller than the background window sill. Between the first and second images, the standing Kane has taken 18 steps to walk to the windows (perhaps 35-50 feet from the foreground characters), while remaining in focus the entire time. His height is now shown to be approximately the same as the height of the window sill. The apparent “shrinking” of his size during the walk may be interpreted as an artistic metaphor for the diminishing stature of Kane due to the partial failure of his media empire during the Depression. He subsequently walks back to the foreground to sign the agreement held by Mr. Thatcher that sells much of his publishing/broadcasting empire back to Thatcher’s bank. The very large depth of field can only be obtained by a small aperture stop, which reduces the light reaching the sensor. Clearly the emulsion must have good sensitivity (it must have been a “fast film”) and the lighting must be sufficiently strong to record “useful” images. The sequence is available on “YouTube”- at http://www.youtube.com/watch?v=WTmVlDh2V2g. Interested readers might want to view the documentary about the movie (http://www.youtube.com/watch?v=eCkYlCBFV6w). An- other scene in the movie that is interesting from the perspective of optics is the so-called “mirror scene,” which is at the end of the 1-minute clip at http://www.youtube.com/watch?v=8fIP7g9en10

Still from the “mirror scene” in “Citizen Kane.” Again, note the depth of field. 147

4. Spellbound,byAlfredHitchcock(c Selznick International Pictures, Vanguard Films 1945 ) The climactic scene in this classic movie° is a confrontation between “Dr. Murchison” (Leo G. Carroll) and “Dr. Constance Petersen” (Ingrid Bergman), where Petersen reveals she has evi- dence that Murchison murdered Dr. Anthony Edwardes, whose “substitute imposter” is played by Gregory Peck. Frames from the scene are shown in the figure. The frames from the view- point of Dr. Murchison show the view of his hand, the gun, and Ingrid Bergman, with all appar- ently “in focus.” To avoid problems with depth of field, the hand and gun are actually models that are larger than life size that were positioned closer to Bergman than to the camera. The website for Turner Classic Movies states that the scene took a week to set up and 19 takes to get the final result (http://www.tcm.com/this-month/article/18621%7C0/Spellbound.html). YouTube clip available from http://www.youtube.com/watch?v=8rDMotFmCJc.

Scenes from “Spellbound” ( c Selznick International Pictures 1945), showing (a) Leo G. Carroll holding a revolver; (b) Ingrid° Bergman walking towards the door as Carroll’s character aims the revolver; (c) and (d) after Bergman’s exit, the hand and gun turn towards the camera and fires.

An additional note of interest in this black-and-white film is that two color frames as the gun fires were spliced into each print by hand. 148 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

One of the two color frames of the gunshot spliced into each print of the film “Spellbound.”

5. Somewhere in Time, split-diopter lens to focus on two distances simultaneously, giving the appearance of expanded depth of field

Split-diopter lens (Fig. 5.13 from Visual effects cinematography By Zoran Perisic), which is attached to the front of a normal lens and which adds power on one side of the field of view.

The frame from “Somewhere in Time” ( c Universal Studios, 1980 ) illustrates the action of the “split-diopter” lens added to the normal° camera lens. Both the foreground field on the right (with Christopher Reeve as “Richard Collier”) and left-hand background field (with Jane Seymour as “Elyse McKenna,” the white garden bench, and the trees) appear to be “in focus.” The split diopter lens adds refractive power (thus shortening the focal length) for half the field. Because the sensor is the same distance from the rear vertex of these two “half-systems,” the object plane that is in focus in the half field with the additional power is closer to the lens. In this example, the split-diopter lens is oriented to “split” the fields through the vertical white pillar and adds power to the right half of the field. The left side of the vertical pillar is “fuzzier” than the right side, where the features of the wood grain are visible. Note that the trees in the background on the right are out of focus, while those on the left are sharp. The audience likely does not notice the discrepancies in the image planes. 4.1 CRITERION FOR “ACCEPTABLE BLUR” 149

Frame from “Somewhere in Time” ( c Universal Studios, 1980) showing use of “split-diopter lens.” Both foreground and background° are “in focus” but note that the left side of the foreground pillar is “fuzzy” while the right side is “sharp.”

A system consisting of both optics and sensor is “diffraction-limited” if the pixel size of the sensor (smallest resolvable spot) is smaller than the linear dimension of the diffraction spot. The system is “detector-limited” / “sensor-limited” if the linear dimension of the individual sensor elements is larger than the diffraction spot.

4.1 Criterion for “Acceptable Blur”

The discussion of the limiting “blur” of an imaging system may be extended to characterize the range of “distances” (or “depths”) over which images of point objects exhibit the “same” (or at least “similar”) blur dimensions. If specified in object space, the distance range is called the “depth of field;” the same metric in image space is the “depth of focus.” The depth of field may be thought of as the “zone of acceptable sharpness” for object locations. Thereisnoonewaytodefine the depths of field and focus, but we can rather easily derive a metric based on ray optics and a hybrid metric that includes the concept of “diffraction” from wave optics (where the aspects must be taken “on faith” at this point). The measurement is based upon the linear dimension B0 of the “acceptable blur.” This may be due to a metric of acceptable spatial resolution or the size of the sensor elements, or the diameter of the diffractionspotinthe hybrid metric. Consider a hypothetical value of B0 showninthefigure. From this value, it is easy to determine the range of possible axial distances that correspond to B0 in the ray model and use that to evaluate the corresponding dimension B in object space via the transverse magnification z0 B0 MT = = . − z B 150 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

The calculation of depth of field: B0 is the linear dimension of the blur for the system (either the diameter of the diffraction spot in a diffraction-limited system or the dimension of the sensor element in a detector-limited system). The locations z0 δ0 specify locations in image space where the geometrical blur has the same linear size. The corresponding± locations in object space are the limits of the “depth of field.”

Asshowninthefigure for a given B0, the “blur” spots are located at two positions equidistant from the “in-focus” image. We assign the name δ0 to the distance between the “in-focus” image and the geometrically blurred images, so these two planes are located at z0 δ0. The depth of focus in ± this model is twice δ0: ∆z0 =2 δ0 · In the ray model, the drawing shows that:

D B0 z0 = = δ0 = B0 ∼= B0 f/# z0 δ0 ⇒ · D · (in the case where the object distance is “many” focal lengths so that the image distance is only slightly longer than a focal length). If B0 is small, so must be δ0; if the f/# is large, so must be δ0.

The object distances z1 and z2 corresponding to these image locations may be evaluated from the imaging equation for the corresponding image distances z0 = z0 δ0 and z0 = z0 + δ0.Itiseasy 1 − 2 to see that the absolute magnification MT is smaller for the smaller image distance, i.e., MT for | | z10 = z0 δ0 is smaller than MT for the larger object distance z20 = z0 + δ0. The nonlinearity of the imaging− equation ensures that the distances between the in-focus object distance z and the extrema are not equal, i.e., z1 z = z z2, thus requiring labels for both: z1 = z + δ1 and z2 = z δ2. − 6 − − However, if δ0 is small, then the concept of longitudinal magnification ML allows simple approximate expressions for the object distances. We already derived a simple expression for ML in terms of the 4.1 CRITERION FOR “ACCEPTABLE BLUR” 151

transverse magnification MT :

Differentiate both sides of the imaging equation: 1 1 1 d + = d =0 z z f µ 1 2 ¶ µ ¶ 1 1 1 1 d + = dz1 + dz2 =0 z z −z2 −z2 µ 1 2 ¶ µ 1 ¶ µ 2 ¶ 2 2 dz2 z2 z2 2 = = = = (MT ) < 0 ⇒ dz − z2 − z − 1 µ 1 ¶ µ 1 ¶ 2 (∆z)0 z2 2 ML = = = (MT ) < 0 ∆z − z − µ 1 ¶ The increments in object distance are related to the increments in image distance via the longitudinal magnification:

δ0 δ0 = ML δ1 = ML δ2 = δ1 = δ2 = ∼ | |· ∼ | |· ⇒ ∼ ∼ ML | | δ0 δ0 z1 = z + δ1 = z + = z 2 ∼ ML − M | | T δ0 δ0 z2 = z δ2 = z = z + 2 ∼ − ∼ − ML M | | T So the depth of field is proportional to the f/# and to the linear dimension of the acceptable blur:

δ0 δ0 B0 f/# ∆z = z1 z2 = δ1 + δ2 = 2 =2 2 =2 · 2 − ∼ · ML · M · M | | T T B ∆z = 2 0 f/# f/# ∼ · M 2 · ∝ µ T ¶

In the detector-limited case where the blur dimension is determined by the pixel dimension b0, the depth of field is proportional to the f/#:

b ∆z 2 0 f/# f/# (in ray model) ∼= 2 · MT · ∝

Note that the depth of field is larger in “slower” systems (with large f-numbers and small cone angles).

If we add the wave concept of “diffraction,” the linear dimension B0 is determined by the dif- fraction pattern, which may be written in terms of the wavelength and the focal ratio. Assume that the linear dimension of image blur has been measured for a particular imaging system at the specific pair of object and image distances (z and z0 respectively) of interest: 152 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

Blur in a diffraction-limited system with aperture diameter D. The image of the point source is a diffraction pattern at the image plane whose linear dimension (using some criterion) is B0. For example, the image of a point source located a distance z from the system could be measured to find this limiting “blur diameter” B0, where the prime indicates that the measurement is made in image space. In a diffraction-limited system, the discussion of Fraunhofer diffractioninimaging shows that one possible measure for B0 is the diameter of the central lobe of the diffraction spot:

z0 f B0 =2.44 λ0 = 2.44 λ0 =2.44 λ0 f/# · · D ∼ · · D · ·

B0 = 2.44 λ0 f/# ∼ · ·

f/# λ (f/#)2 ∆z 2 (2.44 λ f/#) =4.88 0 ∼= 0 2 · 2 · · · · MT · MT λ (f/#)2 ∆z 4.88 0 (if accounting for diffraction) ∼= · 2 · MT

So the depths of field and of focus are proportional to the square of the f/# in the diffraction-limited case.

4.2 Depth of Field via Rayleigh’s Quarter-Wave Rule

We can also derive the depth of focus by finding the range of image locations that satisfy Rayleigh’s rule applied to defocus, and then transform those image distances back into object space via the imaging relation to find the depth of field. The necessary task is to find the change in the image location for change in the wavefront error at the edge of the pupil. In the figure, the ideal reference wavefront has radius R1 (R1 ∼= f if the object is a large distance away) and the wavefront with defocus has radius R2 = R1 + δ0 ∼= f + δ0, 4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE 153

λ0 where δ0 is the change in location of the focal plane with an added quadratic phase of ∆W020 = 4 . The quadratic-phase approximation to the new wavefront is: ±

x2 + y2 x2 + y2 x2 + y2 W [x, y]= = = 2R 2 R + δ δ0 2 1 0 2R 1+ 1 R ¡ ¢ µ 1 ¶ 2 2 1 2 2 + n x + y δ0 − x + y ∞ δ0 = 1+ = 2R R 2R R 1 1 1 · n=0 1 µ ¶ X µ ¶ 2 2 2 2 x + y δ0 ( 1) ( 2) δ0 ( 1) ( 2) ( 3) δ0 = 1+( 1) + − − + − − − + 2R − R 2! R 3! R ··· 1 Ã 1 µ 1 ¶ µ 1 ¶ ! 2 2 2 x + y δ0 δ0 δ0 δ0 = 1 (if = 1) ∼ 2R − R R ¿ R ∼ f ¿ 1 µ 1 ¶ µ 1 ¶ ¯ 1 ¯ ¯ ¯ 2 2 2 2 ¯ ¯ ¯ ¯ x + y x + y ¯ ¯ ¯ ¯ = δ0 2 ¯ ¯ ¯ ¯ 2R1 − · 2R1 where the first term is the quadratic-phase approximation to the ideal wavefront and the second term is the additional effect of the defocus.

Change in image position δ0 as a function of the wavefront error ∆W = W020 for defocus.

In the limit where the object distance is large, the image distance R1 is approximately equal to the focal length f, so this expression simplifies to:

x2 + y2 x2 + y2 W [x, y] = δ0 ∼ 2f − 2f 2 µ ¶ x2 + y2 x2 + y2 ∆W [x, y] = W [x, y] = ∆W [x, y] = δ0 ∼ − 2f ⇒ ∼ − 2f 2 µ ¶

If the wavefront error is positive, ∆W>0= δ0 < 0, which means that the image moves “towards” thelensasshowninthefigure. ⇒ d The magnitude of the wavefront error at the edge of the pupil (where, say, x = 0 and y =0)is: 2

2 d0 2 2 d0 2 +0 d0 ∆W = W x = ,y =0 = δ0 = δ0 | | 2 · 2f 2 · 8f 2 ¯ ∙ ¸¯ ¡ ¢ ¯ ¯ ¯ ¯ We can now apply Rayleigh’s rule¯ that the image is¯ effectively ideal if the maximum wavefront error 154 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS is less than a quarter wave, so that the single-sided depth of field is easy to evaluate:

2 2 2 2 2 λ0 d0 λ0f 2 f f > ∆W = δ0 = δ0 = =2λ0 =2λ0 4 | | · 8f 2 ⇒ ∼ 2 · d d2 d µ 0 ¶ µ 0 ¶ µ 0 ¶ 2 = δ0 = 2λ0 (f/#) using Rayleigh’s rule for ideal imaging ⇒ ∼ ·

In visible light with λ0 ∼= 0.5 μm, the change in image position under the Rayleigh criterion is 2 δ0 [λ0 ∼= 0.5 μm] ∼= (f/#) [ μm] In words, an image in visible light appears to be “in focus” if the distance of the actual image plane from the ideal image plane in micrometers is no larger than the square of the f/#. For example, if the lens is used at f/4, the actual image plane must be within 16 μm of the ideal location; if at f/16, theactualimageplanemustbewithin256 μm ∼= 0.25 mm of the ideal location. Note the similarities and the differenceswiththeruleofthumbthatthesizeofthediffraction spot in micrometers is equal to the f/#.

The depth of focus is twice this value because we can defocus on either side of the ideal image plane: 2 2 Depth of focus: (∆z)0 =2δ0 = 4λ0 (f/#) = 2 (f/#) [ μm] ∼ ∼ · Now convert this to the object space via the longitudinal magnification to find the depth of field:

δ0 (∆z)0 2 ML = = = (MT ) δ ∆z − (∆z)0 (∆z)0 ∆z = 2 δ = = 2 ∼ · ML (M ) | | T 4λ (f/#)2 ∆z 0 ∼= 2 (MT ) which again is proportional to the square of the f-ratio and is quite similar to the “hybrid” metric for depth of field in the diffraction-limited case from the last section:

λ (f/#)2 λ (f/#)2 Depth of field: ∆z 4.88 0 ∆z 4 0 Hybrid ∼= 2 ' Rayleigh ∼= 2 · Ã (MT ) ! · Ã (MT ) !

These two expressions are quite similar; the fact that these are not identical should be no surprise since they were derived using different assumptions.

Note that the depth of field increases as the square of the f/#, so stopping down the lens by a factor of 2 has a big impact — it increases the depth of field by about a factor of 4. Since the transverse magnification is less than unity for most real imaging setups (and a lot less for distant objects), the depth of field increases rapidly as the object distance increases.

It might be useful to do an example. Consider a normal lens with f =50mmacting in visible light (λ0 = 500 nm = 0.5 μm) with the aperture wide open (say, f/2 so that the diameter of the entrance pupil is d0 =25mm) imaging a nearby object with z1 =1m:

1 1 1 − z2 = = 52.63 mm 50 mm − 1000 mm ∼ µ ¶ z2 52.63 mm MT = = = 0.5263 −z1 − 1000 mm − where (again) the negative sign on the transverse magnification means that the image is “upside 4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE 155 down” compared to the object. The depth of focus is:

2 depth of focus at f/2: (∆z)0 =2δ0 = 4 0.5 μm 2 =8μm ∼ · · And the depth of field is obtained by scaling by the square of the transverse magnification:

(∆z)0 8 μm depth of field at f/2: ∆z = 28.9 μm ∼= 2 2 ∼= MT ( 0.5263) − If we stop the lens down to, say, f/16 (a factor of 8), the depths of focus and field are much larger: 2 depth of focus at f/16: (∆z)0 =2δ0 = 4 0.5 μm 16 = 512 μm = 0.5mm ∼ · · ∼ 512 μm depth of focus at f/16: ∆z = = 1.85 mm ∼ ( 0.5263)2 ∼ −

If the object is a large distance away, say z1 = 100 m withthelenswideopenatf/2,thetransverse magnification is much smaller:

1 1 1 − z2 = = 50.025 mm 50 mm − 100 m ∼ µ ¶ z2 50.025 mm 4 MT = = = 5.0025 10− −z1 − 100 m − × The depth of focus is the same as it was for the close-up image at f/2:

2 (∆z)0 =4 0.5 μm 2 =8μm · · but the much smaller value for the transverse magnification means that the depths of field and focus are much larger: 8 μm ∆z = = 32 m ∼ ( 5.002 5 10 4)2 ∼ − × − 512 μm ∆z = = 2km ∼ ( 5.002 5 10 4)2 ∼ − × −

Depth of field of lens focused at z1 =20ft∼= 6m for three focal ratios: f/1.8, f/5.6, and f/16 showing increase in depth of field with increasing focal ratio (from http://www.engadget.com). 156 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS 4.3 Hyperfocal Distance

The last example just presented where the object distance z1 = 100 m and the depth of field ∆z ∼= 2km suggests another useful imaging metric: the shortest object distance for which the depth of

field extends to infinity, which is called the hyperfocal distance (z1)hyperfocal and the corresponding image distance (z2)hyperfocal is the sum of the focal length and the “defocus distance” δ0:

(z1) + δ1 = = (z2) δ0 = f hyperfocal ∞ ⇒ hyperfocal − = (z2) = f + δ0 ⇒ hyperfocal

The hyperfocal object distance (z1)hyperfocal satisfies the imaging equation for this image distance: 1 1 1 + = (z1)hyperfocal (z2)hyperfocal f

1 1 1 − Hyperfocal Distance (z1)hyperfocal = f − f + δ0 µ ¶ 2 2 f + δ0f f = = f + δ0 δ0 f 2 f+ ∼= 2 2λ0 (f/#) f 2 ∼= 2 2λ0 (f/#) where we can also interpret this in terms of the diameter of the diffraction spot:

f 2 f 2 (z1)hyperfocal = = ∼ (2λ0f/#) (f/#) (f/#) ddiffraction spot · · where ddiffraction spot = 2 λ0 f/#. So if we have a so-called “normal lens” with f =50mmacting ∼ · · at f/2 (close to wide open) and in light with λ0 = 500 nm, the hyperfocal distance is:

(50 mm)2 (z ) = = 625 m 1 hyperfocal ∼ 2 500 nm 22 ∼ · · which is quite distant. If we stop the lens down to f/16, we get:

(50 mm)2 (z ) = = 9.8m 1 hyperfocal ∼ 2 500 nm 162 ∼ · · which is quite a lot closer to the lens. This means that objects at all distances in the interval 10 m z1 < should appear to be “in focus” if the lens is used at f/16. / ∞ 4.4 Methods for Increasing Depth of Field

1. Google Lens: http://www.google.com/patents/US6320979

2. Focus stacking: digital combinations of images collected at different focus settings. Different images are combined based on local sharpness to produce an image with extended depth of field.

3.Light-field camera = plenoptic camera that captures the four-dimensional field [x, y, z, t].An example of such a camery is the Lytro, which uses a matrix of microlenses to collect ray 4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 157

direction information in addition to color and lightness. This stored information allows recovery of focused information at different depths.

4. Cameras with different focal settings for different colors of light. The information is combined digitally to extract the sharp edge data from the color with the large f/# with the blurrier structure in other colors.

4.5 Sidebar: Transverse Magnification vs. Focal Length

It may be useful to derive the relationship between transverse magnification and focal length for a given object distance. We know the imaging equation for object distance z1, image distance z2,and focal length f

1 1 1 = + f z1 z2 We already know that for an imaging system consisting of two or more lenses, the object distance is measured to the object-space principal point, the image distance is measured from the image-space principal point, and the focal length is replaced by the effective focal length.Foraspecificobject distance z1 and a fixed focal length f, the equation may be rearranged to determine the image distance: z1 f z2 = · z1 f − We can substitute the expression for the transverse magnification:

z1 f · z2 z1 f f f 1 f 1 M = = − = = = T z ³ z ´ f z z f z f − 1 − 1 1 1 Ã z 1! − 1 Ã1 z ! − 1 − − 1 If the focal length is shorter than the object distance, then the term f < 1: z1 ¯ ¯ ¯ ¯ f 1 ¯ ¯ M = T z f − 1 · Ã1 z ! µ ¶ − 1 n f ∞ 1 f = z n! z − 1 · n=0 − 1 µ ¶ X µ ¶ f f 1 f 2 = 1 + −z · − z 2 z − ··· µ 1 ¶ Ã 1 µ 1 ¶ ! f f 2 1 f 3 = + + −z z − 2 z ··· 1 µ 1 ¶ µ 1 ¶ f MT = if f z1 ∼ −z1 ¿

1 where the series for (1 t)− has been used. For a lens with a fixed focal length f but two object − distances (z1)a and (z1)b the transverse magnifications are: f (M ) T a ∼= −(z1)a f (M ) T b ∼= −(z1)b 158 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS so the difference in transverse magnifications is:

f f (MT ) (MT ) = ∆MT = a − b ∼ −(z ) − −(z ) µ 1 a ¶ µ 1 b ¶

1 1 ∆MT = ( f) ∼ − (z ) − (z ) µ 1 a 1 b ¶ (z1) (z1) =( f) b − a − · (z1) (z1) µ a · b ¶ (z1)a (z1)b ∆z1 ∆MT = f − = f · (z1) (z1) · (z1) (z1) a · b a · b

We have already seen that the transverse magnification varies with the focal length of the lens:

1 1 1 1 z2 1 z2 1 = + = +1 = 1 = (1 MT ) f z z z · z z · − −z z · − 1 2 2 µ 1 ¶ 2 µ µ 1 ¶¶ 2 z2 = =(1 MT ) ⇒ f − f 1 = = ⇒ z2 1 MT − If the object distance z1 is large, then MT / 0, which means that we can substitute the geometric series: | | + 1 ∞ = t if t < 1 1 t | | =0 − X f 1 = z2 1 MT + − ∞ 2 = (MT ) =1+MT +(MT ) + = 1+MT if MT < 1= z2 f ···∼ | | ⇒ ' =0 X f = 1+MT if MT < 1= z2 ' f z2 ∼ | | ⇒ which implies that the magnification increases with the focal length

We should check this for some known cases: if the object distance z1 =+ ,thenz2 = f and : ∞ f z1 = = =1= 1+ MT ∞ ⇒ z2 ∼ | | = MT = 0, correct answer ⇒ | | ∼

If the object distance is z1 = 100 f, then the image distance and approximate transverse magnifi- cation are: · 100 f 99 1 z2 = f = = = 1+MT = MT = 99 ⇒ z2 100 ∼ ⇒ ∼ −100 The actual transverse magnification is:

100 1 1 M = 99 = = T 100 99 ∼ 100 −¡ ¢ − − so the approximation is still quite good. 4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 159

Now consider two distant objects a and b at object distances (z1) > (z1) f ,wehave: a b À (z ) (z ) ∆z 1 a 1 b = f − f f (z1)a (z1)b = (1 + MT ) (1 + MT ) =(MT ) (MT ) = ∆MT f − f ∼ a − b a − b ∆z 1 = ∆M f ∼ T which shows that the difference in transverse magnifications decreases as the focal length f increases for fixed ∆z1. In words, if two distant objects are separated along the optical axis by the distance ∆z, the transverse magnifications for the two objects are more similar if the focal length f is large, which gives the impression to the viewer that the objects are “close together.”

Consider the example shown below; the subjects are a pair of 15- in diameter Rodman smoothbore cannon dating from 1864 that are preserved on restored carriages at Fort Foote, Maryland, near my childhood home (when I was growing up, the two barrels had not been mounted, but were lying on the ground). The near and distant cannons are separated by the fixed distance ∆z1. The images were takenwithazoomlens:thefirst used a “telephoto” setting with equivalent focal length f1 = 140 mm for the 35 mm film format (the actual focal length was f1 =22.2mm). The second image was taken with equivalent focal length f2 =32mmfor the 35 mm format (a “wide-angle” lens; the actual focal length f2 =6.6mm). The difference in transverse magnifications clearly is smaller with the long focal length (first image) as the distant cannon is readily visible; the tiny distant cannon is barely visible in the second image. The transverse magnifications for the background cannon differ by nearly a factor of 2.5 for the two images. This effect leads to the statement that telephoto lenses “compress” the depth of field (though some vigorously dispute this statement for psychological reasons!). 160 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS

Illustration of the variation in transverse magnification with focal length of the lens. The equivalent focal length of the lens used to make the top image is f ∼= 140 mm (telephoto) and that for the bottom is f ∼= 32 mm (wide angle). The background cannon is MUCH smaller in the second image. Chapter 5

Aberrations

Aberrations may be loosely defined as deviations from predicted behavior of an optical system. Chromatic aberrations describe deviations from predicted behavior due to variations in the refractive index for different wavelengths of light. Monochromatic aberrations are variations from calculated behavior due to the approximations used. For example, if we use just the first-order approxmation

sin [θ] ∼= tan [θ] ∼= θ we can describe the deviations from predicted first-order behavior as the third-order aberrations. The aberrations may be described in terms of waves or of rays. The wave aberration is the departure of the wavefront from the ideal spherical wave that “should” emerge from the exit pupil of the system to the image:

p [x, y] exp [+iΦ [x, y]] = p [x, y] exp [+iπW [x, y]] · · where W [x, y] is the scalar wave aberration function measured in units of π radians at each point in the exit pupil. Note that the spherical wave “converges” to a real image or “diverges” from a virtual image. The wave aberration function is the difference of the actual emerging wave from the ideal sphere, which has the form: 2 2 2 2 2 2 (x + y ) x + y + z = R = z = R 1 2 ⇒ · r − R

5.1 Chromatic Aberration

In the earliest days of optics, all optical systems were constructed from single lenses (“singlets”) and therefore suffered from chromatic aberrations due to the physical mechanism of dispersion.We saw that the index of refraction of optical materials decreases with increasing wavelength λ in regions of normal dispersion. At longer wavelengths in a regime with normal dispersion, a lens with positive power will have less refractive power φ (longer focal length f). Conversely, a lens with negative power will have a longer negative focal length at longer wavelengths. The impact of chromatic aberration on the image was minimized if the focal is long and the focal ratio is large. For this reason, early telescopes for astronomical viewing were made very long in part for magnification and in part to reduce the visibility of chromatic aberrations.

161 162 CHAPTER 5 ABERRATIONS

The aerial telescope of Johannes Hevelius with a focal length of f =45m∼= 148 ft with an aperture diameter of d ∼= 220 mm ∼= 8.5in

The observation that different glasses have different dispersions is the basis for the principle of achromatization (from the Greek words for without color), where two optical elements made from glasses with different dispersion characteristics are combined to match the focal lengths at two different wavelengths (typically red and blue). An achromatic doublet is fabricated from a positive element made from crown glass with a lower refractive index and lower dispersion, and a negative element made of flint glass with a larger refractive index and a larger dispersion. For an achromat with a positive focal length (converging lens), the lens is made of a positive lens from crown glass and a negative lens from flint glass so that the chromatic aberrations act in opposition to match at the two wavelengths. If the component lenses are in contact (and often the curvatures are designed to match so that they may be cemented together, then the positive power must be larger (focal length must be shorter).

Lens systems may be built that correct for three or more wavelengths. It may be obvious that the number of elements must match or exceed the number of corrected wavelengths. Apochromats have at least three elements to correct the focal length at three different wavelengths (typically red, green, and blue) and are fabricated from three glass elements with different dispersion characteristics. Of course, the need for the additional element(s) means that apochromats tend to be more expensive than achromats. 5.1 CHROMATIC ABERRATION 163

Principle of the achromat: the first singlet lens exhibits chromatic aberration because of the dispersion of the glass (nred

Apochromat made of three elements to correct focus at three wavelengths.

The traditional wavelengths used to design optics were specified by Fraunhofer based on absorp- tion lines in the solar spectrum:

Line λ [nm] n for Crown n for Flint

C 656.28 1.51418 1.69427 D 589.59 1.51666 1.70100 F 486.13 1.52225 1.71748

The design of acromats is based on the dispersion of the glass, which we already specified

Refractivity nD 1 1.75 nD 1.5 − ≤ ≤ Mean Dispersion nF nC > 0 differences between blue and red indices − Partial Dispersion nD nC > 0 differences between yellow and red indices − nD 1 Abbé Number ν − ratio of refractivity and mean dispersion, 25 ν 65 ≡ nF nC ≤ ≤ − For a single thin lens, the power of the system is:

1 1 1 φ = =(n 1) (n 1) (C1 C2) f − · R − R ≡ − · − µ 1 2 ¶ where 1 C ≡ R The effect of dispersion on the power is obtained by differentiating:

dφ φ dn nF nC φ =(C1 C2)= = dφ = φ = φ − dn − n 1 ⇒ · n 1 · n 1 ≡ ν − − − where ν istheAbbénumber. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 165

For a two-lens system, we have already determined the formula for the power:

φ = φ + φ φ φ t eff 1 2 − 1 · 2 · = dφ = dφ + dφ φ t dφ φ1t dφ ⇒ eff 1 2 − 2 · 1 − · 2 =(1φ t) dφ +(1 φ t) dφ − 2 · 1 − 1 · 2 The power at the two wavelengths is matched so that:

dφ =0=(1φ t) dφ +(1 φ t) dφ eff − 2 · 1 − 1 · 2 φ1 φ2 =(1φ2t) +(1 φ1t) − · ν1 − · ν2 φ1 φ2 = (1 φ2t) =(1 φ1t) ⇒− − · ν1 − · ν2 ν ν 1 + 2 f ν + f ν = t = φ1 φ2 = 1 1 2 2 ⇒ ν1 + ν2 ν1 + ν2 φ1ν1 + φ2ν2 φeff = ν1 + ν2

If the two lenses are in contact so that t =0, then: φ f ν 2 = 1 = 2 φ1 f2 −ν1 For an achromat that has the same focal length for red light (C line, λ = 656.28 nm) and blue light (F line, λ = 486.13 nm). Note that it is possible to use the same glass and adjust the focal lengths and distance to achromatize. If ν1 = ν2 ν,then ≡ f ν + f ν (f + f ) ν f + f t = 1 1 2 2 t = 1 2 = 1 2 ν1 + ν2 → 2ν 2 1 φ1 + φ2 2 f1f2 φeff = = = feff = =2 feff 2 ⇒ 1 + 1 · f1 + f2 f1 f2 ³ ´

5.2 Third-Order Optics, Monochromatic Aberrations

Aberrations may be interpreted as corrections to the paraxial imaging behavior of optics that result by adding the second term to the approximations for the trigonometric functions: for cos [ϕ]:

ϕ3 sin [ϕ] = ϕ ∼ − 3! ϕ2 cos [ϕ] = 1 ∼ − 2! ϕ3 tan [ϕ] = ϕ + ∼ 3

Theexpressionforthecosinemaybesubstituted into the formula for the path length 1 of the ray in terms of the object distance z1, the angle ϕ and the radius of curvature R:

1 2R2 2R 2 1 = 1+ + (1 cos [ϕ]) z z2 z − 1 µ µ 1 1 ¶ ¶ 166 CHAPTER 5 ABERRATIONS

1 2R2 2R ϕ2 2 1 = 1+ + 1 1 z z2 z − − 2! µ 1 ¶third order µ µ 1 1 ¶µ µ ¶¶¶ 1 Rϕ2 R 2 = 1+ +1 z1 z1 µ µ ¶¶ 1 2 2 2 1 = z + Rϕ z1 (R +1) ∼ 1 · = ¡ ¢ which is a significantly more complicated expression than the first-order solution:

1 = 1= 1 = z1 z ∼ ⇒ ∼ µ 1 ¶first order

The wavefront emerging from the aperture of the system (the exit pupil) may be characterized by its shape or by rays at different locations in the pupil that are orthogonal to the wavefront. The rays are defined by the end-point coordinates in the pupil plane (with height r from which they emerge) and in the image plane (with height r0 to which they travel). The deviations from the wave or of the rays from the ideal behavior are characterized by the concept of ray aberrations,which typically are as a set of numerical values (coefficients) that describe the amount of deviation of the ray or of the wavefront from the ideal. The order of the aberrations is determined by the highest power of the term kept in the expansion for the sine in Snell’s law:

θ3 θ5 sin [θ]=θ + − 3! 5! − ··· The inclusion of these larger powers in the expansion results in larger deviation of the theoretical calculation from the actual behavior at larger off-axis angles.

We can also consider deviations of the actual wavefront from the ideal in first-order paraxial or Gaussian optics. For example, a translation of the ideal wavefront down the z-axis from the “ideal” image location may be characterized by an “aberration” that is called defocus.

The decomposition of the wavefront into deviations from the ideal requires six coefficients of powers of r and r0: Spherical Aberration r4 3 Coma r r0 cos [θ] 2 2 2 Astigmatism r r0 cos [θ] 2 2 Curvature of Field r r0 3 Distortion rr0 cos [θ] 4 Piston Error r0 The last of these, piston error, is a measure of a z-axis translation of the wavefront analogous to defocus. As such, it has no effect on the image and often is not included in the list of aberrations. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 167

In spherical aberration with positive coefficients, the rays from the margin of the pupil cross the axis closer to the optic than the paraxial rays. The image of a point object created by a system with spherical aberration shows a bright central region surrounded by a “halo” of light from the margin of the pupil.

Spherical aberration describes the deviation of the rays emerging from the pupil from the ideal convergence to an image point. If the aberration coefficient is positive, the rays emerging from the margin of the pupil cross the optical axis closer to the optic than the paraxial rays close to axis. In other words, the focal length for marginal rays is shorter than that for paraxial rays. Spherical aberration is a circularly symmetric deviation of the wavefront from the quadratic-phase ideal of Gaussian optics. The resulting wavefront emerging from the pupil is a 4th power of the pupil coor- dinates, which has the shape of a china bowl. This shows that the rays near the edge of the pupil are directed towards a point on the axis that is closer to the optic. Since spherical aberration is a function only of the pupil-plane coordinates, it describes a shift-invariant deviation that may be characterized by an impulse response.

The shape of the wavefronts emerging from the pupil for spherical aberration (black) and defocus (red). Marginal rays emerging from a pupil that exhibits spherical aberration will cross the axis (i.e., “focus”) closer to the pupil than the paraxial rays.

For coma, the deviations from ideal performance for coma are larger for larger values of the image plane coordinate r0. If a point source and its image are located on axis, coma in the system will have no effect on the image, but the image of a point source located off axis will be spread differently at different values of the image plane coordinates. The image of an off-axis point source will be “teardrop” shaped. To introduce the concept of monochromatic aberrations, consider the complex amplitude of the 168 CHAPTER 5 ABERRATIONS

wavefront diverging from a specificobjectpoint[x0,y0] to the location [x, y] intheentrancepupil:

w [x, y; x0,y0]=p [x, y] exp [+i Φ [x, y; x0,y0]] · where:

2 z1 r 1 1 Φ [x, y; x0,y0]=exp +2πi exp +iπ exp [+2πi ∆Φ [x, y; x0,y0]] λ · λ z − f · · µ 0 ¶ ∙ 0 µ 1 ¶¸ is the phase at the pupil due to a point source located at [x0,y0] in the object plane, which includes the quadratic phase of the ideal “spherical” wavefront converging to the image point plus any phase error ∆Φ [x, y; x0,y0] and p [x, y] specifies the magnitude function of the pupil (the so-called apodization function). A similar expression may be written for light converging to the image point [x0,y0] from the location [x0,y0] in the exit pupil. If the actual wavefront at [x, y] in the pupil lags behind the ideal sphere (actually a paraboloid), then the light from that location converging to the image plane must have been emitted earlier in time; the phase difference ∆φ at that location [x, y] in the pupil is positive. The map of ∆Φ [x, y; x0,y0] may be decomposed into different “shapes” described by different powers of the object coordinates [x0,y0] and of the pupil coordinates [x, y]. The weights of each of these different shapes present in the actual wavefront are the aberration coefficients, which are commonly used to specify the differences of the behavior from the ideal.

Comparison of ideal and actual wavefronts emerging from optical system. The difference between the wavefronts may be specified by the difference in phase or by the intersections of rays normal to the wavefront.

Alternatively, we can describe the difference in action of the optic from the ideal in terms of the “rays” from different points in the pupil. The rays are (of course) perpendicular to the wavefront emerging from the pupil. Unaberrated rays should all cross the optical axis exactly at the image point. Rays from an aberrated wavefront will cross at different locations. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 169

Rays from different points on the wavefront emerging from the pupil of an optic with spherical aberration; the rays cross the optical axis at different locations.

The aberration function specifies the difference in optical phase between the actual and ideal wavefronts that converge to the ideal real image point (or diverge from the ideal virtual image point). Since the shape of the wavefront due to a point object generally varies with its location in the object plane, the aberration function generally depends on coordinates in both the object and pupil planes; it is a 4-D function. The coordinates used in the calculations of the rays are shown in the figure: 170 CHAPTER 5 ABERRATIONS

Coordinates used to evaluate aberrations. Light propagates from the pupil plane (coordinates without subscripts) over the distance z2 to the image plane (coordinates with subscripts). Note that the pupil and image plane coordinates are normalized so that rmax =(r0)max =1.

A ray of light with wavelength λ0 that emerges from the exit pupil at [x, y] and crosses the image plane at [x0,y0] has the form:

w [x, y; x0,y0]=p [x, y] exp [+2πi Φ [x, y; x0,y0]] · · where p [x, y] specifies the magnitude of the pupil transmittance of the exit pupil (the so-called apodization function) and Φ [x, y; x0,y0] is the phase at the pupil for an object point at coordinates [x0,y0] emerging from the pupil at [x, y]. The phase includes the converging “spherical” (actually parabolic) wave and the phase difference term:

r2 1 1 Φ [x, y; x0,y0]=+i + ∆Φ [x, y; x0,y0] 2λ f − z 0 µ 2 ¶

We consider the locations in polar coordinates: the image location is [x0,y0]=(r0,α) and the pupil coordinates [x, y]=(r, θ). If the optical system has a circular cross-section (i.e., if the optical system is rotationally symmetric), then the behavior of the aberration does not depend on the absolute azimuthal coordinates but only on their difference, so that we can consider a three- dimensional description based on radial coordinates r, r0, and relative azimuthal angle θ α ϕ; − ≡ i.e., we can write the phase error function in the form ∆Φ [r, r0,ϕ]. The relative phase between the 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 171 object point and a location in the pupil is 2π radians (per cycle) multiplied by the number of cycles, which is the ratio of the distance between the locations in the object plane and in the pupil divided by the wavelength λ0:

1 2 2 2 2 distance: R = z +(r cos θ r0 cos α) +(r sin θ r0 sin α) − − n o 1 R 2π 2 2 2 2 Φ [x, y; x0,y0,z]=2π = z +(r cos θ r0 cos α) +(r sin θ r0 sin α) λ0 λ0 − − n o 1 2π 2 2 2 2 2 2 2 2 2 2 = z + r cos θ + r0 cos α 2rr0 cos θ cos α + r sin θ + r0 sin α 2rr0 sin θ sin α λ0 − − 1 2π © 2 ¡2 2 ¢ 2 ¡ ¢ª = z + r + r0 2rr0 (cos θ cos α +sinθ sin α) λ0 − 1 2π © 2 2 2 2 ª = z + r + r0 2rr0 cos [θ α] λ0 − − 1 © 2 2 ª 2 z r + r0 2rr0 =2π 1+ 2 + 2 cos [θ α] λ0 · z − z − ½ ∙µ ¶ µ ¶¸¾1 z r2 + r2 2rr 2 2π 1+ 0 + 0 cos [ϕ] ≡ λ · z2 − z2 0 ½ ∙µ ¶ µ ¶¸¾ This expression may be expanded into a power series via the binomial theorem:

n n (n 1) (1 + u)n =1+ u + − u2 + 1! 2! ··· 1 1 1 1 = (1 + u) 2 =1+ u u2 + u3 ⇒ 2 − 8 16 − ··· In the current expression, we can identify:

r2 + r2 2rr u 0 + 0 cos [ϕ] ≡ z2 − z2 µ ¶ µ ¶ 1 r2 + r2 rr = u = 0 + 0 cos [ϕ] ⇒ 2 2z2 − z2 µ ¶ ³ ´ 1 1 r2 + r2 2rr 2 = u2 = 0 + 0 cos [ϕ] ⇒−8 −8 z2 − z2 ∙µ ¶ µ ¶¸ 1 r2 + r2 2 2rr 2 r2 + r2 2rr = 0 + 0 cos [ϕ] +2 0 0 cos [ϕ] −8 z2 − z2 z2 − z2 "µ ¶ µ ¶ µ ¶µ ¶# 1 r4 + r4 +2r2r2 4r2r2 r2 + r2 rr = 0 0 + 0 cos2 [ϕ] 4 0 0 cos [ϕ] −8 z4 z4 − z2 z2 ∙µ ¶ µ ¶ µ ¶ ¸ 1 r4 + r4 +2r2r2 4r2r2 r3r ³ rr3 ´ = 0 0 + 0 cos2 [ϕ] 4 0 cos [ϕ]+ 0 cos [ϕ] −8 z4 z4 − z4 z4 ∙µ ¶ µ ¶ µ ¶¸ 1 r4 + r4 +2r2r2 r2r2 r3r rr3 u2 = 0 0 0 cos2 [ϕ] + 0 cos [ϕ]+ 0 cos [ϕ] −8 − 8z4 − 2z4 2z4 2z4 µ ¶ µ ¶ µ ¶ So the power series for the phase function truncated to the second order becomes:

2 2 z r + r0 rr0 Φ [x, y; x0,y0,z] = 2π 1+ +2π cos [ϕ] ∼ λ 2z2 − z2 0 µ µ ¶ ¶ z r4 + r4 +2r2r2 r2³r2 ´ r3r rr3 +2π 0 0 2π 0 cos2 [ϕ] +2π 0 cos [ϕ]+ 0 cos [ϕ] λ − 8z4 − 2z4 2z4 2z4 0 µ µ ¶ µ ¶ µ ¶¶ 172 CHAPTER 5 ABERRATIONS

z Now we can multiply through by the leading factor of 2π , which produces 10 terms: a constant, λ0 three terms from the first-order polynomial, and six from the second-order polynomial:

z r2 + r2 rr Φ [x, y; x ,y ,z] = 2π +2π 0 2π 0 cos [ϕ] 0 0 ∼ λ 2λ z λ z 0 á 0 ¢! − 0 r4 + r4 +2r2r2 r2r2 r3r rr3 2π 0 0 2π 0 cos2 [ϕ] +2π 0 cos [ϕ]+2π 0 cos [ϕ] − 8λ z3 − 2λ z3 2λ z3 2λ z3 µ 0 ¶ µ 0 ¶ 0 0

z =2π λ0 r2 r2 rr +2π +2π 0 2π 0 cos [ϕ] 2λ0z 2λ0z − λ0z 4 4 2 2 2 2 3 3 r r0 r r0 r r0 2 r r0 rr0 2π 3 2π 3 2π 3 2π 3 cos [ϕ]+2π 3 cos [ϕ]+2π 3 cos [ϕ] − 8λ0z − 8λ0z − 4λ0z − 4λ0z 2λ0z 2λ0z which may be reordered into: z Φ [x, y; x0,y0,z] = 2π ∼ λ0 π 2 π 2 2π + r + r0 rr0 cos [ϕ] λ0z λ0z − λ0z · π 4 π 3 π 2 2 3 r + 3 r r0 cos [ϕ] 3 r r0 − 4λ0z λ0z − λ0z π 2 2 2 π 3 π 4 3 r r0 cos [ϕ]+ 3 rr0 cos [ϕ] 3 r0 − λ0z λ0z − λ0z In other words, we have “decomposed” the phase of the spherical wave into terms with different powers of the coordinate in the pupil plane (with coordinates [x, y]=(r, θ)) and in the image plane (with coordinates [x0,y0]=(r0,α) in a manner analogous to the decomposition into sinusoidal components in the Fourier transform. Our goal will be to decompose the phase difference between the ideal and actual wavefronts using these same terms. Again, since the system is assumed circularly symmetric, only the difference in azimuthal coordinates θ α ϕ is relevant. − ≡ 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 173

5.2.1 Names of Aberrations

The difference in the shape of the “actual” wavefront from the ideal spherical wavefront is decom- posed into the same terms as the phase; each term has its unique “shape” and name, and will be described by a coefficient that determines “how much” of each “shape” is present in the phase dif- ference. From the series above, we can apply weighting coefficients to the three relevant coordinates distinguished by subscripts: the index j of the power of the radial coordinate r0 at the image (the “image height”), the index m of the power of the radial coordinate r at the pupil, and the index n of the power of cos [ϕ]. From the series above we can see that only some powers are included in the summation, so we can write the phase difference as

∆Φ [x, y; x0,y0,z]=Φideal [x, y; x0,y0,z] Φactual [x, y; x0,y0,z2] − j m n = Wjmnr0r cos ϕ j,m,n X = W000 (propagation from pupil to image) 2 2 + W200r0 (piston error) + W111r0r cos ϕ (tip-tilt) + W020r (defocus) 4 3 + W040r (spherical aberration) + W131r0r cos ϕ (coma) 2 2 2 2 2 + W220r0r (curvature of field) + W222r0r cos ϕ (astigmatism) 3 4 + W311r0r cos ϕ (distortion) + W400r0 (piston error) + ···

The coefficients Wjmn measure the “amplitudes” of the individual terms and typically are spec- ified in units of wavelengths (the “number of waves” of the aberration) at the edge of the pupil (i.e., at r =1); they must be multiplied by 2π radians per wavelength to convert to phase angle. For example, a sample system might be specified as having “one-half wave of spherical and a quarter wave of astigmatism.”

Shift Invariant or Not?

Note that phase errors that depend on r0 will produce different images for different image “heights” and therefore are shift-variant effects that strictly cannot be characterized by impulse responses and/or transfer functions. That being said, it is common practice to examine the “impulse re- sponse” and/or the “transfer function” in a local region as though the aberration were shift invari- ant, which allows the analyst to create a (“pseudo”) frequency-domain description of the action of the aberration. 174 CHAPTER 5 ABERRATIONS

5.2.2 Aberration Coefficients

To get an idea of the behavior in the wavefront due to these terms, we can plot graphs of these “shapes” at the pupil for specified locations in the object plane. The examples are plotted for different object locations and assuming that λ0 = z2 =1. The aberrations are grouped by the numerical powers of the radial terms in the series, e.g., j + m =0for W000, j + m =2for W200, W111,andW200, j + m =4for W040, W131, etc. You might expect that the second-order grouping would include W200 (piston error), W111 (tip-tilt), and W020 (defocus). However, for historical reasons, the groupings are based on the powers for the “rays” derived from the “wavefronts” via the gradient operator (a first-order derivative), so these three form the group of the first-order aberrations.Thetermswithj + m =4are the third-order aberrations,etc.

Zero-Order Term:

Propagation:

constant phase (zero-order piston error = propagation from pupil to image):

1 if x2 + y2 1 λ0 ∆Φ [x, y; x0,y0,z]=2π W000 ≤ · · ⎧ 0 if px2 + y2 > 1 ⎨ p The coefficient W000 is the number of incremental wavelengths⎩ due to propagation “downstream” from the object to the pupil is a normal part of the imaging; it is not considered to be an aberration. In any event, its only effect on the irradiance is the constant attenuation of the image field due to the inverse square law identical to the constant phase term in the Fresnel and Fraunhofer diffraction terms.

zero-order term, constant phase, piston error aberration 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 175

Second-Order Wave (First-Order Ray) Aberrations:

These include the three terms for which the sums of the powers of r and r0 equal two. Since the rays are oriented orthogonal (and must be calculated by derivatives), these correspond to the “first-order” aberrations for rays. In fact, these three terms often are not considered to be aberrations since the only one that has a degrading effect on an irradiance image is defocus, which may (of course) be compensated by changing the location of the sensor so that it coincides with the image.

Constant Phase — First-Order Piston Error

constant phase (first-order piston error): r2 + 0 if x2 + y2 1 2λ z ∆Φ [x, y; x0,y0]=2π W200 ⎧ 0 ≤ · · p 2 2 ⎨⎪ 0 if x + y > 1 p This is an additional constant phase due to the off⎩⎪-axis location in the image plane; it is quadratic in the image coordinate, but constant in the pupil coordinate, so it is a constant for a particular image location. Since this measures the “constant” phase difference, it has no effect on the measured irradiance and therefore no impact on the quality of the image.

constant phase from first-order terms: piston error 176 CHAPTER 5 ABERRATIONS

Bilinear-Phase — “Tip-Tilt”

linear phase from both object and pupil (tip or tilt): rr 0 cos [ϕ] if x2 + y2 1 ∆Φ [x, y; x0,y0]=2π W111 −λ0z ≤ · · ⎧ p 2 2 ⎨ 0 if x + y > 1 p A phase that has linear contributions from the⎩ pupil location r and image location r0 (a “bilinear” phase) means that the shape of the field emerging from the pupil for a particular object location is a“flat” plane tilted in proportion to the off-axis position of the object and the image. Because it is a linear phase in the pupil, it displaces the resulting image towards the direction where the phase is negative.

In atmospheric imaging scenarios (imaging along a vertical path through turbulence), the time- varying tip-tilt aberration is dominant. For example, the centers of the images of individual stars appear to move around over short time intervals of the order of hundredths of a second. The correction of tip-tilt aberration has a very significant positive effect on the quality of the resulting image. For an example, see the animated GIF file at URL:

http://www.ast.cam.ac.uk/~optics/Lucky_Web_Site/100Her_10ms_200fr.gif

first-order linear term, tip-tilt error 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 177

Quadratic-Phase Error, Focus Shift = “Defocus”

quadratic phase = defocus = focus shift ⇒ r2 + if x2 + y2 1 2λ z ∆Φ [x, y; x0,y0]=2π W020 ⎧ 0 ≤ · · p 2 2 ⎨⎪ 0 if x + y > 1 p This quadratic term is the error in the Fresnel propagation⎩⎪ from the exit pupil if the observation plane does not coincide with the image plane and is therefore called “defocus.” Since it is not a result of flaws in the optics, it is often not considered to be an “aberration,” but there is reason to do so in some applications. As an example, consider the atmospheric imaging scenario mentioned under tip-tilt; any time-varying quadratic contribution to the relative phase displaces the focal plane (slightly), so images through atmospheric turbulence with quadratic contributions appear to go in and out of focus over short time intervals (but, as already mentioned, the tip-tilt aberration is dominant, totalling 87% of the light energy under certain assumptions — see Noll, JOSA, 66, pp.207-211, 1976 and van Dam & Lane, JOSA A, 19, pp. 745-752).

first-order quadratic term, focus shift error = “defocus”

Since defocus is a function only of the pupil-plane coordinates, it is shift invariant at the image plane; the effect of defocus does not vary with “image height” and therefore may be described by an impulse response and a transfer function. For example, consider a small first-order focus error of π radians at the edge of a rectangular pupil with linear dimension d0 =1unit. The complex-valued wavefront has the form shown: 178 CHAPTER 5 ABERRATIONS

Pupil function with defocus of π radians at edge of the pupil (“half-wave of defocus”): (a) real part; (b) imaginary part; (c) magnitude; (d) phase, showing quadratic nature.

The incoherent transfer function is the scaled autocorrelation of the pupil and the impulse re- sponse is the inverse Fourier transform. The MTF has a zero at the normalized spatial frequency ρ ∼= 0.5. Note that the image with defocus is “wider” and the peak irradiance is “smaller” than the diffraction-limited image.

(a) MTF of incoherent optical system with square aperture with one-half wave of defocus compared to MTF without defocus (red); (b) psf with one-half wave of defocus (black) and without defocus (red).

Other examples of transfer functions (MTFs) and impulse responses for square apertures with differ- ent amounts of defocus (measured in waves at the edge of the pupil) are shown. Note in particular that the intermediate frequencies are degraded more rapidly than either the smallest or largest spatial 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 179 frequencies. Note that the MTF at certain frequencies is negative, which means that the modulation has changed sign (“lighter” regions in the original object become “darker” in the defocused image). This can be seen in an object with different spatial frequencies.

λ0 MTF and corresponding psfs for square pupil with different amounts of defocus from 4 at the edge of the pupil to 1.5λ0. Note that the decrease in MTF is most pronounced at intermediate spatial frequencies. For larger amounts of defocus, the MTF goes negative over regions of the frequency domain (contrast reversal). The psf widens with increasing defocus.

The spatial frequency of a “radial grating” f [x, y] increases as the reciprocal of the distance from the center. In the examples shown, the irradiance is biased up so that its normalized maximum and minimum amplitudes are 1 and 0, respectively. The grating is imaged through a real optical system onto a CCD sensor that samples the image and thus the image is aliased at large spatial frequencies (near the center). The three images are at the focal plane (i.e., “in focus”) and with two increments of defocus. Track a radial line in the original (in red) to see that the amplitude of the in-focus does not vary from unity (except where there is aliasing), while the defocused image exhibits several changes in phase, from light to dark to light, etc. The contrast of the smallest spatial frequency (at the edge of the image) is reversed in the image with more defocus, and this image also exhibits more changes in phase. 180 CHAPTER 5 ABERRATIONS

Effect of two increments of defocus on the image of a radial grating. The negative regions of the MTF of defocus imply that the contrast of those spatial frequencies is “reversed” (darker gray lighter gray and vice versa). Track the “lightness” along the red lines to see the contrast reversals.→ Note that the “in-focus” image exhibits some sampling (“aliasing”) artifacts in the center where the azimuthal spatial frequency is large.

This artifact is often called “spurious resolution,” because the object is not reproduced at the locations of the phase change. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 181

5.2.3 Fourth-Order (Third-Order Ray) Aberrations:

the “Seidel aberrations” r4 3 = no variation at object, quartic phase at pupil = spherical aberration W040 (LSI) −2λ0z ⇒ ⇒ 3 rr0 + 3 cos [ϕ]= cubic phase at object, linear phase at pupil = coma, W131 2λ0z ⇒ ⇒ 2 2 r r0 3 = quadratic phase at object and pupil = field curvature, W220 −4λ0z ⇒ ⇒ 2 2 r r0 2 3 cos [ϕ]= quadratic phase at object and pupil + azimuth variation = astigmatism, W222 −2λ0z ⇒ ⇒ 3 r r0 + 3 cos [ϕ]= linear phase at object, cubic phase at pupil = distortion, W311 2λ0z ⇒ ⇒ 4 r0 3 = quartic phase at object, no variation at pupil = third-order piston error, W400 −8λ0z ⇒ ⇒ Note that the four of these six terms have even powers of both the pupil coordinate r and the image coordinate r0, whereas coma and distortion include odd powers of both.

Spherical Aberration This is the simplest third-order aberration to describe mathematically since it depends only on the coordinates in the pupil plane; its effect is constant across the image plane. This means that spherical aberration is the only one of the six Seidel terms that is shift invariant (and may therefore be described as a convolution). The wavefront shape for spherical aberration resembles a deeper “bowl” than the paraboloid for defocus. Note that the negative sign on the phase means that the spherical aberration is negative if the phase contribution is positive.

linear phase from both object and pupil (tip or tilt): r4 if x2 + y2 1 −2λ z3 ≤ ∆Φ [x, y; x0,y0]=2π W040 ⎧ µ 0 ¶ · · p 2 2 ⎨⎪ 0 if x + y > 1

⎩⎪ p

quadratic term from second order of expansion: spherical aberration If the numerical coefficient of spherical aberration is positive, then rays from the marginal regions of the pupil have a steeper slope than those from the paraxial region near the optical axis. In other 182 CHAPTER 5 ABERRATIONS words, the “marginal focus” is closer to the lens than the ideal “paraxial focus.” The paraxial image of a point object is not “sharp” but exhibits a halo of light around a bright central core.

Negative coefficient of spherical aberration of positive lens: rays from the margin of the pupil cross axis closer to the optic than paraxial rays. The image of a point object at the paraxial focus exhibits a bright central region surrounded by a “halo” of light from the margin of the pupil.

Because it is a shift-invariant effect at the image plane, spherical aberration may be described by an impulse response and by a transfer function. Spherical aberration is a distortion of the true spherical wavefront that makes a “deeper bowl” so that the incremental phase error is large near the edge of the pupil (far from the optical axis, for the marginal part of the wave) and small near the center of the pupil (near the optical axis, for the paraxial part of the wave).

Example of quartic wavefront error of spherical aberration compared to quadratic error from defocus. Spherical aberration error is a “deeper bowl.”

Consider an example for spherical aberration where the phase error is π radians at the edge of a square pupil, the same phase error at the edge that was considered for defocus. The profiles of the phase in the pupil are: 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 183

Pupil function for one-half wave of spherical aberration: (a) real part; (b) imaginary part; (c) magnitude; (d) phase in units of π radians, showing the fourth-power behavior.

The incoherent MTF shows a significant decrease as the frequency approaches cutoff and the psf is noticeably wider and “shorter:” 184 CHAPTER 5 ABERRATIONS

(a) MTF of incoherent optical system with square aperture with one-half wave of negative spherical aberration at the edge of the pupil compared to MTF without aberration (red); (b) psf with one-half wave of aberration (black) and without aberration (red). Note that the image with spherical aberration is “shorter” and “fatter.”

MTF and corresponding psfs for square pupil with different amounts of spherical aberration from λ0 4 at the edge of the pupil to 1.5λ0. The MTF has a similar behavior as for defocus; it decreases most rapidly at the middle frequencies rather than at smallest or largest, and it may go negative at some frequencies. The MTF for spherical aberration decreases more slowly than for defocus because the phase changes more slowly except near the edge of the pupil.

The uncorrected optical system in the Hubble Space telescope suffered from significant spherical aberration due to flaws in the primary mirror that were disguised during mirror testing. Spherical aberration of the wave emerging from different parts of the pupil may be partially balanced by changing the focus, i.e., by “adding defocus.” For example, the phase at the edge of the 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 185 pupil may be compensated by applying a defocus aberration in the opposite direction so that

14 12 2π W040 +2π W020 =0 · · −2λ z3 · · 2λ z µ 0 ¶ 0 W040 = W020 = ⇒ z2 If we use defocus cancel the phase error due to spherical aberration at the edge of the pupil, the resulting transfer function and image have the form shown, so that the image is improved markedly by using the appropriate amount of defocus.

Application of defocus to balance spherical aberration at edge of square pupil: (a) MTF without aberrations (black), with 1/2 wave of spherical aberration (red), and after balancing with -1/2 wave of defocus; (b) corresponding impulse responses.

Coma

= linear phase from both object and pupil (tip or tilt): ⇒ r r3 + 0 cos [ϕ] if x2 + y2 1 2λ z3 ∆Φ [x, y; x0,y0]=2π W131 ⎧ 0 ≤ · · p 2 2 ⎨⎪ 0 if x + y > 1 p ⎩⎪ The surface shape is proportional to the cube of the image height, proportional to the height of the ray in the pupil. This produces a different phase error, and therefore different images, for different values of the image height r0 as shown in the example. The images have a “comet-like” shape, hence the name for the aberration. 186 CHAPTER 5 ABERRATIONS

Star field imaged through optical system with coma; elongation of the star images increases with distance from optical axis (which is located below bottom of the image). Credit: “Star Gazing with Telescope and Camera,” George T. Keene, Amphoto, Garden City, 1967, p. 93.

Curvature of Field

quadratic phase from object and pupil r2r2 0 if x2 + y2 1 2λ z3 ∆Φ [x, y; x0,y0]=W220 ⎧ − 0 ≤ · p 2 2 ⎨⎪ 0 if x + y > 1 p As indicated by the name, the “best” images in systems⎩⎪ with this aberration are on a curved surface. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 187

Some imaging systems (e.g., Schmidt cameras) are deliberately designed with curved fields be- cause it produces good images over wide fields of view. The sensors used in wide-field Schmidt astronomical cameras were glass plates that were predistorted” prior to being installed in the cam- era. Since the plates could be as large as 14" square, this was a touchy operation.

Astigmatism

The Latin word for “points” is “stigmata,” so that a system with astigmatism is not capable of producing points. It focuses “horizontal” and “vertical” patterns at different focal planes, as shown:

Astigmatism focues vertical and horizontal lines at different planes (horizontal lines in the “sagittal” plane and vertical lines in the “meridional” plane) http://www.olympusmicro.com/primer/anatomy/aberrations.html

The aberration coefficient for astigmatism is: 188 CHAPTER 5 ABERRATIONS

quadratic phase from object and pupil and azimuthal variation

1 2 2 2 2 2 3 r0r cos [ϕ] if x + y 1 ∆Φ [x, y; x0,y0]=2π W222 −2λ0z ≤ · · ⎧ p 2 2 ⎨ 0 if x + y > 1 The error is quadratic with an azimuthal dependen⎩ ce; the additional quadraticp is maximized along the azimuthal direction ϕ =0and, and zero along the orthogonal direction. It therefore adds an azimuthally dependent “focusing” power. In other words, object lines oriented along different directions are focused at different distances from the optic. The eye systems of many people exhibit astigmatism, which means that the corrective lenses must have different powers along the orthogonal axes; in other words, lenses with cylindrical power are needed.

Lenses that have been corrected for astigmatism are known as anastigmats.

Distortion

cubic phase at pupil, linear phase at object, azimuthal variation r3r + 0 cos [ϕ] if x2 + y2 1 2λ z3 ∆Φ [x, y; x0,y0]=2π W311 ⎧ 0 ≤ · · p 2 2 ⎨⎪ 0 if x + y > 1 p ⎩⎪ This is a cubic dependence on the pupil coordinate and linear variation the image coordinate. Like coma, the effect of distortion varies with image height. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 189

The image shapes resulting from distortion with coefficients of different algebraic signs are different. If W311 < 0 or W311 > 0, the images suffer from “pincushion distortion” or “barrel distortion,” respectively.

Images of a grid object through systems with (a) no aberrations; (b) “pincushion” distortion ( W311 < 0); (c) “barrel” distortion ( W311 > 0).

Piston Error

quartic phase at object r4 0 if x2 + y2 1 2λ z3 ∆Φ [x, y; x0,y0]=2π W400 ⎧ − 0 ≤ · · p 2 2 ⎨⎪ 0 if x + y > 1 p This is a constant phase due to the off-axis distance⎩⎪ at the image plane and has no effect on the irradiance of the image, hence it often is not considered to be an aberration. However, it does have an important effect on optical systems with “sparse” primary elements, such as multiple-mirror telescopes. 190 CHAPTER 5 ABERRATIONS

constant term from second-order expansion: piston error

Of course, the ultimate resolution of optical systems may be due in part to other uncontrollable factors. For example, ground-based astronomical telescopes are ultimately limited by random vari- ations in local air temperature that create random variations in the refractive index of atmospheric “patches.” These variations are often decomposed into the Seidel aberrations. The constant phase (“piston”) error has no effect on the irradiance (the squared magnitude of the amplitude). Linear phase errors move the image from side to side and or top to bottom (“tip-tilt”). Quadratic phase errors (“defocus”) add or subtract power from the lens to move the image plane along the axis forwards (towards the optic) or backwards (away from the optic), respectively. In correction for atmospheric phase errors, the tip-tilt error is most significant, which means that correcting this aberration significantly improves the image quality. The field of correcting atmospheric aberrations is called “adaptive optics,” and is an active research area.

5.2.4 Zernike Polynomials

It should be no surprise that other useful decompositions of the wavefront errors exist. Another common set of basis functions are the Zernike polynomials, which are often used for fittingdatafrom interferometric optical testing (though NOT in the presence of air turbulence; Zernikes have little value in this situation). The Zernike polynomials are functions of radial and azimuthal coordinates that describe “surfaces” on the unit circle such that the average value of each is zero:

Z (r, ϕ)=R (r) cos ( ϕ) n n · · Z− (r, ϕ)=R (r) sin ( ϕ) n n · · where the radial part is defined as:

(n )/2 k − ( 1) (n k)! n 2k − − r − if n is even ⎧ n + n · − R (r)= k=0 n ⎪ X k! k ! − k ! ⎪ · 2 − · 2 − ⎨⎪ µ ¶ µ ¶ 0 if n is odd ⎪ − ⎪ ⎩⎪ 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 191

So that: 0! R0 (r)= r0 =1(r)= Z0 (r, ϕ)=1(r) cos (0 ϕ)=1(r) 0 0! 0! 0! · ⇒ 0 · · · · ( 1)0 1! R1 (r)= − · r1 = Z1 (r, ϕ)=r cos (1 ϕ)=r cos (ϕ) 1 0! (0)! (0)! · ⇒ 1 · · · 1 1· · Z− (r, ϕ)=R (r) sin (1 ϕ)=r sin (ϕ) 1 1 · · · etc.

One advantage of the Zernike polynomials is that distinct polynomials are orthogonal over the unit circle (i.e., the scalar product of any pair of distinct Zernike polynomials vanishes):

r=1 1 if n = m Rn (r) Rm (r) rdr δnm r=0 · ∝ ⎧ 0 if n = m ≡ Z ⎨ 6 where δnm is the Kronecker delta function. The set of⎩ the first 36 (nonconstant) Zernike polynomials yields a decomposition with minimum RMS wavefront error. Since they all represent wavefront errors at the exit pupil, the corresponding impulse responses and transfer functions may be calculated; the former are shown in a figure. 192 CHAPTER 5 ABERRATIONS

First 28 Zernike polynomials ordered by azimuthal index (horizontally) and radial index(vertically). Ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files.

psfs (impulse responses) of the aberrations for each of the first 28 Zernike Polynomials (ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files/image096.gif) 5.3 STRUCTURAL ABERRATION COEFFICIENTS 193 5.3 Structural Aberration Coefficients

Structural aberration coefficients are due to the “configuration” or “orientation” of the lens. We have just seen that the lensmaker’s equation ensures that there are many prescriptions for a thin lens with a fixed focal length made from one glass. For example, if n2 =1.5 and f = 100 mm,we can have R1 = R2 = 100 mm (double convex) or R1 =50mmand R2 = (plano-convex, curved ∞ side towards object) or R1 = and R2 =50mm(plano-convex, curved side towards image), and many other possibilities. It is perhaps∞ logical that the aberrations from these different prescriptions will be different too. The calculation leads to one of the “rules of thumb” for optical systems; a better image is generated by an optical system if the side of the optic with the larger radius is on the side with the shorter conjugate, which “divides”thepowerofthelensmoreequallybetweenthe two surfaces. For example, for a plano-convex lens with the source point at infinity (so that the image is at the focal point), the image exhibits better quality if the curved side of the lens is towards the object. With the flat side towards the object, the front flat surface contributes no power to the image.

5.4 Optical Imaging Systems and Sampling

Qfactor

5.5 Optical System “Rules of Thumb”

1. If imaging with a singlet lens, the aberrations are smaller if the lens surface with more curvature (shorter radius of curvature) is on the side of the longer conjugate. Since the transverse magnification is smaller than 1 in most cases (distant object), the “more curved” side of the lens should be towards the distant object. This divides the power of the surfaces more evenly and minimizes the spherical aberration.

2. If imaging in visible light, the diameter of the diffraction spot in micrometers is approximately equal to the f-number of the system.

3. The MTF at the Rayleigh limit is about 9% (www.normankoren.com/Tutorials/MTF1A.html). Lenses are sharpest in the interval of about two stops between the (small) aperture where diffraction starts to dominate and two stops smaller than the maximum aperture. For 35mm lenses, the maximum aperture often is of the order of f/2, so two stops smaller is typically f/5.6. The aperture at which diffraction starts to dominate depends on wavelength, but is generally accepted as about f/22. Therefore the sharpest range for a 35mm lens is between about f/5.6 and f/11.At larger apertures (smaller f/ numbers), resolution is limited by aberrations (astigmatism, coma, etc.); at small apertures, resolution is limited by diffraction. The MTF if the lens is used “wide open” is almost always poorer than MTF at f/8 because of the aberrations. Note that this discussion does not consider the effects of the sensor, just the lens.

4. Image is visually unaberrated if the Strehl ratio 0.8= σ 0.075 λ0 = D ' ⇒ (∆W ) / · ⇒ λ0 λ0 ∆Wmax . = σ∆W / 4 ⇒ / 14 5. If imaging in visible light, the image appears to be “in focus” if the defocus distance measured in micrometers is smaller than (f/#)2.

6. Depending on source, the resolution r of lens in line pairs per mm is approximately 1390 1600 r f/# / / f/# 194 CHAPTER 5 ABERRATIONS

7. More to come...