<<

& Wavefront Coding

From xkcd Frédo Durand MIT EECS

Thursday, February 11, 2010 Depth of field challenge

Thursday, February 11, 2010 Hardware depth of field solution

✦ Stop down! (use smaller aperture) ✦ problem: noise

http://www.cdm-optics.com/site/publications.php

Thursday, February 11, 2010 Why you need shallow depth of field

Photo with fake shallow depth of field

Thursday, February 11, 2010 Why you need shallow depth of field

Original photo

Thursday, February 11, 2010 Today’s plan

✦ Remove blur computationally • Understand blur • A bit of • Deconvolution • Noise, optimal deconvolution, frequency response ✦ Computational optics for DoF extension • single image capture + deconvolution ✦ Focal stack • Multiple-exposure solution

Thursday, February 11, 2010 Blur: Linear shift-invariant filtering

• Replace each pixel by a linear combination of its neighbors. –only depends on relative position of neighbors • The prescription for the linear combination is called the “ kernel”.

10 5 3 0 0 0 4 5 1 0 0.5 0 7 1 1 7 0 1 0.5 Local image data kernel Modified image data (shown at one pixel)

Thursday, February 11, 2010 Convolution

0.3

coefficient ? 0 Pixel offset original

Thursday, February 11, 2010 Blurring

0.3 coefficient

0 Pixel offset original Blurred (filter applied in both dimensions).

Thursday, February 11, 2010 Studying convolution

• Convolution is complicated –But at least it’s linear (f+kg)­ h = f­h +k (g­h) • We want to find a better expression –Let’s study functions whose behavior is simple under convolution

Thursday, February 11, 2010 Blurring: convolution

Convolution Input sign Kernel

Same shape, just reduced contrast!!! This is an eigenvector (output is the input multiplied by a constant)

Thursday, February 11, 2010 Convolution theorem

A convolution in the primal is a multiplication in Fourier

Primal Fourier f⊗g FG

i.e. Fourier bases are eigenvectors of convolution (convolution is diagonal in the Fourier domain)

Thursday, February 11, 2010 Big Motivation for Fourier analysis

• (Complex) sine waves are eigenvectors of the convolution operator –They diagonalize convolution –Convolution theorem

Thursday, February 11, 2010 Second motivation for Fourier analysis: sampling • The sampling grid is a periodic structure – Fourier is pretty good at handling that • Sampling is a linear process –(but not shift-invariant)

Thursday, February 11, 2010 Sampling

• If we’re lucky, sampling density is enough

Input Reconstructed

Thursday, February 11, 2010 Sampling

• If we insufficiently sample the signal, it may be mistaken for something simpler during reconstruction (that's aliasing!)

Thursday, February 11, 2010 Recap: motivation for sine waves

• Blurring sine waves is simple –You get the same sine wave, just scaled down –The sine functions are the eigenvectors of the convolution operator • Sampling sine waves is interesting –Get another sine wave –Not necessarily the same one! (aliasing) If we represent functions (or images) with a sum of sine waves, convolution and sampling are easy to study

Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Fourier as change of basis • Shuffle the data to reveal other information • E.g., take average & difference: matrix

Basis Basis Basis function 1 function 1 function 2 3 3 2 2 1 1

0 Basis 0 function 2 Geometric Signal After rotation Pseudo- interpretation Fourier

Thursday, February 11, 2010 Fourier as change of basis

• Same thing with infinite-dimensional vectors

Basis Basis Basis function 1 function 1 function 2

Basis function 2 Geometric Signal After rotation Pseudo- interpretation Fourier

Thursday, February 11, 2010 Questions?

21

Thursday, February 11, 2010 Other presentations of Fourier

• Start with with periodic signal • Heat equation –more or less special case of convolution –iterate -> exponential on eigenvalues

22

Thursday, February 11, 2010 Motivations

• Insights & mathematical beauty • Sampling rate and filtering bandwidth • Computation bases –FFT: faster convolution –E.g. finite elements, fast filtering, heat equation, vibration modes • Optics: wave nature of light & diffraction

Thursday, February 11, 2010 Primal vs. dual

• Often, we use the Fourier domain only for analysis –convergence, well-posedness • Computation is performed in the primal

• In other cases, computation is better in Fourier –faster because diagonal

24

Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Can we undo blur?

0.3

0 original Blurred ? Blurred original Not easy, even when we know the kernel

Thursday, February 11, 2010 Recall convolution theorem

✦ Convolution in space is a multiplication in Fourier ✦ Note y the observed blurry image and x the original sharp one ✦ y=g⊗x in the spatial domain ✦ Y=GX in the Fourier domain • A frequency does not depend on the other ones

Thursday, February 11, 2010 Invert the convolution theorem

✦ Given y=g⊗x and g, we seek an estimate x’ of x ✦ How do you invert a multiplication? • Division! ✦ X’(ω)=Y(ω)/G(ω)

✦ DECONVOLUTION IS A DIVISION IN THE FOURIER DOMAIN ! ✦ Which means it is also a convolution in the spatial domain, by the inverse of 1/G

Thursday, February 11, 2010 Questions?

✦ Given y=g⊗x and g, we seek an estimate x’ of x ✦ How do you invert a multiplication? • Division! ✦ X’(ω)=Y(ω)/G(ω)

✦ DECONVOLUTION IS A DIVISION IN THE FOURIER DOMAIN ! ✦ Which means it is also a convolution in the spatial domain, by the inverse Fourier transform of 1/G

Thursday, February 11, 2010 Potential problem?

✦ Deconvolution is a division in the Fourier domain ✦ Division by zero is bad! • Information is lost at the zeros of the kernel spectrum G

Thursday, February 11, 2010 Noise problem

✦ Even when there is no zero, noise is a big problem ✦ If G has small number, division amplifies noise ✦ if y=g⊗x+v where v is additive noise ✦ Y=GX+V ✦ X’=(GX+V)/G = X+V/G ✦ V is amplified by 1/G. This is why you typically get more high-frequency noise with deconvolution

Thursday, February 11, 2010 Noise problem blurry, no noise deconvolved

blurry, with noise deconvolved

http://www.mathworks.com/products/demos/image/deblur_wiener/deblur.html Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Noise: extreme case

✦ If G(ω) = 0 => Y(ω)=V(ω) ✦ what is the best estimate of X(ω)? X’(ω) =0

✦ Even if G (ω)=tiny, dividing by tiny is a bad idea and something much closer to zero is better

✦ The strategy should depend on the relative noise • low noise: just divide • high noise: under-estimate, closer to zero

Thursday, February 11, 2010 Noise, even without convolution

✦ y=x+v ✦ or in Fourier Y=X+V

(additive White Gaussian Noise)

Thursday, February 11, 2010 Noise, even without convolution

✦ Say we know that E(X2)=1, E(V2)=16 • Pretty bad signal noise ratio! ✦ We observe Y=X+V=5 ✦ It’s more likely to be X=0.5 +V=4.5 than X=4+V=1 ✦ How can we optimize our bet?

Thursday, February 11, 2010 http://www.norbertwiener.umd.edu/NW/index.html Wiener denoising

✦ Optimal estimation given known noise and signal powers ✦ Derive in Fourier domain because SNR is best known per frequency • Spectrum of images usually falls off as 1/ω2 • Noise is often white (flat spectrum) or high frequency

Thursday, February 11, 2010 Wiener denoising (in Fourier)

✦ Given Y=X+V, find X’=HY to minimize E(||X-X’||2)

✦ argmin E(||X-HY||2) => argmin E(||X-H(X+V)||2)

✦ argmin E(||(1-H)X-HV||2)

✦ X and N are assumed independent, E(XV)=0 expand and ignore cross terms

✦ argmin ||1-H||2E(||X||2)+||H||2E(||V||2)

✦ derive wrt H, set to zero

✦ (2H-2) E(||X||2)+2H E(||V||2)=0

✦ H=E(||X||2) / (E(||X||2+E(||V||2)

✦ divide by E(||X||2) to get a function of SNR 1 H = 1+1/SNR

Thursday, February 11, 2010 Wiener denoising 1 X￿ = Y 1+1/SNR

✦ When SNR is high, gain goes towards 1 ✦ When SNR is low, gain goes to zero

Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Back to convolution

✦ Assume we know the expect noise power spectrum and expected signal power spectrum

✦ Can we tweak 1/G to reduce output noise ? ✦ Maybe if we use something smaller than 1/G • we won’t amplify noise as much • but the inversion won’t be as correct

Thursday, February 11, 2010 http://www.norbertwiener.umd.edu/NW/index.html Wiener deconvolution

✦ Find the gain H that minimize ||X’(ω)-X(ω)||2 where X’=HY ✦ We need to know the signal noise ratio SNR(ω)=E(|X(ω)|2)/E(|V(ω)|2) ✦ OptimalOptimal filter filter 1 G(ω) 2 | | G(ω) G(ω) 2 +1/SNR(ω) ￿| | ￿ ✦ See http://en.wikipedia.org/wiki/ Wiener_deconvolution • careful, their notations are different from mine

Thursday, February 11, 2010 Wiener deconvolution derivation

✦ Find H to minimize ||X’(ω)-X(ω)||2 where X’=HY ✦ argmin E(||HY-X||2)=>argmin E(||H(XG+V)-X||2) ✦ argmin E(||(HG-1)X+HV||2) ✦ X and N are assumed independent:E(XV)=0 Expand and ignore cross terms ✦ argmin ||HG-1||2E(|X|2)+||H||2E(|V|2) ✦ E(|X|2) and E(|V|2) given by expected spectrum ✦ argmin||HG-1||2 +1/SNR ||H||2 ✦ derive wrt H, set to zero, get Wiener I remove ω for simplicity

Thursday, February 11, 2010 Wiener result

http://cnx.org/content/m15167/latest/

Thursday, February 11, 2010 Results from Wiener

naive Wiener blurry with noise deconvolution deconvolution

http://www.mathworks.com/products/demos/image/deblur_wiener/deblur.html

Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Note

✦ Wiener is derived in Fourier domain ✦ But in some cases, can be applied directly in primal ✦ In particular when SNR is 1/ω2 • Rely on image gradient • min ||y-x⊗g||2 + k|∇x|2 - where k depends on SNR

Thursday, February 11, 2010 At a high level

✦ Wiener can be seen as an example of regularization with a prior on the signal ✦ We know the power spectrum ✦ Other priors are possible • Sparsity of the gradient • other filters ✦ Active idea in computer vision

Thursday, February 11, 2010 Recap

✦ Fourier bases diagonalize convolution • Convolution > multiplication in Fourier ✦ Naive deconvolution is division ✦ Optimal deconvolution takes SNR into account Optimal filter 1 G(ω) 2 | | G(ω) G(ω) 2 +1/SNR(ω) ￿| | ￿ ✦ Deconvolution quality depends on blur frequency response • We want a high blur spectrum

Thursday, February 11, 2010 Questions?

✦ Fourier bases diagonalize convolution • Convolution > multiplication in Fourier ✦ Naive deconvolution is division ✦ Optimal deconvolution takes SNR into account Optimal filter 1 G(ω) 2 | | G(ω) G(ω) 2 +1/SNR(ω) ￿| | ￿ ✦ Deconvolution quality depends on blur frequency response • We want a high blur spectrum

Thursday, February 11, 2010 Blind deconvolution

✦ So far we have assumed we know the kernel ✦ When both x and g are unknown, badly ill-posed • It is called blind deconvolution

Removing ✦CameraSee e.g.ShakeRemovingfrom a SingleCameraPhotographShake from a Single Photograph • http://www.wisdom.weizmann.ac.il/~levina/papers/ Rob Fergus1 Barun Singh1 AaronRobHertzmannFergus12 BarunSam T.SinghRoweis1 2 AaronWilliamHertzmannT. Freeman2 Sam1 T. Roweis2 William T. Freeman1 deconvLevinEtal09-MIT-TR.pdf 1MIT CSAIL 2University of Toronto1MIT CSAIL 2University of Toronto • http://cs.nyu.edu/~fergus/research/deblur.html

Thursday, February 11, 2010 Figure 1: Left: An image spoiled by camera Figureshake. Middle1: Left:: AnresultimagefromspoiledPhotoshopby camera“unsharpshakmask”.e. MiddleRight: result: resultfromfromPhotoshopour algorithm.“unsharp mask”. Right: result from our algorithm.

Abstract Abstract depth-of-field. A tripod, or other specializeddepth-of-field.hardware, canAelim-tripod, or other specialized hardware, can elim- inate camera shake, but these are bulky andinatemostcameraconsumershakpho-e, but these are bulky and most consumer pho- tographs are taken with a conventional, handheldtographscamera.are takenUserswith a conventional, handheld camera. Users Camera shake during exposure leads to objectionableCamera shakimagee duringblurexposuremayleadsavoidtotheobjectionableuse of flashimagedue toblurthe unnaturalmaytonescalesavoid the usethatofre-flash due to the unnatural tonescales that re- and ruins many photographs. Conventional blind deconvolution and ruins many photographs.sult.ConInvourentionalexperience,blind deconmanyvofolutionthe otherwisesult.favoriteIn ourphotographsexperience, many of the otherwise favorite photographs methods typically assume frequency-domain constraints on images, methods typically assume frequencof amateury-domainphotographersconstraintsareon images,spoiled by cameraof amateurshake.photographersA method are spoiled by camera shake. A method or overly simplified parametric forms for the motion path during or overly simplified parametricto remoformsveforthatthemotionmotionblurpathfromduringa capturedtophotographremove thatwouldmotionbe blur from a captured photograph would be camera shake. Real camera motions can follow convoluted paths, camera shake. Real camera motionsan importantcan folloassetwforcondigitalvolutedphotographpaths, y. an important asset for digital photography. and a spatial domain prior can better maintainandvisuallya spatialsalientdomainim-prior can better maintain visually salient im- age characteristics. We introduce a method toageremocharacteristics.ve the effectsWofe introduceCameraa methodshaketocanremobe modeledve the effectsas a blurof kernel,Cameradescribingshakethecancam-be modeled as a blur kernel, describing the cam- camera shake from seriously blurred images.cameraThe methodshake assumesfrom seriously erablurredmotionimages.duringTheexposure,method conassumesvolved witherathemotionimage duringintensities.exposure, convolved with the image intensities. a uniform camera blur over the image and neagligibleuniformin-planecameracam-blur over theRemoimagevingandtheneunknogligiblewnin-planecamera shakcam-e is thusRemoa formvingof blindthe unknoimagewn camera shake is thus a form of blind image era rotation. In order to estimate the blur fromera rotation.the cameraIn shakordere,to estimatedeconthevolution,blur fromwhichtheiscameraa problemshake,with a longdeconhistoryvolution,in thewhichim-is a problem with a long history in the im- the user must specify an image region withoutthe usersaturationmust specifyeffects. an imageage reandgionsignalwithoutprocessingsaturationliterature.effects.In the mostage andbasicsignalformulation,processing literature. In the most basic formulation, We show results for a variety of digital photographsWe show resultstaken fromfor a varietytheofproblemdigital isphotographsunderconstrained:taken fromthere are simplythe problemmore unknois underconstrained:wns there are simply more unknowns personal photo collections. personal photo collections. (the original image and the blur kernel) than(themeasurementsoriginal image(theand the blur kernel) than measurements (the observed image). Hence, all practical solutionsobservmusted image).make strongHence, all practical solutions must make strong CR Categories: I.4.3 [Image Processing andCRComputerCategories:Vision]:I.4.3 [ImagepriorProcessingassumptionsand aboutComputerthe blurVision]:kernel, aboutpriortheassumptionsimage to beaboutre- the blur kernel, about the image to be re- Enhancement, G.3 [Artificial Intelligence]: LearningEnhancement, G.3 [ArtificialcoIntelligence]:vered, or both.LearningTraditional signal processingcovformulationsered, or both.ofTraditionalthe formulations of the problem usually make only very general assumptionsproblem usuallyin the makforme only very general assumptions in the form Keywords: camera shake, blind image deconKeywvolution,ords:variationalcamera shake, blindof frequencimage y-domaindeconvolution,powervlaariationalws; the resultingof algorithmsfrequency-domaincan typi-power laws; the resulting algorithms can typi- learning, natural image statistics learning, natural image statisticscally handle only very small blurs and not thecallycomplicatedhandle onlyblurvkeryer-small blurs and not the complicated blur ker- nels often associated with camera shake. Furthermore,nels often associatedalgorithmswith camera shake. Furthermore, algorithms 1 Introduction 1 Introduction exploiting image priors specified in the frequencexploitingy domainimagemaypriorsnot specified in the may not preserve important spatial-domain structurespreservsuch aseedges.important spatial-domain structures such as edges. Camera shake, in which an unsteady cameraCamercausesa shakblurrye, inpho-which an unsteady camera causes blurry pho- This paper introduces a new technique for removing the effects of tographs, is a chronic problem for photographers.tographs,The isexplosiona chronicofproblem for photographers. The explosion of This paper introduces a new technique for removing the effects of unknown camera shake from an image. This advance results from consumer digital photography has made cameraconsumershake digitalvery promi-photography has made camera shake very promi- unknown camera shake from an image. This advance results from two key improvements over previous work. First, we exploit recent nent, particularly with the popularity of small,nent,high-resolutionparticularly cam-with the popularity of small, high-resolution cam- two key improvements over previous work. First, we exploit recent research in natural image statistics, which shows that photographs eras whose light weight can make them difficulterastowhosehold suflightficientlyweight can make them difficult to hold sufficiently research in natural image statistics, which shows that photographs of natural scenes typically obey very specific distributions of im- steady. Many photographs capture ephemeralsteadymoments. Manthaty photographscannot capture ephemeral moments that cannot of natural scenes typically obey very specific distributions of im- age gradients. Second, we build on work by Miskin and MacKay be recaptured under controlled conditions orberepeatedrecapturedwithunderdiffercontrolled- conditions or repeated with differ- age gradients. Second, we build on work by Miskin and MacKay [2000], adopting a Bayesian approach that takes into account uncer- ent camera settings — if camera shake occursentincamerathe imagesettingsfor an—y if camera shake occurs in the image for any [2000], adopting a Bayesian approach that takes into account uncer- tainties in the unknowns, allowing us to find the blur kernel implied reason, then that moment is “lost”. reason, then that moment is “lost”. tainties in the unknowns, allowing us to find the blur kernel implied by a distribution of probable images. Givenbythisa kdistribernel,utionthe imageof probable images. Given this kernel, the image Shake can be mitigated by using faster exposures,Shakebcanut thatbe mitigcan leadated by usingis thenfasterreconstructedexposures, busingut thata standardcan lead deconvisolutionthen reconstructedalgorithm, al-using a standard deconvolution algorithm, al- to other problems such as sensor noise or atosmallerother problems-than-desiredsuch as sensorthoughnoisewe belieor avesmallerthere is-than-desiredroom for substantialthoughimprowevementbelievine therethis is room for substantial improvement in this reconstruction phase. reconstruction phase. We assume that all image blur can be describedWeasassumea singlethatconallvolu-image blur can be described as a single convolu- tion; i.e., there is no significant parallax, anytion;image-planei.e., thererotationis no significant parallax, any image-plane rotation of the camera is small, and no parts of the ofscenethe arecameramovingis small,rel- and no parts of the scene are moving rel- ative to one another during the exposure. Ourativapproache to one anothercurrentlyduring the exposure. Our approach currently requires a small amount of user input. requires a small amount of user input. Our reconstructions do contain artifacts, particularlyOur reconstructionswhen thedo contain artifacts, particularly when the Questions?

Thursday, February 11, 2010 Wavefront coding

Thursday, February 11, 2010 Is depth of field a blur? • Depth of field is NOT a convolution of the image • The circle of confusion varies with depth • There are interesting occlusion effects • (If you really want a convolution, there is one, but in the light field...)

From Macro Photography

Thursday, February 11, 2010 Wavefront coding

• CDM-Optics, U of Colorado, Boulder • Improve depth of field using weird optics & deconvolution • http://www.cdm-optics.com/site/publications.php – The worst title ever: "A New Paradigm for Imaging Systems", Cathey and Dowski, Appl. Optics, 2002

Thursday, February 11, 2010 Wavefront coding

• Idea: deconvolution to deblur out of focus regions • Problem 1: depth of field blur is not shift-invariant –Depends on depth If depth of field is not a convolution, it's harder to use deconvolution ;-( • Problem 2: Depth of field blur "kills information" –Fourier transform of blurring kernel has low frequency response

Thursday, February 11, 2010 Wavefront coding

• Idea: deconvolution to deblur out of focus regions –Problem 1: depth of field blur is not shift-invariant –Problem 2: Depth of field blur "kills information" • Solution: change optical system so that –Rays don't converge anymore –Image blur is the same for all depth –Blur spectrum is higher

Thursday, February 11, 2010 Wavefront coding

• Idea: deconvolution to deblur out of focus regions –Problem 1: depth of field blur is not shift-invariant –Problem 2: Depth of field blur "kills information" • Solution: change optical system so that –Rays don't converge anymore –Image blur is the same for all depth –Blur spectrum is higher • How it's done –Phase plate (cubic lens z=y3+x3) –Will do things similar to spherical aberrations

Thursday, February 11, 2010 Ray version

Thursday, February 11, 2010 Thursday, February 11, 2010 Frequency response (MTF)

Thursday, February 11, 2010 Results

Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Philosophy: Image capture

✦ A sensor placed alone in the middle of the visual world does not record an image

Thursday, February 11, 2010 Image capture

✦ Pinhole allows you to select light rays

Thursday, February 11, 2010 Image formation: optics

✦ Optics forms an image: selects and integrates light rays

Thursday, February 11, 2010 Image formation: computation

✦ The combination of optics & computation forms the image: selects and combines rays

Generalized Computation optics

Final Intermediate image optical image Thursday, February 11, 2010 Computational imaging goals

✦ Better capture information

✦ Form image as a post-process

Generalized Computation optics

Final Intermediate image optical image Thursday, February 11, 2010 Better capture information

✦ Same as communication theory: optics encodes , computation decodes

✦ Code seeks to minimize distortion Generalized Computation optics

Final Intermediate image optical image Thursday, February 11, 2010 Form images as a post-process

✦ The computational part of formation can be done later and multiple times

✦ e.g., enable refocusing Generalized Computation optics

Final Intermediate image optical image Thursday, February 11, 2010 Questions?

Thursday, February 11, 2010 Other forms of coded imaging • Tomography – e.g. http://en.wikipedia.org/wiki/ Computed_axial_tomography – Lots of cool Fourier transforms there • X-ray telescopes & coded aperture – e.g. http://universe.gsfc.nasa.gov/cai/coded_intr.html • Radar, synthetic aperture • Raskar's motion blur • and to some extend, Bayer mosaics

See Berthold Horn's course at MIT

Thursday, February 11, 2010 8 Other computational depth of field extension

Focus sweep Wavefront coding Coded aperture Hausler 72 Levin et al. 07 Dowski & Cathey 94 Nagahara et al. 08 Veeraraghavan et al. 07

Depth-invariant blur Depth-varying blur Depth estimation required

And in next slide, lattice focal lens Levin et al. 09

Thursday, February 11, 2010 32 New solution: The lattice-focal lens

Levin et al. 09: assembly of subsquares with different focal powers

each element focuses on a different depth

toy lattice-focal lens with 4 elements

Thursday, February 11, 2010 Hardware construction 36

Proof of concept

• 12 subsquares cut from plano-convex spherical lenses

• Attached to main lens extra focal power needed very low

• Modest DOF extension with only 12 subsquares

Thursday, February 11, 2010 Depth estimation 37

• Defocus kernels vary with depth

defocus kernels at different depths

• Depth estimation as for the coded aperture camera [Levin et al. 07]

input depth map

Thursday, February 11, 2010 38 Standard lens reference

Thursday, February 11, 2010 Lattice-focal lens

Thursday, February 11, 2010 Lattice focal lens

Online Submission ID: 0465 Online Submission ID: 0465

4D Frequency Analysis of4D Computational Frequency Analysis Cameras of for Computational Depth of Field Cameras Extension for Depth of Field Extension

Standard lens image Our lattice-focalStandard lens lens: image input Lattice-focal Our lattice-focal lens: all-f lens:ocused input output Lattice-focal lens: all-focused output Thursday, February 11, 2010 Figure 1: Left: Image from a standard lensFigure showing 1: limitedLeft: Image depth from of field, a standard with only lens the showing rightmost limited subject depth in focus. of field, Center: with only Input the from righ ourtmost subject in focus. Center: Input from our lattice-focal lens. The defocus kernel of thislattice-focal lens is designed lens. The to preserve defocushigh kernel frequencies of this lens over is designed a wide depth to preserve range.high Right: frequencies An all-focused over a wide depth range. Right: An all-focused image processed from the lattice-focal lens input.image Since processed the defocus from the kernel lattice-focalpreserves lens high input. frequencies, Since the we defocus achieve kernel a goodpreserves restoration high over frequencies, the we achieve a good restoration over the full depth range. full depth range.

1 Abstract 1 Abstract 40 response that can be achieved. In this40 paper,response we use that a can standard be achieved. In this paper, we use a standard 41 computational photography tool, the light41 computational field, e.g. [Levoy photography and tool, the light field, e.g. [Levoy and 2 Depth of field (DOF), the range of scene2 depthsDepth that of field appear (DOF), sharp the42 rangeHanrahan of scene 1996; depths Ng that 2005; appear Levin sharp et al. 2008a],42 Hanrahan to address 1996; these Ng 2005;is- Levin et al. 2008a], to address these is- 3 in a photograph, poses a fundamental tradeoff3 in a photograph, in photography— poses a43 fundamentalsues. Using tradeoff arguments in photography— of conservation43 ofsues. energy Using and taking arguments into of conservation of energy and taking into 4 wide apertures are important to reduce imaging4 wide noise, apertures but they are important also 44 toaccount reduce the imaging finite noise, size of but the they aperture, also 44 weaccount present bounds the finite on size the of the aperture, we present bounds on the 5 increase defocus blur. Recent advances5 in computationalincrease defocus imaging blur. Recent45 power advances spectrum in computational of all defocus imaging kernels. 45 power spectrum of all defocus kernels. 6 modify the acquisition process to extend6 themodify DOF through the acquisition decon- process to extend the DOF through decon- 46 46 7 volution. Because deconvolution quality7 isvolution. a tight function Because of deconvolution the Furthermore, quality is a a dimensionality tight function of gap the has beenFurthermore, observed between a dimensionality the gap has been observed between the 47 47 8 frequency power spectrum of the defocus8 kernel,frequency designs power with spectrum high of the4D defocuslight field kernel, and the designs space with of 2D high images over4D the light 1D field set andof depths the space of 2D images over the 1D set of depths 48 48 9 spectra are desirable. In this paper we study9 spectra how to design are desirable. effective In this paper[Gu we et al. study 1997; how Ng to design 2005]. effective In the frequency[Gu domain, et al. 1997; only Ng a 3D 2005]. In the frequency domain, only a 3D 49 49 10 extended-DOF systems, and show an upper10 extended-DOF bound on the maximal systems, and showmanifold an upper contributes bound on to thestandard maximal photographs,manifold which contributes corresponds to standard photographs, which corresponds 50 50 11 power spectrum that can be achieved. We11 analyzepower spectrum defocus kernels that can be achieved.to focal optical We analyze conditions. defocus Given kernels the aboveto bounds,focal optical we show conditions. that Given the above bounds, we show that 51 51 12 in the 4D light field space and show that12 inin the the frequency 4D light domain, field space andit show is desirable that in to the avoid frequency spending domain, power in theit is other desirableafocal toregions avoid spending power in the other afocal regions 52 52 13 only a low-dimensional 3D manifold contributes13 only a to low-dimensional focus. Thus, 3D manifoldof the light contributes field spectrum. to focus. We reviewThus, existingof the camera light designs field spectrum. and We review existing camera designs and 53 53 14 to maximize the defocus spectrum, imaging14 to systems maximize should the defocus con- spectrum,find that imaging some spend systems significant should power con- in thesefind afocal that some regions, spend while significant power in these afocal regions, while 54 54 15 centrate their limited energy on this manifold.15 centrate We reviewtheir limited several energy onothers this do manifold. not achieve We a review high spectrum several over theothers depth do range. not achieve a high spectrum over the depth range. 16 computational imaging systems and show16 eithercomputational that they spend imaging en- systems and show either that they spend en- 55 Our analysis leads to the development55 of theOur lattice-focal analysis leads lens—a to the development of the lattice-focal lens—a 17 ergy outside the focal manifold or do not17 achieveergy outside a high the spectrum focal manifold or do not achieve a high spectrum 56 novel design which allows for improved56 imagenovel reconstruction. design which allows It for improved image reconstruction. It 18 over the DOF. Guided by this analysis we18 introduceover the the DOF. lattice-focal Guided by this analysis we introduce the lattice-focal 57 is designed to concentrate energy at the57 focalis designedmanifold to of concentrate the light energy at the focal manifold of the light 19 lens, which concentrates energy at the low-dimensional19 lens, which concentrates focal man- energy at the low-dimensional focal man- 58 field spectrum, and achieves defocus kernels58 field with spectrum, high spectra. and The achieves defocus kernels with high spectra. The 20 ifold and achieves a higher power spectrum20 ifold than and previous achieves designs. a higher power spectrum than previous designs. 59 design is a simple arrangement of lens patches59 design with is different a simple focal arrangement of lens patches with different focal 21 We have built a prototype lattice-focal21 lensWe and have present built extended a prototype lattice-focal lens and present extended 60 powers, but the patches’ size and powers60 arepowers, carefully but derived. the patches’ The size and powers are carefully derived. The 22 depth of field results. 22 depth of field results. 61 defocus kernels of a lattice-focal lens are61 highdefocus over kernels a wide of depth a lattice-focal lens are high over a wide depth 62 range, but they are not depth invariant.62 Thisrange, both requiresbut they and are not en- depth invariant. This both requires and en- 23 Keywords: Computational camera,23 depthKeywords: of field, LightComputational field, camera, depth of field, Light field, 63 ables coarse depth estimation. We have constructed63 ables coarse a prototype depth estimation. and We have constructed a prototype and 24 Fourier analysis. 24 Fourier analysis. 64 demonstrate encouraging extended depth64 ofdemonstrate field results. encouraging extended depth of field results.

25 1 Introduction 25 1 Introduction 65 1.1 Depth of field evaluation 65 1.1 Depth of field evaluation

26 Depth of field, the depth range over which26 objectsDepth of in field, a photograph the depth range over which objects in a photograph 66 Similar to previous work, we focus on66 LambertianSimilar to scenes previous and work,as- we focus on Lambertian scenes and as- 27 appear acceptably sharp, presents an important27 appear tradeoff. acceptably Lenses sharp, presents an important tradeoff. Lenses 67 sume locally constant depth. The observed67 sume image locallyB of constant an ob- depth. The observed image B of an ob- 28 gather more light than a pinhole, which28 is criticalgather more to reduce light noise, than a pinhole, which is critical to reduce noise, 68 ject at depth d is then described as a convolution68 ject at depthB = !dd isI then+ N, described as a convolution B = !d I + N, 29 but this comes at the expense of defocus29 outsidebut this the comes focal at plane. the expense of defocus outside the focal plane. ⊗ ⊗ 69 where I is the ideally sharp image, N 69is thewhere imagingI is the noise, ideally and sharp image, N is the imaging noise, and 30 While some defocus can be removed computationally30 While some using defocus decon- can be removed computationally using decon- 70 ! is the defocus kernel, commonly referred70 ! tois as the the defocus point spread kernel, commonly referred to as the point spread 31 volution, the results depend heavily on31 thevolution, information the preserved results depend heavilyd on the information preserved d 71 function (PSF). The defocus PSF ! is71 oftenfunction analyzed (PSF). in terms The defocus of PSF ! is often analyzed in terms of 32 by the blur, as characterized by the frequency32 by the power blur, as spectrum characterized by the frequency power spectrumd d 72 its Fourier transform ˆ , known as the72 opticalits Fourier transfer transform functionˆ , known as the optical transfer function 33 of the defocus kernel. Recent advances33 inof computational the defocus kernel. imag- Recent advances in computational!d imag- !d 73 (OTF). In the frequency domain, convolution73 (OTF). is a In multiplication the frequency domain, convolution is a multiplication 34 ing [Dowski and Cathey 1995; Levin et34 al.ing 2007; [Dowski Veeraraghavan and Cathey 1995; Levin et al. 2007; Veeraraghavan 74 Bˆ( )= ˆ ( )Iˆ( )+Nˆ ( ) where hats74 denoteBˆ( Fourier)= ˆ ( transforms.)Iˆ( )+Nˆ ( ) where hats denote Fourier transforms. 35 et al. 2007; Hausler 1972; Nagahara et35 al. 2008]et al. modify 2007; Hausler the image 1972; Nagahara" et! al.d " 2008]" modify" the image " !d " " " 75 In a nutshell, deblurring divides every spatial75 In frequency a nutshell, by deblurring the ker- divides every spatial frequency by the ker- 36 acquisition process to enable extended36 depthacquisition of field through process such to enable extended depth of field through such 76 nel spectrum, so the information preserved76 atnel a spatialspectrum, frequency so the information preserved at a spatial frequency 37 a deconvolution approach. 37 a deconvolution approach. " " 77 depends strongly on the kernel spectrum.77 If depends!ˆ (") is strongly low, noise on the is kernel spectrum. If !ˆ (") is low, noise is | d | | d | 38 Computational imaging systems can dramatically38 Computational extend depth imaging of systems78 amplified can dramatically and image extendreconstruction depth of is degraded.78 amplified To capture and image scenes reconstruction is degraded. To capture scenes 39 field, but little is known about the maximal39 field, frequency but little magnitude is known about79 with the a maximal given depth frequency range d magnitude[d ,dmax79], wewith want a given PSFs depth! whose range d [d ,dmax], we want PSFs ! whose ∈ min d ∈ min d

1 1 44 Application: Refocusing from single captured image

Thursday, February 11, 2010 45 Application: Refocusing from single captured image

Thursday, February 11, 2010 46 Application: Refocusing from single captured image

Thursday, February 11, 2010 Depth of field analysis

•How do different cameras compare? •What is the best that can be done?

Thursday, February 11, 2010 Our new theoretical analysis

✦ http://www.wisdom.weizmann.ac.il/~levina/papers/lattice/

✦ In the 4D light field • Fourier analysis ✦ Shows that only a 3D subset of the 4D spectrum is useful Only a 3D subset of the (dimensionality gap) 4D spectrum is useful ✦ Inspires new lens design: lattice-focal lens

Previous designs spend energy outside the useful subset Thursday, February 11, 2010 Comparison of different cameras

85 Thursday, February 11, 2010 35 Comparing image reconstruction (simulation)

< << < <

standard coded focus wavefront lattice-focal lens aperture sweep coding lens Object at in-focus depth

Object at extreme depth

Thursday, February 11, 2010 References

•http://www.cdm-optics.com/site/ publications.php •http://www.wisdom.weizmann.ac.il/ ~levina/papers/lattice/ •http://www1.cs.columbia.edu/CAVE/ publications/pdfs/Nagahara_ECCV08.pdf •http://groups.csail.mit.edu/graphics/ CodedAperture/ •http://web.media.mit.edu/~raskar/Mask/

Thursday, February 11, 2010 Focal stacks

Thursday, February 11, 2010 Focal stack DoF extensions • Capture N images focused at different distances • For each output pixel, choose the sharpest image –e.g. look at local variance, gradient.

From Agarwala et al.

Thursday, February 11, 2010 Focal stack

Thursday, February 11, 2010 Montage

Thursday, February 11, 2010 Macro montage • 55 images here

Thursday, February 11, 2010 Stanford Tech Report CTSR 2005-02

Stanford Tech Report CTSR 2005-02

Figure 16: Refocusing of a portrait. Left shows what the conventional photo would have looked like (autofocus mis-focused by only 10 cm on the girl’s hair). Right shows the refocused photograph. Focal stack & plenoptic camera Stanford Tech Report CTSR 2005-02 Light Field Photography with a Hand-Held Plenoptic Camera, Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, Pat Hanrahan Stanford Tech Report CTSR 2005-02 • Capture light field C

B Figure 17: Light field photograph of water splashing out of of a broken A wine glass, refocused at different depths. (a) (b) (c) (d) C Figure 13: A complete light field captured by our prototype. Careful examination (zoom in on electronic version, or useFigure magnifying 16: Refocusing glass in print) of a reveals portrait. Left shows what the conventional photo would have looked like (autofocus mis-focused by only 10 cm on the 292 292 microlens images, each approximately 0.5 mm wide in print. Note the corner-to-corner clarity andB detail of microlens images across the light field,× which illustrates the quality of our microlens focusing. (a), (b) and (c) show magnified views of regions outlined ongirl’s the key hair). in (d).Right These shows close-u the refocusedps are photograph. representative of three types of edges that can be found througout the image. (a) illustrates microlensesA at depths closer than the focal plane. In these right-side up microlens images, the woman’s cheek appears on the left, as it appears in the macroscopic image. In contrast, (b) illustrates microlenses at depths further (a) (b) (c) (d) than the focal plane. In these invertedFiguremicrolens 13: A complete lightimages, field captured the by ourman’s prototype. cheek Careful examination appears (zoom on in onthe electronicright version,, opposite or use magnifying the glass macroscopic in print) reveals world. This effect is due to inversion 292 292 microlens images, each approximately 0.5 mm wide in print. Note the corner-to-corner clarity and detail of microlens images across the light of the microlens’ rays as they pass throughfield,× which illustrates the world the quality focal of our microlens plane focusing. before (a), (b) arriving and (c) show magnified at the views main of regions lens. outlined Finally, on the key in (c) (d). These illustrates close-ups are microlenses on edges at the focal plane representative of three types of edges that can be found througout the image. (a) illustrates microlenses at depths closer than the focal plane. In these right-side (the fingers that are clasped together).up microlens The microlensesimages, the woman’s cheek at appearsthis depthon the left, asare it appears constant in the macroscopic in color image. because In contrast, (b) illustratesall the microlenses rays arriving at depths further at the microlens originate from the same than the focal plane. In these inverted microlens images, the man’s cheek appears on the right, opposite the macroscopic world. This effect is due to inversion point on the fingers, which reflect lightof the microlens’ diffusely. rays as they pass through the world focal plane before arriving at the main lens. Finally, (c) illustrates microlenses on edges at the focal plane (the fingers that are clasped together). The microlenses at this depth are constant in color because all the rays arriving at the microlens originate from the same pointFigure on the fingers, which reflect14: light diffusely. Refocusing after a single exposure of the light field camera. Top is the photo that would have resulted from a conventional camera, focused 9 • Refocus to create focal stack on the clasped fingers.9 The remaining images are photographs refocused at different depths: middle row is focused on first and second figures; last row is focused on third and last figures. CompareFigure 17: Light especially field photograph of water middle splashing left out of ofand a broken bottom right for full effective depth of field.wine glass, refocused at different depths.

Figure 14: Refocusing after a single exposure of the light field camera. Top is the photo that would have resulted from a conventional camera, focused on the clasped fingers. The remaining images are photographs refocused at different depths: middle row is focused on first and second figures; last row is focused on third and last figures. Compare especially middle left and • Use photomontage to bottom right for full effective depth of field. generate all-focus image

Figure 18: Moving the observer in the macrophotography regime (1:1 mag- nification), computed after a single light field camera exposure. Top row Thursday, February 11, 2010 shows movement of the observer laterally within the lens plane, to pro- duce changes in parallax. Bottom row illustrates changes in perspective by moving along the optical axis, away from the scene to produce a near- orthographic rendering (left) and towards the scene to produce a medium Figure 15: Left: Extended depth of field computed from a stack of pho- wide angle (right). In the bottom row, missing rays were filled with closest tographs focused at different depths. Right: A single sub-aperture image, available (see Figure 7). which has equal depth of field but is noisier. Figure 18: Moving the observer in the macrophotography regime (1:1 mag-

10 nification), computed after a single light field camera exposure. Top row shows movement of the observer laterally within the lens plane, to pro- duce changes in parallax. Bottom row illustrates changes in perspective by moving along the optical axis, away from the scene to produce a near- orthographic rendering (left) and towards the scene to produce a medium Figure 15: Left: Extended depth of field computed from a stack of pho- wide angle (right). In the bottom row, missing rays were filled with closest tographs focused at different depths. Right: A single sub-aperture image, available (see Figure 7). which has equal depth of field but is noisier.

10 Stanford Tech Report CTSR 2005-02

Figure 16: Refocusing of a portrait. Left shows what the conventional photo would have looked like (autofocus mis-focused by only 10 cm on the girl’s hair). Right shows the refocused photograph.

Figure 17: Light field photograph of water splashing out of of a broken wine glass, refocused at different depths.

Figure 14: Refocusing after a single exposure of the light field camera. Top is the photo that would have resulted from a conventional camera, focused on the clasped fingers. The remaining images are photographs refocused at different depths: middle row is focused on first and second figures; last row is focused on third and last figures. Compare especially middle left and bottomFocal right for full stack effective depth of& field. plenoptic camera

Figure 18: Moving the observer in the macrophotography regime (1:1 mag- nification), computed after a single light field camera exposure. Top row shows movement of the observer laterally within the lens plane, to pro- Fromduce Ng changes et al. in parallax. http:// Bottom row illustrates changes in perspective by moving along the optical axis, away from the scene to produce a near- graphics.stanford.eduorthographic rendering (left) and towards the scene to produce a medium Figure 15: Left: Extended depth of field computed from a stack of pho- /papers/lfcamera/wide angle (right). In the bottom row, missing rays were filled with closest tographs focused at different depths. Right: A single sub-aperture image, available (see Figure 7). which has equal depth of field but is noisier.

Thursday, February 11, 2010

10 References

• http://www.janrik.net/ptools/ExtendedFocusPano12/index.html • http://www.outbackphoto.com/workflow/wf_72/essay.html • http://grail.cs.washington.edu/projects/photomontage/ • http://people.csail.mit.edu/hasinoff/timecon/

• http://graphics.stanford.edu/papers/lfcamera/

Thursday, February 11, 2010 Slicing

Thursday, February 11, 2010 Scanning: combination in 1 exposure

From Macro photography: Learning from a Master

Thursday, February 11, 2010