Single-Shot Reflectance Measurement from Polarized Gradient Illumination

Graham Fyffe Paul Debevec USC Institute for Creative Technologies 12015 Waterfront Drive, Los Angeles, CA 90094 [email protected]

Abstract each surface, which parts are diffuse, which parts are shiny, and high-resolution surface detail – typically involves pho- We present a method for acquiring the per-pixel diffuse tographing the object under a series of conditions albedo, specular albedo, and surface normal maps of a and fitting the observations to a reflectance model. With subject at a single instant in time. The method is single- the reflectance captured, the digital model can be digitally shot, requiring no optical flow, and per-pixel, making no rendered as if lit by the of any desired environment, assumptions regarding albedo statistics or surface connec- making the object a useful digital asset. tivity. We photograph the subject inside a spherical illu- Single-shot scanning techniques, where the acquisition mination device emitting a static lighting pattern of verti- takes place at a single brief moment of time, make model cally polarized RGB color gradients aligned with the XYZ acquisition more efficient and much easier to apply to dy- axes, and horizontally polarized RGB color gradients in- namic subjects such as facial performances. However, since versely aligned with the XYZ axes. We capture simultane- the subject is lit by just one lighting condition, not much ous photographs using one of two possible setups: a single- about the object’s reflectance can be captured beyond its ap- view setup using a coaxially aligned camera pair with a pearance under diffuse illumination. Far more useful would polarizing beam splitter, and a multi-view stereo setup with be to have, at each surface point, a measurement of the different orientations of linear polarizing filters placed on subject’s diffuse color, its specular component, and a high- the cameras, enabling high-quality geometry reconstruc- resolution measurement of its surface normal. tion. From this lighting we derive full-color diffuse albedo, In this paper, we present a novel single-shot scanning single-channel specular albedo suitable for dielectric ma- technique which uses a color polarized illumination setup to terials, and polarization-preserving surface normals which record precisely these measurements. The subject is placed are free of corruption from subsurface scattering. We pro- in a sphere of , , and LEDS, with horizontally- vide simple formulae to estimate the diffuse albedo, specu- oriented and vertically-oriented linear polarizers distributed lar albedo, and surface normal maps in the single-view and throughout. With this setup, different gradient directions of multi-view cases and show error bounds which are small for light are produced on the different polarizations of the color many common subjects including faces. channels, and the subject is photographed with a set of cam- eras some of which are polarized horizontally and some of which are polarized vertically. We leverage the fact that 1. Introduction for dielectric materials including skin, the specular reflec- tion component preserves the polarization, and the diffuse As the physical and digital worlds converge, there is an component depolarizes the light. These lighting conditions increasing need for creating digital models of people and and cameras allow the diffuse color, specular intensity, and objects. A digital model typically consists of geometric photometric surface orientation to be estimated at each pixel information, indicating the shape of the object’s surfaces, location on the object, yielding a single-shot scanning tech- and reflectance information, indicating how each part of the nique for both geometry and reflectance. We demonstrate object reflects light. Acquiring the geometry of an object the technique with two experimental setups: one using two can be done in many ways, from laser scanning, to struc- DSLR cameras and a polarizing beam splitter to record re- tured light, to passive photogrammetry, the latter of which flectance from a single viewpoint, and a multi-view setup can be performed from photos all shot at the same instant. with a set of DSLR cameras placed around the subject with Acquiring the reflectance of an object – the coloration of differently oriented polarization filters.

978-1-4799-8667-5/15/$31.00 ©2015 IEEE 2. Related Work Another category of single-shot works employs multi- view stereo reconstruction to estimate scene geometry, fol- Reflectance measurement is a well-studied topic, and lowed by further analysis of the corresponded photographs many methods have been proposed to obtain meaningful re- to estimate surface detail. Surface details may be esti- flectance information for a subject within a small budget of mated without photometric stereo, based on an assumed photographs. One hugely successful method is photomet- relationship between dark texture features and surface re- ric stereo, introduced by Woodham [14], wherein multiple lief [3], but this assumption is violated by dark convex fea- observations of a subject are made from a fixed viewpoint tures or light concave features. A shape-from- ap- while varying the illumination. Using as few as three inci- proach may be employed that estimates the relationship be- dent illumination directions, the surface normal and albedo tween surface normal and observed using the coarse of Lambertian surfaces may be recovered. Woodham also stereo base mesh [15], however the assumption that this re- describes the use of three colored to perform photo- lationship holds over sufficiently large regions of the ob- metric stereo in a single photograph, which is possible un- ject is violated in general scenes. We extend our proposed der the constant assumption that all scene ma- polarization-encoded photometric stereo approach to the terials vary only in brightness, and not in or saturation. multi-view case, where inexpensive linear polarizing filters Scenes with multi-color materials violate this assumption. are used in place of a polarizing beam splitter. Thus with Still within the realm of single-shot methods, the constant a single shot we perform multi-view stereo reconstruction, chromaticity assumption may be relaxed, so that surfaces photometric stereo, and polarization-based diffuse-specular can be categorized into a few regions of constant chromatic- separation enabling highly detailed geometry reconstruc- ity [2]. Yet, subjects having subtle chromaticity variations tion and reflectance estimation. violate this assumption, so while surface normal estimates may be accurate, the recovered surface albedo maps appear quantized. Alternatively, the constant chromaticity assump- 3. Apparatus tion may be lifted altogether, by introducing the assumption that optical flow can align photographs sharing common il- Our apparatus consists of a 2.7m spherical geodesic lumination in some color channels while varying in others structure outfitted with 2,040 Luxeon Rebel LEDs arranged [10], or photographs taken under complementary illumina- into 680 evenly distributed clusters each having one red, one tion conditions [13, 9]. Scenes with complex motion violate green and one blue LED, half of which are vertically polar- this assumption, such as the facial motions of speech. Opti- ized and half horizontally polarized. The LEDs are con- cal flow may be avoided using dichroic color filters to sep- trolled by a driver to emit color gradient illumination that is arate the color signal into six simultaneously photographed a function of the lighting direction Θ relative to the center color channels [8], using an apparatus with complementary of the sphere, which we break into two functions: Lv(Θ) light sources that appear to the naked eye, however for vertically polarized LEDs, and Lh(Θ) for horizontally this introduces the assumption that the reflectance spectra polarized LEDs. Writing RGB color values as (R, G, B), of the scene materials exist in a low-dimensional linear ba- the illumination functions are: sis. Our proposed approach uses a polarizing beam splitter or polarizing filters to separate the reflected light into six si- 1 Lv(Θ) = (1 + Θx, 1+Θy, 1+Θz); (1) multaneously photographed channels. We employ comple- 2 (Θ) = 1 (1 − Θ 1 − Θ 1 − Θ ) mentary color spherical gradients to encode the surface nor- Lh 2 x, y, z . (2) mal and specular albedo in the polarization-preserving sig- nal. While previous methods employing color photometric We place subjects at the center of the sphere and photo- stereo are adversely affected by subsurface scattering in the graph them using one or more pairs of cameras, each pair diffuse component, our method is unaffected by subsurface having one camera with vertically polarizing optics and the scattering as the polarization-preserving signal consists pri- other with horizontally polarizing optics. We explore two marily of achromatic single scattering and specular reflec- types of polarizing optics: a polarizing beam splitter, allow- tion. Further, the diffuse albedo is encoded in the depolar- ing the two cameras to be coaxially aligned producing an izing signal, enabling polarization-based diffuse-specular image pair with no parallax; and linear polarizing filters, separation in a single shot where previous methods require which do not allow for coaxial images but are inexpensive multiple photographs or color-space heuristics. Our ap- and suitable for multi-view stereo capture. We calibrate the proach is a form of polarization multiplexing [5] where the exposure and white balance of the cameras and image pro- object is simultaneously illuminated by different polariza- cessing software such that photographs of a 50% reflective tions, but we use area light sources and multiple color chan- diffuse card at the center of the sphere produce pixel nels to enable multi-view single-shot photometric stereo values of (0.25, 0.25, 0.25) in all camera views (for both from the single lighting condition. horizontally and vertically polarized cameras). 4. Lighting Model where k1 =cos2φ cos 2ψf +sin2φ sin 2ψp⊥p , k2 = 1 cos 2φ(p2 − p2 ), f = 1 (p2 +p2 ), and C is the color For a given pixel, consider the 6-channel intensity vector 2 ⊥ 2 ⊥ crosstalk matrix between illumination primaries and camera I =(rv,gv,bv,rh,gh,bh), where iv =(rv,gv,bv) are the primaries, obtained via a simple calibration process. We re- vertically polarized color intensities and ih =(rh,gh,bh) + − fer to these quantities as i and i . Defining the overall en- are the horizontally polarized color intensities. We model ergy, or albedo, of the multiple-scattering component αm, the observed reflectance function as a sum of three com- and likewise the albedos of the specular and single scatter- ponents: a depolarizing RGB multiple-scattering compo- ing components αs and αt (absorbing the Fresnel factor f nent m(Θ), a polarization-preserving achromatic single- into αs), we find: scattering component t(Θ), and a polarization-preserving + achromatic specular component s(Θ), which is modulated i = αm +(αs + αt)1. (9) by the Fresnel equations depending on the polarization axis − In the following, we rely on the idea that i encodes Θ with- according to the Mueller calculus [4]. Thus we have: out knowing the surface normal in advance, which requires k2 ≈ 0. We note this is satisfied when p⊥ ≈ p , which = m(Θ) ◦ 1 (Θ) + (Θ) + (Θ) (Θ) i∗ 2 Lv Lh t L∗ is true at normal incidence and less true towards the Brew- Ω ster’s angle. It is also satisfied when φ is a multiple of 45◦, T T + s(Θ)∗ RφMRψ[1,Lh(Θ)−Lv(Θ), 0, 0] dωΘ, (3) regardless of incidence. We also require k1 > 0 and ide- ally k1 ≈ f, which is possible using the tuned polarization where ∗ denotes either the subscript v or h, ◦ is the element- arrangement in [11] where φ = −ψ. Henceforth we sub- [ 1 1 0 0]T wise product, h and v are the Stokes vectors 2 , 2 , , stitute k1 = f and k2 =0, and discuss the impact of this [ 1 − 1 0 0]T and 2 , 2 , , for horizontal and vertical linear polar- simplification later on. Thus: ization, Rφ is the Stokes rotation matrix from the surface − frame to the camera frame, Rψ is the Stokes rotation matrix i ≈ t(Θ) + fs(Θ) CΘdωΘ. (10) from the light source frame to the surface frame, M is the Ω Mueller reflection matrix, 1 =(1, 1, 1) and 0 =(0, 0, 0). We presume the specular component is statistically simi- For dielectric materials, M takes the form of a polarizer: lar (in the 0th and 1st statistical moment) to a Phong cosine ⎡ ⎤ 1 2 2 1 2 2 lobe with some exponent ns, and the single-scattering com- (p⊥ + p ) (p⊥ − p )0 0 ⎢ 2 2 ⎥ ponent is statistically similar to a spherical cosine lobe [7] ⎢ 1 ( 2 − 2 ) 1 ( 2 + 2 )0 0⎥ ⎢ 2 p⊥ p 2 p⊥ p ⎥ with some exponent nt. Then: M = ⎢ ⎥ , (4) − ns+1 nt ⎢ 00p⊥p 0 ⎥ i = C αsrs + αtrt , (11) ⎣ ⎦ ns+2 nt+2 000 p⊥p where rs is the mean specular reflection direction (pre- sumed coincident with the ideal specular reflection direc- where p and p⊥ are the Fresnel factors: tion), and rt is the mean single-scattering reflection direc-

cos θt − η cos θi cos θi − η cos θt tion (presumed roughly coincident to the surface normal) p = ; p⊥ = , (5) We may estimate the specular albedo as: cos θt + η cos θi cos θi + η cos θt −1 − ns+1 nt αˆs = C i = αsrs + αtrt, (12) with θi the angle of incidence and θt the angle of transmis- ns+2 nt+2 sin = sin sion (related by θi η θt), and η the relative index of αs αtnt where the error |αˆs − αs| is bounded by + via refraction of the surface material in air. The rotation matrix ns+2 nt+2 R R the triangle inequality. We note that many subjects (such as φ (and likewise ψ) has the form:  1 ≈ 1 ⎡ ⎤ human faces) have ns , nt and αt <αs, and hence 10 0 0 the error is small (a few percent). We estimate the diffuse ⎢ ⎥ α = α + 1 ⎢0cos2φ − sin 2φ 0⎥ albedo d m αt as: Rφ = . (6) ⎣0sin2φ cos 2φ 0⎦ + αˆ d = i − αˆs1 = αd +(αs − αˆs)1, (13) 00 0 1 noting that the error in each diffuse albedo color channel Substituting Lv(Θ) and Lh(Θ) with (1) and (2) and simpli- is the same as the error in the specular albedo. Finally we fying, we consider the following quantities: estimate the surface normal via: nˆ ∝ C−1i− +ˆ v iv + ih = m(Θ) + fs(Θ) + t(Θ) 1 dωΘ; (7) αs Ω −1 1 (ns+2)ˆαs (ns+2)ntαt ∝ n+(n·v) −1 v+ rt , 2 (ns+1)αs 2(ns+1)(nt+2)αs iv − ih = t(Θ) + k1s(Θ) CΘ+k2s(Θ)1 dωΘ, (8) (14) Ω normalizing to obtain nˆ, where v is the view vector and ∝ ization separation may be performed for any vertically or denotes proportionality. Again, for many subjects the error horizontally neighboring cameras (see Fig. 2). We find that is small, since ns  1, αs ≈ αˆs nt ≈ 1, αt <αs, and rt ≈ n. Note that as the surface normal is derived only from polarization-preserving reflectance, it is uncorrupted by variations in the diffuse color or by blurring due to sub- surface scattering. Yet, the surface normal estimate does suffer from artifacts near the Brewster’s angle, and where k1 < 0 in (8). We simulated the technique with the full Fresnel equations for a specular sphere. Fig. 1 shows (a) (b) (c) the angular error in the surface normal estimate for vari- ous parameters. While the polarizer arrangement of [11] produces superior results for a single view, it quickly falls apart if the camera is repositioned. The polarizer arrange- ment of [9] has more artifacts at the top and bottom of the sphere, but remains stable as the camera moves to differ- ent viewpoints around the equator. We employ the latter (d) (e) (f) arrangement, so that we may capture multi-view data.

(a) (b) (c) (d) (g) (h)

Figure 2. Photographs from our multi-view setup, with 7 cameras. Views (a, c, e) have vertical polarizing filters over the lens, while (b, d, f, g) have horizontal filters. (h) Schematic view of camera (e) (f) (g) (h) arrangement, with red vertical lines indicating vertical polarizing filters and blue horizontal lines indicating horizontal polarizing fil- 7◦ Figure 1. Simulation results for a black specular sphere, showing ters. The cameras are spaced approximately apart. Any hori- angular error of surface normal estimate in from 0◦ zontally or vertically neighboring cameras may be used as a cam- (blue-black) to 120◦ (-). (a-d) Simulations using the era pair for reflectance estimation. More cameras could be added, polarizer arrangement of [11]; (e-h) Simulations using the arrange- expanding the checkerboard polarization pattern. ment of [9]. The first column uses η =2. The other columns use η =1.4. The first two columns show the frontal view. The third for many subjects, including human faces, standard stereo column has the viewpoint rotated by 30◦ in yaw, and the fourth photogrammetry algorithms are able to reconstruct reason- column rotated by 60◦. The first row improves with greater index ably good geometry from photographs captured under the of refraction (a), yet degrades rapidly as the viewpoint changes (c, described conditions, despite the view dependence of the d). The second row is stable across viewpoints, with good surface specular color gradient reflections, due to the low angular normal estimates in the central region. variation in the illumination. (As in [11, 9], moderately to highly specular surfaces will not be well reconstructed, as the discrete light sources will become apparent in the spec- 4.1. Non-Coaxial Camera Pairs ular reflection.) With pixel correspondence established be- We also consider the case where polarizing beam split- tween views, we consider the case where the vertically and ters are not employed, and instead each camera pair cap- horizontally polarized pixel values may be sourced from tures two independent, but close, views of the subject, with different views, finding: one camera having a vertical linear polarizing filter on the + ns+1 i = αm + C(αsvrsv − αshrsh) lens and the other horizontal. In this case, reflectance es- 2(ns+2) 1 timation is still possible if a correspondence between the + 2 (αsv + αsh)+αt 1; (15) images can be established, for example through stereo pho- togrammetry. Multiple camera pairs may be efficiently dis- − ns+1 nt i = C 2(n +2) (αsvrsv + αshrsh)+ n +2 αtrt tributed around the subject, with lens polarization orienta- s t + 1 ( − )1 tions alternating in a checkerboard fashion, so that polar- 2 αsv αsh , (16) where rsv and rsh are the mean specular reflection vec- in multi-view texture blending. We correspondingly blend tor associated with the vertically and horizontally polarized αˆs and αˆ d weighted by the magnitude of the unnormalized source pixels, which differ as they are sourced from differ- surface normal estimate. We also found that weighting by ent views, and αsv and αsh are the specular albedos asso- αˆ d produces good blending, which we employ in Fig. ciated with the vertically and horizontally polarized source 13. Furthermore, for all quantities we modulate the weight pixels, which may also differ due to the view-dependence for each view by a visibility estimate obtained via raster- of Fresnel reflection. We estimate the specular albedo as: ized depth images, eroded and blurred by a 5-pixel radius to suppress seams due to discontinuities. C−1i− v + v ns+1 T αsv v αsh h αˆs = = (2nn −I) 1 ns+2 2(v + v ) 2 (vv + vh) v h Detecting polarization reversal The polarization pattern 2nt αtrt +(αsv − αsh)1 of our multi-view setup, like that in [9], violates our as- + nt+2 1 , (17) sumption of good polarization separation as the surface nor- (vv + vh) 2 mal points more upwards or downwards, which contributes where vv and vh are the view vectors associated with the to the artifacts in the surface normal estimates visible in e.g. vertically and horizontally polarized source pixels, and I is Fig. 5. We note, however, that the polarization in these re- 1 the identity matrix. Defining αs = (αsv + αsh), the er- gions is approximately reversed (with k1 < 0 in (8)), with 2 ns+1 1 2 2 vertical polarization exchanged with horizontal, as visual- |ˆs − s| ( + )− s + ror α α is bounded by ns+2 2 αsv αsh α √ −1 ized in Fig. 3. Thus, if coarse geometry is available, for 2ntαt 1 + 3| sv− sh| (vv + vh) nt+2 α α 2 , which is small example from multi-view stereo, it may be utilized to distin- as ns  1, αsv ≈ αsh ≈ αs, nt ≈ 1, and αt <αs.We guish reversed polarization from non-reversed polarization estimate the diffuse albedo as: by computing the surface normal estimate for both cases, and retaining the one that is closer to the coarse geometric ˆ + 1 ˆ ˆT αd = i − αˆs 1 + 2 C(2nn − I)(vv − vh) normal. We apply this strategy before blending the surface = αd +(αs − αˆs)1 normal estimates from multiple camera pairs. 1 ns+1 T + C (2nn − I)(αsvvv − αshvh) 2 ns+2 T − (2nˆnˆ − I)ˆαs(vv − vh) , (18) which shares the same error as the specular albedo (in each color channel), plus an additional error term that is small since ns  1, n ≈ nˆ, and αsv ≈ αsh ≈ αˆs. We estimate the surface normal via: (a) (b) −1 − 1 nˆ ∝ C i + αˆs(vv + vh) 2 Figure 3. Polarization reversal. A synthetic rendering (a) of a pat- αsv αsh −1 1 (ns+2)ˆαs αsv tern of blue lines of latitude and red lines of longitude reflected ∝ n +(n · ( α vv + α vh)) 2 (n +1)α − α vv s s s s s on a mirror ball exhibits regions where the horizontal and vertical 1 (ns+2)ˆαs αsh (ns+2)ntαt + 2 (n +1)α − α vh + (n +1)(n +2)α rt axes appear exchanged (highlighted in (b)). s s s s t s + (ns+2)(αsv −αsh) 1 2(ns+1)αs , (19) normalizing to obtain the estimate. The error is again small, Contrast with previous work We compare our method as ns  1, αsv ≈ αsh ≈ αs ≈ αˆs, nt ≈ 1, αt <αs, and to [11, 9] who estimate diffuse and specular albedo maps rt ≈ n. If a point on the surface is associated with multiple using polarization difference imaging under white spherical camera pairs, we obtain multiple estimates (one per camera illumination. A significant advantage of our method is that pair). The error of each estimate is approximately inversely it employs only a single photograph per camera, shot simul- αsv αsh n · ( vv + vh) proportional to the dot product term αs αs , taneously, allowing for rapid capture. While most results so we may optimally blend the estimates as a weighted shown made use of a prototype apparatus with longer ex- sum, weighting each estimate by its associated dot prod- posure times, we show results in Fig. 13 for a performance uct term. As the dot product term contributes most of the captured at 8 fps with an exposure time of 12.5 millisec- variation in magnitude of each unnormalized estimate, we onds per frame. In contrast, the previous works [11, 9] obtain the optimal blend simply by summing the unnor- require 14 photographs per camera, making dynamic cap- malized estimates, and then normalizing the sum. We note ture difficult. Under our lighting model, the previous works   that this scheme automatically favors data from front-facing estimate αˆ d = αm and αˆs = αs + αt, with the single- views via the dot product term, which is common practise scattering component being wholly included in the specular albedo. (The surface normal estimates are equivalent.) It may be more advantageous to include the single-scattering component in the diffuse albedo, as single-scattering re- flectance behaves more similarly to diffuse than to specular. Of course, if all the photographs from [11, 9] were avail-  able, one could estimate αt as αˆt =ˆαs − αˆs, yielding sepa- rate estimates of all three components. An advantage of the (a) vertical (b) horizontal previous works over our proposed method is that the polar- ization differencing of each independent gradient direction entirely avoids the artifacts in the surface normal estimates shown in Fig. 1.

5. Results and Discussion (c) diffuse (d) specular (e) normal We show reflectance estimation results using the pro- posed method for a variety of subjects and materials, first using the polarizing beam splitter arrangement and then us- ing the multi-view stereo arrangement with 7 cameras and linear polarizing filters. Fig. 4 shows the two input pho- tographs for a plastic orange subject, which has surface bumps and moderately glossy specular reflectance, along (f) diffuse v [9] (g) specular v [9] with the estimated diffuse albedo, specular albedo, and sur- face normal. We compare to the results from [9], which makes use of a similar arrangement of polarizing filters on the light sources, but has either only vertical (preferred) or only horizontal (non-preferred) polarizing filters in front of the camera lens. Real materials tend to specularly reflect polarized light to a greater degree along the axis parallel (h) diffuse h [9] (i) specular h [9] to the surface, reaching an extreme at the Brewster’s angle where only light polarized along this axis is reflected. As Figure 4. Reflectance estimation results using a polarizing beam- we ignored this effect in our lighting model, we observe the splitter for a plastic orange. (Specular albedo shown ×10.) (a, b): artifacts predicted in Fig. 1(f). Despite using a combination the simultaneous photographs under differing linear polarization; of the preferred and non-preferred filters via the polarizing (c, d, e): the estimated reflectance maps; (f, g): result reproduced beam splitter, the diffuse-specular separation produced by from [9], showing diffuse and specular albedo for a similar plastic our proposed method does not appear to suffer from any of orange with vertical polarizing filter in front of lens; (h, i): result the artifacts observed in the non-preferred arrangement in from [9], with horizontal polarizing filter in front of lens. The [9], though a small amount of specular pollution is visible proposed method appears to produce a diffuse-specular separation in the four corners of the diffuse map where polarization (c, d) comparable in quality to the preferred method (f, g), though the surface normals suffer from artifacts in the regions where the separation is weakest, also faintly visible in the preferred non-preferred method (h, i) is poor. arrangement in [9]. The surface normal estimates exhibit artifacts in areas with polarization reversal, where the re- flection vector is essentially flipped. Results on subjects with varying specularity The sur- face normal estimate for the green statue with rough spec- Results on subjects with saturated colors Fig. 5 illus- ularity in Fig. 6 appears somewhat flattened, and inter- trates the robustness of the proposed surface normal esti- estingly does not appear to suffer from inconsistent sur- mates to strongly saturated diffuse material colors. Un- face normals. We suppose both of these phenomena may like methods that rely on Lambertian reflection, our method be explained by the polarization-preserving reflection be- estimates surface normals from achromatic polarization- ing dominated by single scattering. There is also a low- preserving reflectance, remaining accurate within the region frequency bias appearing as a vertical gradient over the im- of angles near the equator for all surface colors. Surface age, which we attribute to view-dependence of the polar- normals outside this region are biased towards the diffuse ization separation behavior of the polarizing beam splitter color, as the Fresnel term attenuating the diffuse reflectance cube. The surface normal estimate for the moderately rough differs in the two input photographs. specular red statue appears consistent, except for largely up- (a) diffuse (b) specular (c) normal

Figure 5. Reflectance estimation results using a polarizing beam- splitter for spherical objects having various diffuse colors and moderately rough specularity. (Specular albedo shown ×5.) As observedinFig.4, the surface normal estimate is consistent across materials within the central band around the equator, and becomes inconsistent above and below this region, accompanied by two (a) diffuse (b) specular (c) normal dark bands in the specular albedo estimate. Figure 6. Reflectance estimation results using a polarizing beam- splitter for objects with relief details, having materials with vary- ing specular properties. (Specular albedo shown ×5.) Top row: ward or downward surface normals. The surface normal es- rough specularity produces a surface normal estimate with some timate for the glossy specular statue is similarly consistent, flattening bias; Middle row: moderately rough specularity pro- but has additional noise wherever the highlights from indi- duces a good surface normal estimate except for upward or down- vidual lights break apart. ward normals; Bottom row: high-gloss specularity produces a sur- face normal estimate broken up by discrete specular highlights.

Results on additional subjects Fig. 7 shows results for subjects with a variety of materials. The dark plastic Results with non-coaxial cameras and multi-view stereo body of the camera is a good example illustrating the We tested the PVMS2 software from [6] and the commer- small amount of specular pollution in the estimated dif- cial software Agisoft PhotoScan [1] on photographs cap- fuse albedo, due to violations of the ideal polarization- tured using our multi-view setup for four human subjects preserving assumption. This pollution only occurs towards and a small statue, producing the meshes shown in Fig. 9. grazing angles. The woven basket illustrates that spatially- Despite the view-dependence of the colored specular re- varying specular albedo is recovered even when the subject flections, reasonably good meshes are obtained. We then has significant diffuse reflectance. The human face result employed the Agisoft PhotoScan meshes to reproject verti- shows good diffuse-specular separation, and good surface cal and horizontal polarized images into neighboring views, normal estimates except for largely upward or downward producing the reflectance estimates shown in Fig. 8. Esti- pointing normals. (The haze towards the bottom of mates from all pairs of neighboring cameras were blended the surface normal image is the same defect in our polariz- using the method in Section 4.1. The surface normal dis- ing beam splitter that causes the gradient haze in Fig. 6(c).) ambiguation guided by the estimated geometry improves (a)

(b)

(a) diffuse (b) specular (c) normal

Figure 7. Reflectance estimation results using a polarizing beam- splitter for additional subjects: a camera, a woven basket, and a human face. (Specular albedo shown ×5.) the estimate in upward-pointing and downward-pointing re- (c) gions. We finally refined the meshes with the surface nor- mal estimates using a method similar to [12], producing the detailed meshes shown in Figs. 10 and 11. These diffuse specular normal meshes and reflectance maps are suitable for rendering the subject under novel views and illumination, shown in Fig. Figure 8. Reflectance estimates from multi-view setup, shown in 12. Finally, we show selected frames in Fig. 13 from a short unwrapped UV texture space. (Specular albedo shown ×5.) Black performance captured at 8 fps with an exposure time of 12.5 regions in albedo maps were not visible in at least two views. Row milliseconds per frame, using six cameras. Diffuse albedo, (a) is the subject in Fig. 9(c); row (b) is the subject in Fig. 9(e); specular albedo, and surface normal are all estimated at a row (c) is the subject in Fig. 9(g). single point in time, without any optical flow.

6. Limitations and Future Work Our method shares some limitations with other polariza- face normal estimate relies on the polarization-preserving tion separation methods for surface normal acquisition. If signal in the photographs, so very diffuse materials yield a latitude-longitude polarization arrangement is used (as in noisy surface normal estimates, and the surface normal es- [9], and in all our examples), surface normals pointing sig- timates are very sensitive to noise (“digital grain”) in the nificantly upward or downward are estimated incorrectly. photographs. Additionally, as our lighting apparatus uses This can be partially mitigated using the polarization ar- discrete LED light sources, the specular signal breaks apart rangement from [11] in the single-view setup, and by de- into individual highlights for high-gloss (mirror-like) sur- tecting polarization reversals in the multi-view setup. It faces. See Fig. 6 for examples. The method in [11, 9] would be interesting to see if adding more camera view- shares similar limitations, however an alternative surface points and perhaps further varying the angle of polariza- normal estimate (the so-called “diffuse normal”) is available tion on the cameras could provide better coverage of surface for regions with low polarization-preserving signal, and the normal orientations. Since denser camera arrays are getting use of multiple photographs in the process helps to mitigate more popular for stereo reconstruction (50 to 100 cameras is noise. Nevertheless, our method is suitable for many sub- not uncommon) this would be a natural extension. The sur- jects, including human faces. (a) (b) (c) (d) (a) (c) (g)

Figure 11. Zoomed views of the refined geometry in Fig. 10.

(e) (f) (g)

Figure 9. Reconstructed geometry using standard multi-view stereo techniques on a mix of vertically and horizontally polar- ized views under polarized spherical RGB gradient illumination. (a,c,e,f,g) Subjects processed using Agisoft PhotoScan. (b, d) (a) (b) (c) Same subjects as (a, c) processed using PMVS2 software.

(d) (e) (f)

(a) (c)

(e) (f) (g) (g) (h) (i) Figure 10. Refined geometry results corresponding to the base ge- ometry in Fig. 9, using the photometric surface normal estimates Figure 12. Renderings of recovered geometry and reflectance with in Fig. 8. Note the skin pores and lip wrinkles visible in (a,c,e,f), (a-d) novel viewpoint and HDRI lighting environments; and (e- the fine forehead wrinkles in (e), and the scale and brick detail in i) novel point-light illumination. All renderings use a modified (g), which were not discernible in the base geometry. Blinn-Phong specular lobe with exponent 64 and measured albedo.

7. Conclusions tion information into the polarization-preserving signal en- ables polarization-based diffuse-specular separation, and si- We have presented a method for single-shot capture of multaneous photometric surface normal estimation. We geometry and reflectance. We showed that using com- demonstrated the technique using two setups: one with a plementary colored spherical gradients to encode direc- polarizing beam splitter to coaxially align two cameras, ernment, and no official endorsement should be inferred.

References [1] Agisoft photoscan. http://www.agisoft.com/, 2014. 7 [2] R. Anderson, B. Stenger, and R. Cipolla. Color photometric stereo for multicolored surfaces. In D. N. Metaxas, L. Quan, A. Sanfeliu, and L. J. V. Gool, editors, ICCV, pages 2182– 2189. IEEE, 2011. 2 [3] T. Beeler, B. Bickel, P. Beardsley, B. Sumner, and M. Gross. High-quality single-shot capture of facial geometry. ACM Transactions on Graphics (Proc. SIGGRAPH), 29(4):40:1– 40:9, 2010. 2 [4] E. Collett. Field guide to polarization. SPIE Press, 2005. 3 [5] O. G. Cula, K. J. Dana, D. K. Pai, and D. Wang. Polarization multiplexing for bidirectional imaging. In Proc. of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - vol. 2, CVPR’05, pages 1116–1123, Washington, DC, USA, 2005. IEEE Computer Society. 2 [6] Y.Furukawa and J. Ponce. Accurate, dense, and robust multi- view stereopsis. IEEE Trans. on Pattern Analysis and Ma- chine Intelligence, 32(8):1362–1376, 2010. 7 [7] G. Fyffe and P. Debevec. Cosine lobe based relighting from gradient illumination photographs. JVRB - Journal of Virtual Reality and Broadcasting, 9(2012)(2), 2012. 3 [8] G. Fyffe, X. Yu, and P. Debevec. Single-shot photometric stereo by spectral multiplexing. In IEEE International Con- ference on Computational Photography, 2011. 2 [9] A. Ghosh, G. Fyffe, B. Tunwattanapong, J. Busch, X. Yu, (a) diffuse (b) specular (c) normal and P. Debevec. Multiview face capture using polar- ized spherical gradient illumination. ACM Transactions on Figure 13. Recovered reflectance for three frames from a short Graphics, pages 1–10, 2011. 2, 4, 5, 6, 8 performance captured at 8 fps. (Specular albedo shown ×5.) [10] H. Kim, B. Wilburn, and M. Ben-Ezra. Photometric stereo for dynamic surface orientations. In Proc. of the 11th Eu- ropean Conference on Computer Vision: Part I, ECCV’10, and another with linear polarizing filters on the cameras, pages 59–72, Berlin, Heidelberg, 2010. Springer-Verlag. 2 using multi-view stereo reconstruction to bring disparate [11] W.-C. Ma, T. Hawkins, P. Peers, C.-F. Chabert, M. Weiss, views into correspondence. We show high-quality diffuse- and P. Debevec. Rapid acquisition of specular and diffuse specular separation and surface normal estimates with both normal maps from polarized spherical gradient illumination. setups, and detailed geometry reconstruction in the multi- In J. Kautz and Pattanaik, editors, Eurographics Symposium view case, including dynamic capture. To our knowledge, on Rendering, 2007. 3, 4, 5, 6, 8 this is the first technique for estimating geometry and re- [12] D. Nehab, S. Rusinkiewicz, J. Davis, and R. Ramamoorthi. flectance in a single shot, without overly restrictive BRDF Efficiently combining positions and normals for precise 3D geometry. ACM Transactions on Graphics (Proc. of ACM assumptions, material clustering heuristics, or optical flow. SIGGRAPH 2005), 24(3), Aug. 2005. 8 [13] C. A. Wilson, A. Ghosh, P. Peers, J.-Y. Chiang, J. Busch, Acknowledgements and P. Debevec. Temporal upsampling of performance ge- ometry using photometric alignment. ACM Transactions on The authors thank the following for their support and Graphics, 29(2), Mar. 2010. 2 assistance: Ari Shapiro, Maria Mikhisor, Antionne Scott, [14] R. J. Woodham. Photometric method for determining sur- Shanhe Wang, Koki Nagano, Xueming Yu, Jay Busch, face orientation from multiple images. Optical Engineering, Kathleen Haase, Bill Swartout, Cheryl Birch, Randall Hill 19(1):191139, 1980. 2 and Randolph Hall. This work was sponsored by the Uni- [15] C. Wu, B. Wilburn, Y. Matsushita, and C. Theobalt. High- versity of Southern California Office of the Provost and the quality shape from multi-view stereo and shading under gen- U.S. Army Research, Development, and Engineering Com- eral illumination. In Proc. 2011 IEEE Conf. on Computer mand (RDECOM). The content of the information does not Vision and Pattern Recognition, CVPR’11, pages 969–976, necessarily reflect the position or the policy of the US Gov- Washington, DC, USA, 2011. IEEE Computer Society. 2