Directional Super-Resolution by Means of Coded Sampling and Guided Upsampling

Directional Super-Resolution by means of Coded Sampling and Guided Upsampling David C. Schedl Clemens Birklbauer Oliver Bimber Johannes Kepler University Linz [email protected] Abstract applicability of our approach is only limited by the max- imum disparity/distance among neighboring sample posi- We present a simple guided super-resolution technique tions. We present results for existing light-field camera for increasing directional resolution without reliance on arrays and light-stage systems, and show how directional depth estimation or image correspondences. Rather, it aliasing artifacts of these devices can be improved simply searches for best-matching multidimensional (4D or 3D) by realigning their sampling positions. patches within the entire captured data set to compose new directional images that are consistent in both the spatial 2. Related Work and the directional domains. We describe algorithms for guided upsampling, iterative guided upsampling, and sam- 2.1. Spatial and Temporal Super•Resolution pling code estimation. Our experimental results reveal that A rich literature on super-resolution techniques exists. the outcomes of existing light-field camera arrays and light- Single-image super-resolution techniques for example, stage systems can be improved without additional hardware use patch databases from various scales of the input image requirements or recording effort simply by realignment of (e.g. [8, 5]). For temporal super-resolution, these meth- cameras or light sources to change their sampling patterns. ods were extended to the time domain by utilizing the recur- rence of space-time patches across spatio-temporal scales of the input video to reduce motion aliasing and motion blur (e.g. [17]). One drawback of most light-field cameras is 1. Introduction and Contributions their low spatial resolution. Therefore, various light-field super-resolution approaches, for computing single high- Sampling and processing images under directionally resolution images, have been proposed [1, 7]. A hybrid varying (viewing or lighting) conditions is the key to sev- imaging system that consists of a regular high-resolution eral modern visual effects production techniques in digi- camera and a low-resolution light-field camera was pre- tal photography and movie making. Image-based render- sented in [2]. The input of the high-resolution camera serves ing of light fields [11, 9] or spatio-temporal recordings [18] as guidance for upsampling the spatial resolution of the light and image-based relighting of light-stage data [4] gener- field. Recent compressive sensing approaches capture light ally do not rely on depth estimation or image correspon- fields by placing masks in front of an image sensor of a dences, as this can be difficult or even impossible to com- regular camera (e.g. [12, 14]), but require computationally pute for realistically complex sceneries. To prevent under- expensive reconstruction and only allow small perspective sampling artifacts, however, a substantial number of images changes. from many directions must be captured and combined. This leads to complex and dense camera- or light-arrays, and 2.2. Directional Super•Resolution to performance constraints that are due to bandwidth lim- itations. In this paper, we present a simple guided super- Directional undersampling in the case of too few per- resolution technique for the directional domain that does spective recordings or too large disparities can also lead to not require scene depth, disparity estimation, or exact im- strong visual artifacts (see [3] for an analysis). View in- age correspondences but searches for best-matching multi- terpolation reduces these artifacts by upsampling the num- dimensional (4D or 3D) patches within the entire captured ber of views used for rendering, for example via an estima- data set to compose new directional images that are con- tion framework that utilizes the epipolar plane structure of a sistent in both the spatial and the directional domains. The light field [20] or trained Gaussian mixture models for light- field patches [15], where disparity values are estimated for each patch. For image-based relighting a limited number of lighting bases introduces aliasing artifacts for high-frequency lighting effects (e.g., sharp shadows, highly specular surface re- flections, and for more complex light transports). The limits are usually hardware constraints, such as the frame rate of the cameras or activation time of LEDs [4]. Therefore, one way of avoiding artifacts is to upsample the recorded set of lighting bases (e.g. [13]). In [19], the number of images for relighting was drastically reduced by using spheri- cal harmonics. However, due to the small set of local lighting bases, aliasing artifacts continued to be introduced for high-frequency lighting cases. In [6] the lighting bases are upsampled by segmenting the scene into layers (highlights, Figure 1. Guided upsampling. Black dots in the sampling code diffuse, shadows) and by handling them separately. Flow represent captured samples, while empty circles are samples com- algorithms are applied to the highlights and shadows layer puted by our method. The guidance area is indicated in green. Red (the layers that cause the strongest interpolation artifacts). dots are interpolated samples. A processed patch is indicated by Incorrect segmentation or wrong flow correspondences in- the black rectangle. Green filled circles illustrate upsampled sam- troduce errors. ples with high frequency details from the guidance patch pr and ′ Finding exact image correspondences for depth recon- pr, respectively. struction, disparity estimation, or optical flow computa- tions, however, is only possible for adequately textured isotropic sceneries. More realistic scenes with, for instance, pling. We extract high-frequency information from these anisotropically reflecting surfaces or transparent refracting guidance areas and add these details to novel views. How objects do not support robust estimation of image corre- many guidance regions are used, how large they are, and spondences. Nevertheless, these are the cases that require where they are located for data acquisition is discussed in image-based rendering rather than geometry-based render- section 3.2. ing. If dense depth estimation and optical flow computation Our first step is to low-pass filter each Gr in the direc- are possible, then image-based light-field rendering and re- tional domain to simulate the sampling rate of Di. This lighting are superfluous. Linear view synthesis [10], that is achieved by downsampling (with nearest neighbor sam- calculates novel views from focal stacks without requiring pling) all Gr by the factor 1/f, and subsequently upsam- image correspondences, covers only a 3D subset of 4D light pling the results by f again while linearly interpolating the fields. In this paper, we present an entirely image-based and missing directional samples (i.e., the corresponding images simple guided super-resolution technique for increasing di- at each directional sample position). Note that up- and rectional resolution without reliance on image correspon- downsampling are applied exclusively to the directional do- dences as required for depth-based morphing or interpola- main. The spatial domain (i.e., the resolution of each sample image) remains unaffected. This results in (direction- tion based on optical flow. ′ ally) low-pass filtered guidance regions Gr with partly in- 3. Upsampling Camera Arrays terpolated image samples. The next step is to upsample Di by f while linear interpolation is again applied to approxi- Let us assume an input grid Di(S,T,U,V ) of U × V mate the missing directional samples. This also results in ′ directional samples of images with spatial resolution S × T an upsampled input grid Di with partly interpolated im- (e.g., as recorded with a U × V array of cameras, each hav- age samples. Note that, again, upsamling is applied only ′ ′ ing a resolution of S × T pixels). Our goal is to upsample to the directional domain. Note also that Di, Do, Gr, and the directional resolution to an output grid Do by an up- Gr now all have the same sampling rate. We synthesize sampling factor of f while retaining the spatial resolution Do as follows: For a given s,t,u,v coordinate in Do, we ′ at each sample. For guided upsampling, we generally con- extract a four-dimensional neighborhood patch pi of size ′ sider n directional guidance regions Gr=0..n−1 (areas with δs × δt × δu × δv around Di(s,t,u,v), and find the closest ′ ′ denser sampling) that are distributed within the sampling match pr in all Gr. The final output patch is then approxi- ′ ′ area. The sampling rate of each Gr equals the sampling mated with po = pi +(pr − pr), where pr is the patch in ′ ′ rate of Do, and is by a factor f higher than the sampling Gr that corresponds to pr in Gr. Note that the subtraction ′ rate of Di. As the name suggests, these regions of higher of pr and pr corresponds to high-pass filtering. Thus, we ′ sampling rate serve as guidance for our directional upsam- add only high frequencies (pr − pr) to the low frequencies Algorithm 1 Guided upsampling function GUIDED UPSAMPLING(Di,G,n,δs,δt) Do=init() f=samplingrate(Di)/samplingrate(G0) (δu,δv )=size(G0) for r = 0..(n − 1) do ′ Gr =downupsample(Gr ,f) end for ′ Di=upsample(Di,f) for all s,t,u,v in Do do Figure 2. Iterative guided upsampling. Two embedded guid- if u,v valid /*same sampling pattern*/ then ′ ′ ance regions consisting of different sampling rates (green and red). pi=getpatch(Di,s,t,u,v,δs,δt,δu,δv ) p′ =patchmatch(G′,p′ ) Black dots in the sampling code represent captured samples, while r i ′ pr =getcorrespondingpatch(G,p ) ′ ′ r empty circles are samples computed by upsampling in two itera- po = pi +(pr − pr ) averrageandsetpatch(Do,s,t,u,v,po) tions. The color of filled circles indicate which sample is com- end if puted in what upsampling iteration.

Load more