Display-Aware Image Editing
Total Page:16
File Type:pdf, Size:1020Kb
Display-aware Image Editing Won-Ki Jeong Micah K. Johnson Insu Yu Jan Kautz Harvard University Massachusetts Institute of Technology University College London Hanspeter Pfister Sylvain Paris Harvard University Adobe Systems, Inc. Abstract e.g. [14, 16, 17, 28], major hurdles remain to interactively edit larger images. For instance, optimization tools such We describe a set of image editing and viewing tools that as least-squares systems and graph cuts become unpractical explicitly take into account the resolution of the display on when the number of unknowns approaches or exceeds a bil- which the image is viewed. Our approach is twofold. First, lion. Furthermore, even simple editing operations become we design editing tools that process only the visible data, costly when repeated for hundreds of millions of pixels. The which is useful for images larger than the display. This en- basic insight of our approach is that the image is viewed on compasses cases such as multi-image panoramas and high- a display with a limited resolution, and only a subset of the resolution medical data. Second, we propose an adaptive image is visible at any given time. We describe a series way to set viewing parameters such brightness and contrast. of image editing operators that produce results only for the Because we deal with very large images, different locations visible portion of the image and at the displayed resolution. and scales often require different viewing parameters. We A simple and efficient multi-resolution data representation let users set these parameters at a few places and interpo- (x 2) allows each image operator to quickly access the cur- late satisfying values everywhere else. We demonstrate the rently visible pixels. Because the displayed view is com- efficiency of our approach on different display and image puted on demand, our operators are based on efficient im- sizes. Since the computational complexity to render a view age pyramid manipulations and designed to be highly data depends on the display resolution and not the actual input parallel (x 3). When the user changes the view location or image resolution, we achieve interactive image editing even resolution we simply recompute the result on the fly. on a 16 gigapixel image. 1. Introduction Further, editing tools do not support the fact that very large images can be seen at multiple scales. For instance, a high- Gigapixel images are now commonplace with dedicated de- resolution scan as shown in the companion video reveals vices to automate the image capture [1,2, 24, 37] and im- both the overall structure of the brain as well as the fine age stitching software [5, 11]. These large pictures have entanglement between neurons. In existing software, set- a unique appeal compared to normal-sized images. Fully tings such as brightness and contrast are the same, whether zoomed out, they convey a global sense of the scene, while one looks at the whole image or at a small region. In com- zooming in lets one dive in, revealing the smallest details, parison, we let the user specify several viewing settings for as if one were there. In addition, modern scientific instru- different locations and scales. This is useful for emphasiz- ments such as electron microscopes or sky-surveying tele- ing different structures, e.g. on a brain scan, or expressing scopes are able to generate very high-resolution images for different artistic intents, e.g. on a photo. We describe an in- scientific discovery at the nano- as well as at the cosmo- terpolation scheme motivated by a user study to infer the logical scale. We are interested in two problems related to viewing parameters where the user has not specified any these large images: editing them and viewing them. Edit- settings (x 4). This adaptive approach enables the user to ing such large pictures remains a painstaking task. Al- obtain a pleasing rendition at all zoom levels and locations though after-exposure retouching plays a major role in the while setting viewing parameters only at a few places. rendition of a photo [3], and enhancing scientific images is critical to their interpretation [8], these operations are still The novel contributions of this work are twofold. First, mostly out of reach for images above 100 megapixels. Stan- we describe editing operators such as image stitching and dard editing techniques are designed to process images that seamless cloning that are output-sensitive, i.e., the associ- have at most a few megapixels. While significant speed ated computational effort depends only on the display res- ups have been obtained at these intermediate resolutions, olution. Our algorithms are based on Laplacian pyramids, 1 which we motivate by a theoretical study of the constraints lows users to make adjustments that adapt to the current required to be display-aware. Second, we propose an inter- view and may reflect subjective intents. Furthermore, we polation scheme motivated by a user study to infer viewing propose more complex output-sensitive algorithms for tasks parameters where the user has not specified any settings. such as seamless cloning. We illustrate our approach with a prototype implemented on the GPU and show that we can interactively edit very Shantzis [35] describes a method to limit the amount of large images as large as 16 gigapixels. computation by only processing the data within the bound- ing box of each operator. We extend this approach is several 1.1. Related Work ways. Unlike Shantzis, we deal with changes of zoom level and ignore the high-frequency data when they are not visi- The traditional strategy with large images is to process them ble. This property is nontrivial as we shall see (x 3.1). We at full resolution and then rescale and crop according to the also design new algorithms that enable display-aware pro- current display. As far as we know, this is commonly used cessing such as our stitching method based on local compu- in commercial software. However, this simple approach tation only. In comparison, the standard method based on becomes quickly unpractical with large images, especially graph cut is global, i.e. the bounding box would cover the with optimization such as graph cuts and Poisson solvers. entire image. Further, we also deal with viewing parame- ters, which is not in the scope of Shantzis’ work. Fast image filters have been proposed to speed up oper- ations such as edge-aware smoothing [4, 14, 15, 17, 32], 2. Data Representation seamless compositing [5, 16, 22], inpainting [9], and selec- tion [28]. Although these algorithms reduce the computa- A major aspect of our approach is that the view presented to tion times, they have been designed for standard-size im- the user is always computed on the fly. From a data struc- ages and the entire picture at full resolution is eventually ture point of view, this implies that the displayed pixel data processed. In comparison, we propose display-aware algo- have to be readily available and that we can rapidly deter- rithms that work locally in space and scale such that only mine how to process them. To achieve this, we use several the visible portion of the data is processed. mechanisms detailed in the rest of this section. Berman et al. [10] and Velho and Perlin [36] describe multi- Global Space and Image Tiles Our approach is orga- scale painting systems for large images based on wavelets. nized around a coordinate system in which points are lo- From an application perspective, our work is complemen- cated by their (x; y) spatial location and the scale s at which tary as we do not investigate methods for painting but rather they are observed. A unit scale s = 1 corresponds to the 1 for adaptive viewing and more advanced editing such as full-resolution data, while s = n corresponds to the image seamless cloning. Technically speaking, our methods op- downsampled by a factor of n. We use this coordinate sys- erate in a display-aware fashion, and not in a multi-scale tem to define global space in which we locate data with re- fashion. That is, we apply our edits on-the-fly to the current spect to the displayed image that the user observes (Fig.1). view and never actually propagate the results to all scales. Typically, we have several input images that make up, e.g., Further, it is unclear how to extend the proposed approach a panorama. For each image Ii, we first compute a geo- from painting to algorithms such as seamless cloning. metric transformation gi that aligns it with the others by specifying its location in global space. If we have only one Pinheiro and Velho [34] and Kopf et al. [24] propose input image, then g is the identity function. The geomet- a multi-resolution tiled memory management system for ric alignment can either be pre-computed before the editing viewing large data. Our data management follows similar session, or each image can be aligned on the fly when it is design principles, but supports multiple input images that displayed. In the former case, we use feature point detec- can be aligned to form a local image pyramid on-the-fly tion and homography alignment, e.g. [11]. In the latter case, without managing a pre-built global multiresolution image the user interactively aligns the images in an approximate pyramid. It also naturally supports out-of-core computa- manner. We then automatically register them in a display- tions on graphics hardware with limited memory. Jeong et aware fashion by maximizing the cross-correlation between al. [21] proposed an interactive gigapixel viewer suitable for visible overlapping areas.