Image and video decomposition and editing.

JIANCHAO TAN, George Mason University My research has centered around understanding the colorful appearance of physical paintings, time-lapse painting history, and natural images physical and digital paintings and images. My work focuses on decomposing and videos. images or videos into more editable data structures called layers, to enable The colorful appearance of digital paintings are modeled by Porter efficient image or video re-editing. and Duff “over” model (PD), as described in Porter and Given a time-lapse painting video, we can recover translucent layer Duff [1984], which is also called alpha compositing model. Thisis strokes from every frame pairs by maximizing translucency of layers for its a standard color compositing model in industry. It models current maximum re-usability, under either digital color compositing model or a physically inspired nonlinear color layering model, after which, we apply canvas color as a linear interpolation between previous canvas color a spatial-temporal clustering on strokes to obtain semantic layers for fur- and new coming color. The interpolation value is called alpha values ther editing, such as global recoloring and local recoloring, spatial-temporal or opacity values. Artists usually use a data structure called layers gradient recoloring and so on. to organize paintings in painting software. These layers are RGBA With a single image input, we use the convex shape geometry intuition image format, where the fourth is per pixel alpha values, of color points distribution in RGB space, to help extract a small size as used in Tan et al. [2016]. There is also an additive linear mixing from a image and then solve an optimization to extract translucent RGBA model, which model each pixel color as a convex interpolation of layers, under digital alpha compositing model. The translucent layers are small set of pigment colors, as used in Aksoy et al. [2017]. Artists suitable for global and local image recoloring and new object insertion as use palette to create either digital paintings with painting software layers efficiently. or real world paintings with brushes. The palette colors are rep- Alternatively, we can apply an alternating least square optimization to extract multi-spectral physical pigment parameters from a single digitized resentative colors of one painting. Many recent works focused on physical painting image, under a physically inspired nonlinear color mixing extracting palette from paintings or photo images, such as Abed model, with help of some multi-spectral pigment parameters priors. With [2014]; Aharoni-Mack et al. [2017]; Chang et al. [2015]; Lin and these multi-spectral pigment parameters and their mixing layers, we demon- Hanrahan [2013]. Some works use extracted palette to help recolor strate tonal adjustments, selection masking, recoloring, physical pigment images, as described in Aharoni-Mack et al. [2017]; Chang et al. understanding, palette summarization and edge enhancement. [2015]; Mellado et al. [2017]. Our recent ongoing work introduces an extremely scalable and efficient The colorful appearance of a physical painting is determined by yet simple palette-based image decomposition algorithm to extract additive the distribution of paint pigments across the canvas, which can be mixing layers from single image. Our approach is based on the geome- modeled either as a per-pixel linear mixture of a small number of try of images in RGBXY-space. This new geometric approach is orders of palette pigments with absorption and scattering coefficients (mixing magnitude more efficient than previous work and requires no numerical op- timization. We demonstrate a real-time layer updating GUI. We also present model), or a per pixel layering of a small number of palette pigments a palette-based framework for color composition for visual applications, with reflectance and transmittance coefficients (layering model). such as image and video harmonization, color transfer and so on. The model was proposed by Kubelka and Munk (KM), as described in Barbarić-Mikočević and Itrić [2011]; Kubelka and Munk [1931], CCS Concepts: • Computing methodologies → Image manipulation; which is a nonlinear color compositing model. This model is also Image processing; widely used in recent painting analysis and non-photo realistic Additional Key Words and Phrases: images, layers, time-lapse, video, paint- rendering works, such as Abed [2014]; Aharoni-Mack et al. [2017]; ing, palette, generalized barycentric coordinates, harmonization, contrast, Baxter et al. [2004]; Curtis et al. [1997]; Tan et al. [2018a, 2015]. convex hull, RGB, color space, recoloring, compositing, mixing Decomposing layers from timelapse paintings. The creation ACM Reference Format: of a painting, in the physical world or digitally, is a process that Jianchao Tan. 2018. Image and video decomposition and editing.. 1, 1 occurs over time. Later strokes cover earlier strokes, and strokes (June 2018), 4 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn painted at a similar time are likely to be part of the same object. In the final painting, this temporal history is lost, and a static arrange- 1 INTRODUCTION ment of color is all that remains. The rich literature for interacting My research investigates the colorful appearance of images and with history cannot be used. To enable these interac- videos for re-editing purposes. I target digital paintings, digitized tions, we presented a set of techniques to decompose a time lapse video of a painting (defined generally to include pencils, markers, Author’s address: Jianchao Tan, George Mason University, [email protected]. etc.) into a sequence of translucent “stroke” images. We presented translucency-maximizing solutions for recovering physical (Kubelka Permission to make digital or hard copies of part or all of this work for personal or and Munk layering) or digital (Porter and Duff “over” compositing classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation operation) paint parameters from before/after image pairs and en- on the first page. Copyrights for third-party components of this work must be honored. able spatial temporal video recoloring. We also presented a pipeline For all other uses, contact the owner/author(s). for processing real-world videos of paintings capable of handling © 2018 Copyright held by the owner/author(s). XXXX-XXXX/2018/6-ART https://doi.org/10.1145/nnnnnnn.nnnnnnn

Submission ID: ***. 2018-06-18 16:49. Page 1 of 1–4. , Vol. 1, No. 1, Article . Publication date: June 2018. :2 • Jianchao Tan

long-term occlusions, such as the painter’s hand and its shadow, color shifts, and noise. This work is described in Tan et al. [2015].

Decomposing layers from images. We presented a decompo- sition technique to decompose an image into layers via RGB-space geometry. In our decomposition, each layer represents a single-color coat of paint applied with varying opacity. In RGB-space, the linear nature of the standard Porter-Duff “over” pixel compositing oper- ation implies a convex hull structure. The vertices of the convex Fig. 1. Visualization of the two convex hulls. The simplified RGB convex hull of image pixels in RGB-space correspond to a palette of paint hull is the basis for the methods in Tan et al. [2016], capturing the colors colors. These colors may be “hidden” and inaccessible to algorithms of an image but not their spatial relationships. Our 5D RGBXY convex based on clustering visible colors (Chang et al. [2015]). For our layer hull captures color and spatial relationship at the same time. We visualize decomposition, users choose the palette size (degree of simplifica- its vertices as small circles; its 5D simplices are difficult to visualize. Our tion to perform on the convex hull), as well as a layer order for approach splits image decomposition into a two-level geometric problem. the paint colors (vertices). We applied a modified progressive hull The first level are the RGBXY convex hull vertices that mix to produce method in Sander et al. [2000] to simplify convex hull to get target any pixel in the image. The second level are the simplified RGB convex size of palette. We then solve a constrained optimization problem hull vertices, which serve as the palette RGB colors. Since the RGBXY convex hull vertices lie inside the RGB convex hull, we find mixing weights to find translucent, spatially coherent opacity for each layer, such that control the color of the RGBXY vertices. The two levels combined that the composition of the layers reproduces the original image. allow instant recoloring of the whole image. The left figure shows the We demonstrate the utility of the resulting decompositions for re- locations of the RGBXY vertices in image space. The right figure shows the coloring (global and local) and object insertion. Our layers can be geometric relationships between the 3D and 5D convex hull vertices, and interpreted as generalized barycentric coordinates, as described in how the simplified RGB convex hull captures the same color palette for both Floater [2015]; we compare to these and other recoloring approaches. algorithms. This work is described in Tan et al. [2016].

Pigment-based image analysis and editing. We also presented an alternating nonlinear least square optimization method based 2 IMAGE DECOMPOSITION on Kubelka-Munk mixing color model to efficiently recover multi- 2.1 Image decomposition via RGBXY convex hull spectral pigment parameters (absorption and scattering) from a single RGB image, yielding a plausible set of pigments and their In this work, we extract additive linear mixing layers from input im- mixture maps with a low RGB reconstruction error. Using our de- age. We provide a fast and simple, yet spatially coherent, geometric composition, we repose standard digital image editing operations construction. as operations in pigment space rather than RGB space, with inter- Spatial Coherence To provide spatial coherence, our key insight estingly novel results. We demonstrate tonal adjustments, selection is to manipulate color mixing weights in 5D RGBXY-space, where masking, cut-copy-paste, recoloring, palette summarization, and XY are the coordinates of a pixel in image space, so that spatial edge enhancement. This work is described in Tan et al. [2018a]. relationship are considered along with color in a unified way (Fig- Ongoing: Image decomposition and harmonization. A de- ure 1). We first compute convex hull of the image in RGBXY-space. tailed description of this work is in our first version draft on arXiv We then compute Delaunay generalized barycentric coordinates (Tan et al. [2018b]). (weights) for every pixel in the image in terms of the 5D convex At the core of our and other recent approaches Aksoy et al. [2017]; hull. Pixels that have similar colors or are spatially adjacent will end Chang et al. [2015]; Tan et al. [2016]; Zhang et al. [2017] to image edit- up with similar weights, meaning that our layers will be smooth both in RGB and XY-space. These mixing weights form an Q N ing, images are decomposed into a palette and associated per-pixel × compositing or mixing parameters. We propose a new, extremely matrix WRGBXY, where N is the number of image pixels and Q is efficient yet simple and robust algorithm to do so. Our approach the number of RGBXY convex hull vertices. We also compute WRGB, is inspired by the geometric palette extraction technique of Tan generalized barycentric coordinates (weights) for the RGBXY con- vex hull vertices in the 3D simplified convex hull. W is a P Q et al. [2016]. We consider the geometry of 5D RGBXY-space, which RGB × captures color as well as spatial relationships and eliminates numer- matrix, where P is the number of vertices of the simplified RGB ical optimization. After an initial palette is extracted automatically convex hull (the palette colors). The final weights for the image (given an RMSE reconstruction threshold), the user can edit the are obtained via matrix multiplication: W = WRGBWRGBXY, which is a P N matrix that assigns each pixel weights solely in terms palette in our GUI and obtain new decompositions instantaneously. × Our algorithm’s performance is extremely efficient even for very of the simplified RGB convex hull. These weights are smooth both high resolution images ( 100 megapixels)—20x faster than the in color and image space. To decompose an image with a different ≥ state-of-the-art Aksoy et al. [2017]. We demonstrate applications RGB-palette, one only needs to recompute WRGB and then perform like color harmonization, color transfer, are greatly simplified by matrix multiplication. Computing WRGB is extremely efficient, since our framework. it depends only on the palette size and the number of RGBXY convex hull vertices. It is independent of the image size and allows users to

, Vol. 1, No. 1, Article . Publication date: June 2018. Submission ID: ***. 2018-06-18 16:49. Page 2 of 1–4. Image and video decomposition and editing. • :3

experiment with image decompositions based on interactive palette same input image. Our pipeline can naturally extend to video input, editing (Figure 3b). by simply applying our image decomposition and harmonization on each frame independently. Surprisingly, we can obtain a pretty tem- 2.2 Evaluation poral coherent layer decomposition without additional processing We generate state-of-the-art decompositions in terms of quality beyond the proposed framework. Additionally, our palette based Figure 2b compares recolorings created with our layers to those harmonization template fitting can enable color transfer between from Aksoy et al. [2017], Tan et al. [2016], and Chang et al. [2015]. input image and reference image. More details can be found in Tan Figure 2a shows a direct comparison between our additive mix- et al. [2018b] ing layers and those of Aksoy et al. [2017] for direct inspection. In Figure 2c, we compare the running time of additive mixing layer de- 4 CONCLUSION composition techniques. We also ran an additional 6 extremely large My work focus on manipulating color of images and videos, explor- images containing 100 megapixels (not shown in the plot). The 100 ing several methods to decompose images and videos into more megapixel images took on average 12.6 minutes to compute. Peak editable layers, followed by several editing applications, such as re- memory usage was 15 GB. For further improvement, our approach coloring and insertion. Our ongoing work presented a very efficient, could be parallelized by dividing the image into tiles, since the con- intuitive and capable framework for color composition by explor- vex hull of a set of convex hulls is the same as the convex hull of the ing RGBXY space geometry. It allows us to formulate previous and underlying data. A working implementation (48 lines of code) of novel approaches to color harmonization and color transfer with the RGBXY decomposition method can be found in this code link. very robust results. Our palette manipulations can be plugged into The “Layer Updating” performance is nearly instantaneous, taking any palette-based system. Our image decomposition can be used a few milliseconds to, for 10 MP images, a few tens of milliseconds generally by artists for manual editing or in other algorithms. to re-compute the layer decomposition given a new palette.

Interactive Layer Decompositions To take advantage of our 4.1 Future work extremely fast layer decomposition, we implemented a web GUI for viewing and interacting with layer decompositions (Figure 3a). Palette for vector graphics We are also exploring the color An editing session begins when a user loads an image and precom- palettes editing on vector graphics input, for example, palette based putes RGBXY weights. Our GUI allows users edit palettes and see suggestive colorization on gray scale vector graphics, which may the resulting layer decomposition in real-time. See Figure 3b for a involve the usage of several deep learning techniques. result created with our GUI. In this example, the automatic palette Spatial-temporal palette of videos We also also plan to extend (right) corresponding mixing weights become sparser as a result of our framework to video, exploring the spatial-temporal coherence of interactive editing. palettes, to potentially provide more robust color grading methods. Additionally, combining with semantic segmentation informations 3 APPLICATIONS of video frames, we can provide semantic spatial-temporal palettes 3.1 Layer-based editing model to enable better semantic video color grading. My research can enable various editing applications on images and videos with the help of various extracted layers. For translucent REFERENCES layers extracted from time-lapse painting video, we can enable Farhad Moghareh Abed. 2014. Pigment identification of paintings based on Kubelka- temporal-spatial gradient recoloring, as shown in Figure 4a. We can Munk theory and spectral images. (2014). Elad Aharoni-Mack, Yakov Shambik, and Dani Lischinski. 2017. Pigment-Based Recol- also do spatial-temporal clustering to obtain semantic structure for oring of Watercolor Paintings. (2017). further editing, as shown in Figure 4b For alpha compositing layers Yağiz Aksoy, Tunç Ozan Aydin, Aljoša Smolić, and Marc Pollefeys. 2017. Unmixing- based soft color segmentation for image manipulation. ACM Transactions on Graphics from single image, we can recoloring by changing layer palette (TOG) 36, 2 (2017), 19. colors in real time, as shown in Figure 2b. We can also insert new Vesna Džimbeg-Malčić Željka Barbarić-Mikočević and Katarina Itrić. 2011. Kubelka- object as new layer to merge naturally into original image, as shown Munk theory in describing optical properties of paper (I). Technical Gazette 18, 1 (2011), 117–124. in Figure 5a. For additive mixing layers extracted from digitized William Baxter, Jeremy Wendt, and Ming C Lin. 2004. IMPaSTo: a realistic, interac- physical paintings, we can enable more editing beyond RGB, as tive model for paint. In Proceedings of the 3rd international symposium on Non- shown in Figure 5b. photorealistic animation and rendering. ACM, 45–148. Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based Photo Recoloring. ACM Trans. Graph. 34, 4 (July 2015). 3.2 Harmonization and color transfer Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color Harmonization. In ACM SIGGRAPH 2006 Papers (SIGGRAPH ’06). ACM, New We can simplify color harmonization fitting procedure of Cohen-Or York, NY, USA, 624–630. https://doi.org/10.1145/1179352.1141933 et al. [2006] by replacing their whole pixel histograms with just our Cassidy J Curtis, Sean E Anderson, Joshua E Seims, Kurt W Fleischer, and David H Salesin. 1997. Computer-generated watercolor. In Proceedings of the 24th annual palette colors, and we project palette colors onto template axis in conference on and interactive techniques. ACM Press/Addison- LCh-space to enable color harmonization. Because we use a spa- Wesley Publishing Co., 421–430. tially coherent image decomposition, no additional work is needed Michael S Floater. 2015. Generalized barycentric coordinates and applications. Acta Numerica 24 (2015), 161–214. to prevent discontinuous recoloring as in Cohen-Or et al. [2006]. Paul Kubelka and Franz Munk. 1931. An article on optics of paint layers. Z. Tech. Phys Figure 5c shows different harmonic templates enforced over the 12, 593-601 (1931).

Submission ID: ***. 2018-06-18 16:49. Page 3 of 1–4. , Vol. 1, No. 1, Article . Publication date: June 2018. :4 • Jianchao Tan

(a) (b) (c) (d)

[Aksoy et al. 2017] [Aksoy et al. 2017] al. et [Aksoy (a) (b) (c) (d)

(d)

(d)

Ours Ours

(d) (a) Layer comparison: Aksoy et al. [2017] cre- (b) To evaluate our RGBXY decomposition algorithm, (c) Running time comparison. We ates an overabundance of layers (two red lay- we compare our layers with previous approaches in a evaluated our RGBXY algorithm on ers above) and does not extract the blueish tint, recoloring application. From left to right: (a) Aksoy. etal 170 images up to 12 megapixels and which appears primarily in mixture. Our RG- [2017], (b) Tan et al. [2016], (c) Chang et al. [2015] and an additional six 100 megapixel im- BXY technique identifies mixed colors is able (d) our approach. Our recoloring quality is similar to the ages (not shown; average running to separate the translucent purple haze in front state of the art, but our method is orders of magnitude time 12.6 minutes). Our algorithm’s of the girl’s face. Additionally, our GUI allows faster and allows interactive layer decomposition while performance is orders of magnitude editing the palette to modify layers in real time, editing palettes. faster and scales extremely well with results are shown in Figure 3b image size. Our layer updating is real time.

Fig. 2. Layer comparison, recoloring comparison and running time comparison.

recoloring with recoloring with Original Monochrome Complementary Triad original interactively edited palette! automatic palette!

above between Original Reduce yellow Add more yellow

Square Analogous Single split Double split (Optimal)

above between Original Increase scattering Decrease scattering (a) Our GUI for inter- (b) The user edits the automat- (a) Inserting graphics as (b) Adjusting the mix- (c) Results of our har- actively editing palettes. ically generated palette in GUI new layers between our ing weight of yellow monic templates fit to As users either drag the to ensure that the background decomposed layers pro- pigment and the scat- the an input image. palette colors or click lay- and hair colors are directly rep- duces a more natural re- tering coefficients of ers to edit color wheel, resented. As a result, editing sult than pasting graphics the green pigment. the layer decomposition the purple haze and hair no above. updates lively. longer affects the background color. Fig. 5. Different editing applications based on different layers Fig. 3. Live updating GUI and recoloring comparison. Nicolas Mellado, David Vanderhaeghe, Charlotte Hoarau, Sidonie Christophe, Math- ieu Brédif, and Loic Barthe. 2017. Constrained palette-space exploration. ACM Transactions on Graphics (TOG) 36, 4 (2017), 60. Thomas Porter and Tom Duff. 1984. Compositing digital images. In ACM Siggraph Computer Graphics, Vol. 18. ACM, 253–259. Pedro V Sander, Xianfeng Gu, Steven J Gortler, Hugues Hoppe, and John Snyder. 2000. Silhouette clipping. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 327–334. Jianchao Tan, Stephen DiVerdi, Jingwan Lu, and Yotam Gingold. 2018a. Pigmento: Pigment-Based Image Analysis and Editing. IEEE Transactions on Visualization and Computer Graphics, to appear (2018). (a) From a time lapse recording of the painting (b) The user draws Jianchao Tan, Marek Dvorožňák, Daniel Sýkora, and Yotam Gingold. 2015. Decomposing process (bottom) we extract translucent paint lay- a selection scribble, Time-lapse Paintings into Layers. ACM Trans. Graph. 34, 4, Article 61 (July 2015), ers into a temporal creation history. This allows the system then sug- 10 pages. https://doi.org/10.1145/2766960 Jianchao Tan, Jose Echevarria, and Yotam Gingold. 2018b. Pigmento: Pigment-Based artists to produce complex effects such as color gests several tempo- Image Analysis and Editing. arXiv preprint arXiv:1804.01225 (2018). gradients controlled by time (top left) or temporal ral clusters. Jianchao Tan, Jyh-Ming Lien, and Yotam Gingold. 2016. Decomposing Images into eraser (top right). Layers via RGB-space Geometry. ACM Trans. Graph. 36, 1, Article 7 (Nov. 2016), 14 pages. https://doi.org/10.1145/2988229 Qing Zhang, Chunxia Xiao, Hanqiu Sun, and Feng Tang. 2017. Palette-Based Image Fig. 4. Timelapse painting decomposition and editing application Recoloring Using Color Decomposition Optimization. IEEE Transactions on Image Processing 26, 4 (2017), 1952–1964.

Sharon Lin and Pat Hanrahan. 2013. Modeling how people extract color themes from images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3101–3110.

, Vol. 1, No. 1, Article . Publication date: June 2018. Submission ID: ***. 2018-06-18 16:49. Page 4 of 1–4.