On Demand Solid Texture Synthesis Using Deep 3D Networks Jorge Gutierrez, Julien Rabin, Bruno Galerne, Thomas Hurtut
Total Page:16
File Type:pdf, Size:1020Kb
On Demand Solid Texture Synthesis Using Deep 3D Networks Jorge Gutierrez, Julien Rabin, Bruno Galerne, Thomas Hurtut To cite this version: Jorge Gutierrez, Julien Rabin, Bruno Galerne, Thomas Hurtut. On Demand Solid Texture Synthesis Using Deep 3D Networks. 2018. hal-01678122v1 HAL Id: hal-01678122 https://hal.archives-ouvertes.fr/hal-01678122v1 Preprint submitted on 8 Jan 2018 (v1), last revised 20 Dec 2019 (v3) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. SOLID TEXTURE GENERATIVE NETWORK FROM A SINGLE IMAGE Jorge Gutierrez? Julien Rabin Bruno Galerney Thomas Hurtut? ?Polytechnique Montreal,´ Montreal,´ Canada Normandie Univ., UniCaen, ENSICAEN, CNRS, GREYC, Caen, France yLaboratoire MAP5, Universite´ Paris Descartes and CNRS, Sorbonne Paris Cite´ ABSTRACT loss function. Then, at run-time, the network is applied to new images to perform the task which is often achieved in This paper addresses the problem of generating volumetric real-time. data for solid texture synthesis. We propose a compact, mem- ory efficient, convolutional neural network (CNN) which is Contributions We introduce a solid texture generator based trained from an example image. The features to be repro- on convolutional neural networks that is capable of synthe- duced are analyzed from the example using deep CNN filters sizing solid textures from a random input at interactive rates. that are previously learnt from a dataset. After training the Taking inspiration from previous work on image texture syn- generator is capable of synthesizing solid textures of arbitrary thesis with CNN, a framework is proposed for the training of sizes controlled by the user. The main advantage of the pro- this solid texture generator. A volumetric loss function based posed approach in comparison to previous patch-based meth- on deep CNN filters is defined by comparing multiple view- ods is that the creation of a new volume of data can be done point slices of the generated textures to a single example im- in interactive rates. age. This model ensures that each cross-section of the gener- Index Terms— Texture synthesis, convolutional neural ated solid shares the same visual characteristics as the input networks, volumetric data generation, solid textures. example. To the best of our knowledge, there is no such archi- tecture proposed in the literature to generate volumetric color data. 1. INTRODUCTION Outline The paper is organized as follows: Section2 gives a short overview of the literature about solid texture synthesis Motivation and context Besides imaging problems which methods and 2D texture synthesis using neural networks, Sec- deal with volumetric data (such as MRI in medical imag- tion3 describes the proposed generative model and Section4 ing, or 3D seismic geological data), the main application of gives the implementation details and some visual results. solid texture synthesis originates from computer graphics. Since [1,2], this approach has been proposed to create 3D objects as an alternative to texture mapping or synthesis onto 2. RELATED WORKS ON TEXTURE SYNTHESIS 3D meshes. It is an efficient and elegant way of dealing with many problems that arise with the later approaches, such as Let us give a quick review on example-based texture synthe- time consuming artist design, mapping distorsion and seams, sis, first with a focus on solid texture synthesis and second on etc. Once the volumetric information is generated, surfaces convolutional networks for image generation. can be carved inside without further parametrization no mat- ter how intricate they are. However, several reasons still limit 2.1. Solid texture synthesis the use of such techniques in practice: i) storing volumetric data is prohibitive (HD quality requires several GB of mem- Procedural texture synthesis Procedural methods are a very ory); ii) learning to generate such data is a difficult task as special category which consists of evaluating a specific noise there is no 3D examples; iii) previous synthesis approaches function depending on the spatial coordinates to achieve real- based on optimizing a matching criterion do not achieve time synthesis with minimal memory requirements [2,3]. Un- real-time performance. fortunately, these methods often fail at reproducing real-world In this work, we investigate the use of Convolutional Neu- examples with structured features. ral Networks (CNN) for solid texture synthesis. The general Example-based solid texture synthesis Most successful principle of CNN is first to off-line train a network to achieve methods focus on generating volumetric color data (grids of a specific task, usually by optimizing its parameters given a voxels) whose cross-sections are visually similar to a respec- This work was partially funded by CONACYT and I2T2 Mexico tive input texture example. In [4,5] an analog “slicing” strategy is used, where the network to generate texture samples from random inputs that texture is generated iteratively, alternating between indepen- produce the same visual features as the texture example. As in dent 2D synthesis of the slices along each of the three or- the previous framework, the objective loss function to be min- thogonal axes of the cartesian grid, and 3D aggregation by imized relies on the comparison between feature activations in averaging. The main difference between the two approaches a pre-trained discriminative CNN from the output and the tex- is the features to be considered: steerable filters coefficients ture example. However, this time, instead of optimizing the and laplacian pyramid in [4] and spectral information in [5]. generated output image the training aims at tuning the param- Generalizing some previous work on texture synthesis, eters of the generative network. Such optimization is more in- Wei [6] designed an algorithm which consists of modifying tensive as much more variables are now involved, but it only the voxels of a solid texture initialized randomly by copying needs to be done once and for all. This is achieved in prac- the average color values from an image in a raster scan order tice using back propagation and a gradient-based optimization based on the most similar neighborhood extracted along each algorithm using batches of random inputs. Once trained, the axis. This strategy has been refined and accelerated by Dong generator is able to quickly generate samples similar to the in- et al. [7] by pre-processing compatible 3D neighborhoods ac- put example by forward evaluation. Feed-forward texture net- cordingly to the examples given along some axis. works methods generally produce results with slightly lower Closely related to these methods, patch-based approaches quality than the image optimization using CNN ones but it [8,9] iteratively modify the local 2D neighborhood of a voxel exhibits computation times extremely shorter. to match similar neighborhoods in examples, under statistical Originally these methods train one network per texture constraints on colors or spatial coordinates. sample. Li et al. [17] proposed a training scheme to allow While recent methods achieve state-of-the-art results, one network to have the capacity to generate and mix several their main limitation is the required computation time due different textures. In order to prevent the generator from pro- to iterative update of each voxel, (several minutes except for ducing identical images, they also incorporate in the objec- [7]) thus prohibiting real-time applications. tive function a term that encourages diversity in synthesized batches, similarly to [18]. 2.2. Neural networks on texture synthesis Aside from texture synthesis, other computer vision appli- cations dealing with volumetric benefit from CNN. As an ex- Several methods based on deep CNN have been recently emple, [19] proposed recently a GAN for 3D shape inpainting proposed for 2D texture synthesis that outperform previ- and [20] proposed Octree Networks for 3D shapes generation. ous ones based on more “shallow” representation (wavelets, In the more specific problem of generating 3D objects neighborhood-based representations, etc). In the following, from several 2D view images which is close to the problem we present and discuss the most successful of such architec- considered in this work, deep learning methods have also tures. been recently introduced, including [21] for dynamic facial Image optimization Gatys et al. [10] proposed a variational texture, and [22, 23, 24] for 3D object generation from 2D framework inspired from [11] to generate an image which views. aims at matching the features of an example image. The most important aspect of this framework is that the discrep- 3. SOLID TEXTURE NETWORKS ancy is defined as the distance between Gram matrices of the input and the example images; these matrices encode the fil- Similar to the 2D feed-forward texture networks, our method ters’ responses of an image at some layers of a deep convo- consists of a solid texture generator and a training framework lutional neural network which is pretrained on a large image incorporating a descriptor network and a 3D loss. One gener- dataset [12]. Starting from a random initialization, the in- ator is to be trained for each input example, with most of the put image is then optimized using back-propagation and a computations occurring in that process. Once the generator stochastic gradient descent algorithm. is trained it can be used to quickly generate new solid sam- Since then, several variants have been using this frame- ples of arbitrary sizes (only limited by the available memory). work to improve the quality of the synthesis: for structured Here we describe each element of the framework.