Developing Creative AI to Generate Sculptural Objects
Total Page:16
File Type:pdf, Size:1020Kb
Developing Creative AI to Generate Sculptural Objects Songwei Ge∗, Austin Dill?, Eunsu Kang?, Chun-Liang Li?, Lingyao Zhang Manzil Zaheer, Barnabas Poczos Carnegie Mellon University Pittsburgh, PA, United States fchunlial, eunsuk, songweig, lingyaoz, abdill, manzil, [email protected] Abstract We explore the intersection of human and machine creativ- ity by generating sculptural objects through machine learning. This research raises questions about both the technical details of automatic art generation and the interaction between AI and people, as both artists and the audience of art. We introduce two algorithms for generating 3D point clouds and then dis- cuss their actualization as sculpture and incorporation into a holistic art installation. Specifically, the Amalgamated Deep- Dream (ADD) algorithm solves the sparsity problem caused by the naive DeepDream-inspired approach and generates cre- ative and printable point clouds. The Partitioned DeepDream (PDD) algorithm further allows us to explore more diverse 3D object creation by combining point cloud clustering algorithms Figure 1: Sculpture generated by creative AI, PDD. and ADD. sculptures. Nima et al. [27] introduced Vox2Netadapted from Keywords pix2pix GAN which can extract the abstract shape of the input Partitioned DeepDream, Amalgamated DeepDream, 3D, sculpture. DeepCloud [5] exploited an autoencoder to gener- Point Cloud, Sculpture, Art, Interactive Installation, Creative ate point clouds, and constructed a web interface with analog AI, Machine Learning input devices for users to interact with the generation process. Similar to the work of Lehman et al [22], the results generated Introduction by DeepCloud are limited to previously seen objects. In this paper, we introduce two ML algorithms to create Will Artificial Intelligence (AI) replace human artists or will original sculptural objects. As opposed to previous methods, it show us a new perspective into creativity? Our team of our results do not mimic any given forms and are original and artists and AI researchers explore artistic expression using creative enough not to be categorized into the dataset cate- Machine Learning (ML) and design creative ML algorithms gories. Our method extends an idea inspired by DeepDream to be possible co-creators for human artists. for 2D images to 3D point clouds. However, simply translat- In terms of AI-generated and AI-enabled visual artwork, ing the method generates low-quality objects with local and there has been a good amount of exploration done over the global sparsity issues that undermine the integrity of recon- past three years in the 2D image area traditionally belonging struction by 3D printing. Instead we propose Amalgamated to the realm of painting. Meanwhile, there has been very lit- DeepDream (ADD) which utilizes union operations during tle exploration in the area of 3D objects, which traditionally the process to generate both creative and realizable objects. would belong to the realm of sculpture and could be easily ex- Furthermore, we also designed another ML generation algo- tended into the area of art installations. The creative genera- rithm, Partitioned DeepDream (PDD), to create more diverse tion of 3D object research by Lehman et al. successfully gen- objects, which allows multiple transformations to happen on erated “evolved” forms, however, the final form was not far a single object. With the aid of mesh generation software and from the original form and could be said to merely mimic the 3D printing technology, the generated objects can be physi- original [22]. Another relevant study, Neural 3D Mesh Ren- cally realized. In our latest artwork, it has been incorporated derer [20] focuses on adjusting the rendered mesh based on into an interactive art installation. DeepDream [28] textures. In the art field, artist Egor Kraft’s Content Aware Studies project [11] explores the possibilities Creative AI: ADD and PDD of AI to reconstruct lost parts of antique Greek and Roman Artistic images created by AI algorithms have drawn great ∗Equal contribution attention from artists. AI algorithms are also attractive for their ability to generate 3D objects which can assist the pro- cess of sculpture creation. In graphics, there are many meth- ods to represent a 3D object such as mesh, voxel and point cloud. In our paper, we focus on point cloud data which are obtained from two large-scale 3D CAD model datasets: Mod- elNet40 [38] and ShapeNet [9]. These datasets are prepro- cessed by uniformly sampling from the surface of the CAD models to attain the desired number of points. Point cloud is a compact way to represent 3D object using a set of points on the external surfaces and their coordinates such as those shown in Figures2 and3. We introduce two algorithms, ADD and PDD, which are inspired by 2D DeepDream to gener- Figure 5: Encoding and decoding in latent space ate creative and realizable point clouds. In this section, we will briefly discuss the existing 3D object generation meth- final object X^. Specifically, an autoencoder is a kind of ods, and then elaborate the ADD and PDD algorithms and feedforward neural network used for dimensionality reduc- present some of the generated results. tion whose hidden units can be viewed as latent codes which capture the most important aspects of the object [30]. A rep- resentation of the process of encoding to and decoding from latent space can be seen in Figure5. ^ h = t · h1 + (1 − t) · h2 Figure 2: ModelNet40 [38] Figure 3: ShapeNet [9] x^ = Generator(h^); Learning to Generate Creative Point Clouds Using deep learning to generate new objects has been studied where 0 ≤ t ≤ 1. Empirical evidence shows that de- in different data types, such as music [35], images [15], 3D coding mixed codes usually produces semantically mean- voxels [39] and point clouds [1, 23]. An example of a simple ingful objects with features from the corresponding objects. generative model is shown in Figure4. This approach has also been applied to image creation [7]. One could adopt encoding-decoding algorithms for point clouds [40, 17, 23] based on the same idea. The sampling and interpolation results based on Li et al. [23] are shown in Figure6. (a) Sampling (b) Sampling (c) Sampling Figure 4: An example of generative model. (d) Interpolation (f) Interpolation Here a low-dimensional latent space h of the original ob- (e) Interpolation x ject space is learned in accordance with the underlying Figure 6: Results of Li et al. [23]. probability distribution. Z p(x) = p(h)p(xjh)dh (1) h Inspiration: DeepDream for Point Clouds In contrast A discriminator is usually used to help distinguish the gener- to simple generative models and interpolations, DeepDream ated object x^ and a real object x. Once the discriminator is leverages trained deep neural networks by enhancing patterns fooled, the generator can create objects that look very similar to create psychedelic and surreal images. For example in Fig- to the original ones [15]. However, these generative models ure7, the neural network detects features in the input images only learn to generate examples from the “same” distribution similar to the object classes of bird, fish and dog and then ex- of the given training data, instead of learning to generate “cre- aggerates these underlying features. Given a trained neural ative” objects [13, 27]. network fθ and an input image x, DeepDream aims to mod- ^ One alternative is to decode the convex combination h of ify x to maximize (amplify) fθ(x; a), where a is an activation latent codes h1; h2 of two objects in an autoencoder to get function of fθ. After this process, x is expected to display (a) Union of airplane and bottle (b) Union of cone, vase and lamp Figure 7: 2D DeepDream visualized Figure 9: Amalgamated input for ADD. some features that are captured by fθ(x; a). Algorithmically, DeepDream iteratively modifies x via a gradient update with Algorithm 1: Amalgamated DeepDream (ADD) a certain learning rate γ. input : trained fθ and input X xt = xt−1 + γrxfθ(x; a): (2) output: generated object X^ An extension of Deep- for x = X0;:::;Xn do x0 = x Dream is to use a clas- for t = 1 :::T do sification network as f θ x^ = xt−1 + γrxfθ(xt−1; a) and we replace a with the xt =x ^ outputs of the final layer if t is divisible by 10 then corresponding to certain xt = xt [ x labels. This is related down-sample xt to fix the number of points to adversarial attack [33] end and unrestricted adver- end sarial examples [6, 24]. X^ = X^ [ xT Unfortunately, di- end rectly extending the Figure 8: Naive DeepDream idea of DeepDream to with undesirable sparse surface. point clouds with neural networks for sets [29, 41] results in undesirable point clouds suffering from both local and global sparsity as shown in For object X with any number of points, we first randomly Figure8. To be specific, iteratively applying gradient update take it apart into several subsets with the number of points without any restrictions creates local holes in the surface of that is required by the classification model, and then modify the object. Besides, the number of input points per object the input image through gradient updating (2). In the end, is limited by the classification model. Consequently the we amalgamate all the transformed subsets into a final object. generated object is not globally dense enough to transform By doing this we can generate as dense objects as we want into mesh structure, let alone realize in the physical world. as long as the input object allows. To solve the local spar- To avoid local sparsity in the generated point cloud, one sity, when running the gradient update (2) for each subset we compromise is to run the naive 3D point cloud DeepDream amalgamate the transformed point clouds with the input point with fewer iterations or a smaller learning rate, but it re- clouds after every 10 iterations.