Bridging Generative Deep Learning and Computational Creativity
Total Page:16
File Type:pdf, Size:1020Kb
Bridging Generative Deep Learning and Computational Creativity Sebastian Berns 1 and Simon Colton 1, 2 1 Game AI Group, EECS, Queen Mary University of London, United Kingdom 2 SensiLab, Faculty of IT, Monash University, Australia Abstract by way of the creative AI community, and we propose av- enues for improvements and cross-community engagement. We aim to help bridge the research fields of generative deep learning and computational creativity by way of the creative In the section below, we make a case for generative mod- AI community, and to advocate the common objective of els as a successful and powerful technology which is inher- more creatively autonomous generative learning systems. We ently limited in its creative abilities by its uni-dimensional argue here that generative deep learning models are inher- objective of perfection. The following section discusses ently limited in their creative abilities because of a focus on how, in spite of its limitations, GANs have been used and learning for perfection. To highlight this, we present a se- abused as artwork production engines. We then explore how ries of techniques which actively diverge from standard us- computational creativity research can contribute to further age of deep learning, with the specific intention of producing evolve such models into more autonomous creative systems, novel and interesting artefacts. We sketch out some avenues looking specifically at novelty measures as a first step to- for improvement of the training and application of genera- wards this goal. We conclude by returning to the notion of tive models and discuss how previous work on the evaluation of novelty in a computational creativity setting could be har- bridging the two fields and describing future work. nessed for such improvements. We end by describing how a two-way bridge between the research fields could be built. Learning for Perfection While the purpose of GANs, like all generative models, is Introduction to accurately capture the patterns in a data set and model its Methods in generative deep learning have become very underlying distribution, guaranteeing convergence for this good at producing high quality artefacts with much cul- particular method remains a challenge (Lucic et al. 2018). tural value. In particular, Generative Adversarial Networks Theoretical analyses of the GAN training objective suggest (GANs) (Goodfellow et al. 2014) provide many exciting that the models fall significantly short of learning the target opportunities for image generation. Over the course of distribution and may not have good generalisation proper- only a few years, research contributions have pushed mod- ties (Arora et al. 2017). It is further suggested that GANs els from generating crude low-res images to ones evoking in particular might be better suited for other purposes than visual indeterminacy, i.e., images which “appear to depict distribution learning. Given their high-quality output and real scenes, but, on closer examination, defy coherent spatial wide artistic acceptance, we argue for the adaptation of this interpretation” (Hertzmann 2019) and even further, to gen- generative approach for computational creativity purposes. erate images of human faces indistinguishable from digital Generative models are currently only good at producing photographs (Karras, Laine, and Aila 2019). These are just ‘more of the same’: their objective is to approximate the data a few of the many applications of generative deep learning distribution of a given data set as closely as possible. This around which the notion of ‘creative AI’ has emerged. highlights two sides of the same fundamental issue. First, Closely following the fast-paced research on neural net- in practice it remains unclear whether models with millions works and generative models in particular, an online com- of parameters simply memorise and re-produce training ex- munity has formed under the hashtag #CreativeAI (Cook amples. Performance monitoring through a hold-out test set and Colton 2018), that has been particularly eager and suc- is rarely applied and overfitting in generative models is not cessful in exploring unconventional applications and has es- widely studied. Second, conceptually, such models are only tablished its place in workshops at major conferences with a of limited interest for creative applications if they produce focus on artificial neural networks. With extensive knowl- artefacts that are insignificantly different from the examples edge and experience in the development, application and used in training. Hence we further argue for an adaptation evaluation of machine creativity, the computational creativ- such that generative capabilities align with the objectives of ity community can contribute to this progress by laying out computational creativity: to take on creative responsibilities, potential ways forward. We aim here to build a bridge be- to formulate their own intentions, and to assess their output tween generative deep learning and computational creativity independently (Colton and Wiggins 2012). Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) 406 ISBN: 978-989-54160-2-8 Figure 1: Example results from cross-domain training, i.e., Figure 2: Samples from Broad, Leymarie, and Grierson fine-tuning StyleGAN with the Flickr-Faces-HQ data set (2020) of StyleGAN fine-tuned with a negated loss function. (Karras, Laine, and Aila 2019) and a custom beetle data set. In its state of ‘peak uncanny’ the model started to diverge but Reproduced with permission from M. Mariansky.1 has not yet collapsed into a single unrecognisable output. Active Divergence function can be negated in a fine-tuning process such that In order to produce artefacts in a creative setting, GANs still it produces faces that the discriminator believes are fake require expert knowledge and major interventions. Artists (fig. 2; Broad, Leymarie, and Grierson 2020). Again, human use a variety of techniques to explore, break and tweak, or supervision and curation of the results is just as important as otherwise intervene in the generative process. The follow- devising the initial loss manipulation. ing is a brief overview of some of those techniques. From a purely machine learning perspective, these exploits and ac- Early stopping and rollbacks are necessary whenever a cidents would be considered abuses and produce only sub- model becomes too good at the task it is being optimised optimal results. Actively diverging from local likelihood for. Akin to the pruning of decision trees as a regularisa- maxima in a generator’s internal representation is necessary tion method or focusing on sub-optimal (in terms of fitness to find those regions that hold sub-optimal, but interesting functions) artefacts produced by evolutionary methods, roll- and novel encodings. backs can improve generalisation, resulting in artefacts that are unexpected rather than perfect. Latent space search is a common practice among GAN artists, in which a neural network’s internal representation is Summary explored for interesting artefacts. Traversing from one point All of the above techniques require manual interventions to another produces morphing animations, so-called ‘latent that rely on human action and personal judgement. There space walks’. The space is often manually surveyed. Wher- are no well-defined general criteria for how much to inter- ever precise evaluation criteria are available, evolutionary vene, at which point and by how much, or when to stop. It is algorithms can be employed to automate the search for arte- central to an artistic practice to develop such standards, nur- facts that satisfy a given set of constraints (Fernandes, Cor- ture their individuality and the difference to other practices. reia, and Machado 2020). A major theme in GAN art, however, and a commonality in the above non-standard usages, is the active divergence Cross-domain training forcefully mixes two (or more) from the original objective of the tool of the trade, in pursuit training sets of the same type but different depictions, such of novelty and surprise. This dynamic appears to be, in con- that a model is first fit to the images from one domain (e.g. trast to other artistic disciplines, exceptionally pronounced human faces) and then fine-tuned to another (e.g. beetles). due to the use of state-of-the-art technology that has yet to The resulting output combines features of both into cross- find its definite place and purpose and whose capabilities over images (fig. 1). Finding the right moment to stop fine- are open to be explored. We celebrate and support this en- tuning is crucial and human supervision in this process is deavour and argue that computational creativity can help by currently indispensable. pushing generative models further, towards new objectives. Loss hacking intervenes at the training stage of a model New Objectives for Generative Models where the generator’s loss function is manipulated in a way that diverts it towards sub-optimal (w.r.t. the traditional Two avenues of improvements for generative deep learn- GAN training objective) but interesting results. Given a ing come to mind in a creative setting. First, we can con- model that generates human faces, for example, the loss sider creativity support tools, where a person is in charge of the creative process and the technology takes on an assis- 1Tweet by @mmariansky tive role. For this, generative models need to be more ac- https://twitter.com/mmariansky/status/1226756838613491713 cessible, as active divergence techniques still require highly Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) 407 ISBN: 978-989-54160-2-8 a limited range (Brock, Donahue, and Simonyan 2019). On the other end, a temperature parameter can be applied to scale a network’s softmax output (Feinman and Lake 2020). Both improve the quality of individual artefacts at the cost of sample diversity. While the original intention is to de- crease randomness in order to obtain artefacts closer to the mean, they may also be able to achieve the inverse. Neural network-based methods have been proposed for the genera- tion of novel artefacts, e.g., CAN (Elgammal et al.