Learning to Generate and Edit Hairstyles
Total Page:16
File Type:pdf, Size:1020Kb
Learning to Generate and Edit Hairstyles Weidong Yin1, Yanwei Fu1;y, Yiqiang Ma1 Yu-Gang Jiang2, Tao Xiang3, Xiangyang Xue1;2 1School of Data Science, Fudan University; 2School of Computer Science, Fudan University; 3 Queen Mary University of London , y Corresponding author. Email: [email protected] ABSTRACT One of the reasons is that there are large variations in hairstyles Modeling hairstyles for classication, synthesis and image edit- and in order to model these variations, large-scale datasets are ing has many practical applications. However, existing hairstyle needed. Unfortunately, such a large-scale hairstyle dataset does not datasets, such as the Beauty e-Expert dataset, are too small for exist. In Multimedia and computer vision communities, hairstyles developing and evaluating computer vision models, especially the are often labeled as attributes for face datasets. However, such recent deep generative models such as generative adversarial net- annotation is often crude, focusing mostly hair length and color. work (GAN). In this paper, we contribute a new large-scale hairstyle On the other hand, existing specialized hairstyle datasets such as dataset called Hairstyle30k, which is composed of 30k images con- Beauty e-Expert dataset [18] are too small to represent the diversity taining 64 dierent types of hairstyles. To enable automated gener- of human hairstyles in the wild. ating and modifying hairstyles in images, we also propose a novel In this paper, we introduce the rst large-scale hairstyle dataset GAN model termed Hairstyle GAN (H-GAN) which can be learned – Hairstyle30K to the community and hope that this will greatly eciently. Extensive experiments on the new dataset as well as boost the research into hairstyle modeling. Images in the dataset existing benchmark datasets demonstrate the eectiveness of pro- (see Fig. 1 for examples) are collected from the Web via search posed H-GAN model. engines using keywords corresponding a hairstyle ontology. This results in 64 dierent types of hairstyles in 30K images. On average, KEYWORDS each hairstyle class has around 480 images. The newly proposed dataset is used to train the H-GAN model proposed in this paper. Hairstyle dataset, Hairstyle Classication, Generative Adversarial Importantly, with 64 hairstyle classes, this is a ne-grained dataset Networks presenting a challenging recognition task, as veried by our exper- ACM Reference format: iments. 1 1;y 1 2 3 Weidong Yin , Yanwei Fu , Yiqiang Ma and Yu-Gang Jiang , Tao Xiang , Apart from releasing a new dataset, we also present a Hairstyle 1;2 Xiangyang Xue . 1997. Learning to Generate and Edit Hairstyles. In Pro- Generative Adversarial Network (H-GAN) model for automati- ceedings of ACM Conference, Washington, DC, USA, July 2017 (Conference’17), cally generating or modifying/editing hairstyles given an input 9 pages. image. Our H-GAN has three components: an encoder-decoding DOI: 10.475/123_4 sub-network, a GAN and a recognition subnetwork. Particularly, the encoder-decoding network is a variant of Variational Auto- 1 INTRODUCTION Encoders (VAE) [12]; the recognition sub-network shares the same Hairstyle can express one’s personalities, self-condence, and at- networks as the discriminator of GAN as in InfoGAN [5]. The titudes. It is thus an important aspect of personal appearance. A model is unique in that once trained, it can be used to perform computer vision model that enables recognition, synthesis, and various tasks including recognition, synthesis and modication. modication of hairstyles in images is of great practical use. For ex- Extensive experiments of our H-GAN algorithm on the proposed ample, with such as model, customer can take a photo of him/herself dataset and other general-purpose benchmark datasets validate the and then synthesize dierent hairstyles before going to the hair- ecacy of our model. dresser’s to make the most satisfactory one a reality. In addition, an Contributions. We make several contributions in this paper. Firstly, automated hairstyle recognition model can be used for recognizing to study the hairstyle related problems, we contribute a new large- person’s identity for security applications. scale hairstyle dataset – Hairstyle30k to the community. To the best Existing eorts on hairstyle modeling have been focused on rec- of our knowledge, this is the largest hairstyle dataset, especially in ommending the most suitable hairstyles [18], or interactively users’ terms of the number of hairstyle classes. Secondly, we present a editing [7, 22, 32]. However, there is no attempt so far to systemat- new deep generative model called – H-GAN which can eectively ically study hairstyles in images and no model available that can and eciently generate and modify the hairstyles of person images. address various hairstyle modeling task in a comprehensive manner. Extensive experiments demonstrate that our H-GAN is superior to a number of state-of-the-art alternative models. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation 2 RELATED WORK on the rst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). 2.1 Image Editing and Synthesis Conference’17, Washington, DC, USA © 2016 Copyright held by the owner/author(s). 123-4567-24-567/08/06...$15.00 Editing image with interaction. Recent advances in interactive DOI: 10.475/123_4 image segmentation have signicantly simplied the tasks of object Conference’17, July 2017, Washington, DC, USA Yin et al. Figure 1: Examples of our Hairstyle30K dataset with corresponding hairstyle class labels. segmentations [3, 4, 16, 25, 38]. Existing interactive segmentation information between attributes and generated images to generate approaches such as lazy snapping [16] and grab cut [25] as well as images with specied attributes. recent Generative Adversarial Networks (GAN) related methods [44] enable the users to achieve good quality object cutout with 2.2 Attribute Analysis a few of strokes. In comparison, the existing eorts on hairstyles Attribute-based people search. Hairstyles can be considered as and makeup editing are very primitive [7, 22, 32]. In theory, these a special type of person attributes. Such attributes can be used in interactive image editing works can be used for editing hairstyles. the applications of surveillance environments, including but not However, it can be tedious and time-consuming to manually modify limited in attribute-based people search [29, 34, 36] and person hairstyles via user-interaction. Fully automated image editing thus re-identication [15, 21, 31, 42]. Recent studies on person attributes becomes desirable. are focused on clothing. These include clothes recognition [37], Automated image editing. There are some recent eorts on fully clothing parsing [17] as well as clothing retrieval [2, 41]. The study automated images editing [10, 24, 28, 43]. In particular, two recent presented in this paper complements the existing clothing oriented studies [24, 28] propose approaches to modify the attribute of facial attribute analysis. images. The proposed H-GAN is an automated image editing model Face attribute analysis. It is another important research topic but focuses on hairstyle images. related to our hairstyle analysis. Facial attribute analysis was rst Image Editing and Synthesis. Our work is also related to previ- studied by Kumar et al. [13]. In terms of dierent visual features ous work on joint image editing and synthesis [10, 14, 24, 28, 39]. and distinctive learning paradigm, facial attribute analysis has been Shen et al. [28] manipulated the facial attributes by a GAN-based developed into three categories: (1) the methods using hand-crafted image transformation networks; nevertheless each trained model visual features [13], such as SIFT [20] and LBP [23]; (2) the meth- in [28] can only modify one special type of facial attribute im- ods utilizing the recent deep features [19, 36]; and (3) multi-task ages. In contrast, one trained model of our H-GAN can modify all methods for learning facial attribute [1, 6, 26]. hairstyles presented in the training data. In model proposed by [39] , conditional variational auto-encoder is used to generate facial 3 DATASET COLLECTION images of dierent attributes; due to the lack of the adversarial loss, Our hairstyle30k dataset is designed for studying the problem of the generative images are often blurry. In VAEGAN[14], VAE is hairstyle classication as well as other hairstyle related tasks includ- combined with GAN to generate more realistic image. However, ing synthesis and editing. To construct the dataset, we had down- compared to our H−GAN, it does not use attribute information and loaded more than 1 million images using various web search engines modication is achieved by calculating ¨residual attribute vector¨. (Google, Flicker, and Bing, etc) with hairstyle related search words, The main dierence between VAEGAN and H_GAN is thus the e.g. Beehive hairstyle. The full set of class names of hairstyles are fact that we add attribute information explicitly into the generator listed in Fig. 1. The initial downloaded images were rstly ltered so that we can specify dierent attributes that we want to change. by face detection algorithm. We subsequently pruned some irrele- Also we introduce a recognition network to maximize the mutual vant or erroneous images which has neither faces nor hairstyles. We then manually ltered out the irrelevant images for some hairstyles Learning to Generate and Edit Hairstyles Conference’17, July 2017, Washington, DC, USA that come without faces, e.g., Ducktail. We carefully annotated the the parameters of G and D with the loss functions LD = LGAN pruned images and classied them into dierent hairstyle classes. and LG = −LGAN for generator and discriminator respectively. Finally, we obtained 30k images with 41 types of male hairstyles and The generator can draw a sample z ∼ pprior ¹zº = N ¹0; 1º and 42 types of female hairstyles.