Experimenting with Analogy for Lyrics Transformation
Total Page:16
File Type:pdf, Size:1020Kb
WeirdAnalogyMatic: Experimenting with Analogy for Lyrics Transformation Hugo Gonc¸alo Oliveira CISUC, Department of Informatics Engineering University of Coimbra, Portugal [email protected] Abstract Our main goal is thus to explore to what extent we can rely on word embeddings for transforming the semantics This paper is on the transformation of text relying mostly on of a poem, in such a way that its theme shifts according a common analogy vector operation, computed in a distribu- to the seed, while text remains syntactically and semanti- tional model, i.e., static word embeddings. Given a theme, cally coherent. Transforming text, rather than generating it original song lyrics have their words replaced by new ones, related to the new theme as the original words are to the orig- from scratch, should help to maintain the latter. For this, we inal theme. As this is not enough for producing good lyrics, make a rough approximation that the song title summarises towards more coherent and singable text, constraints are grad- its theme and every word in the lyrics is related to this theme. ually applied to the replacements. Human opinions confirmed Relying on this assumption and recalling how analogies can that more balanced lyrics are obtained when there is a one-to- be computed, shifting the theme is a matter of computing one mapping between original words and their replacement; analogies of the kind ‘what is to the new theme as the origi- only content words are replaced; and the replacement has the nal title is to a word used?’. same part-of-speech and rhythm as the original word. However, we soon noticed that text resulting from the ex- clusive application of the analogy operation had a series of Introduction issues. Therefore, we describe some constraints introduced towards better lyrics, e.g, to guarantee that functional words Materialisations of human creativity rarely start from are not changed, syntax is coherent, or the original metre is scratch. Consciously or not, artists are inspired by what they kept. Yet, although more constraints lead to better structure experience, including other artists and their creations. This and singability, they lower the chance of selecting related is also true in the scope of Computational Creativity, where words, with a negative impact on the theme shift. To analyse many systems rely on an inspiration set. When it comes to the impact of different constraints on aspects like grammar, linguistic creativity, poetry generation systems may rely on semantics, novelty or singability, a selection of results with a corpus of human-created poems where templates (Toiva- different constraints was subjected to a human evaluation, nen et al. 2012; Colton, Goodwin, and Veale 2012) or which suggested that there should be a one-to-one mapping language models (Potash, Romanov, and Rumshisky 2015; between original words and their replacement, only content Yan 2016) are acquired from; the initial population of a ge- words should be replaced, and the replacement must have netic algorithm is derived from (Hamal¨ ainen¨ and Alnajjar the same part-of-speech and rhythm as the original word. 2019b); or a poem is selected and transformed to meet de- Although our experiments were performed in song lyrics, sired constraints (Bay, Bodily, and Ventura 2017). this would work similarly in any kind of poetry, or other We also follow a transformation approach, mostly sup- textual genres. ported by analogy, for producing new text. Briefly, given an The proposed approach constitutes the engine of a system original text, in this case, lyrics of a known song, and a word for lyrics transformation, which we baptised as WeirdAnal- representing a new theme, we compute analogies on a dis- ogyMatic (WAM) because the obtained results could poten- tributional semantics space and shift the theme of the lyrics tially be followed in the creation of parodies from known by replacing some of its words according to computed analo- songs, popularised by artists such as Weird Al Yankovic – gies. For this, we apply a common method for solving analo- e.g., with hits like Eat it (transformation of Michael Jack- gies of the kind ‘what is to b as a is to a0?’, known as the vec- son’s Beat it), Smells Like Nirvana (based on Nirvana’s tor offset or 3CosAdd, and popularised for assessing tradi- Smells Like Teen Spirit), or Like a Surgeon (based on tional models of word embeddings, like word2vec (Mikolov Madonna’s Like a Virgin). This kind of parody has also fea- et al. 2013) or GloVe (Pennington, Socher, and Manning tured several comedy TV shows (e.g., Saturday Night Live) 2014), both known for keeping syntactic and semantic reg- and advertising campaigns (e.g., These Bites are Made for ularities. Analogies are solved with the following operation Poppin’ a 2006 Pizza Hut ad by Jessica Simpson for Super on the vectors of the involved words ~a a~ +~b b~ (e.g., a − 0 ⇡ 0 Bowl, which is a transformation of These Boots are Made common example is king~ man~ + woman~ q u~ e e n ). for Walkin’; or the 2000 TV ad for Mountain Dew, a trans- − ⇡ Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) 228 ISBN: 978-989-54160-2-8 formation of Queen’s Bohemian Rhapsody). All of those lyrics are somehow related to a theme, which we approxi- examples suggest that attempting at the automation of this mate by the song title. The paper is focused on experiments creation procedure may be worth. and necessary workarounds for taking advantage of analogy The remainder of the paper briefly overviews different ap- and still have a result that is not only syntactically and se- proaches for poetry and song lyrics generation, with a focus mantically coherent, but also singable. on those that, along the way, exploit word embeddings. We then describe our approach and illustrate with the result of adding more constraints, step-by-step. Before concluding, Step-by-Step Approach we present the results of the evaluation survey, together with Our goal is to transform a given text, so that it is still mean- examples of the most and least appreciated lyrics. ingful, but its semantics shifts to a new theme tn, given by a single word. For this, we assume that every word wo in Related Work the original text is somehow related to a fixed meaning in a Poetry generation has long been a research topic in Com- distributional space, seen as the original theme to. We then putational Creativity, with much work during the last 20 rely on analogy for computing new words wn for replacing years (Gonc¸alo Oliveira 2017). A prominent approach is each wo. In our experiments, we use song lyrics and make the generation based on templates, instantiated by simi- the rough approximation that to can be obtained from the les (Colton, Goodwin, and Veale 2012), instances of other song title1, i.e., we use a model of distributional semantics relations (Gonc¸alo Oliveira and Oliveira Alves 2016), or by for computing to as the weighted average of the vector of replacing certain words with others, with the same part-of- all content words in the title. Since, at least in the tested speech (PoS) (Agirrezabal et al. 2013), or associated to a models, words are ordered according to their frequency in target subject (Toivanen et al. 2012). While templates gener- the training corpus, we used their index in the model as their ally guarantee that syntactic rules are met, towards semantic weight. This can be seen as a cheap approximation to word coherence, poetry generators often have to rely on a model relevance, because more frequent words (i.e., less relevant) of semantics. For this, semantically-related words can be will have a lower index, thus lower weight, while less fre- acquired from semantic networks (Agirrezabal et al. 2013; quent ones (i.e., more relevant) will have a higher index. Gonc¸alo Oliveira and Oliveira Alves 2016), models of word To wrap it up, we assume that every word wo in the orig- associations (Toivanen et al. 2012), or of distributional se- inal lyrics is to the theme to as a new word wn is to a new mantics, such as word embeddings (Ghazvininejad et al. theme tn. So, once a new theme tn is selected, we can, for 2016; Ham¨ al¨ ainen¨ and Alnajjar 2019a). every wo, apply the 3CosAdd analogy solving method to the Alternative approaches to text generation, including cre- vectors of the involved words, and compute wn as follows: ative text, are based on language models, which can w = w t + t . n o − o n be learned from large corpora with recurrent neural net- Yet, we soon realised that following this with no ad- works (Yan 2016), often with LSTM layer(s) (Potash, Ro- ditional constraints resulted in text that was both hard to manov, and Rumshisky 2015). Yet, recently, the genera- sing and ungrammatical. For minimising those issues, some tion of different kinds of text has been attempted with larger constraints were added to the process of lyrics transforma- transformer-based language models, like GPT-2 (Radford et tion. Such constraints are thoroughly described in this sec- al. 2019), fine-tuned for a specific domain. In any of the tion, with their impact illustrated by results obtained. Dif- previous, the first step is to learn word embeddings from a ferent models of word embeddings were tested, but all re- corpus on the target domain. sults reported were obtained with the Stanford GloVe word Not so different from template-based, one last alternative vectors2 (Pennington, Socher, and Manning 2014), with for producing new text is to start with a single original text 300 dimensions, pre-trained in a corpus of 6B tokens from and replace some of its words towards the desired intent.