Dopelearning: a Computational Approach to Rap Lyrics Generation∗

DopeLearning: A Computational Approach to Rap Lyrics Generation∗ Eric Malmi Pyry Takala Hannu Toivonen Aalto University and HIIT Aalto University University of Helsinki and HIIT Espoo, Finland Espoo, Finland Helsinki, Finland eric.malmi@aalto.fi pyry.takala@aalto.fi [email protected].fi Tapani Raiko Aristides Gionis Aalto University Aalto University and HIIT Espoo, Finland Espoo, Finland tapani.raiko@aalto.fi aristides.gionis@aalto.fi ABSTRACT work. Second, with the number of smart devices increasing Writing rap lyrics requires both creativity to construct a that we use on a daily basis, it is expected that the demand meaningful, interesting story and lyrical skills to produce will increase for systems that interact with humans in non- complex rhyme patterns, which form the cornerstone of good mechanical and pleasant ways. flow. We present a rap lyrics generation method that cap- Rap is distinguished from other music genres by the formal tures both of these aspects. First, we develop a prediction structure present in rap lyrics, which makes the lyrics rhyme model to identify the next line of existing lyrics from a set of well and hence provides better flow to the music. Literature candidate next lines. This model is based on two machine- professor Adam Bradley compares rap with popular music learning techniques: the RankSVM algorithm and a deep and traditional poetry, stating that while popular lyrics lack neural network model with a novel structure. Results show much of the formal structure of literary verse, rap crafts that the prediction model can identify the true next line \intricate structures of sound and rhyme, creating some of among 299 randomly selected lines with an accuracy of 17%, the most scrupulously formal poetry composed today" [5]. i.e., over 50 times more likely than by random. Second, We approach the problem of lyrics creation from an infor- we employ the prediction model to combine lines from ex- mation-retrieval (IR) perspective. We assume that we have isting songs, producing lyrics with rhyme and a meaning. access to a large repository of rap-song lyrics. In this pa- An evaluation of the produced lyrics shows that in terms per, we use a dataset containing over half a million lines of quantitative rhyme density, the method outperforms the from lyrics of 104 different rap artists. We then view the best human rappers by 21%. The rap lyrics generator has lyrics-generation problem as the task of identifying a rele- been deployed as an online tool called DeepBeat, and the vant next line. We consider that a rap song has been par- performance of the tool has been assessed by analyzing its tially constructed and treat the first m lines of the song as usage logs. This analysis shows that machine-learned rank- a query. The IR task is to identify the most relevant next ings correlate with user preferences. line from a collection of candidate lines, with respect to the query. Following this approach, new lyrics are constructed line by line, combining lyrics from different artists in order 1. INTRODUCTION to introduce novelty. A key advantage of this approach is Emerging from a hobby of African American youth in the that we can evaluate the performance of the generator by 1970s, rap music has quickly evolved into a mainstream mu- measuring how well it predicts existing songs. While con- sic genre with several artists frequenting Billboard top rank- ceptually one could approach the lyrics-generation problem ings. Our objective is to study the problem of computational by a word-by-word construction, so as to increase novelty, arXiv:1505.04771v2 [cs.LG] 9 Jun 2016 creation of rap lyrics. Our interest in this problem is moti- such an approach would require significantly more complex vated by two different different perspectives. First, we are models and we leave it for future work. interested in analyzing the formal structure of rap lyrics and Our work lies in the intersection between the areas of in developing a model that can lead to generating artistic computational creativity and information retrieval. In our approach, we assume that users have a certain concept in ∗When used as an adjective, dope means cool, nice, or awe- some. their mind, formulated as a sequence of rap lines, and their information need is to find the missing lines, composing a song. Such an information need does not have a factual an- swer; nevertheless, users will be able to assess the relevance of the response provided by the system. The relevance of the response depends on factors that include rhyming, vocabulary, unexpectedness, semantic coherence, and humor. Tony Veale [26] illustrates other linguistically creative uses of information retrieval, e.g., for metaphor generation. He argues that phrases extracted from large corpora can be used This is a pre-print of an article appearing at KDD’16. as\readymade"or\found"objects, like objets trouvés in arts, that can take on fresh meanings when used in a new context. methods can be found under the domain of machine learn- From the computational perspective, a major challenge ing, for instance, in the emerging field of deep learning. in generating rap lyrics is to produce semantically coherent Hirjee and Brown [12, 13] develop a probabilistic method, lines instead of merely generating complex rhymes. As Paul inspired by local alignment protein homology detection al- Edwards [9] puts it: \If an artist takes his or her time to craft gorithms, for detecting rap rhymes. Their algorithm ob- phrases that rhyme in intricate ways but still gets across the tains a high rhyme detection performance, but it requires a message of the song, that is usually seen as the mark of a training dataset with labeled rhyme pairs. We introduce a highly skilled MC."1 As a result of record results in com- simpler, rule-based approach in Section 3.3, which seemed puter vision, deep neural networks [1] have become a popu- sufficient for our purposes. Hirjee and Brown obtain a pho- lar tool for feature learning. To avoid hand-crafting a num- netic transcription for the lyrics by applying the CMU Pro- ber of semantic and grammatical features, we introduce a nouncing Dictionary [16], some hand-crafted rules to handle deep neural network model that maps sentences into a high- slang words, and text-to-phoneme rules to handle out-of- dimensional vector space. This type of vector-space repre- vocabulary words, whereas we use an open-source speech sentations have attracted much attention in recent years and synthesizer, eSpeak, to produce the transcription. The com- have exhibited great empirical performance in tasks requir- putational generation of rap lyrics has been previously stud- ing semantic analysis of natural language [19, 20]. ied in [29, 28]. These works adopt a machine-translation ap- While some of the features we extract from the analyzed proach, whereas we view it as an information-retrieval prob- lyrics are tailored for rap lyrics, a similar approach could lem. Furthermore, we have deployed our lyrics generator as be applied to generate lyrics for other music genres. Fur- an openly accessible web tool to assess its performance in thermore, the proposed framework could form the basis for the wild. several other text-synthesis problems, such as generation of Automated creation of rap lyrics can also be viewed as a text or conversation responses. Practical extended applica- problem within the research field of Computational Creativ- tions include automation of tasks, such as customer service, ity, i.e., study of computational systems which exhibit be- sales, or even news reporting. haviors deemed to be creative [7]. According to Boden [4], Our contributions can be summarized as follows: creativity is the ability to come up with ideas or artifacts (i) We propose an information-retrieval approach to rap that are new, surprising, and valuable, and there are three lyrics generation. A similar approach could be applied different types of creativity. Our work falls into the class to other tasks requiring text synthesis. of \combinatorial creativity" where creative results are produced as novel combinations of familiar ideas. Combina- (ii) We introduce several useful features for predicting the torial approaches have been used to create poetry before next line of a rap song and hence for generating new but were predominantly based on the idea of copying the lyrics. In particular, we have developed a deep neural grammar from existing poetry and then substituting con- network model for capturing the semantic similarity of tent words with other ones [25]. lines. This feature carries the most predictive power of In the context of web search, it has been shown that all the features we have studied. document ranking accuracy can be significantly improved (iii) We present rhyme density, a measure for the technical by combining multiple features via machine-learning algo- quality of rap lyrics. This measure is validated with a rithms instead of using a single static ranking, such as Page- human subject, a native-speaking rap artist. Rank [21]. This learning-to-rank approach is very popular (iv) We have built an online demo of the rap lyrics generator nowadays and many methods have been developed for it [17]. openly available at deepbeat.org. The performance of In this paper, we use the RankSVM algorithm [14] for com- the algorithm has been assessed based on the usage logs bining different relevance features for the next-line predic- of the demo. tion problem, which is the basis of our rap-lyrics generator. The rest of the paper is organized as follows. We start Neural networks have been recently applied to various re- with a brief discussion of relevant work.

Load more