ARE POETRY and LYRICS ALL THAT DIFFERENT? Abhishek Singhi, Dan Brown Cheriton School of Computer Science, University of Waterloo
Total Page:16
File Type:pdf, Size:1020Kb
ARE POETRY AND LYRICS ALL THAT DIFFERENT? Abhishek Singhi, Dan Brown Cheriton School of Computer Science, University of Waterloo Lyrics Poetry I Written for the masses I Attracts more educated and sensitive audience I Written keeping the background music in mind I Structurally more constrained, adhering to a particular meter and style I Consists of repeat lines and segments I Written against a silent background I Few forms also repeat lines, such as the villanelle Idea Data Set Method I Synonymous words give poets or lyricists a I Articles: We take the English Wikipedia and over I Finding the synonyms of all the words in the training choice. 13,000 news articles from four major newspapers as data set using WordNet, Wikipedia and online I We use adjectives, as a majority have our article data set. thesaurus. synonyms that can be used depending on I Lyrics: We took more than 10,000 lyrics from 56 very I Prune synonyms obtained from the three sources, context. popular English singers. which fall below an experimentally determined I Lyrics, unlike poetry, often repeat lines and I Poetry: We took more than 20,000 poems from more threshold for the semantic distance between the segments, causing us to believe that than 60 famous poets, like Robert Frost, William synonyms and the word. lyricists tend to pick more rhymable Blake and John Keats, over the last three hundred I Finding the probability distribution of word for all the adjectives; years. three types of documents. I Using the probability distributions to classify the document. Initial problem Rhyming adjectives Semantic orientation Predicted Value Lyrics Poems Lyrics Poems Actual Lyrics Articles Poems Mean 33.2 22.9 Mean -0.050 -0.053 Lyrics 67 11 22 Median 11 5 Median 0.000 0.000 Articles 11 80 6 25th percentile 2 0 25th percentile -0.270 -0.270 Poems 10 33 57 75th percentile 38 24 75th percentile 0.130 0.130 Table 1: The confusion matrix for document classification. Table 2: Statistical values for the number of words an adjective Table 3: Statistical values for the semantic orientation rhymes with. of adjectives used in lyrics and poetry. Poetic lyricists Non-Poetic lyricists Non-Poetic Lyricists % of lyrics misclassified as poetry Poetic Lyricists % of lyrics misclassified as poetry Bryan Adams 14% Bob Dylan 42% Michael Jackson 22% Ed Sheeran 50% Drake 7% Ani Di Franco 29% Backstreet Boys 23% Annie Lennox 32% Radiohead 26% Bill Callahan 34% Stevie Wonder 17% Bruce Springsteen 29% Led Zeppelin 8% Stephen Sondheim 40% Kesha 18% Morrissey 29% Average misclassification rate 17% Average misclassification rate 36% Table 5: Percentage of misclassified lyrics as poetry for non-poetic lyricists. Table 4: Percentage of misclassified lyrics as poetry for poetic lyricists. Adjective usage in lyrics vs poetry What next? Lyrics Poetry I Automatic poetry/lyrics generation proud, arrogant, cocky haughty, imperious I Alternate better adjectives to make a document fit its genre better sexy, hot, beautiful, cute gorgeous, handsome I Better measure of distance between documents where the weights are assigned to merry, ecstatic, elated happy, blissful, joyous a word depending on its probability of usage in a particular type of document heartbroken, brokenhearted sad, sorrowful, dismal I This can be extended to different genres of writings like prose or fiction pissed angry, bitter afraid, nervous frightened, cowardly, timid weak, fragile feeble, powerless jealous envious, covetous Table 6: Adjectives which are more likely to be used in lyrics rather than poetry and vice versa..