Read Me Like a Book
Total Page:16
File Type:pdf, Size:1020Kb
Read Me Like A Book: Lessons in Affective, Topical and Personalized Computational Creativity Tony Veale School of Computer Science, University College Dublin, Dublin, Ireland. Abstract For personalization, the Twitter footprint of a target user – Context is crucial to creativity, as shown by the signif- whether their official bio or their recent tweets – offers a icance we attach to the labels P- and H-Creativity. It is textual context in which to situate the generation process. context that allows a system to truly assess novelty, or For topical creativity, the aggregated timelines of an array to ensure that its topical artifacts really are topical. An of online news sources, from a Twitter-addicted president important but an often overlooked aspect of context is to the breaking headlines of mainstream media, provide a personality. A CC system that is designed to reflect a dynamic context for machine creativity. We explore three specific aspect of the creative temperament, whether it modes of CC via these systems: a marriage of linguistic is humour, arrogance or whimsy, must stay true to this and artistic creativity that maps the digital personality of a assumed personality in its actions. Likewise, a system user, as reflected in what they tweet, into metaphors that that creates artifacts that are rooted in emotion must be sensitive to the personality or mood of its audience. are both textual and visual; a topical creator that generates But here we must tread carefully, as the assessment of metaphors for news stories rather than news readers; and personal qualities often implies judgement, and so few a book recommender system that leavens its user-tailored of us like to be judged, especially by our machines. To suggestions with humour, and which invents its own book better understand the upsides and pitfalls of topicality- ideas to supplement the titles in its well-stocked database. and personality-based CC systems, we unpack three of The principle that unites all three systems is the role of these systems here, and explore the lessons they offer. information compression in CC. One space of information may be compressed or decompressed to yield others, and The Wonder of You produce insightful generalizations or vivid elaborations in the process. Thus, compression is required to map a news Creativity can be an intensely personal affair. We put our- story to a linguistic metaphor, as the metaphor need only selves into what we create, relying on our experiences and capture the gist of the story. In fact, such information loss values to build artifacts we hope others will value too. In is desirable when it leads to generalization and ambiguity, doing so, we reveal our personalities. When we create for as metaphors should be objects of profound wonderment. others and assimilate the values of an audience, creativity When moving from online user personalities to metaphors becomes personal and personalized. While it is a tenet of we require the opposite, decompression, to inflate a low- computational creativity (CC) that an agent need not be a dimensional space of personality types into an elaborate, person to be creative, a creative system may nonetheless high-dimensioanl space of possible character metaphors. need a personality, or an appreciation of the personalities Current sentiment analysis techniques can place a user in of others, before it can create like a human (Colton et al. a space of a dozen or so psychological dimensions, while 2008). ‘Software is not human,’ to quote the CC refrain, metaphors will occupy a space that – even after a process yet CC systems must appreciate what matters to a human. of dimension reduction – has hundreds of dimensions. In In any case, we can only know a CC system’s personality, fact, even the extraction of psychological dimensions is a and it can only know ours, by what we do or say, making compression process, since the textual timelines that feed personal/personalized CC a special case of contextual CC. into sentiment analysis are converted into high-dimension For CC systems that create in language, context is itself a distributed spaces built with word co-occurrence statistics. linguistic artifact, as rooted e.g., in our social media time- In the next section we focus on personality-driven CC lines. Here we describe how best to use linguistic context with a system that maps recent user moods into metaphors to deliver various forms of topical and personalized CC. and pictures. Our approach is data- and knowledge-based, Specifically, we will explore the role of linguistic con- marrying textual data from a user profile with a symbolic text in the operation of three Twitter-based CC systems, model of the cultural allusions that underpin a machine’s ranging from one that uses context to ensure topicality to metaphors. Following that, we give statistical form to the ones that view context as the imprint of a user personality. notion that metaphors reside in a space of possibilities, so as to re-imagine metaphor creation as a mapping from one The metaphor whose vector presents the smallest angle space, a topic model of the news, into another, a space of (the largest cosine) to an incoming news vector is chosen metaphors that shares exactly the same dimensions. These as the one with the most relevance to that news item. We are whimsical systems that make sport of news and mood, built our joint space by compressing 380,000 news items, so we present one more system, a CC book recommender 210,000 tweets (from sources including @nytimesworld, that uses simple information-retrieval techniques to guide @CNNbrk and @FOXnews) and 22,846,672 metaphors its suggestions, but which also uses machine creativity in from MetaphorMagnet (which were made available to us some unsubtle ways. This specific system was built for a on request) into the same LDA space of 100 dimensions. recent science communications event, and user feedback We used the gensim package of Řehůřek & Sojka (2010) offers us some lessons on the willingness of humans to be to build the space, and concatenated word lemmas to their judged by machines. Although whimsy can diminish the POS tags to provide a richer feature set to the model. severity of a perceived criticism, humour must be wielded The best pairings produced by this conflation of spaces with care by our autonomous CC systems, especially if it are tweeted hourly by our bot, called @MetaphorMirror. is unbidden, or used for the furtherance of serious goals. The thematic basis of the compression means that some pairings show more literal similarity that others, as in: Metaphor Mirror On The Wall From @WSJ: Sultan Abdullah of Pahang has been Consider the problem of generating apt metaphors for the chosen as Malaysia’s new king. news. As a story breaks and headlines stream on Twitter, ↑↓ we want our metaphor machine to pair an original and in- From @MetaphorMirror: What is a sultan but a sightful metaphor to each headline. So a headline about ruling crony? What is a crony but a subservient sultan? extreme weather might be paired with a metaphor about What drives ruling sultans to be toppled from thrones, nature’s destructive might, or a political scandal might be appointed by bosses and to become subvervient cronies? paired to a crime metaphor. As metaphor theorists often speak of multiple spaces – e.g. Koestler (1964), Lakoff & As is evident here, MetaphorMagnet’s hardboiled world- Johnson (1980) and Fauconnier & Turner (2002) all see view shines through in these pairings, offering meanings different viewpoints as different spaces – it is tempting to and perspectives that, while not actually present in a head- model each space in a metaphor with its own vector space line, can be read into the headline if one is so inclined. So model (VSM), by equating vector spaces with conceptual some pairings show that the VSM has learnt the lessons of spaces. Yet this analogy is misleading, as different VSMs history by reading the news, and this shines through too: – constructed from different text corpora – must have different dimensions (even if they share the same number From @AP: Congo's new President Felix of dimensions) and we cannot directly perform geometric Tshisekedi sworn into office; country's first peaceful comparisons between the vectors of two different VSMs. transfer of power since independence. Since the principal reason for building a VSM is the ease ↑↓ with which semantic tests can be replaced with geometric From @MetaphorMirror: How might an elected ones, we should build a single vector space that imposes incumbent become an unelected warlord? What if the same dimensions on each conceptual metaphor space. elected incumbents were to complete tenures, grab It is useful then to view news headlines and metaphors as power and become unelected warlords. comprising two overlapping subspaces of the same VSM. For a news subspace we collect a large corpus of news Any stereotyping in this response is a product of the VSM content from the Twitter feeds of CNN, Fox News, AP, and its large corpus of past news, rather than any bias in Reuters, BBC and New York Times, and use a standard MetaphorMagnet. Though the latter has a symbolic model compression technique – such as LDA (Latent Dirchlet of warlords and democrats, it is the news space that unites Allocation; Blei et al., 2003), LSA (Latent Semantic this generic model with the specific history of the Congo. Analysis; Landauer & Dumais, 1997) or Word2Vec (Mik- So even as a system strives for topicality, it must have one olov, 2013) – to generate a vector for each headline. We foot planted in the past if its outputs are to seem informed. additionally build a large metaphor corpus by running the Metaphor Magnet system of Veale (2015) on the Google n-grams (Brants & Franz, 2006), to give millions of meta- Fifty Shades of Dorian Gray phors that stretch across diverse topics.