Changing the Mind of Transformers for Topically-Controllable Language Generation
Total Page:16
File Type:pdf, Size:1020Kb
Changing the Mind of Transformers for Topically-Controllable Language Generation Haw-Shiuan Chang Jiaming Yuan Mohit Iyyer Andrew McCallum CICS, University of Massachusetts Amherst [email protected],[email protected], {mccallum,miyyer}@cs.umass.edu Abstract Output Continuation: “: The Future of a Democratic Election. The book tells the story of the 2008 election.” Large Transformer-based language models can aid human authors by suggesting plausi- Step 4: Let me try. What does this Step 2: Might say these topics continuation sound? ble continuations of text written so far. How- 1 book books novels ever, current interactive writing assistants do 2 Essays Perspectives Perspective Transformer 3 University faculty undergraduate not allow authors to guide text generation in -based 4 Reid Sen. McConnell desired topical directions. To address this lim- Language 5 humanity life spirituality Models 6 2011 2010 2009 itation, we design a framework that displays 7 know sure want 8 insistence disdain dismissive multiple candidate upcoming topics, of which 9 election elections Democratic a user can select a subset to guide the gener- 10 U.S. States United ation. Our framework consists of two compo- Step 3: Please nents: (1) a method that produces a set of can- Step 1: Let’s see what talk more about language models would say didate topics by predicting the centers of word these topics clusters in the possible continuations, and (2) a Input Prompt: “Barack Obama writes a new book” User text generation model whose output adheres to the chosen topics. The training of both compo- Figure 1: Given an input prompt, the Transformer- nents is self-supervised, using only unlabeled based LM provides K = 10 topics that might be men- text. Our experiments demonstrate that our tioned next and each topic is represented by M = 3 topic options are better than those of standard words. The user could guide the generation process by clustering approaches, and our framework of- choosing a subset of topics. ten generates fluent sentences related to the chosen topics, as judged by automated metrics and crowdsourced workers. require substantial human labor. In some prior 1 Introduction work (Keskar et al., 2019; Tu et al., 2019), users Recently, Transformer-based language models choose among a static set of predefined attributes (LMs) have achieved impressive performance in (e.g., sentiment) that only provide coarse-grained language generation tasks (Radford et al., 2019; control. Other work (Roemmele and Gordon, 2015; Dai et al., 2019) such as open-domain story genera- Clark et al., 2018) presents users with multiple tion (See et al., 2019a). When writing with the LM, generated continuations, which requires substan- users often desire an intuitive and effective way to tial reading effort and might not contain topics that control what a LM is going to generate (Keskar users want to see. Finally, options could be nodes in et al., 2019). To address this need, interactive writ- a plot graph that are handcrafted (Luo et al., 2015) ing assistants provide options to reveal possible or derived from a collaboration between humans developments of the story and generate continua- and machine (Li et al., 2013), but such choices are tions guided by the user-selected options. usually limited due to the high cost of preparing Interactive writing assistants have wide applica- the options. tions in creative writing (Roemmele and Gordon, To address these limitations, we propose an in- 2015; Clark et al., 2018; Akoury et al., 2020), ed- teractive writing framework that provides a set of ucation (Luo et al., 2015), and gaming (Walton, topics and guides the text generation by the user- 2020). Nevertheless, the existing systems’ options chosen topics. The topic options are generated usually do not provide fine-grained control and/or dynamically based on the input prompt to pro- 2601 Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, pages 2601–2611 April 19 - 23, 2021. ©2021 Association for Computational Linguistics Input Prompt: Barack Obama writes a new book Topic: election, elections, Democratic Topic: book, books, novels Topic: humanity, life, spirituality Word: zombie : The Future of a Democratic Election. The on spirituality and the role of about the United States entitled I book tells the story of the 2008 election. religion in society Don't Care...You Bet I'm a Zombie. Topic: American, America, U.S. Topic: understand, know, realize Topic: political, ideology, politics Topic: God, Christ, eternal Word: story . In it he describes why many , entitled My Living With God , and . In the United States, many people Americans believe in political parties. writes that he will give the gift of grace know the story of the human race Figure 2: Examples of our generated options and continuations. We highlight the words in the continuation that are related to the chosen topics or to the specified word. vide fine-grained control, and our models are self- encode the input prompt. During training, the op- supervised without the need to define the attributes tion generator predicts the cluster centers of fu- or collect annotations. As depicted in Figure1, a ture words, which are in the continuation of the user can peek at the most probable K topics (shown prompt, based on the contextualized embeddings as bags of words) appearing after the input prompt from GPT2. The conditional text generator fine- and control the generation by choosing the topics. tunes GPT2 to predict the next words given the In Figure2, we compare multiple generated sen- prompt and a few subsequent words. Since both tences conditioned on different chosen topic(s) or components’ input and output only come from the specified word(s). For example, if the user chooses prompt and its continuation, training the system a topic about humanity, life, and spirituality, our only requires a raw corpus, word tokenizers, and a system continues the input prompt “Barack Obama list of stop words. This makes the proposed method writes a new book” with “on spirituality and the suitable for open-domain story generation and eas- roles of religion in society”. Then, we can use the ily being fine-tuned for a specific domain. generated text as the new input prompt and update In experiments, we demonstrate that our system the set of topics to include other more relevant top- recommends high-quality topics and often generate ics such as God, Christ, and eternal. The process sentences that follow the chosen topics. We com- can be repeated to create a plot tree. pare our option generator with global topic models A user can also control the generation by spec- such as LDA (Blei et al., 2001) or local topic mod- ifying word(s) if the user wants to see the words els such as clustering the words in the input prompt. that are not in the topic list or seeks a transition The results show that the proposed method gener- to a word that is not directly related to the input ates significantly more topics that are plausible and prompt. For example, a user can ask our system to promote the narrative. Moreover, we compare our generate a sentence about zombie. Consequently, conditional text generator with PPLM (Plug and the continuation of “Barack Obama writes a new Play Language Models) (Dathathri et al., 2020) and book” becomes “about the United States entitled I demonstrate that our generation is more fluent and Don’t Care...You Bet I’m a Zombie”. relevant to the chosen topics. Our code is available The system is realized by two components: an at https://github.com/iesl/interactive_LM. option generator and a conditional text generator. 2 Method Given a prompt, the option generator suggests a set of K topics. After a user chooses a subset of the The proposed framework consists of two compo- topics and specifies some words, the embedding of nents: option generator and conditional text gen- every word or topic will guide the conditional text erator. In Figure3, we illustrate the two compo- generator to produce the continuation that is both nents and their interaction. First, given the prompt consistent with the existing prompt and relevant to x1; :::; xI inputted by a user, the option generator the chosen topics and words. at the bottom of the figure outputs K topics. After Both components are self-supervised and use the user chooses two topics about book and elec- pretrained GPT2 models (Radford et al., 2019) to tion and specifies one extra word story, the topics 2602 (a) Conditional Sample based without considering the prompt). However, the Text Generator on probability prompt might have a narrow focus and the related Softmax words of interest are all clustered into a single topic. GPT2 Encoder A simple remedy is to cluster only the words in the prompt rather than all the words in the cor- pw1 … pw5 pf6 pf6 pf6 pw6 + + + + + + pus. The topics are created dynamically and locally … Linear Layer given a prompt and can capture more fine-grained aspects in the continuations. However, the top- Barack … new t5 t7 tw book : Weighted average ics derived from the prompt might provide less y 1̂ of GloVe GloVe 4. show inspiration because the users have seen the prompt. book, books, election, elections, story Another major drawback of the approach is that novels Democratic the generated topics might encourage the LM to 3. write (b) Option 3. choose generate repetitive sentences or make a narrative Generator User circle inside a loop. T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Motivated by the challenges, we propose an op- 2. show Closest M Words tion generator that predicts the cluster centers based on the prompt instead of clustering the words in c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 1.