<<

Clues: Generating Text with a Hidden Meaning

David Hardcastle Open University, Milton Keynes, MK7 6AA Birkbeck, University of London, London, WC1E 7HX [email protected] [email protected]

alizations of possible crossword clues contain no Abstract implicit syntactic or semantic information, and so a mechanism is required to ensure that the resulting This paper discusses the generation of surface text is syntactically correct and semanti- cryptic crossword clues: a task that in- cally appropriate while the meaning of the text, volves generating texts that have both a derived directly from the input, is not disturbed surface reading, based on a natural lan- during lexicalization. As with computational hu- guage interpretation of the words, and a mour and poetry generation the process of genera- hidden meaning in which the strings that tion is unusual in that the content is not specified in form the text can be interpreted as a . the input (Ritchie, 2001; Manurung, 2000), and The process of clue generation realizes a this leads to tractability problems when consider- representation of the hidden, puzzle mean- ing the wide range of lexicalization options (see ing of the clue through the aggregation of also Ritchie, 2005: 4) requiring a bespoke solution. chunks of text. As these chunks are com- bined, syntactic and semantic selectional 1.1 Cryptic Crossword Clues constraints are explored, and through this understanding task a meaningful The cryptic crossword clues generated by ENIGMA surface reading is recovered. This hybrid consist of two separate indications of the solution language generation/language understand- word, one of which is a definition, the other a puz- ing process transforms a representation of zle based on its orthography. Consider, for exam- the clue as a word puzzle into a representa- ple, the following simple clue for noiseless: Still wild lionesses (9) tion of some meaningful assertion in the domain of the real world, mediated through Here noiseless is represented both by the synonym the generated multi-layered text; a text still (the definition) and a wordplay puzzle (an which has two separate readings. of lionesses) indicated by the convention keyword wild. All of the clues generated by the 1 Introduction system conform to Ximenean conventions (Macnutt, 1966), a set of guidelines that impose This paper discusses a system called ENIGMA restrictions on inflection and word order to ensure which generates cryptic crossword clues: frag- that clues are ‘fair’ and also encourage the use of ments of text that also have a hidden meaning, homographs and convention vocabulary to make quite different from the surface reading. This them cryptic in nature. raises an interesting research question: how to gen- erate text that has multiple layers of meaning based It is important to note here that there are two on different syntactic rules and different semantic separate readings of this clue: a surface reading in interpretations. The input to the realizer is a pro- which the clue is also a fragment of English text, duction of a high-level cryptic clue grammar and the puzzle reading required to solve the clue. whose terminals are the strings that participate in In the surface reading the word still is an adverb the puzzle presented by the clue. These conceptu- qualifying the adjective wild, while in the puzzle

147 reading it is an adjective that is a synonym for These clue trees contain no linguistic information - noiseless. the terminals should be thought of as strings not as words. To lexicalize this data the system must con- There are many different types of crossword struct a fragment of natural language that can be clue wordplay, including , homophones, reinterpreted - through the resolution of homo- writing words backwards, appending words to- graphs and a knowledge of special conventions - as gether, and more besides. ENIGMA generates clues a valid cryptic clue puzzle based on this non- using seven of the eight main types listed in linguistic structure. Along the way the syntax and (Macnutt, 1966) and can also generate complex semantics of the puzzle reading must not be dis- clues with subsidiary . This coverage com- turbed or the clue will lose its hidden meaning. At bined with the richness of lexical choice in cryptic the same time, the natural language syntactic and crossword convention vocabulary means that it is semantic information that is missing from the input not uncommon for ENIGMA to generate several data must be imposed on the clue so that a valid hundred valid clues for a single input word. surface reading is achieved.

1.2 Requirements for Generation 2 Chunk by Chunk Generation Given a particular solution word, such as noiseless, A complete clue does not need to be a sentence, or the first step for the system is to determine the dif- even a clause, it can be any valid fragment of text, ferent ways in which the letters of the solution and ENIGMA takes advantage of this fact to sim- could be presented as a puzzle. For example, the plify the generation algorithm. The clue tree shown basis of the clue could be that noiseless is an ana- in Figure 1 is realized through a process of compo- gram of lionesses, or that it can be formed by run- sition. First the symbol labeled A is realized. Next ning noise and less together, or that it is composed B1 and B2 are realized individually and then com- of a river (Oise) followed by the letter l all placed bined to form B. Now, A and B can themselves be inside the word ness. In its present form ENIGMA combined to form the clue. Each realization is a locates 154 such formulations for the input word fragment of text, and I refer to each of these frag- noiseless. Each of these formulations can be repre- ments as a chunk, although I note that they are sented as a clue tree under ENIGMA’s domain rather different from chunks based on major heads grammar for cryptic crosswords in which the ter- (Abney, 1989), for the reasons set out below. To minal elements are the strings used to compose the implement this process the system needs to be able solution word. The sample clue tree in Figure 1 to do two things: create chunks for each terminal in represents the fact that noiseless can be formed by the clue tree, and merge chunks into successively running noise and less together, a puzzle type larger ones until the root of the tree is reached. known as a Charade (Macnutt, 1966; Manley, This recursive process enables ENIGMA to con- 2001). struct complex clues with subsidiary puzzles using the same implementation it uses for simple puz- zles. 2.1 Word Order When chunks are combined together they cannot interleave or nest. The reason for this is that each chunk represents a part of the hidden meaning of the clue, and word order is central to its interpreta- tion. This is why ENIGMA uses a flat structure rather than a tree structure to build up the clue.

3 Implementation

Each chunk can attach to another chunk to its left, Figure 1. A clue tree that represents appending to its right, or via an intermediary word or phrase, noise and less to form noiseless. such as a conjunction, something I call ‘upward

148 attachment’. The grammar that underpins these ever, in addition to this syntactic information the attachments is encoded as a set of three extension grammar also provides the mechanism through points1 to each chunk: one specifying the relation- which semantic selectional constraints are en- ships that can occur to the left, another those that forced. The erasures do not just specify a type can occur to the right, and the third specifying up- (such as noun or adjective) but also a member of ward attachments. For example the chunk wild li- the chunk: wild or lionesses in this case. This en- onesses has, amongst many others, an extension ables the erasures to be used to determine which point to the left indicating that it can attach as di- semantic checks are required to validate the at- rect object to a verb, one to the right indicating that tachment, adding to ENIGMA’s implementation of it can attach to a verb as subject and an upward chunks the semantic constraints that Abney notes attachment through which it can attach via a coor- as missing from his formulation (1989: 15). For dinating conjunction to another noun. example, if the chunk wild lionesses attaches to a chunk to its left that erases to a verb and is looking In addition to specifying the relationship and for a direct object then the extension point govern- target type each extension point also specifies an ing this attachment enforces syntactic correctness, erasure2 for the chunk to which it belongs - this but this is not enough. Since the clue tree only con- erasure indicates a word in the chunk that can tains information about crossword conventions a stand in for the chunk as a whole when determin- separate semantic check is now required to ensure ing attachment. It is important to note that the era- that it makes sense for the verb to take the noun sure is not equivalent to the syntactic head and that lionesses as its direct object, and this semantic different extension points on the same chunk may check will be performed using the relation and the have different erasures. For example, in addition to erasures as arguments. looking for a verb to the left, the chunk wild lion- esses also has an extension point looking for an So, for example, when the chunk still is com- adverb to the left, since an adverb could qualify the bined to the left of wild lionesses the system per- adjective wild. Therefore, the extension point look- forms a semantic check to ensure that still can ing for a verb erases wild lionesses to lionesses, so qualify wild. If the verb calm (an alternative that a verb chunk looking for a noun to its right as homograph of a synonym for noiseless) is attached direct object will accept it, whereas the extension to the left then the system checks that lionesses can point looking for an adverb erases this same chunk be the direct object of calm, and so on. to wild, so that an adverb looking to its right for an adjective to qualify could also accept it. In this way 4 Sample output the concept of erasure makes it possible for a wider Figure 2 depicts system output from ENIGMA rep- variety of syntactic dependencies to be encoded in resenting the sample clue given in the introduction. the same way on a single chunk, enabling poly- Since so many clues are generated the system also morphic behaviour. generates a list of justifications which it uses to

determine a score and rank the clues. The output In some respects the extension points and asso- shown in Figure 2 only includes information that ciated erasures encoded onto each chunk act like relates to this paper; the full listing also contains the categories on functors in Combinatory Cate- information about the structure of the clue and the gorial Grammar (Clark et al, 2002), or edges on the difficulty of the clue as a word puzzle. All of the agenda used in chart generation (Kay, 1996) as explanatory text is generated using templates. they specify the type and directionality of the ar- guments available and the type of the result. How- Clue for [noiseless] Clue [Still wild lionesses (9)] POS [still/AV0 wild/AJ0 lionesses/NN2] 1 The term extension point is more commonly used to Homograph pun: to solve the clue 'still' define the interfaces to plug-in components in extensi- must be read as Adjective but has surface ble computer systems. reading Adverb 2 In Object-Oriented Programming an erasure is a sim- Sense: dependency fit 'wild lionesses' of plification or genericisation of a type through some in- type Adjective Modifier characterized as 'inferred' terface, see for example (Bracha et al, 2001).

149 Attachment: 'wild lionesses' attached via The resulting clue texts are syntactically and type Attributive Adjective Modifier semantically valid under the symbolic language Sense: 'still wild' sense-checked for In- tensifying Adverb attachment using the grammar of the domain, and at the same time are lexicon plausible fragments of natural language. I plan to Attachment: 'still wild' attached via perform a mix of qualitative and quantitative type Adverbial Qualifier (of Adjective) evaluations on a set of generated clues, a reference Thematic Fit: 'wild' and 'lionesses' share common thematic content. set of clues published in and a set of Figure 2. Sample output from ENIGMA. control clues generated with no syntactic or seman- tic constraints, grouped into subsets that share the 5 Discussion same solution word.

ENIGMA constructs clues using their hidden mean- References ing as the starting point. Lexical choice is very un- restricted, while word order is quite tightly con- S. Abney. (1989). Parsing by Chunks. In The MIT Pars- strained. This leads to combinatorial explosion in ing Volume, edited by C. Tenny. The MIT Press. lexical choice, but of the intractably large number G. Bracha, N. Cohen, C. Kemper, S. Mark, M. Odersky, of possible productions for each clue very few also S.-E. Panitz, D. Stoutamire, K. Thorup, and P. function as viable fragments of natural language. Wadler. (2001). Adding generics to the Java pro- ENIGMA’s approach is to work through the struc- gramming language: Participant draft specification. ture of the hidden clue and determine constraints Technical Report, Sun Microsystems. on the surface reading on the fly. The composi- S. Clark, J. Hockenmaier, and M. Steedman. (2002). tional process reins in the combinatorial explosion Building deep dependency structures with a wide- by pushing language constraints down to the most coverage CCG parser. In Proceedings of the 40th local level at which they can operate. Meeting of the ACL, pages 327–334. D. Hardcastle. (2007). Building a Collocational Seman- ENIGMA uses various generic language under- tic Lexicon. Technical Report BBKCS-07-02. Birk- standing resources built specifically for the appli- beck, London. cation during the generation process to ensure that D. Hardcastle. (2005). An examination of word associa- the syntactic relationships behind the clue’s surface tion scoring using distributional analysis in the Brit- reading are semantically supported. ish National Corpus: what is an interesting score and • A Collocational Semantic Lexicon mined what is a useful system? In Proceedings of Corpus from British National Corpus and augmented Linguistics, Birmingham. using WordNet determines if a proposed de- M. Kay. (1996). Chart Generation. In Proceedings of pendency relation between two words is se- the 34th Meeting of the ACL, pages 200-204. mantically probable (Hardcastle 2007). This lexicon is used to impose selectional con- D. S. Macnutt. (1966). On the Art of the Crossword. Swallowtail . straints on syntactic dependency relations, such as between a verb and its direct object. D. Manley. (2001). Chambers Crossword Manual. Chambers. • A Word Association Measure based on a dis- tributional analysis of data in the British Na- H. Manurung, G. Ritchie and H. Thompson. (2000). Towards a Computational Model of Poetry Genera- tional Corpus is used to evaluate the thematic tion. In Proceedings of AISB Symposium on Creative coherence of the clue (Hardcastle 2005). and Cultural Aspects and Applications of AI and • A Phrase Dictionary derived from the Moby Cognitive Science, pages 79-86. 3 Compound word list is used to identify aggre- G. Ritchie. (2001). Current Directions in Computational gations that result in the creation of multi-word Humour. In Artifical Intelligence Review 16(2), units such as compound nouns or phrasal pages 119-135. verbs. G. Ritchie. (2005). Computational Mechanisms for Pun Generation. In Proceedings of the 10th European Natural Language Generation Workshop, pages 125- 3 http://www.dcs.shef.ac.uk/research/ilash/Moby/. 132.

150