Noun Generation for Nominalization in Academic Writing Dariush Saberi, John Lee Department of Linguistics and Translation City University of Hong Kong
[email protected],
[email protected] Abstract Previous studies on automatic manipulation of clausal structure have mostly concentrated on Nominalization is a common technique in aca- syntactic simplification, typically by splitting a demic writing for producing abstract and for- mal text. Since it often involves paraphras- complex sentence into two or more simple sen- ing a clause with a verb or adjectival phrase tences (Siddharthan, 2002; Alu´ısio et al., 2008; into a noun phrase, an important task is to Narayan and Gardent, 2014). More recent re- generate the noun to replace the original verb search has also attempted semi-automatic nomi- or adjective. Given that a verb or adjective nalization (Lee et al., 2018), which aims to para- may have multiple nominalized forms with phrase a complex clause into a simplex clause by similar meaning, the system needs to be able transforming verb or adjectival phrases into noun to automatically select the most appropriate one. We propose an unsupervised algorithm phrases. that makes the selection with BERT, a state- Noun generation is a core task in the nom- of-the-art neural language model. Experimen- inalization pipeline (Table2). Resources such tal results show that it significantly outper- as NOMLEX (Meyers et al., 1998) and CAT- forms baselines based on word frequencies, VAR (Habash and Dorr, 2003) have greatly facili- word2vec and doc2vec. tated this task by providing lists of related nouns, 1 Introduction verbs and adjectives.