Using Natural Language Generation to Bootstrap Empty Wikipedia Articles

Semantic Web 0 (0) 1 1 IOS Press 1 1 2 2 3 3 4 Using Natural Language Generation to 4 5 5 6 Bootstrap Empty Wikipedia Articles: 6 7 7 8 8 9 A Human-centric Perspective 9 10 10 * 11 Lucie-Aimée Kaffee , Pavlos Vougiouklis and Elena Simperl 11 12 School of Electronics and Computer Science, University of Southampton, UK 12 13 E-mails: [email protected], [email protected], [email protected] 13 14 14 15 15 16 16 17 17 Abstract. Nowadays natural language generation (NLG) is used in everything from news reporting and chatbots to social media 18 18 management. Recent advances in machine learning have made it possible to train NLG systems to achieve human-level perfor- 19 19 mance in text writing and summarisation. In this paper, we propose such a system in the context of Wikipedia and evaluate it 20 with Wikipedia readers and editors. Our solution builds upon the ArticlePlaceholder, a tool used in 14 under-served Wikipedias, 20 21 which displays structured data from the Wikidata knowledge base on empty Wikipedia pages. We train a neural network to 21 22 generate text from the Wikidata triples shown by the ArticlePlaceholder, and explore how Wikipedia users engage with it. The 22 23 evaluation, which includes an automatic, a judgement-based, and a task-based component, shows that the text snippets score 23 24 well in terms of perceived fluency and appropriateness for Wikipedia, and can help editors bootstrap new articles. It also hints 24 25 at several potential implications of using NLG solutions in Wikipedia at large, including content quality, trust in technology, and 25 26 algorithmic transparency. 26 27 27 Keywords: Wikipedia, Wikidata, ArticlePlaceholder, Under-resourced languages, Natural Language Generation, Neural 28 28 networks 29 29 30 30 31 31 32 1. Introduction riety of content integration affordances in Wikipedia, 32 33 including links to articles in other languages and in- 33 34 34 Wikipedia is available in 301 languages, but its con- foboxes. An example can be seen in Figure 1: in the 35 35 tent is unevenly distributed [1]. Under-resourced lan- French Wikipedia, the infobox shown in the article 36 guage versions face multiple challenges: fewer editors 36 37 about cheese (right) automatically draws in data from 37 means fewer articles and less quality control, making Wikidata (left) and displays it in French. 38 that particular Wikipedia less attractive for readers in 38 In previous work of ours, we proposed the Article- 39 that language, which in turn makes it more difficult to 39 40 Placeholder, a tool that takes advantage of Wikidata’s 40 recruit new editors from among the readers. 41 41 Wikidata, the structured-data backbone of Wikipedia multilingual capabilities to increase the coverage of 42 42 [2], offers some help. It contains information about under-resourced Wikipedias [5]. When someone looks 43 43 more than 55 million entities, for example, people, for a topic that is not yet covered by Wikipedia in 44 44 places or events, edited by an active international com- their language, the ArticlePlaceholder tries to match 45 45 munity of volunteers [3]. More importantly, it is mul- the topic with an entity in Wikidata. If successful, it 46 46 tilingual by design and each aspect of the data can be 47 then redirects the search to an automatically generated 47 translated and rendered to the user in their preferred 48 placeholder page that displays the relevant informa- 48 language [4]. This makes it the tool of choice for a va- 49 tion, for example the name of the entity and its main 49 50 properties, in their language. The ArticlePlaceholder is 50 51 *Corresponding author. E-mail: [email protected]. currently used in 14 Wikipedias (see Section 3.1). 51 1570-0844/0-1900/$35.00 c 0 – IOS Press and the authors. All rights reserved 2 Kaffee et al. / Using Natural Language Generation to Bootstrap Empty Wikipedia Articles 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 Fig. 1. Representation of Wikidata triples and their inclusion in a Wikipedia infobox. Wikidata triples (left) are used to fill out the fields of the 23 24 infobox in the article about fromage infobox from the French Wikipedia 24 25 25 26 In this paper, we propose an iteration of the Ar- tences with limited usability in user-facing systems; 26 27 ticlePlaceholder to improve the representation of the one of the most common problems is their ability to 27 28 data on the placeholder page. The original version of handle rare words [12, 13], which are words that the 28 29 the tool pulled the raw data from Wikidata (available model does not meet frequently enough during train- 29 30 as triples with labels in different languages) and dis- ing, such as localisations of names in different lan- 30 31 played it in tabular form (see Figure 3 in Section 3). In guages. We introduce a mechanism called property 31 32 the current version, we use Natural Language Genera- placeholder [14] to tackle this problem, learning mul- 32 33 tion (NLG) techniques to automatically produce a text tiple verbalisations of the same entity in the text [6]. 33 34 snippet from the triples instead. Presenting structured In building the system we aimed to pursue the fol- 34 35 data as text rather than tables helps people uninitiated lowing research questions: 35 36 with the involved technologies to make sense of it [6]. 36 37 This is particularly useful in contexts where one can- RQ1 Can we train a neural network to generate 37 38 not make any assumptions about the levels of data lit- text from triples in a low-resource setting? 38 39 eracy of the audience, as it is the case for a large share To answer this question we first evaluated the 39 40 of the Wikipedia community. system using a series of predefined metrics and 40 41 Our NLG solution builds upon the general encoder- baselines. In addition, we undertook a quanti- 41 42 decoder framework for neural networks, which is cred- tative study with participants from two under- 42 43 ited with promising results in similar text-centric tasks, served Wikipedia language communities (Arabic 43 44 such as machine translation [7, 8] and question gen- and Esperanto), who were asked to assess, from a 44 45 eration [9–11]. We extend this framework to meet the reader’s perspective, whether the text is fluent and 45 46 needs of different Wikipedia language communities in appropriate for Wikipedia. 46 47 terms of text fluency, appropriateness to Wikipedia, RQ2 How do editors perceive the generated text 47 48 and reuse during article editing. Given an entity that on the ArticlePlaceholder page? To add depth 48 49 was matched by the ArticlePlaceholder, our system to the quantitative findings of the first study, we 49 50 uses its triples to generate a short Wikipedia-style sum- undertook a second, mixed-methods study within 50 51 mary. Many existing NLG techniques produce sen- six Wikipedia language communities (Arabic, 51 Kaffee et al. / Using Natural Language Generation to Bootstrap Empty Wikipedia Articles 3 1 Swedish, Hebrew, Persian, Indonesian, and Ukrainian). around the <rare> tokens both during reading and 1 2 We carried out semi-structured interviews, in writing, they did not check the text for correctness, 2 3 which we asked editors to comment on their ex- nor queried where the text came from and what the to- 3 4 perience with reading the summaries generated kens meant. This suggests that we need more research 4 5 through our approach and we identified common into how to communicate the provenance of content in 5 6 themes in their answers. Among others, we were Wikipedia, especially in the context of automatic con- 6 7 interested to understand how editors perceive text tent generation and deep fakes [15], as well as algo- 7 8 that is the result of the artificial intelligence (AI) rithmic transparency. 8 9 algorithm rather than being manually crafted, and 9 Structure of the paper The remainder of the paper is 10 how they deal with so-called <rare> tokens in 10 organised as follows. We start with some background 11 the sentences. Those tokens represent realisations 11 for our work and related papers in Section 2. Next, 12 of infrequent entities in the text, that data-driven 12 we introduce our approach to bootstrapping empty 13 approaches generally struggle to verbalise [12]. 13 Wikipedia articles, which includes the ArticlePlace- 14 RQ3 How do editors use the text snippets in their 14 holder tool and its NLG extension (Section 3). In Sec- 15 work? As part of the second study, we also asked 15 tion 4 we provide details on the evaluation methodol- 16 participants to edit the placeholder page, start- 16 ogy, whose findings we present in Section 5. We then 17 ing from the automatically generated text or re- 17 discuss the main themes emerging from the findings 18 moving it completely. We assessed text reuse 18 and their implications and the limitations of our work 19 both quantitatively, using a string-matching met- 19 in Section 6, before concluding with a summary of 20 20 ric, and qualitatively through the interviews. Just contributions and planned future work in Section 8. 21 like in RQ2, we were also interested to under- 21 22 stand whether summaries with <rare> tokens, Previous submissions A preliminary version of this 22 23 which point to limitations in the algorithm, would work was published in [14, 16].

Load more