Language Technology for Cultural Heritage Data (Latech 2007), Pages 41–48, Prague, Czech Republic, June
Total Page:16
File Type:pdf, Size:1020Kb
The Workshop Programme 9:30- 9:35 Introduction 9:35-10:00 Penny Labropoulou, Harris Papageorgiou, Byron Georgantopoulos, Dimitra Tsagogeorga, Iason Demiros and Vassilios Antonopoulos: "Integrating Language Technology in a web-enabled Cultural Heritage system" 10:00-10:30 Oto Vale, Arnaldo Candido Junior, Marcelo Muniz, Clarissa Bengtson, Lívia Cucatto, Gladis Almeida, Abner Batista, Maria Cristina Parreira da Silva, Maria Tereza Biderman and Sandra Aluísio: "Building a large dictionary of abbreviations for named entity recognition in Portuguese historical corpora" 11:00-11:30 Eiríkur Rögnvaldsson and Sigrún Helgadóttir: "Morphological tagging of Old Norse texts and its use in studying syntactic variation and change" 11:30-12:00 Lars Borin and Markus Forsberg: "Something Old, Something New: A Computational Morphological Description of Old Swedish" 12:00-12:30 Dag Trygve Truslew Haug and Marius Larsen Jøhndal: "Creating a Parallel Treebank of the Old Indo-European Bible Translations" 12:30-12:55 Elena Grishina: "Non-Standard Russian in Russian National Corpus (RNC)" 12:55-13:20 Dana Dannélls: "Generating tailored texts for museum exhibits" 14:30-15:00 David Bamman and Gregory Crane: "The Logic and Discovery of Textual Allusion" 15:00-15:30 Dimitrios Vogiatzis, Galanis Dimitrios, Vangelis Karkaletsis and Ion Androutsopoulos: "A Conversant Robotic Guide to Art Collections" 15:30-16:00 René Witte, Thomas Gitzinger, Thomas Kappler and Ralf Krestel: "A Semantic Wiki Approach to Cultural Heritage Data Management" 16:30-17:15 Christoph Ringlstetter: "Error Correction in OCR-ed Data"- Invited Talk 17:15-17:30 Closing i Workshop Organisers Caroline Sporleder (Co-Chair), Saarland University, Germany Kiril Ribarov (Co-Chair), Charles University, Czech Republic Antal van den Bosch, Tilburg University, The Netherlands Milena P. Dobreva, HATII, University of Glasgow, Scotland Matthew James Driscoll, Kobenhavns Universitet, Denmark Claire Grover, University of Edinburgh, Scotland Piroska Lendvai, Tilburg University, The Netherlands Anke Luedeling, Humboldt-Universitat, Germany Marco Passarotti, Universita Cattolica del Sacro Cuore, Italy Workshop Programme Committee Ion Androutsopoulos, Athens University of Economics and Business, Greece Timothy Baldwin, University of Melbourne, Australia David Bamman, Perseus, USA David Birnbaum, University of Pittsburgh, USA Antal van den Bosch, Tilburg University, The Netherlands Andrea Bozzi, ILC-CNR, Pisa, Italy Kate Byrne, University of Edinburgh, Scotland Paul Clough, Sheffield University, UK Greg Crane, Perseus, USA Milena P. Dobreva, HATII, University of Glasgow, Scotland Mick O'Donnell, Universidad Autonoma de Madrid, Spain Matthew James Driscoll, Kobenhavns Universitet, Denmark Franciska de Jong, University of Twente, The Netherlands Claire Grover, University of Edinburgh, Scotland Ben Hachey, University of Edinburgh, Scotland Djoerd Hiemstra, University of Twente, The Netherlands Dolores Iorizzo, Imperial College London, UK Christer Johansson, University of Bergen, Norway Piroska Lendvai, Tilburg University, The Netherlands Anke Luedeling, Humboldt-Universitat, Germany Roland Meyer, University of Regensburg, Germany Maria Milosavljevic, University of Edinburgh, Scotland Marie-Francine Moens, Katholieke Universiteit Leuven, Belgium Marco Passarotti, Universita Cattolica del Sacro Cuore, Italy Martin Reynaert, Tilburg University, The Netherlands Kiril Ribarov, Charles University, Czech Republic Maarten de Rijke, University of Amsterdam, The Netherlands Peter Robinson, ITSEE, UK Maria Simi, University of Pisa, Italy Caroline Sporleder, Saarland University, Germany ii Table of Contents David Bamman and Gregory Crane: 1 "The Logic and Discovery of Textual Allusion" Lars Borin and Markus Forsberg: 9 "Something Old, Something New: A Computational Morphological Description of Old Swedish" Dana Dannélls: 17 "Generating tailored texts for museum exhibits" Elena Grishina: 21 "Non-Standard Russian in Russian National Corpus (RNC)" Dag Trygve Truslew Haug and Marius Larsen Jøhndal: 27 "Creating a Parallel Treebank of the Old Indo-European Bible Translations" Penny Labropoulou, Harris Papageorgiou, Byron Georgantopoulos, Dimitra 35 Tsagogeorga, Iason Demiros and Vassilios Antonopoulos: "Integrating Language Technology in a web-enabled Cultural Heritage system" Eiríkur Rögnvaldsson and Sigrún Helgadóttir: 40 "Morphological tagging of Old Norse texts and its use in studying syntactic variation and change" Oto Vale, Arnaldo Candido Junior, Marcelo Muniz, Clarissa Bengtson, Lívia 47 Cucatto, Gladis Almeida, Abner Batista, Maria Cristina Parreira da Silva, Maria Tereza Biderman and Sandra Aluísio: "Building a large dictionary of abbreviations for named entity recognition in Portuguese historical corpora" Dimitrios Vogiatzis, Galanis Dimitrios, Vangelis Karkaletsis and Ion 55 Androutsopoulos: "A Conversant Robotic Guide to Art Collections" René Witte, Thomas Gitzinger, Thomas Kappler and Ralf Krestel: 61 "A Semantic Wiki Approach to Cultural Heritage Data Management" iii Author Index Almeida, Gladis 47 Aluísio, Sandra 47 Androutsopoulos, Ion 55 Antonopoulos, Vassilios 35 Bamman, David 1 Batista, Abner 47 Bengtson, Clarissa 47 Biderman, Maria Tereza 47 Borin, Lars 9 Candido Junior, Arnaldo 47 Crane, Gregory 1 Cucatto, Lívia 47 Dannélls, Dana 17 Demiros, Iason 35 Dimitrios, Galanis 55 Forsberg, Markus 9 Georgantopoulos, Byron 35 Gitzinger, Thomas 61 Grishina, Elena 21 Haug, Dag Trygve Truslew 27 Helgadóttir, Sigrún 40 Jøhndal, Marius Larsen 27 Kappler, Thomas 61 Karkaletsis, Vangelis 55 Krestel, Ralf 61 Labropoulou, Penny 35 Muniz, Marcelo 47 Papageorgiou, Harris 35 Rögnvaldsson, Eiríkur 40 Silva, Maria Cristina Parreira da 47 Tsagogeorga, Dimitra 35 Vale, Oto 47 Vogiatzis, Dimitrios 55 Witte, René 61 iv The Logic and Discovery of Textual Allusion David Bamman Gregory Crane The Perseus Project The Perseus Project Tufts University Tufts University Medford, MA Medford, MA [email protected] [email protected] Abstract known to the reader, we are using it here in the spe- cific context of an imitative textual allusion – a pas- We describe here a method for discovering sage in one text that refers to a passage in another. imitative textual allusions in a large collec- When Willy Loman calls each of his sons an “Ado- tion of Classical Latin poetry. In translating nis” in Death of a Salesman, there is no doubt that the logic of literary allusion into computa- this is an allusion to a Classical myth, but it does not tional terms, we include not only traditional point to a definable referent in the record of writ- IR variables such as token similarity and n- ten humanity (as King’s allusion refers specifically grams, but also incorporate a comparison of to the first six words of the Gettysburg Address). syntactic structure as well. This provides a The discovery of these allusions is a crucial pro- more robust search method for Classical lan- cess for the analysis of texts. As others have pointed guages since it accomodates their relatively out,1 allusions have two main functions: to express free word order and rich inflection, and has similarity between two passages, so that the latter the potential to improve fuzzy string search- can be interpreted in light of the former; and to si- ing in other languages as well. multaneously express their dissimilarity as well, in 2 1 Introduction that the tradition they recall is revised. Allusions of this specific variety are perhaps most widely known Five score years ago, a great Ameri- as a trope of modernist authors such as Eliot and can, in whose symbolic shadow we stand Joyce, but they are common in the Classical world today, signed the Emancipation Proclama- as well – most strongly in the Greek poetry of the tion ... Hellenistic era, in the Roman poetry of the republic Thus begins Martin Luther King Jr.’s “I Have a and early empire and in New Testament texts (which Dream” speech of 1963. While the actual text of the allude to prophecies recorded in the Old Testament). Gettysburg Address is not directly quoted here, it is Given the long history of Latin literature, we must Nachleben elicited by means of an allusion: King’s audience also keep in the mind a text’s – how it has been received and appropriated by the genera- would immediately have recognized the parallels be- 3 tween his first four words and the “Four score and tions that follow it. seven years ago” that began Lincoln’s own speech. Uncovering allusions of this sort has long been By opening with this phrase, King is aligning Lin- the task of textual commentators, but we present coln’s invocation of human equality with “the great- 1For an overview of the function and interpretive signifi- est demonstration for freedom in the history of our cance of allusions, see Thomas (1986). 2 nation” for which he was then speaking. Cf. Bloom (1973). 3Cicero, for example, was widely admired by Renaissance While the term “allusion” is commonly applied humanists after Petrarch and provided a model for textual imi- to any reference to a person, place, or thing already tation. Cf. Kristeller (1979). 1 a method here to automatically discover them in month, a week, a natural day, That Faustus may texts. Our approach has many similarities with re- repent and save his soul! O lente, lente, cur- search on text reuse (Clough et al., 2002), para- rite noctis equi! (Act V, Scene 2) phrase and duplicate detection (Dolan et al., 2004), And again, four centuries later, Vladimir and locating textual reference (Takeda et al., 2003; Nabokov appropriates