The Prior-project From Archive Boxes to a Research Community Engerer, Volkmar Paul; Roued-Cunliffe, Henriette; Albretsen, Jørgen; Hasle, Per Frederik Vilhelm

Publication date: 2017

Citation for published version (APA): Engerer, V. P., Roued-Cunliffe, H., Albretsen, J., & Hasle, P. F. V. (2017). The Prior-project: From Archive Boxes to a Research Community. Abstract from Digital Humanities in the Nordic Countries, 2nd Conference, Götenburg, Sweden.

Download date: 11. okt.. 2021

DHN 2017 Digital humaniora i Norden/ Digital Humanities in the Nordic Countries Göteborg, March 14–16 2017

CONFERENCE ABSTRACTS

Published by: The University of Gothenburg, Department of Literature, History of Ideas and Religion, 2017 ISBN: 978-91-88348-83-8 http://hdl.handle.net/2077/52239 Editor: Daniel Brodén Cover logo: Dick Claésson © The University of Gothenburg and the individual authors Programme Committee

Christian-Emil Ore, University of Oslo, (Chair) Jenny Bergenmar, University of Gothenburg, Sweden (Co-chair) Ilze Auziņa, University of Latvia, Latvia Stefan Gelfgren, Umeå University, Sweden Olga Holownia, University of Iceland, Iceland Sakari Katajamäki, Finnish Literature Society – SKS, Rimvydas Laužikas, Vilnius University, Lithuania Cecilia Lindhé, University of Gothenburg, Sweden Liina Lindström, University of Tartu, Estonia Mats Malm, University of Gothenburg, Sweden Bente Maegaard, Copenhagen University, Annika Rockenberger, University of Oslo, Norway Nina Tahmasebi, University of Gothenburg, Sweden Mikko Tolonen, University of Helsinki, Finland

Local Organizing Committee

Jenny Bergenmar, University of Gothenburg (Chair) Daniel Brodén, University of Gothenburg Trausti Dagsson, University of Gothenburg Cecilia Lindhé, University of Gothenburg Mats Malm, University of Gothenburg Julia Pennlert, University of Borås

Preface

The book you hold in your virtual hand contains the collection of abstracts for the presentations to be given at the second conference for Digital Humanities in the Nordic Countries, DHN2017. The conference is held at the University of Gothenburg March 14–16, 2017, and is organized by the Centre for Digital Humanities at the University of Gothenburg. Digital Humanities, Humanities Computing, Computer Applications in the Humanities or Computational Methods in the Humanities – our field has had many names throughout its history going back to the end of the 1940s when Roberto Busa started his collaboration with IBM on producing a complete concordance for the works of Thomas Aquinas. Originally, Busa did not plan to use digital computers as we know them. His idea was to use punch cards and the corresponding semi-mechanical machinery to create the concordance. From our retrospective point of view, Busa’s large number of boxes filled with punch cards seems to be extremely unsophisticated. This is of course not true. Firstly, punch cards represented the state of the art, secondly it is the scholarly method and how the available machinery is exploited to achieve the results, which are important. The same is true for DH today. Although the study and development of digital methods are important, in DH one mostly uses digital methods developed in other contexts like for example machine learning and general statistics. The extensive digitization of our everyday lives has extended DH to a meta-level. An important part of DH is the study of the digitized society – also called the study of digital cultures at some universities. This wide scholarly landscape of DH is reflected in the three main topics listed in the call for papers for DHN2017:

• Nordic Textual Resources and Practices, • Visual and Multisensory Representations of Past and Present, • The Digital, the Humanities, and the Philosophies of Technology.

DH activities in the Nordic countries have a long history dating back at least to the early 1980s. However, there has never been a Nordic Association for the digital and the humanities until the Swedish initiative came early in 2015 headed by professor Mats Malm from the University of Gothenburg. The organization Digital Humanities in the Nordic Countries was founded in 2015 and is now one of three DH-organizations associated to the European Association for Digital Humanities, EADH. Through EADH our organization is connected to ADHO, the global Association of Digital Humanities Organizations. The first conference, DHN2016 held in Oslo, was a big success. A second conference will usually indicate whether there still is an interest. The response to the call for paper to DHN2017 was indeed good. We received 105 proposals for workshops, panels, presentations and posters. The final programme consists of three plenary keynotes, 56 paper presentations, 4 panel sessions and 14 posters, all presented in this book of abstracts. The authors were asked to indicate the main topic of their submission. The first of the three main topics “Nordic Textual Resources and Practices” is the traditional topic in a DH conference and not unexpectedly, 46 % of the submissions were tagged with this topic. 28 % were tagged with “Visual and Multisensory Representations of Past and Present”. In this category we find cultural heritage papers, arts as well as visualization techniques used in text studies. The final topic “The Digital, the Humanities, and the Philosophies of Technology” is the least typical for traditional DH and 24 % were tagged with this category. A large part of these are about topics not so uncommon in traditional DH. However, there are also many interesting presentations on the meta-level which are not so common. In future conferences one should definitely encourage submissions with topics in this third category. In general the submissions cover a wide range of DH. Digital Humanities in the Nordic countries is indeed an active, flourishing activity. We wish to give our warmest thanks to our colleagues in the Programme Committee and the Local Organising Committee, and also to the Scientific Committee who did a splendid job in reviewing and evaluating the submissions (see also dhn2017.eu). Finally we would like to thank our sponsors for their generous funding enabling us to organize this conference: The Royal Swedish Academy of Letters, History and Antiquities, Sven och Dagmar Saléns stiftelse and the Department of Literature, History of Ideas, and Religion, University of Gothenburg. Bursary funding has been generously provided by Digital Scholarly Editions Initial Training Network, DiXiT.

Christian-Emil Ore Chair of the Programme Committee, Chair of Digital Humanities in the Nordic Countries

Jenny Bergenmar Chair of the Local Organizing Committee

Table of Contents

Plenary Lectures New Natures of the Anthropocene and the Need for Humanistic Inquiry into the Digital Dolly Jørgensen 17 Fluid, Frozen, Aggregated: On Discursive Images, Visual Discourse, and the Rematerialization of Data Katja Kwastek 17 Towards a Macroscope for the Study of Nordic Literatures Peter Leonard & Timothy R Tangherlini 17

Panels Digitizing Industrial Heritage: Models and Methods in the Digital Humanities Anna Foka, Finn Arne Jørgensen & Pelle Snickars 21 The Nordic Hub of DARIAH-EU: A DH Ecosystem of Cross-Disciplinary Approaches Koraljka Golub, Marcelo Milrad, Marianne Ping Huang, Mikko Tolonen, Andreas Bergsland & Mats Malm 23 New Research on Digital Newspaper Collections Patrik Lundell, Mikko Tolonen, Jani Marjanen, Hege Roivainen, Leo Lahti, Asko Nivala, Heli Rantala, Hannu Salmi, Johan Jarlbrink, Kristoffer Laigaard Nielbo, Mads Rosendahl Thomesen & Melvin Wevers 26 Web Archives: What’s in Them for Digital humanists? Panel on Web Archiving in the Nordic Countries Caroline Nyvang, Lassi Lager, John Erik Halse, Olga Holownia & Pär Nilsson 28

Long Papers Body Parts in Norwegian Books Lars Bagøien Johnsen & Siv Frøydis Berg 35 Confusing the Modern Breakthrough: Naïve Bayes Classification of Authors and Works Peter M Broadwell & Timothy R Tangherlini 38 Topical Discourse Networks: Methodological Approaches to Turkish Foreign Policy in Sub-Saharan Africa Fabian Brinkmann 45 Vectors or Bit Maps? Brief Reflection on Aesthetics of the Digital in Comics Daniel Brodén 47 Multilingual Clusters and Gender in Nordic Twitter Steven Coats 50 The Prior-project: From Archive Boxes to a Research Community Volkmar Engerer, Henriette Roued-Cunliffe, Jørgen Albretsen & Per Hasle 53 Mapping the Development of Digital History in Finland Mats Fridlund & Petri Paju 57 Visualising Genre Relationships in Icelandic Manuscripts Katarzyna Anna Kapitan, Timothy Rowbotham & Tarrin Wills 59 Spatiality, Tactility and Proprioception in Participatory Art Raivo Kelomees 62 The Elias Lönnrot Letters Online – Challenges of Multidisciplinary Source Material Kirsi Keravuori, Niina Hämäläinen & Maria Niku 66

9 Tagging Named Entities in 19th Century Finnish Newspaper Material with a Variety of Tools Kimmo Kettunen & Teemu Ruokolainen 68 The Digital Experience: Technology and Representation Lars Kristensen & Graeme Kirkpatrick 72 The Corpus of American Danish: A Corpus of Multilingual Spoken Heritage Danish and Corpus-based Speaker Profiles as a Way to Tackle the Chaos Karoline Kühl, Jan Heegård Petersen & Gert Foget Hansen 74 Rhythms of Fear and Joy in Suomi24 Discussions Krista Lagus, Mika Pantzar & Minna Ruckenstein 76 Long-Range Information Dependencies and Semantic Divergence Indicate Author Kehre Kristoffer Laigaard Nielbo & Katrine Frøkjær Baunvig 81 Finnish Internet Parsebank – A Web-crawled Corpus of Finnish with Syntactic Analyses Veronika Laippala, Aki-Juhani Kyröläinen, Jenna Kanerva, Juhani Luotolahti, Tapio Salakoski & Filip Ginter 82 Writing and Rewriting: The Colored Digital Visualization of Keystroke Logging Christophe Leblay & Gilles Caporossi 85 Word Spotting as a Tool for Scribal Attribution Lasse Mårtensson, Anders Hast & Alicia Fornes 87 Text Mining the History of Information Politics Through Thousands of Swedish Governmental Official Reports Fredrik Norén & Roger Mähler 89 Teaching and Learning the Mindset of the Digital Historian and More: Scaffolding Students’ Critical Skills in the Digital Humanities Thomas Nygren 90 New Multi-language Digitised Newspapers and Journals from Finland Available as Data Exports for Nordic Researchers Tuula Pääkkönen & Jukka Kervinen 94 Exploring User Engagement in Crowdsourcing Folk Traditions Sanita Reinsone 96 Bokhylla: A Case Study of the First Complete National Literature Database in the World Eivind Røssaak 99 Life Based Design for Human Researchers Pertti Olavi Saariluoma & Jaana Leikas 100 Leseutgave av Hrafnkels saga, Menotas koding og knytting til andre ressurser Fabian Schwabe 102 Mischievous Machines: A Design Criticism of Programmable Partners Jörgen Skågeby 104 “En temmelig lang fodtur”: hGIS and Folklore Collection in 19th Century Denmark Ida Storm, Timothy R Tangherlini, Georgia Broughton & Holly Nicol 105 Representations: The Analogue Photography as a Digital Source Arthur Tennøe 110 The Trading Faces: Online Exhibition and Its Strategies of Public Engagement Alda Terracciano 112 The New Lexicon Poeticum Tarrin Wills 114

10 Short Papers What’s Missing in This Picture? Political Change and Wordscapes of Latvian Poetry Anda Baklanē 121 The Space Between: The Usefulness of Semi-distant Readings and Combined Research Methods in Literary Analysis Karl Berglund 122 ”These Memories Won’t Last”: Visual Representations of the Forgotten Jennifer J Dellner 123 From Theory to Practice: The Sett i gang Web Portal Kari Lie Dorer 124 Automated Improvement of Search in Low Quality OCR Using Word2Vec Thomas Egense 125 Reading Moravian Lives: Overcoming Challenges in Transcribing and Digitizing Archival Memoirs Katherine Faull, Trausti Dagsson & Michael McGuire 126 Senses and Emotion of Early-modern and Modern Handicrafts – Digital History Approach Johanna Ilmakunnas 127 Reading Through the Machines: Epistemology, Media Archeology and the Digital Humanities Jonas Ingvarsson 127 Organizational and Educational Issues in Representing History through a Series of Data Sprints on Visual Data from an API Lars Kjær, Ditte Laursen, Stig Svenningsen & Mette Kia Krabbe Meyer 129 The Afterlife of Early Modern Portraiture in Digitized Museum Collections: Discovering Conventions and Forgotten Images Charlotta Krispinsson 130 Málið.is: An Icelandic Web Portal for Dissemination of Information on Language and Usage Ari Páll Kristinsson & Halldóra Jónsdóttir 131 [Re]use of Medieval Paintings in the Network Society: A Study of Ethics Pakhee Kumar 132 Digitization of Literary Fiction. Example of Jan Potocki's The Manuscript Found in Saragossa Rafał Kur 133 Multidisciplinary Terminology Work in the Humanities: New Form of Collaborative Writing Tiina Mirjami Käkelä-Puumala 134 Towards a Reader-friendly Digital Scholarly Edition Sebastian Köhler 135 Towards a Digital Edition of the Codex Regius of the Prose Edda: Philosophy, Method, and Some Innovative Tools Michael John MacPherson 136 Contributing to Nordic Cultural Commons through Hackathons Sanna-Maria Marttila 137 Young People’s Historical Thinking in the Face of Digitized Sources Åsa Olovsson 138

11 Sixties Biopoetics: A Media Archaeological Reading of Digital Infrastructure Jesper Olsson 140 Mapping Letters Across Editions Vemund Olstad & Hilde Bøe 141 The Battle of the Text – Quantitative Methodologies in Literary Studies Julia Pennlert 141 Spatial Humanities and the Norwegian Folklore Archive Kristina Skåden 142 How to Study Online Popular Discourse on Otherness – Public User Interfaces to Online Discussion Forum Materials Jaakko Suominen & Elina Vaahensalo 143 Socio-Economic Relations in Ptolemaic Pathyris: A Network Analytical Approach to a Bilingual Community Lena Tambs 144 Combining Data Sources for Language Variation Studies and Data Visualization Kristel Uiboaed, Eleri Aedmaa & Maarja-Liisa Pilvik 145 Places and Journeys of the Contemporary Norwegian Novel: A Pilot Study Kim Tallerås, Tonje Vold & David Massey 146 The Use of Medical Visualisation in Cultural Heritage Exhibitions Karin Wagner 147 Visualizing the Landscape of Contemporary Norwegian Novels Miroslav Zumrík 148

Posters Interdisciplinary Collaboration for Making Cultural Heritage Accessible for Research Johanna Berg, Rickard Domeij, Jens Edlund, Gunnar Eriksson, David House, Zofia Malisz, Susanne Nylund Skog & Jenny Öqvist 151 Mapping Language Vitality Coppélie Cocq 152 Enemies of Books Olof Gunnar Essvik 153 Working with Digital Newspapers Katrine Gasser & Mogens Vestergaard Kjeldsen 154 Towards a Material Politics of Intensity – Mimetic, Virtual and Anarchistic Assemblages of Becoming-Non-Human/Machine in Minecraft Marleena Huuhka 156 The Cultural Heritage HPC Cluster Per Møldrup-Dalum 157 Collecting Speech Data over the Internet Tommi Nieminen & Tommi Kurki 159 Staging the Medieval Religious Play in Virtual Reality Annika Rockenberger 160 Use of Digital Methods to Switch Identity-related Properties Jon Svensson, Roger Mähler, Mats Deutschmann, Anders Steinvall & Satish Patel 163 Prozhito: Private Diaries Database Nataliya Tyshkevich & Ivan Drapkin 164 Creating Children’s Books in the Context of Pokémon Go, Museums and Cultural Heritage Lars Vipsjö 165 From Online Research Ethics to Researching Online Ethics Sari Östman & Riikka Turtiainen 166

12 Pre-conference Workshops Higher Education Programs in Digital Humanities: Challenges and Perspectives Koraljka Golub, Jenny Bergenmar, Isto Huvila, Marcelo Milrad & Mikko Tolonen 171 Data Management for Humanities Scholars: An Introduction to Data Management Plans and the Cultural Heritage Data Reuse Charter Marie Puren & Charles Riondet 173 Developing a Repository and Suite of Tools for Scandinavian Literature Mads Rosendahl Thomsen, Timothy R Tangherlini & Kristoffer Laigaard Nielbo 175 Transkribus: Handwritten Text Recognition Technology for Historical Documents Louise Seaward & Maria Kallio 176

13

14

PLENARY LECTURES

15

16 New Natures of the Fluid, Frozen, Aggregated: Anthropocene and On Discursive Images, the Need for Humanistic Visual Discourse, and Inquiry into the Digital the Rematerialization of Data

Dolly Jørgensen Katja Kwastek Associate Professor, History of Technology Professor of Modern and Contemporary & Environment, Luleå University of Art, Vrije University Amsterdam, Technology, Sweden

Nature is not without a history, and new na- After a brief overview of the history of digi- tures are constantly produced through hu- tal art history, this lecture will discuss the man technology and activities. This idea is discursive potential of (digital) images, re- currently framed as the Anthropocene, the mediating or responding upon each other, geological Era of Man, which has been pro- circulating in the networks, aggregating as posed as an official geologic era beginning c. big image data visualizations, and serving as 1950. The evidence for this new geologic era arguments within (scholarly) discourse. It is based on new materials like radioactive will look both at the impact and implemen- isotopes and plastics and the redistribution tation of these phenomena within academia of materials like carbon from consumed fu- and at their reflection and instrumentalizat- els. The Anthropocene’s wide and deep hu- ion in artistic practice, including recent ten- man influence on the planet also corre- dencies to rematerialize data and discourse sponds with the creation of digital technolo- in sculptures and installations. gies in the modern era. Presenting examples from webcams, visitor information boards, Topic: Visual and Multisensory Representat- databases, and other forms of digitally aug- ions of Past and Present mented nature, I will argue that it is through the digital that many humans now come to know and experience new nature. Know- Towards a Macroscope ledge of nature has always been mediated through technology, but the digital has for the Study of Nordic enabled both greater physical distance and Literatures conceptual closeness with nature. I propose that digital humanities needs to move Peter Leonard beyond thinking of the digital as a tool to Director of Yale University Library Digital thinking of the digital as a part of the ecosy- Humanities Lab, United States of America stem of the Anthropocene’s new nature, Timothy R Tangherlini shaping and reshaping both culture and the Professor, Scandinavian Section and Dept. environment. of Asian Languages and Cultures, UCLA, United States of America Topic: The Digital, the Humanities, and the Philosophies of Technology The study of Nordic literatures is one mar- ked by a series of complexities that pose sig- nificant challenges as we move toward deve- loping meaningful approaches to the study of literature at the scales made possible by the vast and successful digitization projects underway across the Nordic region. These complexities arise not only from the lingu- istic variation across the region, where

17 seeming proximity of the languages can lead of the philologist, but has usually required to a false sense of security, but also from painstaking, manual work. This tool makes divergences in concepts of canon, perio- such alignment far quicker and, by relaxing dization, and the divergent cultural trajecto- the standards of precision, can be used to ries that characterize the region. Rather than detect similarities within and across authors. resign in the face of this complexity, we The second tool, subcorpus topic modeling should embrace it as a challenge that may (STM), radically lowers the barriers to entry allow us to make intriguing discoveries for scholars interested in using probabilistic about literary influence, development and methods for discovering latent semantic disruption. Modeling this complexity patterns in corpuses. The tool allows the requires a bold turn toward an encompas- user to fashion a virtual “fishing net” to di- sing methodology that weds the time-tested scover similar semantic patterns (topics) in benefits of close reading, philology, and li- much larger, unlabeled corpora. The third terary history to emerging approaches of di- tool makes use of word-embedding models, stant reading such as network analysis and to allow the user to trace differences in lan- probabilistic modeling. Consequently, we guage use patterns within authorships, ac- propose developing a “macroscope” for the ross authorships, and across historical peri- study of Nordic literature. Katy Börner, wri- ods. ting in the CACM, articulates the power of We recognize that these tools constitute the macroscope for the study of complexity, baby steps on the way to a more unified noting that the macroscope “provide[s] a macroscope for the study of Nordic litera- ‘vision of the whole,’ helping us ‘synthesize’ ture, and that many more tools based on the related elements and detect patterns, sound methodology should be devised. trends, and outliers while granting access to Many additional challenges—from the deve- myriad details. Rather than make things lar- lopment of accessible and machine act- ger or smaller, macroscopes let us observe ionable corpora, to the development of tools what is at once too great, slow, or complex that can consistently and accurately deal with for the human eye and mind to notice and linguistic differences across the modern comprehend” (Börner 2011, 60). In our talk, Scandinavian languages—lie ahead. Yet the we present some initial steps toward promise of these approaches, which will al- realizing a macroscope tuned specifically to low scholars to work across traditions, to the exigencies of the Nordic literary world. engage reading from the distant to the close We present a series of three tools that and everything in between, and to situate could be integrated into a rich study envi- these studies in a rich historical context of ronment for Nordic literature. The first tool cultural change, is too appealing not to en- allows for the alignment of closely similar gage. passages, allowing a scholar to focus on close comparisons between works. Sequence Topic: Nordic Textual Resources and alignment has always been a powerful tool Practices

18

PANELS

19

20 Digitizing Industrial Heritage: dish Academy of Letters, History and An- tiquities (2016–2019). The project is a colla- Models and Methods in the boration between the Swedish National Mu- Digital Humanities seum of Science and Technology, with a nat- ional responsibility for technical and indust- Anna Foka rial heritage, and Humlab at Umeå Univer- Finn Arne Jørgensen sity. Based on selected parts of the mu- Pelle Snickars seum’s collections the panel aims to explore Umeå University, Sweden the potential of digital technologies to re- frame Swedish industrialization and the The Nordic cultural heritage sector is under communication of society, people and envi- strong pressure to digitize their collections, ronments. as a way of ensuring long-term preservation Material from the museum’s collections and access. Yet, digitization itself is no selected for digitization (and research) are all panacea, nor is it always entirely clear what it related to different phases of Swedish means to digitize something. The relations- industrialization. Hence, the underlying pro- hip between the original and the copy has ject idea is that heritage institutions should long been the subject of scholarly analysis, not only devote themselves to digitize their yet new digitization technologies and digital collections, but also make it possible for re- fabrication methods such as 3D scanning, searchers and visitors to use digital heritage 3D printing—and even computational mo- through various kinds of applications, tools deling—raises intricate questions about re- and software. presentation, authenticity, and contextuality The starting point of the project are three in cultural heritage. selected categories of material in the mu- This panel interrogates the intersection seum collections which in various ways mir- between digitizing archives and visualizing ror the ”three phases of industrialization”: history with the goal of developing (A) parts of the business leader and industry methodology of high relevance for the cul- historian, Carl Sahlin’s (1861-1943) extensive tural heritage sector. We build on Turkel’s collection. (B), all editions of the museum argument that ‘the process of digitization yearbook, Daedalus (1931-2014), and (C) 31 creates a representation that shares some of wooden models from Swedish pre-industrial the attributes of an original’ and that techno- inventor Christopher Polhem’s ”mechanical logies that are not frequently used by histori- alphabet” from the early 1700s, belonging to ans, for example, could allow us to capture his so called Laboratorium mechanicum. and recreate particular attributes of These models were later (in the 1750s) inser- documents, artefacts and environments’ ted into The Royal Model Chamber (Kung- (2011: 287). From this perspective, the panel liga modellkammaren), a Swedish institution aims to explore the possibilities and affor- for information and dissemination of tech- dances of emerging technologies and nology and architecture set up in central methods that combine innovative digital vi- Stockholm at Wrangelska palatset. sualizations with more traditional modes of Our digitization methods are correlated understanding of the past. with different industrial-historical periods, The panel discusses three different cul- and in effect they will result in three sets of tural heritage perspectives to examine the digital tools, applications and/or game pro- specificity of digitization and its potential to totypes focused on various narratives of bridge research, institutional heritage and Swedish industrialization. The Royal Model interest from the general public. Our case Chamber is, for example, a space which our studies for examination spring from the pro- project has constructed a VR-model of ject ‘Digital Models. Techno-historical col- through HTC Vive and Unity. The theme of lections, digital humanities & narratives of (digital) reconstruction thus has both a industrialization’, funded by the Royal Swe- profound and ambiguous historical dimens-

21 ion, since Polhem sincerely believed (as a digital methods for creative and cultural he- pre-industrial inventor) that physical models ritage organizations, and digitalized infra- were always superior to drawings and structures for the study of arts and humani- abstract representations. ties. As stated, it is not always clear what it Finn Arne Jørgensen is Associate Profes- means to digitize something, and the panel sor of History of Technology and Environ- particularly seeks to address the challenges ment at Umeå University, a position he will of digitizing disparate forms of data, gami- combine with serving as head of the Nor- fication and visualizations in immersive, wegian Museum of Travel and Tourism star- virtual reality environments. Following the ting February 2017. His research explores London Charter on computer-based visuali- the influence of mediating technologies on sation of heritage, it promotes “intellectual the use and experience of nature. Digital and technical rigour in digital heritage visua- scholarship and digital media is a core com- lisation” – yet, in what way should one, for ponent of his academic work and upcoming example, digitize Polhem’s models and his museum practice. Laboratorium mechanicum? The London Pelle Snickars is Professor of Media and Charter defines principles for the use of Communication Studies, specializing in digi- computer-based visualisation methods “in tal humanities at Umeå University, with an relation to intellectual integrity, reliability, affiliation to the research centre Humlab. documentation, sustainability and access” His research has focused the relationship (London Charter 2009). Indeed, the charter between old and new media, media eco- recognises that the range of available com- nomy, digitization of cultural heritage, media puter-based visualisation methods is con- history as well as the importance of new te- stantly increasing. Still, what is the exact re- chnical infrastructures for the humanities. lation between “technical rigour” and virtual heritage (in a software culture permeated by Topics: Visual and Multisensory Representat- constant updates)? In order to provoke a ions of Past and Present confrontation of (stupid) scanning versus Keywords: models, digitization, 3D, visualizat- (intelligent) simulation, we have for example ion, heritage both 3D scanned some of Polhem’s models using an ordinary iPhone (and Agisoft Pho- References toscan software), and CT-scanned them (at London Charter (2009) The London Charter Linköping University Hospital), with multi- for the Computer-Based Visualization tudinous images taken from different angles of Cultural Heritage. to produce a cross-sectional and tomograp- http://www.londoncharter.org/downl hic 3D image (a kind of virtual slice, al- oads.html lowing one to see inside the models without William J. Turkel (2011) ”Intervention: breaking them). Hacking history, from analogue to di- gital and back again,” Rethinking His- The panel will feature short presentations tory 15(2), 287-296. from the three participants, with Foka, Jør- gensen, and Snickars presenting one cate- gory each. Demonstrations of preliminary results and visualizations will also be presen- ted, as well as ample time for discussion and exchange with the audience. Anna Foka is Assistant Professor in In- formation Technology and the Humanities at Humlab, Umeå University. Her research is focused on the rendering of the past with embodied and interactive technology, critical

22 The Nordic Hub of national collaboration and projects. Among the Nordic countries, Denmark is the full DARIAH-EU: A DH partner with four universities, Copenhagen, Ecosystem of Cross- Aarhus, Aalborg and University of Southern Disciplinary Approaches Denmark (DARIAH-DK). Danish DA- RIAH-EU activities are facilitated by the Koraljka Golub national DH Infrastructure DIGHUMLAB, Marcelo Milrad hosted at the DARIAH-DK coordinating Linnaeus University, Sweden institution, Aarhus University. Sweden’s first Marianne Ping Huang academic institution, Linnaeus University, Aarhus University, Denmark joined in May 2016 as a collaborative part- Mikko Tolonen ner. Finland (University of Helsinki) and University of Helsinki, Finland Norway (Norwegian University of Science Andreas Bergsland and Technology) also became collaborative Norwegian University of Science and partners, in November 2016. The Nordic Technology Hub of DARIAH-EU (DARIAH-Nordic) Mats Malm held its first meeting on 8 November in University of Gothenburg, Sweden Växjö, Sweden, in connection with the In- ternational Symposium on Digital Humani- Background and Motivation ties (Växjö, 7-8 November, The particular exploration of new ways of https://lnu.se/en/research/conferences/int interactions between society and Informat- ernational-digital-humanities-symposium/). ion Communication Technologies (ICT) The Digital Humanities in the Nordic with a focus on the Humanities has the pot- Countries (DHN) organisation was establis- ential to become a key success factor for the hed in 2015 in order to create a venue for values and competitiveness of the Nordic interaction and collaboration between the region, having in mind recent EU and reg- Nordic countries, including the Baltic ional political discussions in the field of Di- countries. The ambitions behind the DHN gital Humanities (European Commission, initiative thus largely overlap with the re- 2016; Vetenskapsrådet’s Rådet för forsk- cently formed Nordic Hub of DARIAH- ningens infrastrukturer, 2014). Digital Hu- EU. The panel would like to present diffe- manities (DH) is a diverse and still emerging rent perspectives on Nordic contributions to field that lies at the intersection of ICT and DH as well as the aims of the DARIAH- Humanities, which is being continually for- Nordic and discuss possible joint opportuni- mulated by scholars and practitioners in a ties and challenges in Nordic DH. With its range of disciplines (see, for example, Svens- tradition in supporting the Humanities rese- son & Goldberg, 2015; Gardiner & Musto, arch and development, Nordic countries 2015; Schreibman, Siemens, & Unsworth, may serve as a bastion for (Digital) Humani- 2016). The following are examples of ties. The Nordic Hub of DARIAH-EU and current areas of fields and topics: text- DHN may pave the way forward towards analytic techniques, categorization, data mi- reaching that aim. ning; Social Network Analysis (SNA) and bibliometrics; metadata and tagging; Geo- A DH Ecosystem of Cross-disciplinary Approaces graphic Information Systems (GIS); multi- Mats Malm (previous chair of DHN) will media and interactive games; Music Inform- present the visions and ambitions behind ation Retrieval (MIR); interactive visualizat- DHN and the recently established Centre ion and media. for Digital Humanities at the University of DARIAH-EU (http://dariah.eu), is Gothenburg, which will start a Master pro- Europe’s largest initiative on DH, compri- gramme in Digital Humanities in the autumn sing over 300 researchers in 18 countries, of 2017. While both the Centre for Digital thereby opening up opportunities for inter- Humanities and DHN aim at broad inclusi-

23 veness, he will here focus on the use of al programme in this field; and, 2) to textual databases for re-examining the his- establish a prominent research regional tory and cultural heritage of the Nordic centre that combines in novel ways already countries. This implies collaboration on existing expertise from different depart- common textual resources and technologies ments and faculties working in close colla- for mining, at the same time as it raises a boration and co-creation with people and number of questions concerning cross- different organizations (both public and pri- disciplinarity and exchange of perspectives vate sector) from the surrounding society. and methods. The main goals of this new initiative (laun- Mikko Tolonen will present the ongoing ched in February 2016) at the first phase developments at the University of Helsinki (12-15 months) are twofold; first, to (and in Finland) regarding Digital Humani- establish the foundations for the creation of ties. This includes the recently launched a DH educational programme and second, Heldig (Digital Humanities Centre, to carry out research and create an innovat- https://www.helsinki.fi/en/researchgroups ion centre at the wider region surrounding /helsinki-digital-humanities) and how it can LNU, encompassing east southern Sweden. relate to collaboration in DARIAH-EU. To- A combination of cross-disciplinary, cross- lonen will particularly discuss the relations- sector and international aspects would pro- hip between the Digital Humanities infra- vide a solid ground to build a more or less structure designed to be implemented at the unique international distance Master-level University of Helsinki and how it relates to programme. Addressing future societal chal- ongoing grassroot research projects. lenges would be eventually possible, 1) by Andreas Bergsland will discuss the role highly skilled professionals whose education that the arts might play within Digital Hu- has been markedly enhanced by practice- manities. As a starting point, he will take the informed education, and, 2) through joint, work that has been done at the Norwegian cross-sector innovation. University of Science and Technology Marianne Ping Huang will present DA- (NTNU): establishing ARTEC, an interdi- RIAH-EU related activities in a Danish and sciplinary task force at the intersection of art European context, focusing on initiatives for and technology. He will argue how some of cultural creative participation, including born ARTEC’s initiatives might have both op- digital cultural data and a presentation of portunities and challenges partly converging open cross-sectoral innovation with DA- with those of the DH field, but might also RIAH-EU Humanities at Scale (2015-2017). expand and enrich current practices. One DARIAH-EU will set up its new Innovation such initiative, Adressaparken, is a commons Board in 2017 and host the first DARIAH- area in Trondheim for exploration of sen- EU Innovation Forum with the Creativity sor-based digital storytelling and an open World Forum in Aarhus, November 2017, arena for test and experimentation of new intersecting with Aarhus European Capital experiences and new digital media. While of Culture 2017. DARIAH-EU’s move to- most DH initiatives in Europe seem to wards digitally enhanced public humanities, focus on computational humanities projects, closer collaboration with GLAM (Galleries, Bergsland will explore the unique potential Libraries, Archives, Museums) institutions, of integrating artistic and creative practices and public-private innovation will be discus- into DH/ARTEC initiatives at NTNU. sed in light of the scope of DH and the Koraljka Golub and Marcelo Milrad will Nordic Hub of DARIAH-EU. present and analyse the cross-sector and cross-disciplinary Digital Humanities Initia- Discussion Points: Prospects and Challenges tive at Linnaeus University (LNU) along the The great breadth of cross-disciplinary and axes of its strengths, weaknesses, opportuni- organizational initiatives presented above ties and threats. Their long-term vision is to: presents significant potential for DH in 1) create a leading and innovative education- Nordic countries. Major opportunities lie in

24 the collaborative democratic tradition that References supports re-combining already existing ex- 1) European Commission. (2016). Horizon pertise and resources encompassing 1) diffe- 2020: Social Sciences & Humanities. rent universities, 2) various disciplines, and Available at 3) the wider community through input from https://ec.europa.eu/programmes/ho related public and private sectors. These rizon2020/en/area/social-sciences- points serve to unite and consolidate already humanities existing expertise in order to create new 2) Gardiner, E. and Musto, R. G. (2015). constellations for collaboration leading to The Digital Humanities: A Primer for new knowledge and products (expertise, Students and Scholars. Cambridge: education, research, public and relevant Cambridge University Press. commercial services). Possibilities to colla- 3) Schreibman, S., Siemens, R., and Un- borate across Nordic countries can take sworth, J. (2016). A New Companion place at a number of levels, including joint to Digital Humanities. (2nd ed.). research and innovation, education efforts, Malden, MA; Chichester, West Sussex, expertise and experience exchange, bringing UK: Wiley-Blackwell. in international views to address more reg- 4) Svensson, P., and Goldberg, D. T. (Eds.). ional challenges. Ensuing important value (2015), Between Humanities and the for the general public could be a (re)- Digital. Cambridge, Ma.: MIT Press. affirmation of the value of humanities in 5) Vetenskapsrådet’s Rådet för forskningens particular, and academic practices in general. infrastrukturer. (2014). Områdesö- Challenges would be discussed in terms verikt för forskningens infrastrukturer. of the emerging job market, the low number Available at of students pursuing carriers in humanities http://www.vr.se/download/18.2302f at the Master level (e.g., in Sweden), and the a711489c4798d4a35fa/141146122942 fact that DH as a field is still in its infancy, 3/Samtliga+områden+infrastruktur.p leading to it being quite difficult to get fun- df ding and grants to carry out long-term rese- arch that sustain our efforts over time. Rela- ted to sustainability is the question on how to promote a dialogue and collaboration with potential industrial partners in order to run collaborative projects that go beyond just research. Not the least, epistemological, conceptual and terminological differences in approaches by the different disciplines and sectors may present further challenges and therefore may require additional resources to reach an understanding. Further, while there is a strong collaborative spirit across Nordic countries, there will certainly be administra- tive issues with cross-university collaborat- ion as the current working structures are ba- sed on individual units.

Topics: The Digital, the Humanities, and the Philosophies of Technology Keywords: Nordic Hub of DARIAH-EU, digital humanities in the nordic countries (DHN), cross-disciplinariy initiatives, cross-sectoral initiatives

25 New Research on Digital Online Newspaper Databases and Swedish History, 2009–2017 (Patrik Lundell) Newspaper Collections This paper discusses aspects of the impact of online databases, in particular the Patrik Lundell newspaper databases of the Swedish Nation- University of Mid Sweden al Library, on Swedish historical scholarship. Mikko Tolonen How have Swedish historians reacted to the Jani Marjanen availability of these databases? One aim is to Hege Roivainen map out their actual use of these digital University of Helsinki, Finland sources, also in relation to the use or non- Leo Lahti use of other and complementary newspaper KU Leuven, University of Turku, Finland sources, since the first substantial newspaper Asko Nivala database, Digitized Swedish Newspapers Heli Rantala (Digitaliserade svenska dagstidningar), was Hannu Salmi launched in 2009. Another aim is to investi- University of Turku, Finland gate explicit methodological considerations, Johan Jarlbrink or the lack of them, regarding this usage in University of Umeå, Sweden terms of for example awareness of OCR re- Kristoffer Laigaard Nielbo lated accuracy and the selection of primary Mads Rosendahl Thomesen sources on which the databases are built. A Aarhus University, Denmark third aim is to reflect on potential problems Melvin Wevers and shortcomings, assuming from prelimi- Utrecht University, Netherlands nary investigations that the impact is as The proposed panel focuses on new studies profound as the meta-reflections are scarce. on digitized newspaper collections in the The empirical case will be doctoral dissertat- Netherlands, Denmark, Sweden and Fin- ions in various historical disciplines land. The panel consists of six individual pa- published from 2009 until today. pers that highlight different aspects of using digitized newspapers for research. The panel Patterns of Public Discourse in Finland: Combining focuses on discussing semantic change in Meta-data from Library Catalogues and the newspapers as a response to political crises Finnish Historical Newspaper Library (Mikko and to technological innovation. It will also Tolonen, Jani Marjanen, Hege Roivainen, scrutinize the direction of conceptual inno- Leo Lahti) vation and the role of text reuse in how The Finnish Newspaper Library made digi- ideas and concepts “traveled” between tally available by the National Library of Fin- newspapers. Finally, the papers will reflect land contains nearly all the printed newspa- on the role of newspapers in the long-term pers between 1771 and 1910. This paper transformations of public discourse and to uses the metadata information about the which extent the availability of digital newspapers to statistically trace the expans- newspaper collections has affected the the- ion of public discourse in Finland during the ory and practice of historical inquiry. Each long nineteenth century. Rather than using short paper presents the methods and preli- the metadata as a tool to find information minary results in the respective cases. The and relevant papers, we use it as a tool to discussion that follows concentrates on how analyze the structural changes in public dis- to use digitized newspapers collections to course. By relating information on publicat- study patterns in the newspaper texts as well ion places, language, number of issues, as the outer ramifications of publishing. It number of words, size of papers, and will also compare the different methodologi- publishers and comparing that to the ex- cal approaches and address how the diffe- isting scholarship on newspaper history and rent projects have addressed problems per- censorship we aim at reaching an improved taining to OCR quality. birds-eye view of newspaper publishing in

26 Finland after 1771. We then compare the From a Canon of the Extraordinary to an Archive results to our previous study that uses library of the Everyday (Johan Jarlbrink) catalogues from the Royal Library in Sweden The newspapers from the nineteenth cen- and the National library in Finland to di- tury has, until recent, formed a gigantic pa- scuss the role of newspapers and books per archive without an index. With so many respectively in the public sphere. Finally, the newspapers and texts, and no possibility to paper addresses issues of representativity of search and get an overview, most scholars the material, the need to clean up the ex- has focused on a limited number of canoni- isting data and potential shortcomings in the zed genres, events, writers and titles. Whe- analysis due to missing data in catalogues or reas the newspapers were characterized by errors in the metadata. repetitive and miscellaneous elements, the historical research has been dominated by Towards the Study of Text Reuse in Finnish the extraordinary. Digitized newspapers are Newspapers and Journals, 1771–1910 (Asko far from perfect, but the digital archive ma- Nivala, Heli Rantala & Hannu Salmi) kes it possible to search and research the The paper draws on the digitized collection everyday banalities in a systematic way. My of Finnish newspapers and journals. It in- aim in this paper is to discuss and describe cludes all newspapers, published between the possibilities of digital newspaper archives 1771 and 1910 in Finland. This material co- and digital tools for text analysis. I will argue vers as much as 1,951,076 pages, half of it in that the benefit of the digitized archive is Swedish, the other half in Finnish. In addit- less the possibility to find textual gold nug- ion, there are digitized journals, in sum gets, but rather the opportunity to find patt- 1,099,527 pages prior to 1910. We have terns in the great mass of less spectacular been working with this material in the con- texts. To illustrate the discussion I will pre- sortium Computational History and the sent a digital analysis of word co- Transformation of Public Discourse in Fin- occurrences in newspaper texts about new land, 1640–1910 (funded by the Academy of technologies in the mid-nineteenth century. Finland, 2016–2019), which is based on a The analysis show that very few texts co-operation between the Universities of describe the sensational possibilities of new Helsinki and Turku and the National Lib- technologies, often highlighted in previous rary of Finland. The Turku team concentra- research. What dominates the reports is bu- tes on full text mining, especially on the reaucratic banalities and technical details. question of text reuse. Prior to the Berne When the trivialities of the everyday are ta- Convention in 1886, there was no effective ken more serious we will get a better un- copyright law to regulate the circulation of derstanding of the meaning of the new tech- texts. Newspaper business took advantage nologies in the historical context. of this situation, and news items and stories, poems and anecdotes were copied from pa- Semantic Disruptions in Public Discourse: Topical per to paper. Newspapers were active pro- Divergence and Change-Point Detection in Two ponents and producers of culture: their con- Centuries of Danish Newspapers (Kristoffer Lai- tent included a mixture of textualities, from gaard Nielbo, Mads Rosendahl Thomesen) advertisements to jokes, and they participa- Culture and media researchers theorize that ted in formulating cultural influences, stan- negative cultural events (e.g., war, terrorism, dardizing prevailing phenomena and migration) trigger semantic disruptions in establishing conventions for the modern the field of public discourse (i.e., create new era. The paper discusses the problem of text meanings that displace established cultural reuse detection, its particular challenges with values). Until recently, researchers have been Finnish material, and the preliminary results using limited and biased samples when try- of the project. ing to model such semantic disruptions. With increased access to high performance computing and digitized media collections, it

27 has become possible for humanistic resear- Topics: Nordic Textual Resources and chers to query and model millions of Practices documents. We have therefore initiated a Keywords: newspapers, culture analytics, study that combines unsupervised statistical public discourse, historiography learning and information theory in order to model semantic disruption in Danish public discourse. The target data set, which serves Web Archives: What’s in as a discourse proxy, consists of 150 years of Them for Digital humanists? newspaper articles in the Danish Medi- Panel on Web Archiving in astream collection. In this talk we present research design, methods and our initial mo- the Nordic Countries deling based on simulated data. Caroline Nyvang Tracking the Consumption Junction – Long Range The National Library of Denmark Dependencies and Predictive Causality in 20th Lassi Lager Century Dutch Newspapers (Melvin Wevers, The National Library of Finland Kristoffer L. Nielbo) John Erik Halse Historian Roland Marchand argues that ad- The National Library of Norway vertisements provide an insight into the Olga Holownia social realities of the past, they inform us The British Library/IIPC about the role consumer goods played in the Pär Nilsson lives of consumers. Marchand adds that the The National Library of Sweden central purpose of an ad, however, is to sell merchandise, and herewith ads not only re- Introduction flect but also shape society. This raises the For the past 20 years the Nordic countries question to what extent advertisements de- have been at the forefront of web preservat- termined how consumers viewed products? ion. The National Libraries of Denmark, Or should the process actually be conceptu- Finland, Iceland, Norway and Sweden are alized as a more complex interplay between among the founding members of the Inter- advertisements and consumers? The nego- national Internet Preservation Consortium tiation of meaning between producers and (IIPC) the objective of which is to acquire, consumers has been conceptualized as the preserve and make accessible knowledge and consumption junction. In this study we ana- information from the Internet for future ge- lyze the interplay between advertisements nerations. The members of the IIPC have and consumers by comparing the long-term been working together on tools, techniques dependencies in word use and predictive as well as standards that have enabled the causal relationship between product terms in creation of web archives. The Nordic coun- articles and advertisements in digitized tries have a long history of working together newspapers. In the case of advertisements on technology development, techniques and shaping society, one would expect the pre- methods for accessing archived web docu- dictive causal direction to go from adverti- ments, the Nordic Web Archive (NWA), sements to articles. In the case of a more started in 2000, being one of their most suc- complex relationship, external factors de- cessful initiatives that has underlined the va- termine the relationship between advertise- lue of cross-border collaborations. ments and articles. Results indicate that While web archiving has been essential different scaling laws apply to advertise- for experts working in the field of digital ments and articles. Moreover, the ability of preservation, web archives are still an un- advertisements to shape society systema- tapped resource for researchers, not least in tically appears to be dependent on product the field of Digital Humanities. Projects and group. initiatives such as Buddah (Big UK Domain Data for the Arts and Humanities), “Archi- ves Unleashed” datathons (organised in Ca-

28 nada and the US) or NetLab in Aarhus, to The Danish Netarkivet (netarchive) is name just a few, show the importance of constructed by following a four-string ap- interdisciplinary collaborations between re- proach: 1) Four times a year, all -dk.domains searchers and web archiving experts. A are harvested; 2) Daily harvests of app. 100 number of researchers have also worked select sites ensure that very dynamic websi- with the Nordic National Libraries on pro- tes (e.g. news sites) are properly archived; 3) jects based on web archiving but there is still We initiate special harvests in relation to a lot that can be done in that respect and our both predictable and unforeseen events (e.g. hope is that the web archiving panel at the the 2015 terror attack in Copenhagen and DHN conference will lead to a better un- the Eurovision Song Contest); 4) Special derstanding of the state of arts as well as re- harvests are planned at the specific request searchers’ needs. of researchers as well as the general public. Therefore, the objectives of the panel are: Netarkivet is the second largest archive in * to introduce the Nordic web archives: Denmark and the fourth largest in the netarkivet.dk, Norsk Nettarkiv, Suomalainen world, based on the amount of archived verkkoarkisto, Kulturarw3 and Vefsafn.is; data. However, we also face a number of * to highlight the value of web archives as unique challenges in relation to making the a source for researchers; collections readily available for researchers * to discuss common platforms of colla- due to the strict Danish data protection laws. borations as well as challenges posed by dif- Furthermore, we are challenged by the fact ferent legal frameworks and, consequently, that Netarkivet is not curated, which makes different types of access; finding relevant data extremely difficult. * to discuss “new kinds of collaborat- Based on recent Danish use cases – a ions” between DH researchers and curators study of online memorial sites and contem- of online collections; porary literature blogs – the presentation will * to present use cases from the Nordic explore how the Humanities can utilize ar- countries and beyond; chived web to formulate new research quest- * to encourage DH researchers’ feedback ions or, perhaps, revisit old ones. on the type of datasets and tools they would like to work with; Finland: Suomalainen verkkoarkisto * “to compare the collections across bor- As a part of its legal deposit duties, the Nat- ders”. ional Library of Finland annually collects a representative sample of webpages that have Introduction to the National Libraries *.fi or *.ax domain names and are located in Web Archives Finland, or contain subject matter that is tar- While all the Nordic National Libraries have geted to the Finnish public. Many news sites collaborated on developing tools and plat- (both open and paywall contents) are harve- forms that can be used by their respective sted daily. Domain crawls are supplemented web archives, the main difference between by theme and event based special harvests. them is related to legal frameworks and, The collections of the past years include for consequently, availability and access. example national elections (2008 – 2015), Eruption of Eyjafjallajökull volcano in Ice- Denmark: netarkivet.dk land (2010) and “European Refugee Crisis” Since 2005, the Royal Library and the State (2015). Social media contents (Twitter, You- and University Library in Aarhus (in 2017, Tube and Facebook) are harvested sele- the two institutions will be merged into the ctively, mostly as part of the thematic crawls. Royal Library of Denmark) has been re- The Finnish web archive was launched in sponsible for archiving the Danish Internet 2006. By the end of the year 2016, the size in accordance with the Legal Deposit law of the web archive was about 120 TB. Ar- passed by parliament December 2004. chived Finnish web is preserved in the Nat- ional Digital Library’s Digital preservation

29 service. Access to the index of harvested an open access archive, certainly more work URLs is open to anyone, but access to the can be done with the available material. archive itself is available only at special legal deposit workstations. The new user interface Norway: Norsk Nettarkiv will be opened on spring 2017, with meta- The National Library of Norway (NLN) data of – and easy access to all of the theme started harvesting the Norwegian top level and event based crawls and social media domain (.no) on a yearly basis in 2001. In contents. 2008, however, the National Library had to National Library has collaborated with stop full domain harvests since the Nor- DH researchers and research communities wegian Data Inspectorate questioned the in language detection of web pages (other legal basis for this practice. From 2008 until than *.fi- and *-ax -domains), in selecting now the National Library are harvesting be- seeds for some thematic crawls and mecha- tween 500–2500 subdomains under .no, af- nisms to collect social media contents. At ter informing the website owners in writing the moment, mining of the web archive first. itself hasn't been possible due to copyright A revised version of the Norwegian Act st law and data protection regulation, but the on Legal Deposit came into force 1 Janu- library wants to make collaboration with re- ary 2016 and it enables the NLN to do full search communities and other web archives domain harvests of the Norwegian top level to find best ways to make appropriate data domain (.no), as well as to collect websites sets of limited contents available for rese- outside the .no-domain that are either arch use. Also, the library wants to get feed- owned by Norwegian institutions or indi- back what is essential for DH researchers viduals, or adapted to Norwegian for their current needs – and what would be users. Importantly, the revised law also ma- essential for future researchers. kes it possible for the National Library to make the web archive available for research Iceland: Vefsafn.is and documentation purposes. Parts of the The Icelandic Web Archive (Vefsafn.is) con- archives will be open to the public. tains all websites hosted on the Icelandic NLN currently works to establish a more domain .is and many web sites hosted up to date solution for the web harvesting elsewhere that are in Icelandic or refer activity, that are flexible and that scales to directly to matters of interest to Iceland. Ac- handle the full Norwegian top domain. cess to the complete Web Archive is open to In terms of harvesting content, different the world except for web sites where the approaches have been followed since 2001: user must pay for access and web sites that 1) selective harvesting of web sites 2001- for some reason are closed by the owner's 2004 and from 2009; 2) domain crawls once request. or twice a year since 2002; 3) event har- The .is domain has been harvested by the vesting since 2001, for events of national National and University Library of Iceland interest, such as general and local elections, since October 2004 and the policy is to har- royal weddings etc. At present the archive is vest the complete .is domain three times a growing with about 175 TB a year. It is the year. In addition selected web sites are har- aim of the Norsk Nettarkiv to make part of vested at least weekly and for national events the archive available in open access which like elections relevant web sites are harve- will certainly create new research possibili- sted. Additionally, material from the Internet ties. Archive, covering .is from 1996-2004 is avai- Although access to the archive has limi- lable in the archive. tations, we see opportunities to collaborate The researchers from the University of with researchers in making derivative data- Iceland have used Vefsafn for projects ana- sets available. We have one such project go- lysing linguistic corpora. Given that this is ing now where we analyze the proportion of use of the two Norwegian languages (bok-

30 mål and nynorsk) to assist the Norwegian is identified as Swedish using geolocation. Language Council in their work. Harvesting is done roughly twice a year. A second collection comprises about 140 SWEDEN: Kulturarw3 newspapers with a daily issue. These are The National Library of Sweden (Kungliga harvested every day. The archive is open to biblioteket) started to harvest the web in everybody but only within the library. Kul- 1997 under the project Kulturarw³. Today turarw3 - The Web Archive of the National there are more than 1.7 billion items corre- Library of Sweden. sponding to approximately 72 terabytes of data. One part of the archive consists of Topics: Nordic Textual Resources and bulk harvesting of the Swedish web. The Practices collection includes both web servers located Keywords: web archiving, digital preservation, under the Swedish top level domain "se" and digital born content, collaboration, tools de- servers located elsewhere. This second part velopment, curation, use cases

31

32

LONG PAPERS

33 34 Body Parts in investigations uses feature sets of books that already is, or will be, made available to the Norwegian Books research community, so that the questions

and results reported here can be replicated Lars Bagøien Johnsen and done on different selections. Siv Frøydis Berg Of particular interest is trends in Nor- National Library of Norway wegian books during the period 1810 up to 2000 using the library’s n-gram viewer with a Introduction focus on the 20th century, while smaller se- In this presentation we will discuss questions lections of authors from the middle and later like how the human body is represented in part of the century is compared. literary works, and if there is a difference The set of body words we consider is a between fiction and nonfiction. We will also list of approximately 35 different body parts be looking at differences between authors, ranging from head to toe, in singular and gender and time. For example, do literary plural forms. In the present pilot study refe- works agree on the most frequent body rences to genitals, intestines or bodily fluids parts? are not taken into account, words related to One of the themes of this conference is those parts of the body are reserved for fu- visualization, and we will demonstrate how ture studies. These words are of particular augmented tabular data can be used as a tool interest when considering e.g. medical litera- in discovering patterns, and formulating hy- ture in comparison to fiction and religious potheses about them, which should be of writings (e.g. Forth & Crozier op.cit.). interest to scholars in the humanities. Anot- Nouns expressing body parts are counted her theme texts, and features of texts, which as they are, in addition to counts in the we use throughout this investigation. context of a possessive pronoun. Possessive Although the body in itself, in biological constructions in Norwegian differ from terms, hasn’t changed much in the recent Swedish and Danish in that in addition to history of humans, its presentation and “hans arm” (his arm) the possessor can be focus of particular details differs to a certain positioned behind a definite version of the extent across genre, time periods and aut- noun “armen hans” (literally the arm his). hors. A discussion of how the body is used However, modern Norwegian seems to pre- in modern culture is found in e.g. Christop- fer the latter construction for body part her E Forth; Ivan Crozier (2005). possession. We report on a pilot study that considers Each count is connected to metadata. In body references in fiction and compare it the case of the n-gram viewer, it is Nor- with nonfiction in the form of encyclope- wegian books across all genres, and in the dias. One specific hypothesis to be tested is case of authors the counts are linked to the if there is a difference between fiction and author, which are studied in two groups, a nonfiction in their frequency of references set of authors from the middle of the cen- to body parts using possessive pronouns like tury and a set from the later part. For his and her. encyclopedias the counts are connected to each encyclopedia in turn. This gives a col- Method and data lection of tables which we show in the next All the material is made available through section. the Norwegian National Library as a result of its digitization effort. Some part of the

35

Figure 1. For an interactive graph, see: https://www.nb.no/sp_tjenester/beta/ngram_1/

In addition to metadata, body words are is associated with, and are constructed from linked to other words in collocations, which books across the whole digitized collection. are constructed from concordances with a size of 6 preceding and 6 following words. Results These concordances form a basis for creating Here we show some of the results, and start so called collocational word clusters. For that off with a view from the n-gram viewer purpose we use the standard PMI measure which displays trends of n-grams up to (Pointwise Mutual Information) which here three. For parts of the top of the body, is computed by comparing the frequency of a consider the following graph (Figure 1), word in the concordances with its total fre- which shows a rising trend starting around quency within the whole book collection. the middle of the 20th century for some of The PMI value is also weighted with the the body parts. Each part is shown as a logarithm of the absolute frequency within a summation of its different morpho- concordance set, which will penalize rare logical and spelling variants. Note that words, especially those that come from “munn” (mouth) shows a relatively stable OCR-errors, while still not giving too much frequency (The n-gram viewer takes care of weight to high frequency words. For the capitalized nouns, which are typical for 19th methods described here, see e.g. Lewan- century Norwegian), while “øyne” (eyes), dowska-Tomaszczyk (2007), Romesburg “ansikt” (face) and “hår” (hair) have a clear (2004). These clusters give us a clue as to increase. what activity and quality a certain body part

Figure 2.

36

Figure 3 and 4.

While trends tell us a little bit, there may Now, since body parts also where be variation behind it. For that purpose, we counted in the context of possessive place it within the context of a selection of pronouns, we can use the gradient table to writers, starting with the modern set. Using show differences between writers and diffe- the tables of counts for each writer, and sor- rences within the possessive construction. ted on the total, the results look like this, The following table displays how “ansiktet” where the columns are labeled by surnames (face) differs between the masculine and fe- (Figure 2, see previous page). minine possessor. The table also suggest that The table is augmented with a gradient there is a connection to gender of writer, but color along the columns. One result that can if that connection is meaningful requires be read off this table is that writers do share further study (Figure 3). the same top reference to body parts, However, one result that does look solid “hodet” (head) as container, “eyes” and “an- is the connection between “hår” and the sikt” for expressions, and “hand” for action. gender of the possessor (Figure 4). Note that “hodet” (head) is either first or All have “håret hennes” ranked well second with all. above “håret hans”, except for Roy Jacob-

Figure 5.

37 sen, which have the counts close to each ot- Confusing the Modern her. The group of writers from the middle of Breakthrough: Naïve Bayes the 20th century show similar top six in Classification of Authors body parts. This group also contains Agnar and Works Mykle who was prosecuted for the novel “Sangen om den røde rubin” (http:// Peter M Broadwell www.arkivverket.no/arkivverket/Arkivverke UCLA Library, United States of America t/Riksarkivet/Nettutstillinger/ Skatter-fra- Timothy R Tangherlini arkivet/Sangenom-den-roede-rubin). What UCLA, United States of America can the table of counts tell us about the dif- ference between Mykle and other writers of The Modern Breakthrough is widely consid- the same era? In order to answer this, we ered to be one of the most important turn- look at words from the torso, considering a ing points in late nineteenth century Nordic relatively small set as seen in the table (Fi- literature, ushering in a period of literary ex- gure 5, see previous page). perimentation predicated on a pivot toward As is readily seen, Mykle do differ from naturalism. Georg Brandes’s iconic work, the others in key respects, his top words are Det moderne gennembruds mænd (1883), “hofte” (hip) and “brystene” (breasts) which provides a literary historical framework for he shares more or less with most of the ot- the consideration of the movement, outlin- hers, however, as we see from the table, ing in broad strokes the contours of this Mykle also moves down the torso to “mid- shift in literature and, in the portraits of a jen” (the waist) and “rumpen” (the butt), series of featured male authors, offering a and these two sets him apart from the rest. touchstone for broader understanding of For the encyclopedias, our hypothesis this movement. In 1983, Pil Dahlerup of- that these contain only references to nouns fered a corrective to Brandes’s work with without pronominal possessor appears to Det moderne gennembruds kvinder. Here, hold up. Dahlerup surfaces the numerous female au- thors who were writing groundbreaking Topics: Nordic Textual Resources and work in the shadows of the male-dominated Practices, Visual and Multisensory Represen- literary world. A great deal of scholarship on tations of Past and Present the Modern Breakthrough considers the rich Keywords: body, literature, genre, gender network of literary cross-influence that char- acterized the period. Influence, however, is a References complex phenomenon and one that is hard Barbara Lewandowska-Tomaszczyk (2007) to formalize. In the following work, we pro- Corpus linguistics, computer tools, and appli- pose to explore the related phenomenon of cations : state of the art P.Lang Frankfurt similarity, predicated on the notion that the am Main, New York most sincere form of flattery is imitation. To H Charles Romesburg (2004) Cluster analysis what extent do writers from this period for researchers, Lulu Pr. share aspects of language? Can we capture Christopher E Forth; Ivan Crozier eds. this sharing in a useful manner? (2005) Body parts : critical explorations in In earlier work, Leonard and Tangherlini corporeality Lanham : Lexington Books. (2013) showed how probabilistic topic mod- eling could be deployed to help discover sim- ilarities across the works of male and female authors of the period. Working at the level of the passage, they used a topic model of male Modern Breakthrough authors to identify passages from a large, poorly labeled corpus that exhibited topical similarity. In this case, the corpus consisted of all of the works in

38 Google Books written in Danish until 1923. sion” – where the classifier “fails” in assign- With this approach, they were able to con- ing all passages to their original grouping – firm Dahlerup’s identification of numerous suggest significant overlaps in style and con- female Modern Breakthrough authors. tent within or between authors’ oeuvres, In this work, we focus specifically on the which we compare to the output from the authors identified by Brandes and Dahlerup text similarity calculations. Such compari- as Modern Breakthrough authors, with their sons enact a fundamental principle of the works constituting a well-defined corpus. “macroscope” as introduced by Katy Börner Extending work by Broadwell, Mimno and (2011) and extended to the humanities by Tangherlini on the classification of folk leg- Tangherlini (2013), namely the greater de- ends (2016), we develop a “hold one out” gree of insight made available when one can Naïve Bayes classifier trained on the ma- switch rapidly between multiple analytical chine actionable works of the authors in our perspectives on complex cultural phenome- corpus, and also run standard text-similarity na. A related “macroscopic” feature of our calculations on the corpus, including LDA analysis is the ability of the Naïve Bayes clas- topic inference and cosine similarity based sifier interface to “drill down” to investigate on TF-IDF scores for unigrams, bigrams, a specific passage and even view the words and trigrams. that were most influential in assigning it to a The multilingual nature of the corpus rais- category other than its original category. es numerous problems that we are unable to We visualize the alternate classifications address in a sophisticated manner with this of the authors’ works via an interactive con- work. To get around the problem of curating fusion matrix, in which the sizes of the dots a single-language representation of the Mod- drawn on the cells of the matrix indicate the ern Breakthrough that would likely miss a number of passages with the actual “label” great deal of interesting overlap, we have (author or author+work) in the same row on chosen to consider Danish works as well as the vertical axis that were assigned by the Danish translations of Swedish and Norwe- classifier to the proposed label at the same gian works. Similarly, to avoid problems of column on the horizontal axis (Figure 1, at classifier failure based on orthographic differ- the end of the text). For instance, a passage ences, we have normalized the Danish in the- from Herman Bang’s Ved Vejen may be se works to comply with the orthographic properly classified by the NB classifier as a conventions of 1948 (Hartvig Frisch). Herman Bang passage, or it may be assigned To make its results useful for literary to another author in the corpus. The strong scholars, we present our analysis at two levels diagonal of blue circles that emerges in these of aggregation. On the first level, we aggre- visualizations represents those passages that gate all of the works of a particular author the NB classifier has placed into the ex- into a single grouping. On the second level, pected category. The presence of red dots each work (e.g., a novel) is a single grouping. off the main diagonal indicates where pas- Consequently, users can explore the varying sages have “confused” the classifier. By se- levels of overlap between all authors in the lecting a red dot in the matrix, the user is corpus and among all works in the corpus. taken to a list of passages labeled in one This significantly complicates the previous manner and classified in another manner. A binary of male-female authors, and allows for list of words associated with the classifica- various alternative groupings of authors. tions at the top of the page allows one to During our analysis, each machine- understand the words that the classifier finds actionable work is chunked into 500-word significant, from the original label at the be- passages after applying basic orthographic ginning of the list to the new label at the end normalization. We then run the passages in of the list (Figure 2, at the end of the text). groupings as described above through the The results of the text cosine and LDA Naïve Bayes classifier and text similarity cal- topic similarity comparisons are visualized culations. Instances of classification “confu- via a similarity matrix, analogous in format

39 to the confusion matrix, with the degree of Brandes, Georg. 1883. Det moderne gen- shading in each cell x,y indicating the simi- nembruds mænd. København: Gylden- larity of the full texts associated with column dal. x and row y (Figure 3, at the end of the text). Broadwell, Peter, David Mimno and Timo- Such matrices can also be converted to dis- thy R. Tangherlini. 2016. “The Telltale tance plots where points (representing texts) Hat: Surfacing the Uncertainty in Folk- are placed closer together when they are lore Classification.” Journal of Cultural more similar, although the two-dimensional Analytics 1 (2). In process. nature of the plots can obscure important Dahlerup, Pil. 1983. Det moderne gen- relationships (Figure 4, the end of the text). nembruds kvinder. København: Initial experiments with a subset of the Gyldendal. Modern Breakthrough authors indicate a low Leonard, Peter and Timothy R. Tangherlini. degree of inter-author similarity and confu- 2013. “Trawling in the Sea of the Great sion, with the F-score of the Naïve Bayes Unread: Sub-Corpus Topic Modeling classifier averaging over 95 % for each au- and Humanities Research.” Poetics 41 thor “label” when only the works’ authors (6): 725-749. are considered. Further dividing each au- Tangherlini, Timothy R. 2013. “The Folk- thor’s output into his or her individual lore Macroscope: Challenges for a works yields more intriguing results, show- Computational Folkloristics.” The 34th ing a considerable degree of classifier confu- Archer Taylor Memorial Lecture. West- sion and overall text similarity among certain ern Folklore 72 (1): 7-27. authors’ works (primarily Bang and Pontop- pidan in our sample), but generally quite low Bibliography confusion between the works of different Broadwell, Peter and Timothy R. Tangher- authors. Visualizations of the text similarity lini. 2016. “GhostScope: Conceptual matrices echo these results. Mapping of Supernatural Phenomena In further work, we plan to use moments in a Large Folklore Corpus.” In Maths of “misclassification” and overlap between Meets Myths: Quantitative Approaches authors and within the works of a single au- to Ancient Narratives, edited by Raph thor to develop a better understanding of sty- Kenna, Máirín MacCarron, and Pádraig listic similarity and possible influence among MacCarron, 131-157. Cham, Switzer- the authors of the Modern Breakthrough. In land: Springer International. particular, incorporating a temporal dimen- Broadwell, Peter, David Mimno and Timo- sion into these analyses may help to estimate thy R. Tangherlini. 2016. “The Telltale authorial influence by determining whether Hat: Surfacing the Uncertainty in Folk- the classificatory “confusion” of a given au- lore Classification.” Journal of Cultural thor’s texts favors works by the authors that Analytics 1 (2). In process. are considered to have influenced them. Al- Broadwell, Peter, Timothy R. Tangherlini ternately, such an analysis can suggest in- and Hyun Kyong Hannah Chang. 2016. stances of text similarity and potential influ- “Online Knowledge Bases and Cultural ence that extend or even contradict accepted Technology: Analyzing Production narratives of Nordic literary history. Networks in Korean Popular Music.” In Series on Digital Humanities 7, 369- Topics: Nordic Textual Resources and 394. Taipei: National Taiwan University Practices Press. In process. Keywords: modern breakthrough, classificat- Broadwell, Peter and Timothy R. Tang- ion, influence, text similarity herlini. 2016. “WitchHunter: Tools for the Geo-Semantic Exploration of a Works cited Danish Folklore Corpus.” Journal of Börner, Katy. 2011. “Plug and Play Macro- American Folklore 511 (Winter 2016): scopes.” Communications of the ACM 14-42. 54 (3): 60-69.

40 Figure 1. Confusion matrix of a subset of Danish-language works by authors from the Modern Bre- akthrough, with the size of the dots indicating the number of passages from each author and work om the horizontal axis that were categorized by a Naïve Bayes classifier as belonging to the author and work on the corresponding row of the vertical axis.

41 Figure 1. Confusion matrix of a subset of Danish‐language works by authors from the Modern Break- through, with the size of the dots indicating the number of passages from each author and work on the horizontal axis that were categorized by a Naïve Bayes classifier as belonging to the author and work on the corresponding row of the vertical axis.

Figure 2. Detailed “drill-down” view of the text passages from a single work (Bang’s novella Stille Eksis- tenser from 1886) identified by the Naïve Bayes classifier as most likely belonging to Bang’s novel Ved Vejen (also written in 1886). The color-coding of the words indicates that the classifier considered reddish words to be more closely associated with Stille Eksistenser, while the blue-tinted words are more closely related to Ved Vejen.

42 Figure 3. A text similarity matrix for the same works as in Figure 1, based on the cosine similarity of the TF‐IDF weights of the unigrams, bigrams, and trigrams in each work. Note the resemblance to Figure 1.

43 Figure 4. A text clustering plot of the works from Figures 1 and 3. The distance between the points (works) is indicative of their textual similarity as calculated for the similarity matrix shown in Figure 3.

44 Topical Discourse Networks: this context since they are themselves domi- nant figures within the discursive format- Methodological Approaches ions. Among the most relevant actors in the to Turkish Foreign Policy in field of Turkish foreign policy towards Sub- Sub-Saharan Africa Saharan Africa are certainly the Turkish foreign aid agency TI KA and the Turkish Fabian Brinkmann bureau of religious affairs Diyanet, which Ruhr-University Bochum, Germany both are increasingly active in Africa (See for example Ali 2011). For the Presidency for The Republic of Turkey undoubtedly Turks Abroad and Related Communities changed in 2002 when the Adalet ve Kalkinma (YTB) there also can be identified first ap- Partisi (Justice and Development Party, proaches towards a more structured enga- AKP) came into power after a period of gement with the African continent (Öktem economical and political struggle. With the 2014). Besides these state-political actors end of the Cold War the Turkish Republic non-state actors will have to be taken into had been in a geopolitical situation, which account as well. In this context the African had an enormous impact on its self- engagements of Turkish NGOs, for example perception and its international relations- in the field of development aid, will have to hips. In a new geopolitical environment be named. In a similar way the Turkish eco- Turkey was searching for a new interpretat- nomic organizations (for an overview see ion of its political role after it has lost the Seufert 2012) and the Think Tanks that are geopolitical standing it had during the Cold both part of the first approaches towards War. The search for a new geopolitical role cooperations with African countries should hugely influence the foreign policy of (Uchehara 2008) are important players in the Turkish Republic in the coming years this discursive field. Thus, the networks and the foreign policy of Turkey changed between the different state and non-state under the new circumstances. Developing actors will have to be considered to provide the concept of ‘Strategic Depth’, which loca- an overview of the structures of the Turkish ted Turkey at the centre of a ‘Eurasian- foreign policy discourse about Sub-Saharan African landmass’, Turkish foreign policy Africa. became increasingly diversified, although Based on ongoing research this paper will whether this has been a dramatic reorientat- present how the Structural Topic Model, a R ion or just a series of gradual shifts still package developed by Roberts et al (2011, remains a subject of debate (See for example 2013, 2014, 2015, available online at structu- Bagdonas 2012). raltopicmodel.com), can be used to uncover In this context, A lot has been written discursive structures and (discourse) about the Turkish interests in the Balkans, networks of actor. It will describe the the Caucasus, Central Asia and the Middle attempt to untangle the different political, East, but still the political and economic in- economic, anticolonial, religious, historical terests of Turkey in Sub-Saharan Africa and cultural discourses across intertwined remain an underdeveloped part of the di- actors via the mass data approach of Topic scussions about Turkish foreign policy. Modeling across different covariates in a Against the background of these political diachronic and synchronic way. developments a closer look at the argumen- Structural Topic Modeling (or STM) ena- tations, rhetorics and discourses behind the bles the researches to do additional things Turkish dedication towards Sub-Saharan Af- compared to other topic modeling appro- rica seems in order to provide an encompas- aches It allows for the inclusion of metadata sing overview about the Turkish foreign po- (i.e. date, actor, etc.) in the model. This can licy towards Africa. be done in two ways: topical prevalence and Naturally, the different actors of this par- topical content. Topical prevalence allows us ticular political field cannot be ignored in to look at the influence of the metadata on

45 the frequency of a topic (i.e. “Is a topic di- lund 2013; Mimno 2012). In discourse analy- scussed more in one year as compared to sis the term ‘topic’ has also been used by another), while topical content allow us to some scholars. Thus, Haslinger (2006) observe how a particular topic is discussed points out that topics could be understood (i.e. does a specific actor use different words as complexes of meaning that are talked than another). It can also be used to uncover about with different opinions. He does this the latent topic structures of documents. in order to define specific discursive proces- Thus, it can show how specific topics show ses other than ‘discussion’ or ‘debate’. He up closely tied to other topics in a given argues that topics are the foundation of the corpus. Using these three aspects of STM it structure of every form of communication becomes possible to discover diachronic and thus have to be a part of discourse ana- changes in topics and across actors as well as lysis. Thus, this paper will try to show the the topic structuring in the discourse of the methodological possibilities Structural Topic examined documents. (Wang 2011) The in- Modeling gives the researcher to undertake a clusion of metadata allows the researcher to structured analysis of discursive networks, build new and expanded question into his debates and discussions across a multitude research and make better inferences about of intertwined actors. relevant issues in the data of the corpus. All this will be done with the already laid This mostly theoretical paper will show out example of Turkish foreign policy in how these aspects of Structural Topic Mo- Sub-Saharan Africa. These discourses are of deling can be operationalized for an encom- particular interest, since the Sub-Saharan passing discourse analysis of actors in inter- space is a new field of Turkish foreign policy twined networks. It will show methodologi- and the political and societal discourses that cal-theoretical approaches derived from support these new developments have only Critical Discourse Analysis, especially the been just developed in Turkish politics. concept of discourse strands (thematically Thus, different actors vie for discursive in- consistent trends of discourse, which regu- fluence and power over this concrete policy larly appear in an overall societal discourse) field. Topic Modeling is able to uncover by Siegfried Jäger (M. Jäger/S. Jäger 2007: 25) these discursive conflicts and differences. and the Discourse-Historical Approach of the Vienna School of Critical Discourse Ana- Topics: Visual and Multisensory Representat- lysis by Ruth Wodak. It aims at showing on a ions of Past and Present methodological-theoretical level the op- Keywords: Turkey, foreign policy, topic mo- portunities of STM can be used to identify deling, network analysis, discourse analysis changes, similarities and differences between these actors in a discursive network. Bibliography (selection) Based on a solely Turkish corpus of all Ali, Abdirahman: Turkey’s Foray into Africa. available documents, texts and utterances A New Humanitarian Power?, in: In- (such as for example press releases, activity sight Turkey 13/4 (2011), S. 665-73. reports, journals, speeches, etc.) published Bagdonas, Özlem Demirtaş: A Shift of Axis by the actors of this policy field, which have in Turkish Foreign Policy or a Market- already been laid out above, the presented ing Strategy? Turkey’s Uses of its project aims at two things: a) uncovering the ‚Uniqueness‘ vis-à-vis the discursive macrostructures through the use West/Europe, in: Turkish Journal of of Topic Modeling, and b) investigating the Politics, 3/2 (2012), S. 111-132. discursive networks between different politi- Brauer; René/Fridlund, Mats: Historizing cal actors by taking into account various Topic Models. A Distant Reading of document metadata. Topic Modeling Texts within Historical Topic Modeling provides a way of ‘di- Studies, in: Cultural Research in the stant reading’ of documents uncovering Context of ‘Digital Humanities’. Pro- topics and topical structures (Bauer/ Frid- ceedings of International Conference 3-

46 5 October 2013, St. Petersburg 2013, S. http://scholar.harvard.edu/files/bstew 152-163. art/files/stm.pdf). Haslinger, Peter: Diskurs, Sprache, Zeit, Seufert, Günter: Außenpolitik und Selbst- Idendität. Plädoyer für eine erweiterte verständnis. Die gesellschaftliche Fun- Diskursgeschichte, in: Eder, Frank X. dierung von Strategiewechseln in der [ed.]: Historische Diskursanalysen. Ge- Türkei, Berlin 2012. nealogie, Theorie, Anwendungen, Uchehara, Kieran E.: Continuity and Change Wiesbaden 2006, S. 27-50. in Turkish Foreign Policy Toward Afri- Jäger, Margarete/ äger, Siegfried: Deutungs- ca, in: Akademik Bakış 3 (2008), S. 43- kämpfe. Theorie und Praxis Kritischer 64. Diskursanalyse, Wiesbaden 2007. Walker, Joshua W.: Turkey’s Global Strate- Mimno, David: Computational Historiog- gy: Introduction: The Sources of Turk- raphy. Data-Mining in a Century of ish Grand Strategy – the ‘Strategic Classic Journals, in: ACM Journal of Depth’ and ‘Zero-problems’ in context, Computing in Cultural Heritage 5/1 in: Kitchen, Nicholas [Hg.]: IDEAS re- (2012), S. 3:1-3:19. ports – special reports (2011), S. 6-12. Öktem, Kerem: Turkey’s New Diaspora Wang, Xuerui: Structured Topic Models. Policy. The Challenge of Inclusivity, Jointly Modeling Words and their Ac- Outreach and Capacity (Istanbul Policy companying Modalities, Amherst 2009. Research Paper), Istanbul 2014. Roberts, Margaret E. et al.: stm: R Package for Structural Topic Models (Working Vectors or Bit Maps? Brief Paper), 2011. (Available online under https://github.com/bstewart/stm/blo Reflection on Aesthetics b/master/vignettes/stmVignette.pdf?ra of the Digital in Comics w=true). Roberts, Margaret E. et al..: The Structural Daniel Brodén Topic Modek and Applied Social Sci- University of Gothenburg, Sweden ence, presented on: Advances in Neural Information Processing Systems Work- In recent years, researchers from various shop on Topic Models: Computation, disciplines have contributed to the emerging Application, and Evaluation, 2013. study of the digital in comics (see Goodbrey (Available online under 2013; Digital Humanities Quarterly 2015). http://scholar.harvard.edu/files/bstew Scholars from film and media studies, for art/files/stmnips2013.pdf). example, have demonstrated the uses of film Roberts, Margaret E. et al.: Navigating the theory that deals with circulation (produc- Local Modes of Big Data. The Case of tion, distribution and consumption) for Topic Models, presented on: Data Ana- thinking about digital comics and web com- lytics in Social Science, Government ics (see Werschler 2011). Others have writ- and Industry, New York, Im Er- ten on issues of digital mediatisation, draw- scheinen. (Available online under ing on theories of adaptation and animation http://scholar.harvard.edu/files/dtingl (see Burke 2014). However, scholars have ey/files/multimod.pdf). shown less interest in how film theory can Roberts, Margaret E. et al.: Structural Topic be useful to explore the aesthetics of the dig- Models for Open-Ended Survey Re- ital in comics (for another study of media sponses, in: American Journal of Politi- forms in digital humanities that utilizes film cal Science 58/4 (2014), S. 1064-1082. theory, see Ng 2015). Roberts, Margaret E. et al.: A Model of Text The aim of this paper is to briefly reflect for experimentation in social sciences on this topic, drawing on Sean Cubitt’s pro- (Working Paper), 2015. (Available lific study on the history of moving images online under from a digital perspective, The Cinema Effect

47 (2004). Cubitt’s book is grounded in the flections on the aesthetics of the digital in concepts pixel, cut and vector. Concisely comics. put, pixel describes the cinematic image’s appearance. Cut concerns how images are Digital Colouring and Multimedia Styles organised and differentiated through a film. I will draw on two examples from main- Schematically transferred to the medium of stream comics. The first one concerns the comics, theses concepts may designate the breakthrough of digital colouring in the mid- visual elements of images and grids, respec- 1990s. In 1990 Frank Miller, the auteur be- tively. My interest lies in the vector, which hind iconic graphic novels such as Dark concerns the relation between the image and Knight Returns (1986), collaborated with artist the interpretative mind. In computer Dave Gibbons of Watchmen (1986–1987, au- graphics the vector is a line drawn from the thor Alan Moore) fame, on the dystopian centre of the screen, connecting pro- action/satire series Give Me Liberty for Dark grammed points and existing only tempo- Horse Comics. A sequel, Martha Washington rarily as it leaves behind trails of light. Cubitt Goes to War, was published in 1994 and Gib- utilizes the analogy of the disappearing line bons’ idiosyncratic style, a combination of to conceptualize thinking in the cinema as a cartoonish and mundane realism, character- vector-like process that links images in space ized both runs. But there were also signifi- and time, drawing on the viewer’s experi- cant differences. For example, on Martha ence of the flow of images. However, writ- Washington Goes to War Gibbons used a less ing on digital effects driven Hollywood cin- gritty style, a choice that tied in with Miller’s ema Cubitt also uses the concept of the bit more fantastical, high-concept storytelling. map to argue that what he regards as the However, what is most interesting here is vector’s principle of openness has moved that whereas the colours in Give Me Liberty into something more fixed. Through the bit were hand-painted on watercolour paper map he describes how cinema in the digital- with the then-advanced blue-line method, era has become not only more visually spec- computer rendering was used in Martha tacular but also more composed and con- Washington Goes to War. Simply by looking at trolled, not least since serendipity is harder the clean, smooth colour schemes the atten- to achieve on a computer, an instrument of tive reader could see that digital graphics was precision (2004: 251). used. Given the limited space of the paper, it is hard to address the complications of export- ing theoretical concepts from one medium to another or the fundamental differences between cinema and comics. Nor will I en- gage with the ideological argument Cubitt develops concerning the qualities of the bit map. However, it should be noted that he presents his concepts in response to a medi- um with some kind of indexical relation to reality, a relation created by the cinemato- graphic apparatus that seems to capture events. But as Cubitt himself writes, digital cinema and other digital media do not pri- marily refer, they communicate (2004: 250). The same could be said about comics (both pre-digital and digital) and I simply want to Figure 1. Give Me Liberty (1990) use Cubitt’s concepts, the vector and the bitmap, in order to tease out some brief re-

48 the two principles no longer lie on “the fa- miliar axis of verisimilitudinous painting and abstraction but along a line stretching from cartography at one end to architecture at the other. Somewhere in between lie the fields of virtual sculpture and computer-aided de- sign manufacture” (ibid). This idea seems somewhat pertinent to Bendis and Maleev’s Daredevil run. Maleev has been described as one of a new, graphically astute breed of multimedia artists, who incorporate painting, drawing and cartooning as well as photog- raphy, collage and computer effects to ex- Figure 2. Martha Washington Goes to War pand the visual universe of their works (1994) (Schumer 2005). Combining an angular, sketchy approach and photorealism (work- To some extent, this ties into how Cubitt ing from photographs of models and city- associates the bit map with a more precisely scapes), he has crafted a style, which goes composed aesthetic universe. In Martha beyond established modes of realism in Washington Goes to War there is even, argua- mainstream comic books, yet retaining an bly, a visible tension between Gibbons’ old- organic feel. fashioned hatched line drawings and the slick, heavily graded colour palette. One can discern similar tensions in later works by Gibbons, such as the subsequent Give Me Liberty runs (1995–2007) or the high-concept spy thriller/parody The Secret Service (2012, author Mark Millar), and conspicuous uses of digital colouring have generally become a prominent element in mainstream comics. It is worth pointing out that the aesthet- ics of digital colouring depends on the ap- proach. The use of rendering, which has a certain three-dimensional feel, differs from a flat colouring approach, which tends to have more of an old-school feel. Moreover, most artists working in mainstream comics today combine digital and classic hands-on ap- proaches. For example, on The Secret Service Gibbons used watercolour brushes and In- dian ink as well as digital graphic tablets. Figure 3. Daredevil no. 62 (2003) My second example concerns this kind of hybrid aesthetics; writer Brian Michael Ben- It is tempting to mainly describe Maleev’s dis and artist Alex Maleev’s acclaimed run of images in terms of a higher degree of real- Marvel Comics super-hero book Daredevil ism. Arguably, they have another quality of (2001–2006). Here, I should stress a point verisimilitude compared to, for example, the Cubitt makes; that the “distinction between artwork of Frank Miller and David Mazzuc- bit map and vector […] so dear to first-year chelli’s defining Daredevil mini-series Born classes in computer graphics, is now ap- Again (1986), which is also characterized by proaching obsolescence” (2004: 249). Ac- gritty and realistic but nevertheless non- cording to Cubitt, the differences between photorealistic stylization.

49 some differences between pre-digital and digital aesthetics in comics might also lie somewhere along a line that stretches to- ward absolute precision and control.

Topics: The Digital, the Humanities, and the Philosophies of Technology Keywords: comics, digital aesthetics

Bibliography 1. Burke, L. (2016): “Sowing the Seeds: How 1990s Marvel Animation Facilitated Today’s Cinematic Universe”, M J McEniry, R Moses Peaslee & R G Weiner (eds), Marvel Comics into Film: Essays on Adaptations Since the 1940s, London: McFarland, 2. Cubitt, S.(2004). The Cinema Effect. Cam- Figure 4. Daredevil no. 229 (1986) bridge, Mass: MIT Press. 3. Digital Humanities Quarterly (2015), 9:4. [Is- However, Cubitt’s claim that the difference sue on digital comics] between the bit map and the vector should 4. Goodbrey, D. (2013). ”Digital Comics: not simply be described from an axis of veri- New Tools and Tropes”. Studies In similitudinous painting and abstraction Comics, 4:1. complicates matters. Though it would not be 5. Ng, J. (2015). ”The Cut between Us: Digi- accurate to place Maleev’s images in a field tal Remix and the Expression of Self”. equivalent to the one Cubitt associates with In Svensson, P. & Goldberg, D T. digital cinema (in between virtual sculpture (eds), Between Humanities and the Digital. and computer-aided design manufacture), as Cambridge, Mass: MIT Press. the analogies become a little bit “off” in the 6. Schumer, A. (2005). “Super-hero Artists context of the comics medium, it neverthe- of the Twenty-first Century: Origins”. less seems reasonable to propose that Ma- In Dooley, M. & Heller, S. (eds), The leev’s imagery exists in an aesthetic border Education of Comics Artists. New York: zone in which Miller and Mazzucchelli’s pre- Allworth Press. digital Daredevil mini-series do not. 7. Werschler, D. (2011). “Digital Comics, Circulation, and the Importance of Conclusion Being Eric Sluis”. Cinema Journal, 50:3. The aesthetics of the digital has brought new visual elements to comics, such as computer rendering of colours, but also hybrid, multi- Multilingual Clusters and media aesthetics, which do not necessarily on the surface seem that different from the Gender in Nordic Twitter ones of yesterday. It would probably not be impossible to create images with similar Steven Coats qualities to those of Maleev’s without the University of Oulu, Finland use of computers, but arguably it would re- quire more work and time. In this perspec- Recent years have seen an increase in the tive, what digital technology has perhaps en- relative prominence of computer-mediated abled is an economy within the comics in- communication (CMC) modalities such as dustry with which spectacular images can be texting, instant messaging, or posting on so- manufactured (c.f. Cubitt 2004: 248). But cial media, and platforms such as Twitter have become multilingual sites with global

50 representation (Mocanu et al. 2013; Leetaru media or communicative contexts, and the et al. 2013). At the same time, population attitudes of speakers towards the use of movements and changes in education and English in the Nordic countries. For exam- media consumption have contributed to- ple, in Iceland, the majority of Icelanders are wards an increasing bi- and multilingualiza- exposed to English every day, while 21% of tion of local environments -- trends that are Icelanders report speaking English daily particularly evident in the Nordic countries. (Arnbjörnsdóttir 2011). Norwegians are re- National languages continue to receive ported to be essentially diglossic (Rindal reinforcement in education and state media, 2010; Rindal and Percy 2013), and re gener- but bilingualism with English has become ally unperturbed by the prospect of English the norm in Fenno-Scandia, while greater displacing Norwegian in Norway (Sandøy population mobility and demographic 2010). For Sweden, Bolton and Meierkord changes have contributed to increased lin- (2013), for example, attest that while Swe- guistic diversity in the population. dish remains the “preferred language… in Large-scale quantitative studies of multi- most domains” (93), English is dominant in lingualism on CMC and Twitter (e.g. Ronen academia and business. Similar findings are et al. 2014, Hale 2014) have shed light on reported for Finland in the results of an ex- multilingual networks globally, and the ways tensive survey into the use of English in the in which Twitter language use can pattern country: for Finland, English has become “a with gender expression have also been inves- language used in many domains and settings tigated in linguistics and natural language within Finnish society” (Leppänen et al. processing research (e.g. Bamann et al. 2014, 2011: 16). Burger et al. 2011, Rao et al. 2010). A number of studies of CMC and Twitter In this study, online multilingualism in language have investigated aspects of Eng- the Nordic countries is investigated by lish, including phenomena such as the dis- means of a quantitative analysis of geo-lo- course functions of hashtags (Wikström cated Twitter messages. The research inves- 2014; Squires 2015), lexical innovation in tigates the following questions: Which lan- American English (Eisenstein et al. 2016), guages are favored by multilingual users in grammatical varation in English-language the Nordics? To what extent are the Nordic Twitter from Finland and the Nordic coun- languages used by multilingual Twitter users tries (Coats 2016a, 2016b), or the interaction located in Fenno-Scandia? What role is the between demographic parameters such as global language English playing in this gender with lexical and grammatical features multilingual landscape? and finally, what are in American English (Bamann et al. 2014). the similarities and differences between the For multilingualism, Ronen et al. (2014) multilingual networds of male and female compared the worldwide influence of lan- users? guages by analyzing networks of bi- and In a first step, the online linguistic behav- multilingual book translations, Wikipedia ior of bi- or multilingual persons using Twit- author editors, and Twitter users, and found ter in the Nordic countries is investigated that English plays an important central role. according to location and gender. In a se- Hale (2014) investigated global multilingual cond step, the influence of the languages networks on Twitter, including the network themselves is analyzed by looking at the ag- associations of retweets and user mentions, gregate network behavior of language users and found that while most interaction net- according to gender. What does the struc- works are language-based and English is the ture of the networks of multilingual users most important single mediating language, tell us about the current state of the Nordic other languages collectively represent a larg- languages and their future prospects? er bridging force. Eleta and Golbeck (2014) demonstrate that multilingual users' language Previous Work choice on the Twitter reflects the predomi- Much research has investigated the status of nant language of their social networks. While English, the extent of English use in various

51 it has been found that users of less repre- guages and females for 41. The connection sented languages are more likely to switch strength between languages i,g was quanti- languages and that English has become the fied using the phi coefficient, calculated central mediating language, the interaction from a contingency table of the number of of multilingualism with gender in has not yet bilinguals. been subject to research attention. Phi is equivalent to Pearson's product- moment correlation coefficient for two bi- Data Collection nary variables, and ranges in value from -1 Tweets with populated place attributes were to 1. Positive values indicate the language collected from the Twitter Streaming API in pairs are more strongly connected than November 2016 using the Tweepy library in would be expected based on the prevalence Python (Roesslein 2015). The country_code of the languages in the multilingual dataset. attribute was used to filter for only those A t-statistic was calculated to test the sig- tweets originating from the Nordic countries nificance of the correlation between lan- (including Åland and the Faroe Islands). guages. Links between languages that were statistically significant at p<0.1 were retained Language Determination in the multilingualism network. Since March 2013 Twitter objects include an The network relationship between lin- automatically detected languge field, lang, guistic communities was represented by an determined on the basis of probabalistic N by N matrix of the number of bilingual matching of byte sequences in various lan- users, where N represents the number of guage training data. Like other automatic languages. To reduce the number of false language detection modules, the Twitter al- positives due to language misidentification, gorithm performs poorly on very short sen- only connections with fewer than bilingual tences. For this reason, tweets whose lan- users were considered. guage was reported as “undefined”, as well Network relationships were visualized us- as those with fewer than 6 word tokens, ing the R packages igraph and visNetwork were removed in a further filtering step. In (Csardi and Nepusz 2006; Almende and total, the multilingual database consisted of Thieurmel 2016). 296,437 tweets by 33,347 unique users in 51 languages. Results Gender was disambiguated on the basis In terms of overall language representation, of name lists provided by the statistical of- and in accord with earlier findings (Mocanu fices of the Nordic countries. The most ex- et al. 2014; Coats 2016b), English is the tensive name information was available for most prevalent language in the data, with Denmark, while public information available approximately 32% of the data in English, for the other Nordic countries was some- followd by 26% in Swedish, 13% in Finnish, what less extensive. 5,277 male and 6,095 6% in Norwegian, 5% in Danish, and 2% in given names from the lists were matched Icelandic. with the value of the char_name attribute Overall, 6.2% of the users in the com- for each unique user in the dataset. The plete data set qualify as multilinguals. For the method assigned gender to approximately gendered subcorpora, 6.49% of male users 65% of the tweets collected from the target and 6.42% of female users fulfilled the crite- area. ria for multilingualism. Taken together, these findings match well with those reported by Quantification of Bilingualism Strength Hale (2014), who reported that that 11% of A user in the dataset was determined to be Twitter users in a global sample collected in bilingual for languages i,j if he or she had 2011 are multilingual. authored at least three tweets in both lan- A multilingualism network for the entire guages. Of the 51 languages in the dataset, Nordic region was created without taking male bilingual users were present for 40 lan- gender into account. Node size corresponds

52 to the number of multilingual users for a tudes towards the traditional languages of language. Edge width corresponds to the the region. For linguistic communities of strength of the connection (number of bilin- languages not traditionally present in the guals) for a language pair. Nordic countries, the network associations may reflect recent gender imbalances in mi- gration.

Topics: Nordic Textual Resources and Practices Keywords: computer-mediated communicat- ion, twitter, multilingualism, NLP

Bibliography Coats, Steven. (2016). Grammatical feature frequencies of English on Twitter in Finland. In Lauren Squires (Ed.), Eng- lish in computer-mediated communi- cation: Variation, representation, and change, 179–210. Berlin: de Gruyter Mouton. https://doi.org/10.1515/9783110490 817-009 Coats, Steven. (2016). Grammatical frequen- cies and gender in Nordic Twitter Englishes. In Darja Fišer and Michael For the region as a whole, a network of 22 Beißwenger (Eds.), Proceedings of the languages and 40 edges describes the statisti- 4th conference on CMC and social cally significant bilingual links. English clear- media corpora for the humanities, 12– ly plays the most important role: it is con- 16. Ljubljuana: U. of Ljubljana Aca- nected to all of the languages for which a demic Publishing. statistically significant phi value was calculat- (http://nl.ijs.si/janes/wp- ed. Swedish has the next highest number of content/uploads/2016/09/CMC- connections, connecting to 11 of the 22 lan- conference-proceedings-2016.pdf) guage nodes. Other Nordic languages have fewer active bilingualism links on Twitter: Denmark has 4, Norway and Finland 3, and The Prior-project: From Iceland is only connected to English. Multilingual networks were also created Archive Boxes to a for individual Nordic countries. In them, the Research Community principal national language(s) figure promi- nently, but English remains important as a Volkmar Engerer bridge between linguistic communities. Henriette Roued-Cunliffe The multilingualism networks created by Jørgen Albretsen gender for the Nordic region may reflect Per Hasle some cultural and demographic facts. While University of Copenhagen, Denmark English plays the central role in both the male and female clusters, the languages rep- Introduction resented as well as the strength of links be- A very important part of Digital Humanities tween languages are somewhat different for (DH) is the development, use and discussion males and females. It can be shown that for of digital research infrastructures within the Nordic languages, link strength for gendered humanities field. In fact, the notion of DH is clusters may reflect common cultural atti- often almost identified with this kind of en-

53 deavours. We ourselves think that such a tos of the archive material. This work is conception is too narrow and misses im- ongoing and has currently resulted in ap- portant points, but we shall not attempt a prox. 7000 photos, which have been reas- general discussion here. However, it is im- sembled in the PVL so that they mirror the portant to note that data and representations original documents, integrating a facility for within the humanities are often more transcribing them and adding user com- heterogeneous and more dependent on do- ments (PVL 2016). main expertise than are datasets within the STEM and in fact even the social sciences, Current Prior Virtual Lab whose datasets tend to be more regular and The restricted access has inspired the term more amenable to standardized tools for Virtual Closed Collaborative Community. storing, retrieving, analysing and visualising. To get access a potential new user must first In this paper, we present a DH research contact the project team and ask for login, infrastructure which relies heavily on a com- stating her or his areas of research and in bination of domain knowledge with inform- which ways that person’s work in the PVL ation technology. The general goal is to de- can add to the collaborative effort of velop tools to aid scholars in their interpre- publishing Prior’s Nachlass - the digital edit- tations and understanding of temporal logic. ion of Prior’s hitherto unpublished papers as This in turn is based on an extensive digi- well as other relevant material such as corre- tisation of Arthur Prior’s Nachlass kept in spondence between Prior and other resear- the Bodleian Library, Oxford. The DH in- chers, Prior’s notebooks and scrapbooks etc. frastructure in question is the Prior Virtual In PVL users can follow each other's pro- Lab (PVL). PVL was established in 2011 in gress and add comments to on-going order to provide researchers in the field of transcriptions (Albretsen et al. 2016b). After temporal logic easy access to the papers of an editorial process the transcribed texts are Arthur Norman Prior (1914-1969), and offi- made available in PDF format combined cially launched together with Prior’s with a prototype search facility (Nachlass Nachlass at the Arthur Prior Centenary Con- 2016). Those users who have contributed to ference at Balliol College, Oxford, in August the transcription are credited in the footno- 2014 (Arthur Prior Centenary Conference tes of each transcribed edition. Our experi- 2014; Albretsen et al. 2016a). ence to date is that the current PVL is in Prior was a distinguished logician, philo- need of enhanced search facilities combined sopher, and in his younger years also a theo- with underlying metadata structures in the logian. He is best known for his work on Nachlass. time and for being the founding father of modern temporal logic, beginning in New Digital Humanities project Zealand in the early 1950s. In 1956 he pres- Prior’s archive includes various documents ented his ideas at the John Locke Lectures in such as drafts of philosophical essays, letter Oxford. Following this he took up a profes- correspondence between Prior and other sorship in Manchester (1959-1965) and was scholars, or sudden ideas scribbled as hand- later appointed Reader in the University of written notes. These documents are current- Oxford and Fellow of Balliol College (1966- ly used as information sources about Prior’s 1969). Prior died, age 55, from a heart at- convictions, theories, his life, relations to tack, while on a lecturing tour in Norway colleagues etc. However, when digitised, (Priorstudies 2016). transcribed, and connected to each other Prior’s archive is now kept in the through a database structure, they become a Bodleian Library in Oxford and is still research object in their own right. Because subject to copyright. Following an agree- of this transformation, Prior scholars can ment between the team behind PVL, the now explore patterns in the structure of the Prior family, and the Bodleian Library, the documents that were not visible before. team has been permitted to take digital pho-

54 The further development of PVL is he- same time, and therefore regards index aded by a research group (the authors of this terms as closely related to the vocabulary abstract) at the Royal School of Library and used by specialists in their domain. The step Information Science, University of Copen- from traditional thesauri and classification hagen. This activity forms an important part schemes to ontologies of knowledge do- of the “Prior project”, which 2016 received mains integrates semantic web principles funding from the Danish Council for Inde- into the description of data and introduces a pendent Research | Humanities to carry out controlled language for knowledge represen- the research project The Primacy of Tense: tation with a built-in logic. This also makes it A.N. Prior Now and Then, duration three ye- possible to derive information which is not ars (DFF Grant 2016). The further deve- explicitly contained in the descriptive terms lopment of PVL will be split into work on themselves (Antoniou et al. 2012: 4). To be a the data repository and the interface as two bit more specific, let’s give an example. The separate entities. The team behind PVL are metadata established (or to be established) in to varying degrees Prior scholars, digital this infrastructure will contain not only stan- humanists, information scientists, and data- dardised metadata such as author, title, date, base engineers. Moreover, we have a vivid etc., but also and in fact more so domain exchange with the other project researchers specific metadata such as types of temporal as well as the users of PVL. The project aims logic, e.g. A-series and B-series logics, hybrid to combine this cross-disciplinary expertise logic, metric and non-metric tense logic, and in order to integrate community-specific so on. These notions can only be established practices of Prior scholars into the data by domain experts and not by general in- structures and interfaces of the digital tools formation specialists. At the same time, they they use. It must be said that the informat- are exactly the kind of metadata that makes ion behaviour of the users has not yet been it possible to search and chart the kind of studied systematically. Such a study is one of patterns that experts in the field are looking the points within our project plan. for.

Data repository New interface for PVL In order to extend the existing facilities of It is our goal to make the PVL a research the PVL it is necessary to offer a data and portal, where query results are presented to query structure that enables Prior scholars to scholars through an interface that facilitates explore the documents with varying and the identification of new relationships, iden- flexible parameters such as references to lo- tify patterns, and offer alternative ways of gicians, publications being mentioned and understanding and analysis. This work will theories discussed. The new PVL ar- build on concepts identified in Roued- chitecture aims to separate the data structure Cunliffe’s (2011) research on Decision Sup- from the interface and to develop a port Systems for the reading of ancient sustainable dataset that is suitable for both documents. This research examines digital new and future interface designs. It applies tools useful for the transcription, interpretat- traditional information science knowledge ion and publication of the Vindolanda Ta- (mostly generated in the library domain) to blets (2010) from the Roman occupation in the data repository, drawing on insights Britain. However, many of the conclusions from indexing theory, metadata research, are equally relevant for scholarship on other knowledge organization, information retrie- handwritten documents such as Prior. val, and theories of information seeking. Building the new interface comprises the A further refinement of the PVL’s data task of bringing together Prior community structure can be achieved by ontologies. The practices with system design, metadata information scientific concept of an onto- structure and the system’s affordances in logy encompasses the sphere of indexing terms of Prior researchers’ information terms and related search terminology at the seeking behaviour.

55 Conclusion Topics: The Digital, the Humanities, and PVL as well as the general website concer- the Philosophies of Technology ned with Prior’s work and his archive in the Keywords: digital epistemology, domain Bodleian Library has without doubt already analysis, ontology, research infrastructure, for quite some time been a useful DH infra- virtual closed collaborative community. structure for researchers. This is evident not least in many papers from the Arthur Prior References Centenary Conference, cf. (Albretsen et al. Albretsen, J., Hasle, P., and Øhrstrøm, P. 2016a), to which the Nachlass material made 2016a. Special Issue on The Logic available through PVL was crucial. Moreo- and Philosophy of A.N. Prior. Syn- ver, this infrastructure clearly could not have these. Volume 193 Number 11. Guest been developed without specific expertise edited by Jørgen Albretsen, Per Hasle, on temporal logic and Prior’s work. PVL is, and Peter Øhrstrøm. in all modesty, a showcase of how important http://link.springer.com/journal/112 humanities material kept in a research library 29/193/11/page/1. Retrieved No- can be digitised using domain knowledge vember 14, 2016. (and indeed only when using domain know- Albretsen, J., Hasle, P., and Øhrstrøm, P. ledge), and made available and useful for the 2016b. The Virtual Lab for Prior relevant research community, making up the Studies: An example of a Closed Col- Virtual Closed Collaborative Community. In laborative Community, DRAFT pa- this manner it also reflects an important cha- per. racteristic of many research infrastructures http://research.prior.aau.dk/anp/pdf for the humanities, namely a particularly /The_Virtual_Lab_for_Prior_Studies strong call for domain expertise for their _article_draft.pdf. Retrieved Novem- useful development. The perspectives for ber 14, 2016. taking PVL to its next level raises some new Antoniou, Grigoris, Groth, Paul, van Har- information scientific, not to say epistemo- melen, Frank & Hoekstra, Rinke. logical, issues of great importance. The de- 2012. A semantic Web primer. 3rd. velopment of a relevant ontology together Cambridge, Mass.: MIT Press. with search options and visualisations of se- Arthur Prior Centenary Conference. 2014. arch results as well as other PVL material is http://conference.prior.aau.dk/. Re- in fact not just about making powerful tools trieved November 14, 2016. available for research in temporal logic and DFF Grant. 2016. The Primacy of Tense: A.N. in Prior’s work; it is itself such research. The Prior Now and Then, funded 2016-2019 structure to be achieved is not neutral. It is by the Danish Council for Independ- itself a kind of theory about the internal ent Research | Humanities. coherence in Prior’s work, a “statement” DFF|FKK Grant-ID: DFF – 6107- about its overall architecture. This we intend 00087. http://ufm.dk/forskning-og- to elaborate in a longer ensuing paper, but innovation/tilskud-til-forskning-og- we hope to have established a convincing innovation/hvem-har-modtaget- case to the effect that our infrastructure tilskud/2016/bevillinger-fra-det-frie- does indeed form a sufficient basis for forskningsrad-kultur-og- studying some pertinent epistemological is- kommunikation-til-dff- sues for DH. forskningsprojekt-2-juni-2016. Re- trieved November 14, 2016. Nachlass. 2016. http://nachlass.prior.aau.dk. Retrieved November 14, 2016. Priorstudies. 2016. http://www.priorstudies.org . Retrie- ved November 14, 2016.

56 PVL. 2016. http://research.prior.aau.dk. Re- chen, GeSuS e.V., 31. Mai – 1. Juni trieved November 14, 2016. 2013 in Freiburg/Breisgau, Jena, pp. Roued-Cunliffe, H. 2011. A decision sup- 61 – 74. port system for the reading of ancient Engerer, Volkmar, (2014), „Thesauri, Ter- documents (Doctoral thesis). Univer- minologien, Lexika, Fachsprachen. sity of Oxford. Kontrolle, physische Verortung und https://ora.ox.ac.uk/objects/uuid:9d das Prinzip der Syntagmatisierung 547661-4dea-4c54-832b- von Vokabularen“, Information, Wis- b2f862ec7b25 . Retrieved November senschaft & Praxis, 65/2 (2014), pp. 14, 2016. 99 – 108. [BFI 1] Vindolanda Tablets. 2010. Vindolanda Ta- Engerer, Volkmar, (2012), „Informati- blets Online II. onswissenschaft und Linguistik. Kur- http://vto2.classics.ox.ac.uk. Re- ze Geschichte eines fruchtbaren in- trieved November 14, 2016. terdisziplinären Verhältnisses in drei Akten“, SDV – Sprache und Daten- Recent publications by the first author: verarbeitung. International Journal for Engerer, Volkmar (accepted 13 July, 2016): Language Data Processing, 36/2 “Control and Syntagmatization. Vo- (2012), pp. 71 – 91 (= Hermann Cöl- cabulary Requirements in Information fen (Hg.), E-Books – Fakten, Per- Retrieval Thesauri and Natural Lan- spektiven und Szenarien) guage Lexicons”, Journal of the Asso- ciation for Information Science and Technology. Mapping the Development of Engerer, Volkmar (im Erscheinen): „Infor- mationswissenschaft für Linguisten. Digital History in Finland Die Sprache des Information ret- rieval“ (Akten der Gesus- Mats Fridlund Jahrestagung in St. Petersburg, Russ- Petri Paju land, 2015). Aalto University, Finland Engerer, Volkmar (im Erscheinen): „Das Vokabular zwischen Sprach- und In- In 2015, the field of digital history in Finland formationswissenschaft“ (Akten der saw a tremendous development (see Par- Gesus-Jahrestagung in Brno, Tsche- land-von Essen 2016), and in many ways this chische Republik, 2016). has continued ever since. This paper pre- Engerer, Volkmar, (2016), “Exploring inter- sents work by the project “Towards a disciplinary relationships between lin- Roadmap for Digital History in Finland: guistics and information retrieval Mapping the Past, Present & Future Devel- from the 1960s to today”, Journal of opments of Digital Historical Scholarship.” the Association for Information Sci- The ongoing project was awarded by the ence and Technology, Article first Kone Foundation in December 2015 and published online April 4, 2016 (Early lasts for 12 months. On its steering board, view). DOI: 10.1002/asi.23684. the project involves several of the most ac- Engerer, Volkmar, (2014), „Indexierungs- tive Finnish digital historians who as a group theorie für Linguisten. Zu einigen na- also felt the need for and came up with the türlichsprachlichen Zügen in künstli- idea of the project amidst this fast develop- chen Indexsprachen“, in: Schönen- ment. Principal Investigator of the project is berger, Manuela, Volkmar Engerer, professor Mats Fridlund (Aalto University) Peter Öhl & Bela Brogyanyi (Hgg.) and Dr Petri Paju does the research. (2014), Dialekte, Konzepte, Kontakte. In this paper, we aim to discuss some key Ergebnisse des Arbeitstreffens der questions and challenges firstly in its own Gesellschaft für Sprache und Spra- work and goals, and secondly in the field of digital history research in Finland in general.

57 Further, we aim to place these Finnish de- The presentation could fit with the con- velopments within the context of the larger ference subtheme of The Digital, the Hu- digital humanities (as well as history) move- manities, and the Philosophies of Technolo- ment in the Nordic countries. gy. Our scholarly perspective comprises of The project work started in February research in the history of science and tech- 2016. Its information gathering, including nology and to some extend science and interviews, are carried on from March on- technology studies. Drawing on some of wards. To prepare historians for its inquiry, these research traditions, one recent inspir- the project organized a public Opening Sem- ing study has been Smiljana Antonijević’s inar titled “Digital History in Finland: Possi- book Amongst Digital Humanists: An Eth- ble Futures” in Helsinki 15.4.2016. It fea- nographic Study of Digital Knowledge Pro- tured expert speakers from history’s neigh- duction (2015). boring disciplines (archeology and historical linguistics) and presentations about compu- Topics: The Digital, the Humanities, and the tational history in Finland as well as about Philosophies of Technology big data from a historians’ point of view. Keywords: digital history, history of the digital The online inquiry was open from late humanities, Finland, survey project, mapping April till end of June 2016. It was widely ad- the field vertised with for example articles written in both Swedish and Finnish. Altogether seven- Sources teen (17) persons responded to the inquiry. Antonijević, Smiljana: Amongst Digital Hu- This somewhat low number of respondents manists: An Ethnographic Study of will be complemented by results from other Digital Knowledge Production. recent surveys and user studies of which Palgrave Macmillan, New York there are a few. 2015. Based on these inquiry answers, the re- Project’s blog: searcher of the project compiled a report https://digihistfinlandroadmapblog and the report, called “Digitaalinen histori- .wordpress.com/ antutkimus kyselytuloksia” (Paju 2016, 12 Paju, Petri: ”Digitaalinen historiantutkimus pages, with an abstract and key results in kyselytuloksia.” Report from the English), was made public in the project’s project Towards a Roadmap for blog 1.9.2016. Main results of the report Digital History in Finland, available centered on the complexity of defining digi- online from 1.9.2016: tal history and the researchers’ difficulties https://digihistfinlandroadmapblog with such an identity. Moreover, several crit- .wordpress.com/2016/09/01/rapo ical issues were identified, namely creating rtti-kyselyvastauksista/ better, up-to-date information channels of Parland-von Essen, Jessica: ”Tankar kring digital history resources and events, provid- den snabba utvecklingen i Finland ing relevant education, skills, and teaching år 2015”, Historia i en digital värld, by historians, and the need to help historians January 5, 2016. and information technology specialists to meet and collaborate better and more sys- Bibliography tematically than before. Meanwhile there is a Fridlund, Mats & Daniel Sallamaa: ”Radikale lot happening in the field of digital history Mittel, gemäßigte Ziele: Repression that should and will be somehow included in und Widerstand im Großfürstentum the mapping. This now on-going project Finnland“, Osteuropa 66 (2016):4, 35- should have fresh results from mapping this 47. fast changing domain and compiling a Fridlund, Mats: ”Motståndets materialitet: roadmap for it by the time of the possible oppositionella ting, tekniker och presentation in Gothenburg in mid-March kroppar under ofärdsår”, in: 2017. Nina Wormbs, & Thomas Kaiserfeld,

58 eds. Med varm hand (Stockholm, We focused initially on texts belonging to 2015), 53-84 a group of so called ‘legendary’ sagas, or Fridlund, Mats & René Brauer: ”Historizing ‘mythical-heroic’ sagas (Ice. fornaldarsögur), topic models: A distant reading of since the question of this group’s genre sta- topic modeling texts within historical tus - specifically, whether fornaldarsögur studies”, in: LV Nikiforova & NV (FAS) ought to be considered a distinct Nikiforova, eds., Cultural Research in genre, or be analysed alongside their the Context of "Digital Humanities” ‘cousins’ riddarasögur (RIDD) - has been (St Petersburg, 2013), 152-163. widely discussed in the literature. Contra- Paju, Petri: ”Digitaalinen historiantutkimus dictory opinions concerning genre classifi- kyselytuloksia.” Report from the pro- cation have been offered by leading scholars ject Towards a Roadmap for Digital in the field; Mitchell (1991, 21) and Aðal- History in Finland, available online heiður Guðmundsdóttir (2001, cxlvii) have from 1.9.2016: suggested that fornaldarsögur were conside- https://digihistfinlandroadmapblog.w red a distinct category of literature already in ordpress.com/2016/09/01/raportti- the Middle Ages, as they are frequently kyselyvastauksista/ bound together in manuscripts, whereas Driscoll (2005, 193; see also Ármann Ja- kobsson 2012, 24) has suggested, also on the basis of their codicological context, that rid- Visualising Genre darasögur and fornaldarsögur should be tre- Relationships in ated as one literary group. Despite their op- Icelandic Manuscripts posing conclusions, the consensus among these scholars is that the codicological Katarzyna Anna Kapitan context of these texts is key to understan- University of Copenhagen, Denmark ding the genre they represent. Timothy Rowbotham Though it is necessary to look into medi- University of York, United Kingdom eval manuscripts to reach the medieval re- Tarrin Wills ader’s understanding of the genre, we must University of Copenhagen, Denmark take into consideration the huge loss of me- dieval manuscripts, and thus recognise that Purpose our knowledge of the medieval tradition is The proposed paper is based on the research fragmentary. Due to this lack of data, which arose from a collaboration between looking into sixteenth and seventeenth cen- three of us working respectively on the wri- tury manuscripts may deliver us important ting and reception of medieval Icelandic le- information about the medieval tradition, gendary histories (Rowbotham); transmiss- since there is some probability that post- ion history and applications of digital tools medieval manuscripts are close copies of in philological research (Kapitan); and un- their medieval exemplars, and thus might derstanding the manuscript context for preserve the the texts’ original context. The- prose and poetic texts (Wills). We discove- refore, we have decided to look at all avai- red that we had between us access to enough lable manuscript descriptions collected in data and expertise to remarkably expand on handrit.org, fasnl.ku.dk and the Skaldic Pro- previous analyses of the relationships ject Database. The method we have pursued between Old Norse texts as preserved in for identifying genre association has been to medieval and later manuscripts and, analyse the complex manuscript context of furthermore, that these analyses could be these texts, on the basis that analysis of this used to refine our definitions of literary gen- context helps to inform our understanding res and the place of individual texts within of the genre classification of medieval Norse those categories. literature. The approach we have developed has been applied across the corpus to un-

59 derstand genre relationships as represented describe the relationships between items in by the manuscript tradition. different manuscripts. Consequently the same text in two descriptions may be labe- Method led with different ‘uniform’ titles or even Our paper focuses on an interpretation of genre class. In order to build a visualisation the relationships between Old Norse texts and analysis of the relationships between based on a statistical analysis of digitized texts and genres we have had to define these manuscript descriptions. Since the initial relationships ourselves. We describe firstly focus of our research was an interpretation how one particular genre, the legendary sa- of genre associations within the corpus of gas, was supplemented and normalised. Se- fornaldarsögur an obvious point of de- condly, we describe how manuscript data parture was the online catalogue fasnl.ku.dk. from the fasnl.ku.dk, handrit.org, Skaldic The catalogue of all the manuscripts in Project and Dictionary of Old Norse Prose which fornaldarsaga texts are found, in- were merged, including processes for nor- cluding information on their format and lay- malising text names and generic classificat- out, the other texts they preserve and when, ions. where and by and/or for whom they were An open source visualisation software, written. Gephi, was used to analyse 153963 con- Further data came from other projects: nections between 1518 texts. A network of The Dictionary of Old Norse Prose (ONP) relationships between all the texts was achi- has produced a comprehensive list of works eved by application of ForceAtlas2 layout within the scope of that project (published (Jacomy et.al. 2011). ForceAtlas2 is a force in their Registre volume and with sub- directed layout in which nodes repulse each sequent revisions), along with detailed in- other like magnets while edges attract the formation about the manuscripts for each nodes they connect like springs, in case of work including the dating of the manuscripts our network, inspired by RIDD-network and location of each work within the ma- presented by Hall (2013), texts are represen- nuscript. This data was supplied to the Skal- ted as nodes, while edges represent ma- dic Project and has also been used (with nuscripts. The thicker is the edge between permission) here. The Skaldic Project itself two texts the bigger is a number of ma- has supplemented the manuscript informat- nuscripts in which these texts appear to- ion with the poetry relevant to that project gether. Unlike in Hall’s (2013) network, the and other manuscripts that were not recor- size of the nodes is standardized and inde- ded in the ONP data tables. Additionally, pendent of a number of connections created relevant data for manuscripts not containing by the texts. fornaldarsögur has been supplemented by Further analysis weights the connections the XML descriptions in handrit.org. between texts according to length (using The ONP and Skaldic Project ma- page counts), as a large number of very small nuscript information is structured with texts texts (i.e. þættir) can disproportionately in- linked to the manuscripts using a relational fluence the network by generating more database model. fasnl.ku.dk and handrit.org, connections. Additionally, we have compa- in contrast, give XML descriptions of the red results using different watershed dates manuscripts. One of the challenges for for the manuscript tradition, including 1728 addressing this question is taking the com- (the year of the great fire of Copenhagen) plex manuscript descriptions, constructed as and 1829 (the publication year of Rafn’s TEI XML, and extracting the relationships Fornaldarsögur norðrlanda). between the texts contained within them. The manuscript descriptions from fasnl.ku.dk and handrit.org were designed to give a detailed description of each object and its structure, but do not definitively

60 ripheral’ to the fornaldarsaga genre; these included texts such as Hrómundar saga Gripssonar, for which we have only indirect evidence of its existence in the middle ages, Þjálar-Jóns saga, which has often been re- garded in scholarship as a riddarasaga, and those texts, such as Helga þáttr Þórissonar and Norna-Gests þáttr, that were originally included as episodes (or þættir) in longer konungasögur, but since the nineteenth cen- tury have been included in the fornaldarsaga corpus. The XSLT scripts used in the

earliest stages of our research confirmed that Figure 1: Network of Icelandic literature. FAS - these texts, among others, were noteworthy pink; FORNS, FORNTH - red; ISL, ISLT - for the frequency with which they appear green, KON, KONTH - blue, RIDD - yellow, alongside genres such as riddarasögur and RIDDST - orange; EDD - white konungasögur.

Findings Topics: Visual and Multisensory Representat- As presented on Figure 1, the group of ions of Past and Present fornaldarsögur (pink) is positioned between Keywords: manuscript studies, network analy- íslendingasögur (green) and riddarasögur sis, data visualisation, genre (yellow), and mixes with fornadlarsögur síðari tíma, and fornaldarþættir (red). Bibliography Kongungasögur (blue) show close affiliation Aðalheiður Guðmundsdóttir. (2001). Úlf- with íslendingasögur, while eddic poetry hams saga, Reykjavík: Stofnun Árna (white) creates a separate group, which is Magnússonar. connected to fornadalrsögur through Herva- Ármann Jakobsson. (2012). “The Earliest rar saga ok Heiðreks. This connection can Legendary Saga Manuscripts”. In: The be explained by the fact that riddles from Legendary Sagas: Origins and Devel- Hervarar saga ok Heiðreks were often opment. Eds. Annette Lassen, Agneta copied independently from the saga, and in- Ney Ármann Jakobsson. (pp. 21–32). cluded in manuscripts together with other Reykjavík. poems, but in the catalogues they appear as Driscoll, M. J. (2005). Late prose fiction the witnesses of the saga. ('lygisögur'). In A Companion to Old The data collected and visualised is of Norse-Icelandic Literature and Cul- great value to the study of medieval Icelan- ture. (31 ed., pp. 190-204). Oxford: dic literature, but the great volume of it pre- Blackwell Publishing Ltd. sents a significant challenge to researchers Hall A., Parsons K. (2013). “Making stem- wishing to provide a detailed philological mas with small samples, and digital analysis. To begin to analyse the data, we approaches to publishing them: testing decided to take a small number of texts as the stemma of Konráðs saga keisara- case studies and, marrying the approaches of sonar”, Digital Medievalist 9. philological research with those of the digital Jacomy M., Heymann S., Venturini T., Bas- humanities, examine relationships between tian M. (2011). “ForceAtlas2, A Graph an individual fornaldarsaga and the texts it is Layout Algorithm for Handy Network linked to in the manuscript transmission. Visualization”, The selection of case studies was initiated by http://webatlas.fr/tempshare/ForceA focussing on a number of texts that were of tlas2_Paper.pdf interest from a literary critical perspective, and that we regarded as somewhat ‘pe-

61 Mitchell, S. (1991). Heroic sagas and ballads. based on the subjective analysis of the re- Ithaca and London: Cornell University searchers? Press. The main emphasis of this presentation is Rafn C.C. (1829). Fornaldarsögur Norðrlan- the proprioceptive experience in art. I will da. Vol I-III. Kaupmannahöfn. start with an analysis of earlier inventions and analogous practices which introduce Online resources corporeal artistic experience. I then investi- Ordbog over det norrøne prosasprog Regis- gate whether we can talk about the ‘proprio- tre: ceptive image’ in the same way that we can http://onpweb.nfi.sc.ku.dk/mscoll_d speak about the artistic, musical or literary _menu.html image. This analysis is influenced by a media Online catalogue handrit.org: archaeological approach, in particular Erkki http://handrit.org Huhtamo's interpretation in which his ap- Skaldic Project: proach is termed “media archaeology as http://skaldic.abdn.ac.uk/db.php? topos study” or simply “topos archaeology.” Stories for all time Poject: I aim to demonstrate how this “topoi” ⎯ http://fasnl.ku.dk/ "haptic and corporeal experience in audio- Gephi The Open Graph Viz Platform: visual performances and visual art" or "spa- https://gephi.org tiality, tactility and proprioception in partici- patory art" ⎯ changes and "transfigures" those examples in which the corporeal expe- Spatiality, Tactility and rience is translated into digital data and sub- Proprioception in sequently used for manipulations of the art- work. Before starting to analyse the works of Participatory Art Jeffrey Shaw, Char Davies and Bill Seaman in the sub-chapter "Tactility and propriocep- Raivo Kelomees tion in media art", I will provide a series of Estonian Academy of Arts, Estonia historical examples which lead to contempo- rary developments in media art. In this presentation I analyse performances, The main focus of the text is on changes artworks and installations in audiovisual and in the "art world", with an emphasis on contemporary art which emphasise tactile fields which could be called media art, new and corporeal experiences. This tendency media, electronic art, and contemporary art. can be observed in technological art, cinema To a lesser extent there is also a focus on and large visual attractions. I aim to demon- discussions happening in crossmedia and strate that due to technical developments transmedia—even though some projects are and new tools, the possibilities now exist for not easy to define, or belong to the fields of new aesthetic experiences in which the both new media and transmedia. This par- body’s position and its biological reactions ticularly concerns those works of multimedia are decisive. This leads to the question of where the tactile experience on screen is how the critical or theoretical point of view gradually becoming spatial and corporeal. of an artwork changes when the spectator’s Another topic under analysis is how clear is reactions to it are documented and quanti- the tendency to make the audio-visual expe- fied in real time and are changed into source rience tactile, tangible and physically experi- material for the next stage(s) of the artwork. enceable, in contrast to the virtual experi- Does this constitute the next step in the re- ence. search of interactive artworks which were In my discussion of multi-screen and based on the subjective analysis of the par- physically perceptible environments I want ticipant’s reactions? Does it allow us to re- to show situations, solutions and artworks write art analytical analyses, which were from the beginning of a so-called television

62 era, and in experiments of the expansion of "cyberformance" groups Troika Ranch and the cinematic experience, in which: Dumb Type. The goal in presenting these * an “interrelation” occurs between the examples is to illustrate the attempts in cin- visual screen content and a "communica- ema, theatre, art and research environments tion" occurs between screens: the visual or to create multi-screen environments that en- auditive content on different screens is gage the audience, offering them entertain- transferred from one to another, and a nar- ment, information and an explorative expe- rative is split between different (two or rience. The tendency is to make the visual more) screens; medium tangible and corporeal so that in * a connection occurs between screen some examples in interactive art the viewer images and stage activity: actors in physical "puts his hands" into the artwork. space and screen-space are acting in collabo- In this text I will formulate the definition ration or antagonism to each other; of proprioception, which means the spatial * viewers are influencing and leading the orientation arising from stimuli within the screen content: screen environments which body itself. This term is used to cover senso- surround viewers are gradually changed into rial systems which give information about environments which are shaped by us- position, posture, orientation and movement ers/viewers; of the body (and its parts) in space. In regard * viewers or actors are "in the image": of a proprioceptively perceived artwork we viewers or actors are corporeally in the im- can talk about the situation where the view- age or influencing it directly; er’s whole body and behaviour is involved in * the spectator’s physiology is influencing the decisive interaction. I will choose three or leading the screen content: the viewer’s examples of interactive art to analyse from participation in the presentation of images is the proprioceptive point of view: Jeffrey influenced by the biological data of the same Shaw's "Legible City" (1989), Char Davies’ viewer. This means that biological data (such "Osmose" (1995) and Bill Seaman's "Ex- as Heart Rate Variability, HRV; Galvanic change Fields" (2000). Transferring proprio- Skin Response, GSR etc.) are used as input ceptive cognition into interactive, participa- data for audiovisual variations. tive and tactile art allows us to enquire Amongst early examples the following whether the corporeal experience is interest- works are analysed: Raoul Grimoin-Sanson's ing and aesthetically novel. Also, does the "Cinéorama", 1900; Charles and Ray Eames' corporeal experience make these artworks “Glimpses of the United States” in Moscow proprioceptively distinctive? I conclude that in 1959; Czechoslovakia’s "Laterna Magika" "Legible City" is more ordinary than "Os- at the World Fair in Brussels in 1958, de- mose and "Exchange Fields"—in which the signed by Josef Svoboda; "Polyvision" by viewer’s proprioceptive participation is orig- Josef Svoboda and Jaroslav Frič; Josef Svo- inal. boda's "Diapolyekran" at Expo'67 in Mon- In this analysis I avoid discussion of bio- treal; Roman Kroitor's "Labyrinthe" at feedback-based interactive art and cinema. Expo'67 Montreal; Radúz Činčera's "Ki- The goal of the presentation is to prove that noautomat" at Expo'67; and other projects. the expansion of the viewers’ experience in Discussing contemporary environments cinema and art has reached a corporeal and of stage performances, digital art and re- tactile experience. In these artworks the vis- search practices I will present predecessors ual-auditive-spatial presentation is related to like Robert Whitman and his "Prune Flat" the viewer’s physical activity or reactions. (1965) and "Shower" (1964); Tony Oursler's Building on a series of historical examples I works; the British theatre company "Moving prove the existence of the trend and the his- Being"; Steve Dixon's group "Chameleons"; torical tendency that was already visible in Peeter Jalakas' performance "Estonian tromp l'oeil paintings ⎯ the desire to erase Games. Wedding" ("Eesti mängud. Pulm", the difference between the artificial and real 1996). I also discuss the "digital theatre" and worlds. It is interesting to see a consistency

63 of attempts to "break the barrier" between New Media in Theater, Dance, Per- reality and artificiality which occurs on dif- formance Art, and Installation. The ferent technical levels of complexity. We can MIT Press, Cambridge, MA, 2007. talk about cultural topos that make the vir- M. Bielicky, Prague–A Place of Illusionists, tual tangible in that which is visible beside in: Future Cinema. The Cinematic visual art and media art in experimental solu- Imaginary after Film. Jeffrey tions of cinema. Shaw/Peter Weibel (eds), The MIT Firstly I focus on artworks in which Press, Cambridge, MA/London, "immersion" is happening to a maximum 2003. extent and where the proprioceptive "sense" O. Grau, Virtual Art. From Illusion to Im- defines the aesthetic experience. Since pro- mersion, The MIT Press, Cambridge, prioception is a complex corporeal- Mass., 2003. physiological feedback mechanism it would Glimpses of the U.S.A. (1959) [excerpt], be wrong to call it "a sense", but undoubted- https://www.youtube.com/watch?fe ly it has been unjustly omitted in discussions ature=player_embedded&v=Ob0aSy about art. This presentation aims to fore- DUK4A ground this term and to demonstrate that we C. Hales, Spatial and Narrative Construct- can talk about a proprioceptive aesthetic ex- ions for Interactive Cinema, with perience. particular reference to the work of I conclude that artworks which are made Radúz Činčera. In: Expanding for tactile, proprioceptive and biofeedback Practices in Audiovisual Narrative, experiences are made with experimental and ed. by R. Kelomees, C. Hales. Cam- research purposes. The creation of these bridge Scholars Publishing, 2014. works depends on the availability and V. Havránek, Laterna Magika, Polyekran, cheapness of respective sensor technologies, Kinoautomat, in: Future Cinema. the level of competency of artists, designers The Cinematic Imaginary after Film. and programmers, and the rise of new col- Jeffrey Shaw/Peter Weibel (eds), The laborative practices. MIT Press, Cambridge, MA/London, 2003. Topics: Visual and Multisensory Representat- E. Huhtamo, Obscured by the Cloud: Media ions of Past and Present Archeology, Topos Study, and the Keywords: tactility, biofeedback, propriocept- Internet. In. ISEA2014 Dubai: ion, participatory art, interactive art, Location. Proceedings of the 20th corporeal-physiological feedback International Symposium on Electronic Art. Ed. by Thorsten References Lomker. Zayed University Books, N. Carpentier, Media and Participation: A Dubai, UAE. site of ideological-democratic E. Huhtamo, Twin-Touch-Test-Redux: Me- struggle. Bristol, UK and Chicago, dia Archaeological Approach to Art, USA: Intellect, 2011, 276-308; Book Interactivity, and Tactility, in: Medi- of Imaginary Media: Excavating the aArtHistories, ed. Oliver Grau. Cam- Dream of the Ultimate Communicat- bridge, Mass: The MIT Press, 2006. ion Medium. Edited by Eric Kluiten- E. Huhtamo, Illusions in Motion: Media Ar- berg. Rotterdam: NAI Publishers, cheology of the Moving Panorama 2006. and Related Spectacles. Cambridge, J. D. Bolter and D. Gromala, Windows and MA: MIT Press, 2013. Mirrors. Interaction Design, Digital E. Huhtamo, Resurrecting the technological Art, and the Myth of Transparency. past: An introduction to the archeo- MIT Press, Cambridge MA, 2005, lk. logy of media art. Intercommunicat- 28. ion, 14, 2. (1995). S. Dixon, Digital Performance. A History of E. Huhtamo, From Kaleidoscomaniac to

64 Cybernerd Towards an Archeology urimälus:kirjandus-, filmi- ja of the Media. Leonardo, Vol. 30, No teatripärase esituse transmeedialine 3 (1997). analüüs. Acta Semiotica Estica VII, I. Ibrus & C. A. Scolari, Introduction: 2010, 160–182. Crossmedia innovation. In I. Ibrus & M. Saldre and P. Torop, Transmedia space. C. A. Scolari (Eds.), Crossmedia In- In: Ibrus, Indrek and Carlos A. Sco- novations: Texts, Markets, Institut- lari (Eds.). Crossmedia Innovations: ions. Frankfurt: Peter Lang. Texts, Markets, Institutions. Frank- Into the Light. The Projected Image in furt etc.: Peter Lang, 2012. American Art 1964–1977, Chrissie M. Schjødt, Switching, Iles (ed.), exhib. cat., Whitney Mu- http://www.switching.dk/en/ seum of American Art. New F. Sparacino, G. Davenport, A. Pentland, York/Harry N. Abrams, New York, Media in performance: Interactive 2001. spaces for dance, theater, circus, and H. Jenkins, Transmedia Storytelling 101, museum exhibits. IBM Systems http://henryjenkins.org/2007/03/tr Journal Vol. 39, Nos. 3 & 4, 2000, p. ansme- 479, dia_storytelling_101.html#sthash.SG http://alumni.media.mit.edu/~flavia ZaSez6.dpuf /Papers/ibm_sparacino.pdf O. Kruglanski "As Much As You Love Me", F. Sparacino, C. Wren, G. Davenport, A. http://archive.aec.at/prix/#35286 Pentland, Augmented Performance L. Manovich, Soft Cinema, in Dance and Theater. International http://www.softcinema.net/mission Dance and Technology 99 _to_earth.htm (IDAT99), at Arizona State Univer- L. Manovich, Information as an Aesthetic sity, Feb. 25-28, 1999, Event, 2007, http://alumni.media.mit.edu/~flavia http://manovich.net/content/04- /Papers/flavia_augmented_performa projects/056-information-as-an- nce.pdf aesthetic-event/53_article_2007.pdf M. Teemus, Reisides toas. Pano-, kosmo- ja L. Manovich, What is Visualization? Visual diaraamadest Tallinnas ja Tartus Studies, vol. 26, no.1. (2011): 36-49, (1826-1850). Tartu Ülikooli Kirjastus, http://manovich.net/content/04- 2005. projects/064-what-is- F. Thalhofer, Planet Galata, visualization/61_article_2010.pdf http://www.thalhofer.com/_data/P Media Archaeology: Approaches, Applicat- AGES/xproject_2010_PlanetGalata. ions, and Implications. Edited by html Erkki Huhtamo & Jussi Parikka. F. Thalhofer, Love Story Project, Berkeley, CA:University of California http://www.thalhofer.com/_data/P Press, 2011. AGES/xproject_2002_LoveStoryPro B. Montero (2006). Proprioception as an ject.html Aesthetic Sense. Journal Of Proprioception, US National Library Aesthetics And Art Criticism 64 of Medicine Medical Subject He- (2):231-242. adings (MeSH), S. Natale, Understanding Media Ar- https://www.nlm.nih.gov/cgi/mesh chaeology, Canadian Journal of /2011/MB_cgi?mode=&term=Prop Communication, Vol 37 (3), rioception http://www.cjc- B. Seaman "Exchange Fields" (2000), on- http://projects.visualstudies.duke.ed line.ca/index.php/journal/article/vie u/billseaman/seamanvanberkel/exch wFile/2577/2336. ange_fields/exchange_fields.htm M. Saldre, Tühirand eesti kultu- See This Sound: Promises in Sound and Vis-

65 ion. Ed. by Claudia Albert, Amy Ale- edition will be the first complete publication xander, Rainer Bellenbaum, Dieter of his vast correspondence. We will begin Daniels, Sandra Naumann. Walther the publication with the approximately 1 800 König, Köln, 2010. private letters written by Lönnrot. P. Tikka, Enactive Cinema. Simulatorium The online edition is designed not only Eisensteinense. University of Art and for those interested in the life and work of Design Helsinki, 2008. Lönnrot himself, but more generally to P. Weibel, The Post-Gutenberg Book. The scholars and general public interested in the CD-ROM between Index and Nar- work and mentality of the Finnish 19th cen- ration, in: artintact 3, Artists'inte- tury nationalistic academic community, their ractive CD-ROMagazin. Cantz Ver- language practices both in Swedish and in lag 1996. Finnish, and in the study of epistolary cul- S. Zielinski, Deep Time of the Media: ture. The rich, versatile correspondence of- Toward an Archaeology of Hearing fers source material for research in biog- and Seeing by Technical Means. raphy, folklores studies and literary studies; Cambridge, MA: MIT Press, 2006. for general history as well as medical history G. Youngblood, Expanded Cinema, Dutton, and the history of ideas; for the study of ego New York, 1970. documents and networks; and for corpus linguistics and history of language. While being fully aware of the signifi- cance and the multidisciplinary use of the The Elias Lönnrot Letters Lönnrot letters, the group working with the Online – Challenges of online publication is faced with the usual Multidisciplinary Source challenges of humanistic research and publi- cation projects: insecure and discontinuous Material funding, the time-consuming process of transcribing of extensive source materials, Kirsi Keravuori and the fast development of technical solu- Niina Hämäläinen tions. The SKS decided to prioritize the Maria Niku prompt online publication of Lönnrot letters Finnish Literature Society SKS with good, practical tools for researchers and open, accessible data for those that want The Finnish Literature Society SKS will to develop the material further. launch the Elias Lönnrot Letters Online in The extensive source material together April 2017. The new digital edition is part with the priority on prompt publication and of the Society's "Open Science and Cultural the small staff made it necessary to find a Heritage" -project which seeks to develop publishing platform that would require rela- scholarly online materials and tools. tively light modification and would be easy The correspondence of Elias Lönnrot to manage, and where the process of im- (1802–1884, doctor, philologist and creator porting the source material could be easily of the national epic Kalevala) comprises of automatized. The group first considered the 2 500 letters or drafts written by Lönnrot edition platform used by the SKS's Edith – and 3 500 letters received. The online edi- Critical Editions of Finnish Literature and the tion is the conclusion of several decades of Svenska litteratursällskapet's Zacharias Tope- research, of transcribing and digitizing letters lius Skrifter (http://www.topelius.fi/). How- and of writing commentaries. Part of Lö- ever, this was found to be too labour- nnrot's letters we published already in the intensive and complex. In particular the beginning of the 20th century, and the Select- commentary tool included in the edition ed letters came out in 1990. Since then Lö- platform was deemed to be unnecessary for nnrot's correspondence has been digitized in the purposes of the Elias Lönnrot Letters an Academy of Finland project, and tran- Online, which will include only a limited scribed between 2005 and 2016. The online

66 amount of commentaries. The SKS had pri- which enables researchers and other users to or experience with Omeka, the open-source modify them for their own purposes. web-publishing platform for the display of A researcher who uses digitized letters as library, museum, archives, and scholarly col- source material is faced with some challeng- lections and exhibitions. A trial period es related to the material and its context. We demonstrated that this platform was the best ask what kind if information is lost in using option available for the planned publication. online publications, where the materiality of As an open-source tool, Omeka is low the letters and their connection to the real- cost and does not involve complex permis- life physical objects in the archive are weak- sion and copyright issues. Its item format, ened. Can a good interface help convey in- with Dublin Core metadata fields and the formation about the original letters and the ability to attach files to the items, is well entity they form in the archive? Just like the suited to the source material, in which each archive can be a place hiding, concealing letter and draft forms an individual docu- and covering documents if nobody looks ment consisting of facsimile images and for them, a digital edition also needs an ac- transcription encoded in XML/TEI5. tive user whose questions render the docu- Omeka's collections feature makes it easy to ments meaningful. Issues concerning con- organize the source material into collections text and contextualization and their relation according to letter recipients. to the digitization and to the online presen- A number of plugins available for Omeka tation of archival material are therefore of provide added functionality for importing great importance. A researcher needs to be and displaying documents. The CSV import able to place the archival material in a wider plugin, combined with a simple XSLT script, historical and cultural context in order to enables mass import of documents together make sense of it. To make digitized material with the image and TEI files attached to understandable and meaningful, we need to each document. The SolrSearch plugin, built provide as much contextual information as on Apache's Solr, provides an open text possible, e.g. precise information on the search that encompasses both the metadata original archival material and the processes fields and the transcription. Some image of selection, digitization, and edition it has viewer plugins are useable for displaying the undergone. Also contextual information on facsimile images for each document. the text of the letters is required to help the TEI5 enables detailed encoding of the reader understand the meanings within the source material. However, the project group text itself and the circumstances of letter decided on a light encoding of the transcrip- writing. Lönnrot’s letters have been availa- tions. Lönnrot's own underlinings, additions ble and well catalogued in the archives of and deletions of text, and unclear and unde- SKS, but our comprehension of his corre- cipherable parts of the transcriptions are spondence is filtered through selected pub- marked with TEI tags. Information con- lications such as Journeys of Elias Lönnrot tained in the transcriptions, such as personal (1902), his biography (1931, 1935) and Select- and place names, is left unmarked. Similarly, ed works of Elias Lönnrot 1: Letters (1990). for example different kinds of additions of Therefore, we will reflect on how our per- text (above lines, in the margins etc.) are not ceptions could be widened, and possibly differentiated. This is partially to do with the changed, by the complete digital edition of extensive amount of manual work such de- the correspondence. tailed encoding require, and partially with We will demonstrate how a hypothetical the functionality provided by the publica- researcher might use our online publication tion. The open text search provides easier as a tool to access Lönnrot's letters and find and quicker access to the same information answers to questions related to his/her re- as the encoding would. The TEI documents search problem. We will show the benefits will be made available as free downloads, the tool offers in comparison to the tradi- tional methods of accessing this kind of

67 source material, as well as address the poten- Tagging Named Entities tial limitations that might arise from the technical solutions adopted. The benefits in 19th century Finnish and potential limitations are related to how Newspaper Material the material is displayed and what kind of with a Variety of Tools search tools are provided. Are the digitized letters and their transcriptions easily accessi- Kimmo Kettunen ble and are features such as zooming in on Teemu Ruokolainen the facsimiles or moving from page to page The National Library of Finland within a letter easy to use? How do the search options help the researcher find the Introduction information he/she needs? How can we Digital newspapers and journals, either help scholars make new interpretations OCRed or born digital, form a growing based on digitized material? global network of data that is available 24/7, We'll finish with the challenges of build- and as such they are an important source of ing platforms and interfaces for the multi- information. As the amount of digitized disciplinary scholarly community. We have journalistic information grows, also tools for opted for an interface designed with the cul- harvesting the information are needed. tural historian in mind rather than focusing Named Entity Recognition (NER) has be- on Lönnrot himself. Thus the letters are come one of the basic techniques for infor- published "vertically", all the letters to a par- mation extraction of texts since the mid- ticular addressee at a time, instead of "hori- 1990s (Nadeau and Sekine, 2007). In its ini- zontally", year by year. We know that lin- tial form NER was used to find and mark guists are interested in the letters as well, but semantic entities like person, location and instead of attempting to build an interface organization in texts to enable information that caters for them too, SKS will share the extraction related to this kind of material. data with Finn-Clarin and the Language Later on other types of extractable entities, Bank, where linguists can use it together like time, artefact, event and meas- with other similar materials in Finnish and in ure/numerical, have been added to the rep- Swedish. As Lönnrot's letters form an ex- ertoires of NER software (Nadeau and ceptionally vast collection of manuscripts Sekine, 2007). In this paper we report evalu- written by one hand, we are handing part of ation results of NER for historical 19th cen- the letters together with their transcriptions tury Finnish. Our historical data consists of over to THE READ project (Recognition an evaluation collection out of an OCRed and Enrichment of Archival Documents). Finnish historical newspaper collection And finally, we are co-operating with the 1771–1910 (Kettunen and Pääkkönen, project STRATAS – Interfacing structured 2016). and unstructured data in sociolinguistic re- Kettunen et al. (2016) have reported first search on language change. NER evaluation results of the historical As a significant part of the Lönnrot let- Finnish data with two tools, FiNER and ters are written in Swedish, we hope to find ARPA. FiNER is provided by the Fin- ideas for Nordic co-operation in Göteborg. CLARIN consortium, ARPA is a semantic web tool produced by the Semantic Compu- Topics: Nordic Textual Resources and ting group at the Aalto University. Both Practices tools achieved maximal F-scores of about 60 Keywords: Online publishing, open science, at best, but with many categories the results correspondences were much weaker. Word level accuracy of the evaluation collection was about 73 per- cent, and thus the data can be considered very noisy. NER results for modern Finnish have not been reported extensively so far.

68 Silfverberg (2015) mentions a few results in standard measures of precision, recall and F- his description of transferring an older ver- score, the last one defined as 2PR/(R+P), sion of FiNER to a new version. With mod- where P is precision and R recall (Manning ern Finnish data F-scores round 90 are and Schütze, p. 269). As the FST and Con- achieved. nexor’s tagger do not distinguish multipart In this paper we add two more analysis names with their boundaries only a compa- tools to our earlier NER repertoire. Finnish rable loose evaluation without entity bound- Semantic Tagger (FST) is not a NER tool as ary detection is reported here (Poibeau and such; it has first and foremost been deve- Kosseim, 2001). loped for semantic analysis of full text. The Table 1 shows F-score results of four FST assigns a semantic category to each evaluations of locations and persons in our word in text employing a comprehensive evaluation data. EnamexPrsHums contain semantic category scheme (USAS Semantic both first names and last names; Enam- Tagset, available in English1 and also in exLocXxx is a general location category that Finnish2; Löfberg et al., 2005). The scheme combines three more refined location cate- contains three name related categories: per- gories to one. sons, locations and organizations. Our other new tool is Connexor’s NER software3, which is a commercial tool for modern Fin- EnamexPrsHum EnamexLocXxx F-score Number F-score Number nish. of found of found tags tags Results for the Historical Data ARPA 52.9 3636 52.4 2933 Connexor 56.4 5321 60.9 1802 Our historical Finnish evaluation data con- FiNER 58.1 2681 57.5 1541 sists of 75 931 lines of manually annotated FST 51.1 1496 56.7 1253 newspaper text, one word per line. Most of the data is from the last decades of 19th cen- Table 1. Evaluation of four tools with loose crite- tury. Earlier NER evaluations with this data ria and two name categories in the historical news- have achieved at best F-scores of 50–60 in paper collection. Best results are in bold. some name categories (Kettunen et al., 2016). Our baseline tagger, FiNER, is de- scribed more in Kettunen et al. (2016). All taggers recognize locations and persons Shortly described, it is a rule-based NER quite evenly, differences are small. Our base- tagger that uses morphological recognition, line tagger FiNER achieves best F-score morphological disambiguation, gazetteers with persons, Connexor with locations. Per- (name lists), pattern and context rules for formance of the taggers is quite bad, which name tagging. is expectable as the data is very noisy. We evaluated performance of our differ- It is evident that the main reason for low ent NER tools using the conlleval4 script used NER performance of the tools is the quality in Conference on Computational Natural of the OCRed texts. If we analyze the tagged Language Learning (CONLL). Conlleval uses words with a morphological analyzer (Omorfi v. 0.35), we can see that wrongly tagged words are of lower quality than those 1 http://ucrel.lancs.ac.uk/usas/USASSeman that are tagged correctly. Figures are shown ticTagset.pdf in Table 2. Thus improvement in OCR qual- 2 https://github.com/UCREL/Multilingual- ity will most probably bring forth a clear im- USAS/raw/master/Finnish/USASSemantic provement in NER of the material. Tagset-Finnish.pdf 3 https://www.connexor.com/nlplib/?q= technology/name-recognition 4 http://www.cnts.ua.ac.be/conll2002/ner/ bin/conlleval.txt, author ErikTjong Kim Sang, version 2004-01-26 5 https://github.com/flammie/omorfi

69 tic Tagger, the FST, and Connexor’s NE

Locations Persons software. FiNER and Connexor’s tagger are ARPA right tag, word 1.9 4.5 dedicated NER tools for modern Finnish, unrecognition rate Connexor right tag, word 10.2 25.0 but the FST is a general semantic tagger and unrecognition rate ARPA a semantic web linking tool. Our re- FiNER right tag, word 6.3 12.8 unrecognition rate sults show that they all tag names of loca- FST right tag, word un- 5.6 0.06 tions and persons almost at the same level in recognition rate the noisy OCRed historical newspaper col- ARPA wrong tag, word 22.7 29.3 unrecognition rate lection. FiNER is best with names of per- Connexor wrong tag, 53.5 57.4 sons, Connexor with locations. Differences word unrecognition rate between tagger performances are at biggest FiNER wrong tag, word 38.3 34.0 unrecognition rate 7–8 % points. FST wrong tag, word 44.0 33.3 In general our results show that NE tag- unrecognition rate ging in a noisy historical newspaper collec- tion can be done to a reasonable extent with Table 2. Unrecognition rates for rightly and tools that have been developed for modern wrongly tagged words, percent. Finnish. Anyhow, it seems obvious, that bet- ter results could be achieved with a new tool, which is trained with the noisy histori- Development of a New Statistical Tagger cal data. We have ongoing development Our baseline tagger FiNER employed in the work with regards to this. We also try to im- above experiments is a rule-based system prove the quality of our OCRed text data utilizing morphological analysis, gazetteers, with new OCRing and post-correction. To- and pattern and context rules. However, gether these should yield better NER results while there does exist some recent work on in the future. rule-based systems for NER (Kokkinakis et Finally, a note about usage of Named En- al., 2014), the most prominent research on tity Recognition is in order. Named Entity NER has focused on statistical machine Recognition is a tool that needs to be used learning methodology for a longer time for some useful purpose. In our case extrac- (Nadeau and Sekine, 2007; Neudecker 2016). tion of person and place names is primarily a Therefore, we are currently developing a sta- tool for improving access to the Digi collec- tistical NER tagger for historical Finnish tion. After getting the recognition rate of text. For training and evaluation of the sta- some NER tool to an acceptable level, we tistical system, we are manually annotating need to decide, how we are going to use ex- newspaper and magazine text from the years tracted names in Digi. Some exemplary sug- 1862–1910 with classes person, organization, gestions are provided by the archives of La and location. The text contains approximately Stampa6 and Trove Names (Mac Kim and 650,000 word tokens. Subsequent to annota- Cassidy, 2015). La Stampa style usage of tion, we can utilize freely available toolkits, names provides informational filters after a such as the Stanford Named Entity Recog- basic search has been conducted in the nizer (Finkel et al., 2005), for teaching the newspaper collection. User can further look NER tagger. We expect that the rich feature for persons, locations and organizations sets enabled by statistical learning will allevi- mentioned in the article results. This kind of ate the effect of poor OCR quality on the approach enhances browsing access to the recognition accuracy of NEs. For recent collection (Bates, 2007; McNamee, Mayfield work on statistical learning of NER taggers and Piatko, 2011; Toms, 2000). Trove for historical data, see Neudecker (2016). Names’ name search takes the opposite ap- proach: user searches first for names and Discussion then gets articles where the names occur. We In this paper we have shown results of NE tagging of historical OCRed Finnish with four tools: FiNER ARPA, a Finnish Seman- 6 http://www.archiviolastampa.it/

70 believe that La Stampa style usage of names Search of Named Entities in a Finnish in the GUI of a newspaper collection is OCRed Historical Newspaper Collec- more informative and useful for users, as the tion 1771-1910. LWDA 2016, availa- Trove style can be achieved with the normal ble at: http://ceur-ws.org/Vol- search function in the GUI of the newspa- 1670/paper-35.pdf per collection. Kettunen, K. and Pääkkönen, T. (2016). Our main emphasis with NER will be to Measuring Lexical Quality of a Histor- use the names with the newspaper collection ical Finnish Newspaper Collection – as a means to improve structuring, browsing Analysis of Garbled OCR Data with and general informational usability of the Basic Language Technology Tools and collection. A good enough coverage of the Means. In LREC 2016, Tenth Interna- names with NER needs to be achieved also tional Conference on Language Resources and for this use, of course. A reasonable balance Evaluation, available at of P/R should be found for this purpose, http://www.lrec- but also other capabilities of the software conf.org/proceedings/lrec2016/pdf/1 need to be considered. These remain to be 7_Paper.pdf. seen later, if we are able to connect func- Kokkinakis, D., Niemi, J., Hardwick, S., tional NER to our historical newspaper col- Lindén, K., and Borin. L. (2014). lection’s user interface. HFST-SweNER – a New NER Re- source for Swedish. In Proceedings of Acknowledgements LREC 2014, available at: This work is funded by the EU Commission http://www.lrec- through its European Regional Develop- conf.org/proceedings/lrec2014/pdf/3 ment Fund, and the program Leverage from 91_Paper.pdf. the EU 2014–2020. Löfberg, L., Piao, S., Rayson, P., Juntunen, J- P, Nykänen, A. and Varantola, K. Topics: Nordic Textual Resources and (2005). A semantic tagger for the Practices Finnish language, available at Keywords: named entity recognition, historical http://eprints.lancs.ac.uk/12685/1/cl newspaper collections, Finnish 2005_fst.pdf. McNamee, P., Mayfield, J.C., and Piatko, References C.D. (2011). Processing Named Enti- Bates, M. (2007). What is Browsing – really? ties in Text. Johns Hopkins APL Tech- A Model Drawing from Behavioural nical Digest, 30, 31–40. Science Research. Information Research Mac Kim, S., Cassidy, S. (2015). Finding 12. Names in Trove: Named Entity http://www.informationr.net/ir/12- Recognition for Australian. In Proceed- 4/paper330.html. ings of Australasian Language Technology Finkel, J.R., Grenager, T. and Manning, C. Association Workshop, available at (2005). Incorporating non-local in- https://aclweb.org/anthology/U/U15 formation into information extraction /U15-1007.pdf. systems by Gibbs sampling. In Pro- Manning, C. D., Schütze, H. (1999). Founda- ceedings of the 43rd Annual Meeting tions of Statistical Language Processing. The on Association for Computational MIT Press, Cambridge, Massachusetts. Linguistics (ACL 2005), 363–370, Nadeau, D., and Sekine, S. (2007). A Survey available at of Named Entity Recognition and http://dl.acm.org/citation.cfm?id=12 Classification. Linguisticae Investigationes, 19885. 30(1): 3–26. Kettunen, K., Mäkelä, E., Kuokkala, J., Neudecker, C. (2016). An Open Corpus for Ruokolainen, T. and Niemi, J. (2016). Named Entity Recognition in Historic Modern Tools for Old Content - in Newspapers. In LREC 2016, Tenth In-

71 ternational Conference on Language Re- ous kinds of different ‘realities’ (AR, VR, sources and Evaluation, available at MR) that are flocked as ‘affordable’ for con- http://www.lrec- sumers in order to promote ‘new’ experienc- conf.org/proceedings/lrec2016/pdf/1 es. We reject this narrative and argue that 10_Paper.pdf . these ‘new’ experiences are as old as their Poibeau, T. and Kosseim, L. (2001). Proper ancestors, but argue that the digital experi- Name Extraction from Non- ence is in fact not that different from a real Journalistic Texts. Language and Com- experience. puters, 37(1): 144–157. As a consequence of this experience of Silfverberg, M. (2015). Reverse Engineering non-difference, which can be summed up as a Rule-Based Finnish Named Entity indifference to different realities, we observe Recognizer. Paper presented at a deflating critical effect in the digital experi- Named Entity Recognition in Digital ence as it takes us closer to the real. We no Humanities Workshop, June 15, Hel- longer only occasionally escape from reality, sinki available at: but reality is also the escape. The ever pres- https://kitwiki.csc.fi/twiki/pub/FinC ence of the digital escape in our everyday life LAR- is the escape from critique. We argue that we IN/KielipankkiEventNERWorkshop2 have arrived at an understanding of the digi- 015/Silfverberg_presentation.pdf tal experience as near real or as a slightly al- Toms, E.G. (2000). Understanding and Fa- ternated state of reality, but that in this bodi- cilitating the Browsing of Electronic ly state of being, an embodied ‘gratuitous’ Text. International Journal of Human- experience (Williams 1991), the meaning of Computer Studies, 52, 423–452. critique becomes pointless, since our sense of the real and the represented has narrowed to indistinctive proportions. We are resigned to a state of a frozen standstill (Benjamin The Digital Experience: 1999) where critical distance allows for min- Technology and imal or no reflections. This means that here Representation is no end to the escape but only continuous digital escapes into bodily experiences. Lars Kristensen Viewed in this way, critique is part of the University of Skövde, Sweden problem of digital media and not the first Graeme Kirkpatrick step towards any kind of solution. Rather University of Manchester than constituting a special, epistemically privileged vantage point it should be part of The research presented is part of an ongoing what we are talking about when we begin project regarding post-critique and digital media analysis. Just as we should attend to culture. It is our objective to describe an ex- the structure of feeling that makes certain perience that is particular to the digital, an kinds of media possible (Williams 1968), so aesthetics that is only achieved with/through we must include in that reading an account digital paraphernalia. Media histories tell us of the critical interpretation that articulates that newness has always been projected as and manages the feeling responses of audi- giving some kind of vertiginous bodily expe- ences. Our approach to digital media starts rience; an experience in which our bodies with the idea that we need to give up the are too slow to react to the represented or idea of critique as a standpoint outside me- too inexperienced to follow through, dis- dia and that this step is essential to what dig- placing or disrupting our sense of being in ital media are; to their specific difference control. These experiences are part of capi- from older forms. talist celebrations of new technologies where The hazard in such an approach is that humans can test their readiness for capitalist we might appear to be advocating a kind of projected future realities. Currently it is vari- quietism or a post-modern, post-political

72 celebration of the popular. However, we re- codification. At best, the experience we have tain a political concern with how media with a computer game may separate players serve contemporary domination. Our theo- from everyday world-orderings and oblige retical perspective is informed by Jacques them to question what makes the experience Rancière’s (2007) notion that media do not cohere. This reflects a shift in the relation of so much deceive or mislead their audiences subject and media object more generally, in as they cleave them to a particular percep- which their relationship is more symmetrical tion of reality, namely, one that is parti- and entwined. In consequence ‘art’ loses its tioned, divided. Viewed in this way media privileged status and ‘mediatisation’ loses its are not deceptive or even manipulative, as association with ‘top-down’ forms of domi- critique would have it; their divisiveness – nation (Kirkpatrick 2011). their violence – is of a different order. It consists in their articulation, or better, super- Digital representation imposition, of the sensed and the sensible. In this part, we explore the indistinction of representation and reality in the digital expe- Digital technology rience. Being post-critical means that the In this part we generalize some of the find- world is not ‘out there’ beyond media and ings of contemporary sociology of art (Hein- represented by them but rather constituted ich 1998; Hennion 1995) onto modern me- by being mediated through technology. dia more generally. We will argue that, More precisely, it is in moving between me- viewed in this perspective, much of Ador- diatic instants that we pull together a world, no’s theory of media and culture has been or worlds of experience, out of what is. Re- banalised. Where he wrote of a loss of au- ality here is what gets instituted (Latour thentic experience, it is clear that all media 2013: 280) or what is included ‘as one’ in the are now experiential, obliging us to act and count (Badiou 2007). Under these circum- to think. Where he identified the subject- stances reality is in between fiction and au- object relation as the locus of ‘critical ten- thentic life: in a strange and anti-climactic sions’, and charged the modern subject with fulfillment of surrealism, the element of pre- the ‘critical’ task of breaking itself apart in tense is a recognized part of real life. order to let objects speak, we now struggle This situation corresponds to what Hal to shut them up. Where critical theory tar- Foster calls the new Alexandrine, in which geted a monolithic ‘hegemonic technological previously sharpened differences have be- rationality’ (Feenberg 2002), contemporary come “stagnant incommensurabilities” (Fos- media accommodate a variety of thinking ter 2004: 28). The infinite complexity of styles and present diverse openings to expe- contemporary mediation, in which fantasy rience of the incommensurate, of that which permeates reality and playful subjects disap- does not fit. pear into a nether world where life is never To make this argument, we will use the fully life and death is only temporary, also example of the computer game, Dark Souls merits description as neo-baroque (Ndalianis (2009-2012). In its difficult complexity, in 2003). We will explore these ideas through the fact that it demands a response from the the case studies of This Is Not a Film (2011) player to become anything at all, and in the by Jafar Panahi. In this film, Panahi re- obscurity of its literal meaning this game re- enacts scenes from his earlier films in his sembles the modernist work of art. Unlike living room in Teheran, while being under those works, though, the video game does house arrest for alleged crimes against na- not require theory to redeem its meaning. tional security. We will discuss the film in Playing through the game is an experience of relation to contemporary media where eve- cleaving form from the dark matter of the ryone and no one is an ‘actor’; everyone, in- digital machine but, detached from anything cluding Panahi, knowingly acts out a role like critique, this form does not oppose the that is and is not them. The example illus- tyranny of meaning or subvert a hegemonic trates the way in which our experience of the

73 inchoate, or of being as difference, works The Corpus of American because it does not work: it is our stumbling failures and awkward mistakes that bridge Danish: A Corpus of the gaps and produce a world, as well as it is Multiligual Spoken Heritage Panahi’s. Moreover, this activity is some- Danish and Corpus-based thing we find thematised by ‘un-critical’ par- ticipants in media, whose activity instantiates Speaker Profiles as a Way to the contemporary social imaginary. Tackle the Chaos

Topics: The Digital, the Humanities, and the Karoline Kühl Philosophies of Technology Jan Heegård Petersen Keywords: theory of aesthetics, technology, Gert Foget Hansen experience, games, film University of Copenhagen, Denmark

References During the last years (2014– 2016), we have Badiou, A. (2007) Being and Event Trans. overcome a number of challenges while O. Feltham. London: Continuum. establishing the Corpus of American Danish Benjamin, W. (1999) The Arcades Project, (CoAmDa) within the project ‘Danish Cambridge, Massachusetts: Harvard Voices in the Americas’ at the University of University Press Copenhagen. The challenges have emerged Feenberg, A. (2002) Transforming Technol- from the digitization and integration of le- ogy Oxford: Oxford University Press. gacy data, the challenges of transcribing Foster, H. (2004) Design and Crime (and non-standardized data (with regard to spo- other diatribes) London: Verso. kenness and multilingual language use) and, Heinich, N. (1998) The Glory of Van Gogh: finally, from using the corpus for research of an anthropology of admiration Trans. language variation and change in emigrant P. L. Browne New Jersey: Princeton and heritage speakers. University Press. This will give a presentation of CoAmDa Hennion, A. (1995) La Passion Musicale and its sub-corpora, the Corpus of North Paris: Éditions Métailié. American Danish (data from Canada and the Kirkpatrick, G. (2011) Aesthetic Theory and USA) and the Corpus of South American the Video Game Manchester: Man- Danish (data from Argentina), as a newly chester University Press. established linguistic resource in the Nordic Latour, B. (2013) An Enquiry into Modes of region. We aim at presenting and discussing Existence: an anthropology of the corpus-based sociolinguistic speaker profiles moderns Trans. C. Porter New York: as a newly developed tool for coping with a Harvard University Press. huge number of speakers who diverge mas- Ndalianis, A. (2003) Neo-baroque Aesthetics sively with regard to their speech product- and Contemporary Entertainment ion. London: MIT Press. As of December 2016, the CoAmDa Ranciere, J. (2009) The Future of the Image amounts to 1.47 million tokens (165 hrs) Trans. G. Eliott. London: Verso. produced by 264 speakers (born between Williams, L (1991) ‘Film Bodies: Gender, 1876 and 1965). Basic speaker metadata like Genre, and Excess’, Film Quarterly, gender, birth year, time of emigration, home 44(4) (Summer), pp. 2-13 area in homeland and residence at the time Williams, R. (1968) The Long Revolution of the recording are available for most spea- London: Chatto and Windus. kers. The data is orthographically transcribed, aligned with sound, PoS-tagged,

and the CoAmDa is currently being annota-

ted with a basic syntactic annotation (main

clauses, declarative clauses, subject, finite

74 verb and sentence adverbials). During the production of the North American Danes transcription process, words were coded ac- and the Argentine Danes: The variance con- cording to language used. To make the cod- cerns fluency (i.e. amount of empty and fil- ing as time efficient as possible, automated led pauses, hesitation phenomena, restarts), procedures were developed to the effect that the amount of word-internal codeswitching transcribers needed only code language use and code shifting between utterances as well for words in languages other than the desig- as the kind and amount of non-Danish lin- nated default language (in our case Danish), guistic variables (e.g., the use of ‘English’ such as English or Spanish, as well as word- word order in American Danish, see Kühl & internal switching (occurring either between Heegård Petersen to appear). Within the stems or between stem and suffixes) and tradition of Labovian sociolinguistics, our words that could not unambiguously be as- aim is to look for ties between this variance signed to one language due to an intermedi- in speech production and the sociolinguistic ate pronunciation (e.g., Danish søster and variables (i.e. speaker metadata). Ties English sister). between language use and sociolinguistic A subsequent automated procedure as- characteristics cannot be assumed per se in signed a language code to the remaining research on emigrant and heritage speakers, words, based on the designated default lan- as the emigration process typically will result guage, and language codings based on the in a mix-up of the connections between lin- orthographic transcription. Some words guistic variables and sociological charac- were categorized by common traits such as teristics established in the homeland. Non- interrupted words beginning or ending in a intuitive knowledge about the speakers’ dash (-) and proper nouns by being capital- speech production, including differences in ized. Some words belonging to (semi-)closed lemma production, amount and type of sets such as discourse markers were coded code-switching, speech rate, hesitations re- based on a small lookup table. Based on this petitions and restarts, is thus a desideratum. preliminary coding of language use, diction- In order to establish these connections aries (comprehensive word-language- anew for the speakers in the corpus, we category lookup tables) for either language draw on the above-mentioned transcription combination were generated which were and annotations, i.e. codings of language then meticulously proofread. Two different choice, syntax, pauses and hesitations, in dictionaries were necessary, since the words combination with the speaker metadata. Us- that contain word-internal language switch- ing these data to create corpus-based socio- ing and/or those that are ambiguous with linguistic speaker profiles provides us with regard to language assignment are not the the possibility of recognizing patterns across same between Danish vs. English and Dan- the language production of many speakers, ish vs. Spanish. The final language coding but also in the language production of single was arrived at by automatically checking the speakers. The paper will demonstrate how preliminary language coding against the dic- this tool enables a more objective assess- tionaries, thus ensuring a very high accuracy ment of a speaker’s or a group of speakers’ of the language coding. language production and competence, in For future projects, the dictionaries creat- addition to presenting the Corpus of Ameri- ed in this process may facilitate (semi-) au- can Danish as a linguistic resource and di- tomated designation of language use in tran- scussing the process of establishing the cor- scriptions of Danish mixed with either Eng- pus. lish or Spanish. Emigrant and heritage speakers are noto- Topics: Nordic Textual Resources and rious for the variation in language compe- Practices tence and language production (Polinsky & Keywords: corpus, multilingualism, spoken Kagan 2007). Accordingly, we observe a language great deal of variance within the language

75 References themes subject to discussion. The sentiment Kühl, Karoline & Jan Heegård Petersen analysis most often entails examining the (2016) Ledstillingsvariation i amerika- negative/positive dichotomy in the material, danske hovedsætninger med topikali- as in reviews indicating good vs. bad (e.g., sering. In Ny forskning i grammatik 23. film or other product reviews), indications Available online at of shared/different opinion or approv- http://ojs.statsbiblioteket.dk/index.ph al/disapproval, and expressions of posi- p/nfg/article/download/24650/2159 tive/negative emotional state. 8. With social media, the entity we call ‘so- Polinsky, Maria & Olga Kagan (2007) Herit- ciety’ has become more visible, more tangi- age Languages: In the 'Wild' and in the ble, and more concrete than before. Social Classroom. In Languages and Linguistics media make visible and renew the process Compass 1 (5), 386-395. through which we collectively produce thoughts and thus reshape society. In social media conversations, emotions are transmit- Rhythms of Fear and Joy ted from one participant and one discussion to another. This results in occasional emo- in Suomi24 Discussions tional rushes and affective contagion. Often people’s emotions become synchronised and Krista Lagus eventually form shared rhythms, in a process Mika Pantzar called entrainment, and larger entities: rec- Minna Ruckenstein ognisable wave motions. University of Helsinki, Finland We approach the social media discussions by means of rhythmanalysis (Lefebvre 2004), Suomi24 is the largest social media platform focusing on social rhythms produced by dif- in Finland. Every day, over 15,000 messages ferent practices and systems. These kinds of are posted in its nearly 2,900 discussion rhythms can be detected in cities and in groups. In its slightly more than 15 years of people’s biologies, resulting in fluctuations existence, about 80 million messages in total of stress and recovery that could be detected have been posted on Suomi24, of which we by means heart rate variability measurements have had access to about 56 million (exclud- (Pantzar et al. 2016). Our presentation con- ing, for example, deleted messages; see the siders whether it is possible to detect similar material description by Lagus et al. 2016). rhythms in social media text. We ask what The readers and the posters of the messages kind of fluctuation can be detected in the represent the Finnish population well in emotional discourse between weekdays and their geographical distribution, for example. weekends, and whether one can see system- In this presentation, we contemplate how atic fluctuation depending on the time of and to what extent it is possible to recognise day, as we witnessed in the above-mentioned emotional movement and rhythm from this stress data, collected from 35 people (the kind of material. study revealed that on Saturdays, after a Over the past few years, sentiment analy- spike in early afternoon, stress declines to- sis (Abbasi 2008; Liu 2010; Honkela et al. wards the evening, whilst on weekdays and 2014) has become an established method of Sundays, the level of stress reaches its peak describing emotional states by means of so- in the evening, and on weekdays also at 8– cial media content. Twitter or Facebook 9am). posts are a window to recognising people’s The presentation focuses on the approx- views on abortion, NATO, or political par- imately 56 million messages in the Suomi24 ties, among other things. Sentiment analysis dataset. We describe how the lexicons for can be used to examine the posters’ subjec- the respective emotion categories were cho- tive views on various topics, whereas topic sen. We will then present time cycles calcu- mining (Purhonen & Toikka 2016; Winter & lated from the frequencies of words express- Wiberg 2016) is used to examine facts or

76 ing fear and joy. Finally, we discuss to what low the rhythm of holidays. Joy/Happiness extent our word-frequency examination in discourse thrives throughout the summer, connection with emotional discourse repre- particularly in July. We can see another peak sents meanings related to emotions and to over Christmas. In contrast, the Fear/Worry experiencing emotions. discourse is more evenly distributed over When discussing emotional discourse, we the year, but peaks in September. We might are especially interested in getting at long- ask whether these figures reflect the real- term mood-like emotional states and also world seasonal stress curve. With the emotional discourse related to or contrib- Fear/Worry spike in September and the uting to the activation of these mood states. concurrent low point for the Joy/Happiness Previous efforts to recognise long-term curve, we could ask whether there is a spike emotional states or moods include studying in the population’s stress level in September, of six basic emotions (Strapparava & Mihal- as the working population have returned to cea 2008), recognition of eight basic emo- work after the summer holidays and as tions from Twitter data (Roberts et al. 2012), schools too are running as usual. Are there and recognition of emotional states repre- any other studies that would indicate Sep- senting five distinct aspects of well-being as tember to be a particularly challenging time represented in various types of news and in people’s lives? A similar pattern is found discussion data (Honkela et al. 2014). in January, after Christmas. The method we applied comprises the On a weekly level (see Figure 2, at the following components: end of the text), on Sundays, both the 1. Identification of the emotional states Joy/Happiness discourse and the subject to study and their more precise defi- Fear/Worry discourse reach their highest. nition using a vocabulary-based approach. The former is at its lowest on Thursdays, 2. Evaluating the preliminary emotion and the Fear/Worry discourse reaches its features qualitatively using Korp tool (Borin lowest point on Saturdays. Thursday is a et al., 2012) and adjusting the set of features common banking day – that is, when people accordingly. pay their bills. We might also consider the 3. Calculating the averaged occurrences possible relationship of alcohol use to these of emotion expressions over various time results, especially in the case of Saturday cycles and visualizing them. evening. Is the spike in the Joy/Happiness As a starting point, we selected for ob- curve at around 9pm on Saturdays a sign of servation two quite distinct categories of a ‘buzz’ and the spike in the Fear/Worry emotional discourse, which appear as some- curve on Saturday night a sign of coming what dichotomous: categories of fear down from that ‘high’? (‘Fear/Worry’) and joy (‘Joy/Happiness’). On an hourly level, the Joy/Happiness We chose fear and worry to be the starting discourse is at its highest slightly before point because we deemed them likely to midnight, after which the curve keeps drop- have more multifaceted contexts than other ping until 10am. After that, the proportion considered moods such as sadness or anger of Joy/Happiness discourse starts to in- and, hence, to provide a richer target of crease slowly. With the Fear/Worry dis- analysis. Moreover, the mood of fear and course, we see a much clearer hourly worry may function as a seedbed of social rhythm. In the discussions, there is a con- aggressions. centration of fear and worry between 2 and In our dataset, the ways in which both 4am. This can be explained by the fact that the Fear/Worry lexicon and the the group of people taking part in the dis- Joy/Happiness lexicon manifest themselves cussion at these hours is very limited and show distinct differences by season, day of specific. For instance, the messages are the week, and time of day. On a monthly longer than average and the number of par- level (see Figure 1, at the end of the text), ticipants is far below the daily average the Joy/ Happiness discourse seems to fol- (Lagus et al. 2016).

77 Discussion method obsolete. Wired, 7/2008. Avai- The emotional discourse in the climate of lable at interaction surrounding us tunes us in to the http://archive.wired.com/science/disc emotional atmosphere that we live in, and it overies/magazine/16-07/pb_theory. affects the emotional nature of our actions Borin, L., Forsberg, M., & Roxendal, J. in the following days. Hence, an interesting (2012). Korp – the corpus infra- empirical question arises: what kind of emo- structure of Spräkbanken. In: Procee- tional discourse does this nation collectively dings of LREC 2012 (pp. 474–478). produce and read on Internet fora? Empiri- Honkela, T., Korhonen, J., Lagus, K., cal research directed to such issues might & Saarinen, E. (2014). Five- give us a stronger foundation for answering dimensional sentiment analysis of cor- questions about the recent evolution of the pora, documents and words. In: Ad- Finnish social climate or emotional terrain. vances in Self-organizing Maps and As shown by Facebook’s study of the Learning Vector Quantization (pp. contagious nature of emotions expressed in 209–218). Springer International the virtual world, once people read messages Publishing. of either positive or negative polarity in their Kramer, A.D. (2012). The spread of emot- news feed, it affects the positive and nega- ion via Facebook. In: Proceedings of tive expressions in their own messages in the the SIGCHI Conference on Human following days (Kramer 2012; Kramer et al. Factors in Computing Systems (pp. 2014). What the illustrations calculated here 767–770). ACM. inform us of is the emotional landscape of a Kramer, A.D., Guillory, J.E., & Hancock, particular discussion forum against cyclic J.T. (2014). Experimental evidence of time - a natural way to think about our own massive-scale emotional contagion life. While we might not deduce the entire through social networks. Proceedings emotional state of a nation from these fig- of the National Academy of Sciences, ures, nevertheless we might consider, when 111(24), 8788–8790. is it emotionally most beneficial for our- Lagus, K.H., Pantzar, M., Ruckenstein, M.S., selves to participate in social media discus- & Ylisiurua, M.J. (2016). Suomi24 sions. Furthermore, these figures might sug- Muodonantoa aineistolle [‘Suomi24: gest times when to add particular support Shaping the data’]. Valtiotieteellisen for those who might need it most. Much like tiedekunnan julkaisuja [‘Publications of a rhythm map of a city traffic might depict the Faculty of Social Sciences’], 10, times of road rage versus polite and calm, May 2016. Helsinki: Unigrafia. 44 pa- this mapping depicts an emotionally in- ges. formed timescape of virtual social climate. Lefebvre, H. (1992/2004). Rhythm Analysis, Space, Time and Everyday Life Topics: The Digital, the Humanities, and the (Athlone Contemporary European Philosophies of Technology Thinkers series), transl. by S. Elden Keywords: social media, social sciences, senti- and G. Moore. London: Continuum. ment analysis Liu, B. (2010). Sentiment analysis and subjectivity. In: Handbook of Natural Bibliography Language Processing, Vol. 2 (pp. 627– Abbasi, A., Chen, H., & Salem, A. (2008). 666). Sentiment analysis in multiple langu- Pantzar M, Ruckenstein M, Mustonen ages: Feature selection for opinion V.(2016). Social rhythms of the heart. classification in Web forums. ACM In: Health Sociology Review, forth- Transactions on Information Systems coming, (TOIS), 26(3), 12. http://dx.doi.org/10.1080/14461242.2 Anderson, C. (2008). The end of theory: The 016.1184580 data deluge makes the scientific

78 Purhonen, S. & Toikka, A. (2016). ”Big da- tan” haaste ja uudet laskennalliset tekstiaineistojen analyysimenetelmät. Esimerkkitapauksena aihemallianalyysi tasavallan presidenttien uuden- vuodenpuheista 1935–2015 [‘The chal- lenge of “big data” and new computat- ional methods for text analysis: An ex- ample from a topic model of New Year’s speeches of Finnish Presidents, 1935–2015’]. Sosiologia [‘Sociology’], 53(1), 6–27. Roberts, K., Roach, M.A., Johnson, J., Guthrie, J., & Harabagiu, S.M. (2012). EmpaTweet: Annotating and detecting emotions on Twitter. In: Proceedings of LREC 2012 (pp. 3806–3813). Strapparava, C. & Mihalcea, R. (2008). Le- arning to identify emotions in text. In: Proceedings of the 2008 ACM Sympo- sium on Applied Computing (pp. 1556–1560). ACM. Winter, L. & Wiberg, M. (2016). Presidentin uudenvuodenpuheet: Kvantitatiivisen tekstianalyysin mahdollisuuksia [‘The Presidents’ New Year’s speeches: The possibilities of quantitative text analy- sis’]. Politiikka [‘Politics’], 58(1), 80–88.

79 80 Long-Range Information short-term and anti-persistent dynamics. KL divergence, on the other hand, identified a Dependencies and Semantic short ‘late style’ phase of semantic innova- Divergence Indicate tion in Grundtvig’s late writings. Author Kehre Discussion Kristoffer Laigaard Nielbo We argue that early and late style phases are Aarhus University, Denmark signatures of one, or several, Kehren in Katrine Frøkjær Baunvig Grundtvig’s writings that reflect a combina- University of Southern Denmark tion of individual mental history and general cognitive development. At a methodological Introduction level, we argue that LRD and Information Across academic disciplines that study liter- Theory can substantiate qualitative observa- ary and intellectual history there are ongoing tions of paradigm shifts in cultural systems discussions of if and when culturally im- both at the micro and macro-level. portant writers of fiction or non-fiction un- derwent a personal paradigm shift (a Kehre). Topics: Nordic Textual Resources and Prac- The collected writings of philosopher Martin tices Heidegger, writer Milan Kundera, and theo- Keywords: author development, long-range logian Martin Luther, all show indications of dependencies, information theory, culture such a Kehre. The temporal identification of analytics a Kehre however represents a significant methodological challenge. In this paper, we Bibliography describe a novel approach to identification Structural Differences Among Individuals, of change in the history of highly productive Genders and Generations as the Key writers. The approach combines information for Ritual Transmission, Stereotypy, theory and fractal analysis to substantiate and Flexibility. / Nielbo, Kristoffer claims about an intellectual Kehre as exem- Laigaard; Fux, Michel ; Mort, Joel; plified by the Danish liberal thinker, theolo- Zamir, Reut; Eilam, David. In: gian and romantic writer N.F.S. Grundtvig Behaviour, 2016 (in press). (1783-1872). Traveling Companions Add Complexity and Hinder Performance in the Spatial Methods Behavior of Rats. / Dorfman, Alex; The corpus consists of the collected writings Nielbo, Kristoffer Laigaard; Eilam, of N.F.S. Grundtvig (N = 988). Shannon David. Entropy and Latent Dirichlet Allocation In: PLoS ONE, 04.01.2016. were used to model llexical density and se- Segmentation and cultural modulation in mantic content, respectively, of each of perception of internal events are not Grundtvig’s writings. Adaptive Fractal Anal- trivial matters. / Nielbo, Kristoffer ysis (AFA) was used to estimate the Hurst- Laigaard; Andersen, Marc Malmdorf; exponent (i.e., a measure of Long-Range Schjødt, Uffe. Dependencies in time series) of lexical densi- In: Religion, Brain, and Behavior, ty in windowed time slices (n = 589). Kull- 03.06.2016. bach-Leibler (KL) divergence was applied to Attentional Resource Allocation and Cul- an LDA model’s topic distributions in order tural Modulation in a Computational to estimate time dependent semantic change. Model of Ritualized Behavior. / Nielbo, Kristoffer Laigaard; Sørensen, Results Jesper. In: Religion, Brain, and AFA indicated an 'early style’ phase of per- Behavior, 2015. sistent lexical innovation in Grundtvig’s ear- ly writings, which is contrasted by otherwise

81 Finnish Internet Parsebank cal and / or syntactic constructions regard- less of their lexical realisation (such as all – A Web-crawled Corpus sentence subjects). Also searches with re- of Finnish with Syntactic strictions are possible (such as verbs without Analyses a subject). The interface returns the sentenc- es including the searched expression as well Veronika Laippala as its linguistic contexts. The search hits can Aki-Juhani Kyröläinen also be downloaded. In addition to the FIP, Jenna Kanerva the user interface includes several other lan- Juhani Luotolahti guage resources, such as corpora in other Tapio Salakoski languages following the UD scheme. Filip Ginter The currently available version of the FIP University of Turku, Finland is in total composed of 3,662,727,698 words. These include 28,585,422 lemmata, This paper presents the Finnish Internet 39,688,642 unique words and 275,690,022 Parsebank (FIP), a freely available web- sentences. The FIP was collected using two crawled corpus of Finnish, its user interface methods. First, all Finnish texts were detect- as well as some recent studies enabled by it. ed from the 2012 release of the Common The FIP consists of nearly 4 billion Crawl dataset. Common Crawl is a U.S. non- words automatically collected from the Web. profit organisation that builds and maintains It has full morphological and dependency web-crawled data. Second, we launched a syntax analyses. On the word-level, this in- dedicated web crawl targeting Finnish data, cludes the part-of-speech classes of the not delimited to the .fi-domain. The crawl words and their morphological features was realised using the SpiderLing crawler (such as noun, singular and genitive), and on which is designed for collecting linguistic the sentence-level, the sentence structure data. and the syntactic functions of the words in it For the linguistic analysis of the data, the (such as nominal subject). These are marked raw texts were first segmented to sentences following the Universal Dependencies (UD) and words with a sentence splitter and to- scheme, a syntactic model seeking cross- kenizer developed using the Apache linguistically consistent annotations and at- OpenNLP toolkit trained on our previously tested on 47 languages. The UD allows for manually developed language resource, the novel insights to many linguistic research Turku Dependency Treebank (TDT) (Ha- problems by enabling their study across lan- verinen et al. 2014). The part-of-speech clas- guages. For instance, the characteristics of ses of the words and their morphological different texts can be analysed not only in features were assigned with the Marmot tag- one language but across several languages. ger (Mueller et al., 2013), the morphological Also, many language-technology applica- analyzer OMorFi (Pirinen, 2011) and a sys- tions, such as machine translation, profit tem transforming the OMorFi output to UD from these harmonised markings. (Pyysalo et al., 2015). An evaluation of this The FIP is available through a user inter- analysis pipeline showed an accuracy of face at http://bionlp-www.utu.fi/ 97.0%, for the parts-of-speech and 94.0% dep_search/and as a downloadable version, for the full representation of the morpholog- shuffled at the sentence-level at ical features, which is comparable to the http://bionlp.utu.fi. The advanced user in- state-of-the-art results in other languages terface is described in detail in Luotolahti et (Pyysalo et al. 2015). The dependency syntax al. (2015). The interface allows for the search analysis on the sentence structure is carried of both individual words (such as boy), out using the parser of Bohnet et al. (2010) words with specific morphological or syntac- which is also trained on a version of TDT in tic features (such as boy as the sentence sub- the UD scheme. The parser performance is ject) and the search of specific morphologi- 81.4% labeled attachment score which indi-

82 cates the percentage of dependencies be- transitive constructions (cloze task) and tween the correct words with a correct de- reading times (eye-tracking) (Kyröläinen et pendency type. al. 2016). The morphosyntactic analysis of Thanks to its size, linguistic variation and the FIP allows us to model transitive con- the syntactic analyses, the FIP allows for struction and, importantly, even rare verbs novel possibilities for all disciplines working occur with sufficient quantity that makes it on textual data. Among others, these ad- possible to build semantic representations vantages have allowed us to develop large- for them. scale quantitative methods for a detailed lin- Finally, the FIP data has been used to guistic analysis of the characteristics of both improve the language technology available different kinds of texts and individual words for the Finnish language, especially machine or expressions, such as discourse markers translation (MT). A better language model as expressing reactions or interaction (for in- well as a reinflection generation model were stance, kyllä ‘sure’ and tokin ’certainly’ induced from the FIP data, resulting in an (Laippala et al.2016, fortc.)). These methods improved MT performance especially for the allow as well the automatic identification of English to Finnish direction. (Tiedemann et for instance machine translations and infor- al. 2016) In an ongoing effort, the FIP data mal texts from the FIP (Laippala et al. 2015). is being used to fully automatically gather a In addition, the FIP has been applied for parallel corpus for MT system training. the study of very rare linguistic construc- tions, typical in spoken or informal language Topics: Nordic Textual Resources and varieties not necessarily found in traditional, Practices manually collected corpora. For example, Keywords: Web corpus, Universal Dependen- Huumo et al. (forthc.) explore the variation cies, big data, corpus linguistics, natural lan- of an extremely rare and grammatically ques- guage processing tionable syntactic construction, where a transitive sentence, i.e. a sentence with an References object, includes a subject in the partitive Bohnet, Bernd 2010. Top accuracy and fast case, as in Useita uudehkoja autoja on reput- dependency parsing is not a contradic- tanut tämän testin ‘Several newish cars have tion. In Proceedings of COLING’10, failed this test’. Wessman (2016) studies the pages 89–97 use of a novel syntactic construction typical Haverinen, Katri, Nyblom, Jenna, Viljanen, of spoken language, where the subordinate Timo, Laippala, Veronika, Kohonen, conjunction koska ‘because’ is attached to a Samuel Missilä, Anna, Ojala, Stina, Sa- noun phrase instead of a verb phrase, as in I lakoski, Tapio and Ginter, Filip. 2014. am tired, because the headache’. Building the essential resources for Another line of investigation that utilizes Finnish: the Turku Dependency Tree- the data available in the FIP concerns the bank. Language Resources and Evalu- representation of clausal semantics using ation, 48(3):493–531. neural networks, specifically, modeling the Huumo, Tuomas, Kyröläinen, Aki-Juhani, semantic fit of arguments in a transitive con- Kanerva, Jenna, Luotolahti, Juhani, struction. The implemented neural network Salakoski, Tapio, Ginter, Filip, and builds a semantic representation for a transi- Laippala, Veronika (forth.) Distribu- tive construction based on word2vec tional Semantics of the Partitive A Ar- (Mikolov, et al. 2013). We are currently test- gument Construction in Finnish. In ing the performance of the implemented Luodonpää-Manni, M., E. Penttilä and model in several tasks such as cloze task and J. Viimaranta (eds.).Empirical ap- modeling reading times using eye-tracking. proaches to cognitive linguistics: Ana- The results of current model appear promis- lysing real-life data. Newcastle Upon ing as the model estimates are correlated to Tyne: Cambridge Scholars Publishing. human preferences when asked to complete

83 Laippala, Veronika, Kyröläinen, Aki-Juhani, open source development of a mor- Kanerva, Jenna, Luotolahti, Juhani, phological analyser. In Proceedings of Salakoski, Tapio, and Ginter, Filip the 18th Nordic Conference of Com- (forth.) Dependency profiles as a tool putational Linguistics (NODALIDA), for big data analysis of linguistic con- pages 299–302. structions: A case study of emoticons. Pyysalo, Sampo, Kanerva, Jenna, Missila, Journal of Estonian and Finno-Ugric Anna, Laippala, Veronika and Ginter, Linguistics. Grammar in Use: Ap- Filip. 2015. Universal Dependencies proaches to Baltic Finnic. for Finnish. In Proceedings of the Laippala, Veronika, Kyröläinen, Aki-Juhani, 20th Nordic Conference of Computa- Komppa, Johanna, Vilkuna, Maria, tional Linguistics (Nodalida 2015), Kalliokoski, Jyrki, and Ginter, Filip. pages 163–173. 2016. Sentence-initial discourse mark- Tiedemann, Jörg, Cap, Fabienne, Kanerva, ers in the Finnish Internet. In Text- Jenna, Ginter, Filip, Stymne, Sara, Link 2016 Handbook. Harmattan. Östling, Robert, and Weller-Di Marco, Laippala, Veronika, Kanerva, Jenna, Pyysalo, Marion. 2016. Phrase-Based SMT for Sampo, Missilä, Anna, Salakoski, Finnish with More Data, Better Mod- Tapio, and Ginter, Filip. 2015. Syntac- els and Alternative Alignment and tic N-grams in the Classification of the Translation Tools. In Proceedings of Finnish Internet Parsebank: Detecting the First Conference on Machine Translations and Informality. Proceed- Translation, pages 391-398. ings of the 20th Nordic Conference of Wessman, Kukka-Maaria. 2016 Koska inter- Computational Linguistics net. Finiittiverbittömän koska X kon- (NODALIDA 2015), May 11–13, struktion syntaksi ja variaatio. Master’s 2015 in Vilnius, Lithuania. thesis, School of languages and trans- Luotolahti, M. Juhani, Kanerva, Jenna, lation studies, University of Turku. Pyysalo, Sampo, and Ginter, Filip

2015. SETS: Scalable and Efficient Bibliography Tree Search in Dependency Graphs. Laippala, Veronika, Kyröläinen, Aki-Juhani, Proceedings of the 2015 Conference Kanerva, Jenna, Luotolahti, Juhani, of the North American Chapter of the Salakoski, Tapio, and Ginter, Filip Association for Computational Lin- (forth.) Dependency profiles as a tool guistics: Demonstrations. Association for big data analysis of linguistic con- for Computational Linguistics, 51--55. structions: A case study of emoticons. Kyröläinen, Aki-Juhani and Luotolahti, M. Journal of Estonian and Finno-Ugric Juhani and Hakala, Kai and Ginter, Linguistics. Grammar in Use: Ap- Filip 2016. Modeling cloze probabili- proaches to Baltic Finnic. ties and selectional preferences with Huumo, Tuomas, Kyröläinen, Aki- neural networks. DSALT: Distribu- Juhani, Kanerva, Jenna, Luotolahti, tional semantics and linguistic theory, Juhani, Salakoski, Tapio, Ginter, Filip, August 08–15, 20106 in Bolzano, Italy. and Laippala, Veronika (forth.) Distri- Mueller, Thomas, Schmid, Helmut, and butional Semantics of the Partitive A Schutze, Hinrich. 2013. Efficient Argument Construction in Finnish. In higher-order CRFs for morphological Luodonpää-Manni, M., E. Penttilä and tagging. In Proceedings of the 2013 J. Viimaranta (eds.).Empirical ap- Conference on Empirical Methods in proaches to cognitive linguistics: Ana- Natural Language Processing, pages lysing real-life data. Newcastle Upon 322–332. Tyne: Cambridge Scholars Publishing. Pirinen, Tommi A. 2011. Modularisation of Laippala, Veronika, Kanerva, Jenna, Finnish finite-state language descrip- Pyysalo, Sampo, Missilä, Anna, Sala- tion–towards wide collaboration in koski, Tapio, and Ginter, Filip. 2015.

84 Syntactic N-grams in the Classification computer science and cartography (Kirk, of the Finnish Internet Parsebank: De- 2012), in addition to textgenetic analysis. tecting Translations and Informality. It is important to consider two comple- Proceedings of the 20th Nordic Con- mentary concepts on the same visual surface ference of Computational Linguistics when creating data visualizations, namely, (NODALIDA 2015), May 11–13, data representation (visual variables in the 2015 in Vilnius, Lithuania. creation of graphs or charts) and data pre- sentation (appearance and delivery format of the entire data visualization design, colors, the interactive features and the annotations). Writing and Rewriting: The (Aligner, et al., 2011) Colored Digital Visualization The writing process is difficult to grasp as a whole. From a computer science and mat- of Keystroke Logging hematics standpoint, there are only two di-

mensions to this process: the temporal di- Christophe Leblay mension, involving the specific moment University of Turku, Finland when each operation was made; and the Gilles Caporossi spatial one, which corresponds to the exact HEC Montréal, Canada position of the operation in the list. Because

this definition is highly decontextualized, As they contain a lot of data, keystroke log- some writing process representations also ging files, are difficult to read and analyze use a third dimension, chronology, which is (Wengelin, et al., 2009). There are many rea- a simplification of the temporal aspect sons for this, including their chronological (Bécotte-Boutin, Caporossi & Hertz, 2015). format and high number of complex details The writer adds and removes characters (Kollberg, 1996). However, representations chronologically in time, but the overall state of writing are, so far, one of the main tools of the text changes as the writer modifies it. used to analyze it. The reason why analyzing Genetic criticism studies precisely the diffe- the writing process is so important derives rent states of the text. Those three dimens- from the genetic methodology, where the ions then concern genetic operations at the more a text is changed or modified, the bet- most basic level. Each operation of the wri- ter it becomes (Leblay, 2011). The ultimate ting process can be considered as a substi- goal then becomes to understand how modi- tution operation (Van Waes & Schellens, fications continue to improve the text and 2003). An insertion would be the replace- how modifications are done in order to un- ment of an empty space by a keystroke, and derstand the way the text continuously im- the deletion or replacement of a keystroke proves. by an empty space. These operations are The goal of data representations is to characterized by the fact they are done in a help researchers with their analysis, to assist single step with the mouse or keyboard. them in understanding the data and finding More complex operations, such as substitut- patterns in it. Visualization is more than just ion and replacement, which are done in two drawings of data; it is an analysis tool (Many- steps (Caporossi & Leblay, 2011), are consi- ika, et al., 2011). Seeing how data interacts dered to be combinations of the simple op- makes it possible to discover and understand erations. patterns and changes over time within a da- Another aspect of the writing process is tabase (Minelli, et al., 2013; Yau, 2011). For the micro and macro aspects of the text, i.e., a researcher to use representations in a way the detailed operations performed and the that does more than just describe a dataset process’overall structure. Because those two requires visualization techniques. These te- aspects cannot be visualized together in the chniques are multidisciplinary and include same representation unless interactivity and statistics, cognitive science, graphic design, the view adjustment feature are used (Aig-

85 ner, et al., 2011) researchers usually use seve- awareness during text composition. ral representations to understand the process Learning and Individual Differences, more completely (Alamargot, et al., 2011; 21 (5), 505-516. Breetvelt, et al., 1994; Caporossi & Leblay, Breetvelt, I., Van Den Bergh, H., & Rijlaars- 2011; Cox, et al., 2009; Doquet-Lacoste, dam, G. (1994). Relations between 2003; Haas, 1989; Latif, 2008; Leijten & Van Writing Processes and Text Quality: Waes, 2013; Southavilay, et al., 2013; Van When and How. Cognition and In- Waes & Schellens, 2003). struction, 12 (2), 103-123. Actual visualizations of the writing pro- Caporossi, G., & Leblay, C. (2011). Online cess are bidimensional, and because of that, Writing Data Representation: A Graph they focus for example on revision, the tem- Theory Approach. In Lecture Notes in poral aspect or the writer’s retrospection Computer Sciences 7014, 80-89. (Latif, 2008). Even if it is important to ana- Cox, M., Ortmeier-Hopper, C., & Tirabassi, lyze and understand the spatiotemporal di- K. E. (2009). Teaching Writing for the mension of the process (Stromqvist, et al., "Real World": Community and Work- 2006), none of the actual visualizations re- place Writing. The English Journal, 98 present the problem completely. (5), 72-80. We propose new visualizations based on Doquet-Lacoste, C. (2003). Étude Génétique mathematical graphs that consist of nodes de l'Écriture sur Traitement de Texte (points) and edges (lines eventually joining d'Élèves de Cours Moyen 2, Année the nodes). As such, graphs are based on 1995-1996. Paris: Université Sorbonne relationships between nodes and may be nouvelle. used for modeling purposes. This colored Haas, C. (1989). How the Writing Medium representation is halfway between detailed Shapes the Writing Process: Effects of representations and overviews. The dynamic Word Processing on Planning. Rese- aspect of the writing process is highlighted arch in the Teaching of English, 23 (2), (Caporossi & Leblay, 2011; Leblay & Capo- 181-207. rossi, 2014). One of its strength is that it Kirk, A. (2012). Data visualization: a suc- clearly shows the temporal and chronologi- cessful design process [electronic cal relationships between operations, facilita- book]. Packt Pub. ting their identification in a structured way. Kollberg, P. (1996). Rules for the S-notation: Another advantage of this visualization of a computer-based method for repre- the writing process is that it “can handle senting revisions. Stockholm, Sweden: moving text positions” (Southavilay, et al., IPLab, Royal Institute of Technology 2013). (KTH). Latif, M. M. (2008). A State-of-the-Art Re- Topics: Visual and Multisensory Representat- view of the Real-Time Computer- ions of Past and Present Aided Study of the Writing Process. Keywords: classification, keystroke logging, International Journal of English Stu- digital colored visualization, textgenetics, dies, 8 (1), 29-50. time-oriented production Leijten, M., & Van Waes, L. (2013). Keystroke Logging in Writing Rese- Bibliography arch: Using Inputlog to Analyze and Aigner, W., Miksch, S., Schumann, H., & Visualize Writing Processes. 30 (3), Tominski, C. (2011). Visualization of 358-392. Time-Oriented Data. Human- Leblay, C. & Caporossi, G. (2014). Temps Computer Interaction Series. London: de l'écriture: enregistrements et re- Springer. présentations. Louvain-la-Neuve: Aca- Alamargot, D., Caporossi, G., Chesnet, D., demia. & Ros, C. (2011). What makes a skilled Manyika, J., Chui, M., Brown, B., Bughin, J., writer? Working memory and audience Dobbs, R., Roxburgh, C., et al. (2011).

86 Big data: the next frontier for innovat- project group behind the current abstract ion, competition, and productivity. has previously used it on medieval Swedish McKinsey Global Institute. manuscripts, namely on Cod. Ups C 64 (Lat- Minelli, M., Chambers, M., & Dhiraj, A. in) and Cod. Ups. C 61 (Old Swedish), see (2013). Big data, big analytics: Wahlberg et al. (2011) and Wahlberg et al. Emerging business intelligence and (2014). The most common usage for Word analytic trends for today's businesses. Spotting is to extract words for different Wiley Publishing. purposes, for instance for linguistic investi- Southavilay, V., Yacef, K., Reimann, P., & gations. From a technical perspective, there Calvo, R. A. (2013). Analysis of Colla- are several different variants of Word Spot- borative Writing Processes Using Re- ting, but in most cases the searching process vision Maps and Probabilistic Topic is built up on a template of the word form in Models. Proceedings of the Third In- question being chosen, and then the com- ternational Conference on Learning puter identifies graph sequences in the man- Analytics and Knowledge, 38-47. uscript, charter etc. that are similar to the Stromqvist, S., Holmqvist, K., Johansson, template. For further details on the technical V., Karlsson, H., & Wengelin, A. aspects of the method, see below. (2006). What Keystroke Logging can In the present investigation, the Word Reveal about Writing. In K. P. Lind- Spotting method is used for another pur- gren (Ed.), Computer Keystroke Log- pose, namely scribal attribution, i.e. identify- ging and Writing. Elsevier, 45-71. ing individual scribes. Our material is the Van Waes, L., & Schellens, P. J. (2003). Wri- medieval Swedish charter corpus in its en- ting Profiles: The Effect of the Writing tirety, as far as they have been photographed Mode on Pausing and Revision Patt- (more than 10 000 charters). These are pre- terns of Experienced Writers. Journal served at Svenskt diplomatarium, of Pragmatics, 35, 829-853. Riksarkivet. As stated above, the basic con- Wengelin, A., Torrance, M., Holmqvist, K., cept of the Word Spotting method is that a Simpson, S., Galbraith, D., Johansson, word template is chosen as a point of refer- V., et al. (2009). Combined eyetracking ence, from which the other similar word and keystroke-logging methods for forms are identified. From a linguistic per- studying cognitive processes in text spective, the template consists in a graph production. Behavior Research sequence, as such unique and produced by a Methods, 41 (2), 337-351. certain scribe at a certain time. This means Yau, N. (2011). Visualize this: the flowing that the template contains some characteris- data guide to design, visualization and tics of the scribe that produced it. For our statistics. Indianapolis: Wiley purpose, the template is not used for identi- Publishing. fying all the word forms in the corpus that the template represents, but for identifying the instances when the word forms (and in- Word Spotting as a Tool dividual letters; see below) have been exe- for Scribal Attribution cuted in a way similar to the template. For the purpose of scribal attribution, not Lasse Mårtensson only graph sequences (in this case word University of Gävle, Sweden forms) are of interest, but also individual Anders Hast graphs (letters). The shape of letters has for Uppsala University, Sweden a long time been considered as a key issue Alicia Fornes for scribal attribution in the palaeographic Autonomous University of Barcelona, Spain research. One could mention Per-Axel Wiktorsson’s work in four volumes, Sveriges Word Spotting is a set of methods for local- medeltida skrivare (2015), where Wiktorsson izing word forms in handwritten text. The identifies the scribes mainly on the basis of

87 the shape of seven letters: ‘g’, ‘w-‘, ‘æ’, ‘ø’, method can be used for searching for both ‘y’, ‘ n’, ‘k’ och ‘h’ (p. 27). We have therefore words and graphs, and even for parts of focused on the identification of specific let- graphs. ters, and especially those consisting of sever- The fact that there are matches in the al components, with a more complicated Word Spotting and the Letter Spotting pro- ductus, more specifically those used by cess do not automatically lead to the conclu- Wiktorsson. These are, of course, more like- sion that the letters have been produced by ly to show individual traits than more simple the same scribe. Instead the matches should formations such as ‘i’, ‘o’ etc. In our investi- be seen as suggestions, to be further evaluat- gations this far, we have made searches for ed by a human researcher. The matches rep- ‘g’, ‘æ’ and ‘k’, and we will continue with the resent graph sequences that display similari- other ones listed by Wiktorsson. ties with the template regarding the measur- From a technical perspective, the search ing points. If for instance matches are found for individual letters poses a greater chal- regarding ‘g’ in certain documents, one lenge than the search for sequences, as the would also expect matches in the same doc- number of measuring points for the former uments regarding other letters. This is, how- is much smaller. Thus, a great deal of time ever, not always the case, and thus one must has been put into optimizing the technical evaluate the results of the searches with aspects of the method. The current state of great care. the art in HTR (Handwritten Text Recogni- One great difficulty when dealing with tion; Llados et al. 2012) can be divided into scribal attribution in medieval documents is at least two categories: 1) Segmentation the absence of ground truth. It is very rare techniques (Rath et al. 2007) need to seg- that we know who actually held the pen in ment the documents into text lines or even these documents, and when the scribes are into words. Therefore, the performance of known, they are in most cases known these techniques highly depends on the ac- through earlier attributions. When working curacy of the line or word segmentation al- with new methods for scribal attribution, it gorithms. To this approach belong the is not satisfactory to rely on previous at- above mentioned Wahlberg et al. (2011) and tributions only. If one would use previous Wahlberg et al. (2014). 2) Segmentation-free attributions to evaluate the methods, one approaches (Leydier et al. 2009) divide the would risk going in circle, forming the new manuscript into zones, or cells. Our ap- methods on the previous work. For that proach belongs to this category and we use a purpose, we have established a set of char- so-called sliding window to match the tem- ters where the scribes have been identified plate with the content of the window, in this on external evidence, i.e. not through attrib- case the handwritten document being inves- utions on palaeographic grounds etc. Most tigated. The unique quality of our approach important are the charters containing a no- is that we can perform what has been done tice from a recording clerk, stating that this for a long period of time in the area known person has written the document in his own as image registration. In image registration, hand (see Wiktorsson 2015: 28). These char- template images are matched to find identi- ters function as our point of reference in the cal images. In our case, dealing with hand- searches in the corpus. written text, this must be done in a different This investigation is a part of an on going way, since the template and the word within project, called “New Eyes on the Scribes of the sliding window are not identical, since all medieval Sweden” (Riksbankens jubile- graphs are unique and always displaying umsfond). The aim of this project is to in- some incidental variation, however small. vestigate and map the characteristics of the Therefore, the algorithm must be much script and the scribes in the medieval Swe- more relaxed than in the case of ordinary dish charters. Within this project, we use image registration, i.e. allowing for variance several methods, each aiming at measuring (without loosing accuracy). The current certain features of the script (see e.g.

88 Mårtensson et al. 2015). Hence, the current Wiktorsson, P.-A. (2015). Skrivare i det me- Word Spotting investigation should not be deltida Sverige. Vol. 1. Skara: Skara seen as one isolated attempt at solving the stifthistoriska sällskap. issue of scribal attribution, but as a part of a large scale mapping of script features. The purpose of this project is not to find one Text Mining the History single method that will work as an automatic tool for scribal attribution. It is through the of Information Politics collected evidence of several methods for Through Thousands of measuring script features that a new map- Swedish Governmental ping of the medieval scribes will be achieved. Official Reports

Topics: Visual and Multisensory Representat- Fredrik Norén ions of Past and Present Roger Mähler Keywords: word spotting, scribal attribution, Umeå University, Sweden palaeography, image analysis Why did “information”, a concept and a Bibliography keyword that we take for granted in our Leydier, Y., A. Ouji, F. LeBourgeois, and H. modern vocabulary, infiltrate the official lan- Emptoz (2009). Towards an omnilin- guage in the twentieth century? In this gual word retrieval system for ancient presentation, I will show how the rise of the manuscripts. In: Pattern Recognition, governmental information discourse, in the vol. 42, no. 9, pp. 2089–2105, 2009. 1960s and the 1970s, can be understood Llados, J., M. Rusinol, A. Fornes, D. within a larger theme of “development”, and Fernandez, and A. Dutta (2012). On how the concept of information became re- the influence of word representations garded as a silver bullet for the bureaucratic for handwritten word spotting in hist- apparatus to tackle problems in society. This orical documents. In: International is done by topic modeling (with Journal of Pattern Recognition and LDA/MALLET) the corpora of Swedish Artificial Intelligence, vol. 26, no. 05, Governmental Official Reports (8 000 re- 2012. ports, 1922–), in particular by examining co- Mårtensson, L., F. Wahlberg and A. Brun occurring topics, a less common approach (2015). Digital Palaeography and the within the practical use of probabilistic topic Old Swedish Script. The Quill Feature modeling. The scope and the long time-span Method as a Tool for Scribal Attribut- of the report series makes it an international- ion. In: Arkiv för nordisk filologi ly unique source, especially when it comes to 130/2015. study the emergence of interests and atti- Rath, T., and R. Manmatha (2007). Word tudes of a single state through time and, fur- spotting for historical documents. In: thermore, to view it as the “voice” of the IJDAR, pp. 139–152, 2007. Swedish state. Today, when incalculable Wahlberg, F., M. Dahllöf, L. Mårtensson amounts of texts, like the report collection, and A. Brun (2011). Data Mining Me- not only are available online but also search- dieval Documents by Word Spotting. able – down to each single word – this In: Proceedings of the 2011 Works- presentation emphasizes the need for the hop on Historical Document Imaging humanities to accept the challenge of poten- and Processing. tially re-rewrite parts of history. That is, how Wahlberg, F., M. Dahllöf, L. Mårtensson changes of language, in millions of docu- and A. Brun (2014). Spotting Words in ments, can be linked to – and create new un- Medieval Manuscripts. In: Studia derstanding for – developments in society. Neophilologica 86/2014. In collaboration with Humlab, the digital humanities hub at Umeå University, and

89 software developer Roger Mähler, a method Each graph were modulated by the Close- was developed that utilized the output data ness-Centrality, Force Atlas and Modularity from MALLET which was expected to give Class algorithms to sort topics and reports insight into three things: 1) the number and into larger thematic clusters, that is meta- diversity of reports that the information top- topics. In order to classify a meta-topic, eve- ic occurred in through time, 2) to discover ry topic and report that belonged to a cluster and visualize larger cluster of topics and give were manually examined as a way to identify the information topic a position, and hence the common theme of the topic cluster. For a context, in those clusters, and 3) enrich the example, one interesting result, which was analysis by combining distant and close read- found in the pre-study of the network graph ings. of the 1970s, situated the information topic The Swedish Governmental Official Re- within a cluster that had been labeled as port series are available for public access at theme of development (of various political The Riksdag's open data website (da- issues). ta.riksdagen.se), and part-of-speech tagged Thirdly, by adding close reading to the versions are available at Språkbanken analysis, the actual reports in the cluster of (spraakbanken.gu.se) as downloadable XML- “development”, presented concrete insights files. This study extracted the word stem of and illustrations of how to understand and all nouns from the reports of the 1960s, synthesize the connection between “inform- 1970s and 1980s, which is sufficient for the ation” and “development” and how “in- study of themes in a text, and each report formation” became regarded as a universal was split into chunks of 1 000 nouns each. tool for handling problems and challenges in MALLET was then used to compute three society. Hence, the dynamic interaction distinct LDA topic models, one for each between closeness and distance helped to decade, and each consisting of 500 topics. A strengthen both perspectives and enrich the manual review of the generated topics end result. showed that each decade had been assigned a distinctive topic of a general information Topics: Nordic Textual Resources and discourse. The computed average topic Practices weights for each report, based on weights in Keywords: topic modeling, mallet, swedish each chunk, were used to visualize a network governmental official reports, gephi of all reports, and their most dominant top- ics, with weaker report-to-topic links filtered out based on a configurable threshold. Our Teaching and Learning pre-study showed that LDA topic modeling the Mindset of the Digital often computes a very dominant topic with a Historian and More: weight of over 60 %, while the weight of the following topics dropped significantly. Scaffolding Students’ Hence, a generous threshold of 0.01 was Critical Skills in the Digital proved to capture both a discursive core as Humanities well as its periphery.

The method showed three things. Firstly, Thomas Nygren a distinctive information discourse evolved Uppsala University, Sweden over time in the official language in the state bureaucracy. By highlighting reports in Katherine Hayles (2012, p. 21) notes how which the information topic was dominant, “[y]oung people practice hyper reading ex- it was clear that the number of co-occurring tensively when they read on the web, but reports increased by a fourfold from the they frequently do not use it in rigorous and 1960s to the 1980s. disciplined ways.” This is an important ob- Secondly, the three datasets, one for each servation but what does this mean in decade, were imported, separately, to Gephi. practice? What does it mean to read in “rigo-

90 rous and disciplined ways”, online and off- school students focus more on dramatic line, and how can we support students ha- events and racist language. Guided by a cen- bits of mind when they read and interpret tral historical question in a directed reading, sources and information from and about the these patterns change, especially for univer- past? In this paper I will present some empi- sity students, but also for high school stu- rical studies to highlight challenges and op- dents, who now read documents more in portunities to support students’ success in line with historians’ initial reading strategies. navigating in the digital world of humanities. Our findings shed new light on how experts’ Going to the sources and making sense and novices’ historical literacy differs, and of fragments from the past in archives, digi- how these differences may relate to their tal and analogue, demand historians to abilities to critically scrutinize sources, become experts in sourcing, corroborating corroborate evidence, and understand cen- and contextualizing the sources–and more. tral aspects of historical events. Participants’ To better understand how experts and no- scores in the post-tests correlates to some vices read historical documents we designed extent with participants reading focus, indi- an eye-tracking study to track read diffe- cating how a more professional focus may rences between historians and students be scaffolded by a historical question in when they read historical documents from ways that can help students pay more attent- the time of the French revolution (Mulvey & ion to source information. Nygren, work in progress). We tracked eye However, digital historians do not only movements of four historians, four univer- closely scrutinize documents, they also use sity students, and eight high school students archives with large sets of data. In my talk I reading with an infrared non-intrusive eye- will also present some indications from tracking camera. The stimuli contained four quasi-experimental studies showing that no- historical sources with information regarding vice users of Swedish digital archives may human rights at the time of the French revo- lose their awareness of their theoretical po- lution. Sources were selected based upon sition and empathy when using large data their usefulness to test historical thinking sets (Nygren 2014). When facing a large set (Wineburg, 1991); this means that they of data and statistics it may be an instinctive should be primary sources with important reaction to start sorting the information and source information and hold valuable in- quantifying, rather than reflecting on the formation for answering a complex hist- starting point of your research and critical orical question–challenging the reader to perspectives, thus conducting a more inter- source, corroborate and contextualize the pretive investigation. Close reading of a few information. documents may perhaps make it easier to The material participants in this study hermeneutically scrutinize the information. read included excerpts from four primary However, using digital tools and material can sources namely, (1) Declaration of the Rights of also be used to closely analyze how smaller Man and Citizen, 26 August 1789, Paris, fragments fit into a bigger picture, but this France; (2) The Declaration of the Rights of Wo- takes a critical awareness of the materiality man by Marie Gouze [published under the and a focus on the research question, which pseudonym Olympe de Gouges], September both students and researchers need to bear 1791, Paris, France; (3) A particular account of in mind when “going” to the archives. the insurrection of the Negroes of St. Domingo, Evidently it is possible for students to 1791, Paris, France; and (4) Haitian Declarat- navigate digital databases and learn history in ion of Independence, January 1804, Gonaives, new ways using affordable technological re- Haiti. In total there were 10 pages, 1828 sources. But this needs to be supported by words, for participants to read. The prelimi- hard and soft scaffolding (Brush & Saye, nary findings show that historians seem to 2002). Hard scaffolds built into the databa- focus more on sourcing and central hist- ses can make databases more useful for ot- orical aspects of documents, whereas high her than just historians with expertise of the

91 digital architecture. Soft scaffolds designed may be possible in more traditional explo- by teachers and historians can make it rations using pen and paper (Nygren, Frank, possible for students to use Swedish digital Bauch & Steiner, 2016). In digital history the databases designed for professional histori- writing of history may be more than creating ans (Nygren & Vikström, 2013; Nygren, text. With digital platforms, readers/users Sandberg & Vikström, 2014). In previous can create and present multiple interlinked studies we find that students can learn to narratives; integrate images, maps, commen- walk in the shoes of the digital historian and tary, and primary sources in the same field use primary digitized sources in constructive of vision; and curate and shape the re- ways, but they often stumble when it comes ader/user’s experience, allowing for a hybrid to contextualizing the information. A central experience (cf Thomas III & Ayers 2003). aspect here is the challenge of historical em- But as a final presentation, visualizations pathy. Understanding people in the past by need to help the reader/user see behind the their own standards means that we need to seductive cleanliness of data presentations contextualize the information and try to and animations. It is also important to bear shift perspectives. There is a challenge to in mind how multiple narratives and multi- understand the past as a “foreign country,” a modal presentations may confuse the re- place where language and concepts as well as ader/user rather than give a richer un- context differ in fundamental ways from our derstanding of the topic. There may be a risk contemporary world (Lowenthal, 1985). This of cognitive overload for readers in hyper- cognitive and emotional ability to underst- text environments (cf Gerjets & Sheiter and unfamiliar perspectives across time and 2003). All users need to learn to become space, often labeled historical empathy, is a critical readers and navigators in multimodal central but also complicated matter in hist- environments, and digital historians and stu- orical studies (Davis et al 2001). Closeness dents need to understand the audience in and distance is vital in our understanding of somewhat new ways when a digital tool the past and digital tools may help us see becomes a publishing tool (Hayles, 2012). things in a larger perspective and also zoom Students and historians need also be able to in on selected parts in time and space. Clo- review scholarship in new media, if we want seness to primary sources and the environ- to make use of new opportunities and ments studied may be a way to overcome safeguard quality in historical scholarship temporal and spatial gaps of understanding. and prepare students for a future in acade- Materiality may certainly affect our con- mia and beyond (Presner 2010; Nygren, struction of knowledge (Latour 1999) and Foka & Buckland 2014). researchers and students in digital history Last but not least we need to consider the may benefit from physical reminders of the uses of history in contemporary digital me- complexity of the fragmentary remains dia. In an ongoing non-intrusive study of behind neat data. Mixing the digital with teachers contrasting contemporary uses of tangible materials may be an important history with primary sources from the era of scaffold to consider. civil rights movement, we observe that stu- Digital tools can also be used to comple- dents rarely critically scrutinize contempo- ment printed material, in ways that may help rary uses of the ideas of Martin Luther King students and historians overcome the unre- Jr. when misinterpretations are underlined in achability of the past so evident when going the media (Nygren & Johnsrud, in review). to the archives (Robinson 2010). Using visu- For students, critically examining evidence alizations makes it possible to organize the seems to be difficult when authorities and information in new ways, for instance lin- media augment oversimplified and popular king it to geographical locations and on perceptions of the past. However, we also temporal scales. Digital tools can be used to find that students can learn a more nuanced collect different types of data and to explore perspective on the life and deeds of MLK, relationships in time and space beyond what findings still observable a year after the

92 initial teaching took place. The results from central skill. But in the digital humanities this study show important potentials and there are now opportunities to think, make, limitations when trying to stimulate the le- enact and experiment in a diversity of forms arning of core content, critical thinking and and in collaborations. Some ideas might cer- values of social justice using primary hist- tainly benefit from being treated and presen- orical sources and contemporary media re- ted in non-textual ways. But how do we sti- presentations that attempt to make the past mulate all this in practice? The answer today historical and practical. Connecting the past is that we do not really know, and we need to the present and critically scrutinizing con- more empirical studies to connect our the- temporary media is asking more from stu- oretical understanding to the learning of dents than what we ask from historians. And students. historians actually do not seem to be very It is time to move beyond anecdotal evi- good at critically scrutinizing online inform- dence about how to support learning in the ation (Wineburg & McGrew, 2016). We digital humanities. The research presented need to better understand this challenge and here provides some small insights into the how to deal with this in schools and acade- potentials and pitfalls of teaching and le- mia in a digital age. arning critical mindsets useful in the digital Scaffolding students to read and write humanities. This research highlights how it with the affordances offered by the digital is possible for, at least some, students to le- humanities is a certainly a tall order for te- arn to read like historians, navigate digital achers. To make this challenge a bit less archives and deconstruct contemporary me- complex I suggest a focus on a few central dia myths about the past. But our research mindsets, namely, skills to criticize, empathize also highlights the complexity of teaching and create. Having the skill to criticize means and learning in the digital humanities, how that students, and scholars, in the digital little we know, and how important it is to humanities need to be able to: critically exa- support students’ humanistic habits of mind. mine and corroborate various types of sour- ces (such as text, image, and audio), critically Topics: Nordic Textual Resources and read between the lines, read close and di- Practices stant, critically explore and experiment with Keywords: teaching, learning, history, archives, various digital tools, understand different skills critical and ethical perspectives, and formu- late critical questions. This critical mindset is References a central part of being a rigorous reader in Brush, T. A., & Saye, J. W. (2002). A sum- the digital humanities. A skillful reader is mary of research exploring hard and also able to empathize with multiple per- soft scaffolding for teachers and stu- spectives. This central aspect of the humani- dents using a multimedia supported ties involves classic challenges to underst- learning environment. The Journal of In- and: historiography, different human per- teractive Online Learning, 1(2), 1-12. spectives, not least the mind of the author, Davis, O. L., Yeager, E. A., & Foster, S. J. the reader, the creator and ordinary people (Eds.). (2001). Historical empathy and in foreign cultures and countries. Today this perspective taking in the social studies. means not least understanding human ex- Rowman & Littlefield. istence and making in analog and digital Gerjets, P., & Scheiter, K. (2003). Goal con- worlds. This means that students must learn figurations and processing strategies as to contextualize the information and empat- moderators between instructional de- hize in cognitive and emotional ways with sign and cognitive load: Evidence distant worldviews. Last but not least stu- from hypertext-based instruction. dents in digital humanities need to create Educational psychologist, 38(1), 33-41. accounts to process and communicate their Hayles, N. K. (2012). How we think: Digital thoughts in nuanced ways. Writing is still a media and contemporary technogenesis. Uni- versity of Chicago Press.

93 Latour, B. (1999). Pandora's Hope: Essays on New Multi-language the Reality of Science Studies. Cambridge, Mass.: Harvard University Press Digitised Newspapers Lowenthal, D. (1985). The Past is a Foreign and Journals from Finland Country. Cambridge: Cambridge Uni- Available as Data Exports versity Press. Nygren, T. (2014). Students Writing History for Nordic Researchers Using Traditional and Digital Archi- ves. Human IT 12 (3): 78–116. Tuula Pääkkönen Nygren, T., Foka, A., & Buckland, P. I. Jukka Kervinen (2014). The status quo of digital hu- National Library of Finland manities in Sweden: past, present and future of digital history. H-Soz-Kult. To respond to the needs of the especially Nygren, T., Frank, Z., Bauch, N. & Steiner, researchers of digital humanities, we have E. (2016) Connecting with the Past: created specific data export packages from Opportunities and Challenges in Digi- the digitised materials (Pääkkönen, tal History, in Research Methods for Crea- Kervinen, Nivala, Kettunen, & Mäkelä, ting and Curating Data in the Digital Hu- 2016) which currently span until 1910. We manities, eds. M. Hayler & G. Griffin, have developed a custom XML format for Edinburgh University Press, 2016, 62– the export packages, which contains the 86. post-processing results from the digitisation. Nygren, T., & Vikström, L. (2013). Treading There is one XML file per one page of a old paths in new ways: upper se- newspaper, which contains three pieces of condary students using a digital tool of information: the metadata, ALTO XML the professional historian. Education (Technical Metadata for Layout and Text Sciences, 3(1), 50-73. Objects standard) and the textual content of Nygren, T., Sandberg, K., & Vikström, L. a digitised page. These three parts bring the (2014). Digitala primärkällor i histo- information developed within the library rieundervisningen: en utmaning för available for many kinds of research oppor- elevers historiska tänkande och histo- tunities from the bibliographic metadata to riska empati. Nordidactica: Journal of the content analysis. Also within the custom Humanities and Social Science Education, XML file the simple raw text format give (2), 208-245. [Digital primary sources additional possibilities for the researchers to in history education: A challenge for focus on their research questions. We hope students’ historical thinking and hist- that by opening both metadata and the con- orical empathy] tent, it is possible to create collaborations Presner, T. (2010). Digital Humanities 2.0: a with library and researchers for the tools and report on knowledge. Connexions Pro- method development of the materials, for ject.—2010. example optical character recognition (OCR) Robinson, E. (2010). Touching the Void: and post-correction fixes. The material is as- Affective History and the Impossible, is in the export packages, so the material has Rethinking History: The Journal of Theory varying number of OCR errors, where the and Practice, 14:4, 503-520. OCR quality ranges 57-76% from the 19th Thomas III, W. G., & Ayers, E. L. (2003). century data (Kettunen & Pääkkönen, 2016), The differences slavery made: A close for example. Potential OCR and metadata analysis of two American communi- issues is something that the researcher needs ties. to be aware when taking the material set into Wineburg, S. & McGrew, S. (2016) Why use – however, it is also opportunity to work Students Can't Google Their Way to together to improve the data content for the Truth, Education Week, 36(11), 22, everybody. Library can act as a central role 28 by connecting different researchers who face the same issues with the raw content and the

94 library can benefit by being able to utilize ing metadata or data, which has now re- the research results for example in improv- vealed itself. Therefore, the material as re- ing the quality of the materials onwards search data requires awareness of the limita- (Pääkkönen & Kervinen, 2016). tions by the researchers, even though our Digitisation and export packages can also attempt is to offer as authentic material as it be seen as technical infrastructure for the is got after the digitisation post-processing. research data creation, but thinking the de- The first version of the export packages tails of the technical requirements is not contain material until 1910, but our internal enough. Therefore, we have also taken steps tools make it possible to generate the export to analyze the legal aspects of opening the packages to the newer material when the data, namely the incoming General Data need arises and new contract models are in Protection Regulation and the copyright di- place. The near proximity of the tools to the rective proposal (DSM-directive) of the EU digitisation chain offers benefits as digitisa- (European Commission, 2016), because with tion progresses, new export packages can be the versatile material of the newspapers we created with a cost-effective way. In this pa- need think both the people appearing in the per, we will tell how the export packages of content and the original authors view. With the material were created, where you can get these two viewpoints, and preparing to the them at the moment and which constraints incoming changes, we can start responding the materials have. The interesting material to the new requirements regardless of the of the digitised collections of Finland con- way how the material is provided to the us- tain material for example in Finnish, Swe- ers. Together with the Finnish Copyright dish, Russian, German and Sami making the Society (Kopiosto) for the newspapers and language-base possibly interesting for Nor- journals and couple of media houses we dic collaboration. For example, based on have created a process and a tool within the the feedback and information queries in our presentation system to manage copyright presentation system at digi.kansalliskirjasto.fi redaction requests. As a brief overview to there is steady flow of visitors from Sweden, this tool, the tool allows National Library of who are interested on the various news, big Finland to redact specific part of the digit- and small, which have appeared in both ised contents, based on the request of the sides of the border. So far, the export pack- right holder, while still allowing the rest of ages have been delivered to a Comic re- the material stay intact. After the redaction, search project of Academy of Finland and to that particular section of the digitised mate- the few researchers of Helsinki Centre for rial can only be read in the legal deposit li- Digital Humanities (http:// heldig.fi) We braries while it stays redacted in the public will also tell some aspects of earlier digital internet or in locations, where the accessibil- humanities projects, which have been im- ity to the materials has been extended by portant in developing collections (Kettunen, contracts (Karppinen, Kaukonen, 2016), features to the presentation system Pääkkönen, & Sorjonen, 2016). On the oth- and start of collaboration with researchers, er hand, for the researcher use, there are from who we have got feedback via initial plans to enhance the user management of user queries or via direct contacts. Besides the presentation system further, so that we the export packages, the new processes, and can offer materials via it to the researchers recently created contract models make it with whom we have agreements in place. possible to open up materials for the re- With help of the preparation to the incom- search use, thus enabling us to implement ing changes to the regulations and new the openness and digital humanities policies technical features, there are opportunities of the National Library of Finland in the fu- for new collaborations even across borders. ture (National Library of Finland, 2016). However there are limitations in the re- search data, as via processing them further we have also noticed malformatted or miss-

95 Acknowledgments National Library of Finland. (2016). Duties This work was funded by the EU commis- and strategy [Text]. sion through its European Regional Devel- Retrieved 17 May 2016, from opment Fund and the program Leverage https://www.kansalliskirjasto.fi/en/ from the EU 2014-2020. duties-and-strategy Pääkkönen, T., & Kervinen, J. (2016). Histo- Topics: Nordic Textual Resources and riallisten digitoitujen sanoma- ja aika- Practices kauslehtien avaaminen avoimena data- Keywords: newspapers, digital resources, na tutkijoille. Informaatiotutkimus; Vol accessibility, research use 35, Nro 3 (2016): Informaatiotutkimuksen Päivät 2016. Retrieved from References http://ojs.tsv.fi/index.php/inf/article European Commission. (2016). Proposal for /view/59442/20626 a Directive of the European Parlia- Pääkkönen, T., Kervinen, J., Nivala, A., Ket- ment and of the Council on copyright tunen, K., & Mäkelä, E. (2016). Ex- in the Digital Single Market. porting Finnish Digitized Historical Retrieved 22 September 2016, from Newspaper Contents for Offline Use. https://ec.europa.eu/digital-single- D-Lib Magazine, 22(7/8). market/en/news/proposal-directive- https://doi.org/10.1045/july2016- european-parliament-and-council- paakkonen copyright-digital-single-market Hölttä, T. (2016). Digitoitujen kulttuuripe- rintöaineistojen tutkimuskäyttö ja tut- Exploring User Engagement kijat. Retrieved from in Crowdsourcing http://urn.fi/URN:NBN:fi:uta- 201603171337 Folk Traditions Karppinen, P., Kaukonen, M., Pääkkönen, T., & Sorjonen, M. (2016). Contracts Sanita Reinsone Enabling Collaboration of The Na- University of Latvia tional Library of Finland with Media Houses in Electronic Deposit. Pre- The mass digitizing activities of holdings of sented at the IFLA World Library and tradition archives carried out over the last Information Congress, Columbus, few decades have introduced a significant Ohio, United States. amount of various digitized cultural artefacts Kettunen, K., & Pääkkönen, T. (2016). to the wider public. As information technol- Measuring lexical quality of a historical ogy has developed, knowledge of how tradi- Finnish newspaper collection— tion archive materials could and should be analysis of garbled OCR data with digitally maintained has advanced as well. basic language technology tools and Folklore archivists have developed and means. In Proceedings of the Tenth Interna- sought digital platforms appropriate for their tional Conference on Language Resources and specific collections and suitable solutions for Evaluation (LREC 2016). virtual representation, further processing, Retrieved from and (re-)using of digitized data not only to https://www.researchgate.net/profile ensure long-term preservation of the cultural /Kimmo_Kettunen/publication/2995 artefacts and creating new access routes to 15022_Measuring_Lexical_Quality_of collections, but also to remain close to con- _a_Historical_Finnish_Newspaper_C tributors as well as to continue the archiving ollection_- of new vernacular knowledge. _Analysis_of_Garbled_OCR_Data_wi In recent years, projects providing tools th_Basic_Language_Technology_Tool and inviting volunteers to transform digital s_and_Means/links/56fd194208aeb72 content from one format into another have 3f15d61be.pdf become one of the most widespread phe-

96 nomena among Digital Humanities research the most suitable files for transcription. In and cultural heritage institutions. One such addition, simple game elements were includ- tool is transcription, which is still one of the ed to provide additional incentives for most common forms of crowdsourcing in younger-generation users to engage, i.e., digital humanities. The benefits of such col- each transcribed character provided the user laborative activities for tradition archives are with one Talka point. Users could collect apparent as the most historical tradition ar- points individually and/or collectively for chives consist of vast quantities of handwrit- their schools if they indicated which school ten or type-script text collections that cannot they represented. The Top 10 individual us- be automatically transformed digitally. By ers and Top 10 schools were displayed on turning manuscript images into digital texts, the front page. The competition feature documented knowledge becomes available served to encourage schoolchildren to com- not only by the metadata created by the ar- pete for the most points and contribute to chive system but also by its content. their school’s position by various means, e.g. Although the number of crowdsourcing involving friends and family members. projects is substantially increasing in the dig- Despite all the positive engagement through ital humanities and cultural heritage fields, crowdsourcing, members did make some tradition archives have not yet been as eager inappropriate responses in order to falsely to carry out transcription crowdsourcing inflate their scores. However, each of the projects. However, the experiences of those transcribed pages was carefully proofread by few tradition archives that have managed to the editors, corrected, and accepted or delet- launch crowdsourcing campaigns for folk- ed if the submitted transcription proved to lore manuscript transcription, such as the be fraudulent. Irish National Folklore Collection (Universi- A crowdsourcing initiative based in gami- ty College Dublin) and the Archives of Lat- fication and competition while producing vian Folklore (Institute of Literature, Folk- high youth involvement should be regarded lore and Art, University of Latvia), are im- with caution, because it can have negative pressively positive. consequences as well. Compared to other This paper will provide in-depth analysis kinds of campaigns9, participants are moti- of the crowdsourcing campaign “Valodas vated more by the intensity of the exercise talka”7 carried out by the Archives of Latvi- and competing than by the idea of volun- an Folklore (ALF) in cooperation with the teering as such. Because of this, they may be UNESCO Latvian National Commission in less inclined to be careful contributors lead- 2016. Targeted at a school audience and last- ing to more work for editors in the end. Be- ing for 71 days, the campaign provided a sides, a competition tends to create an inner contribution of almost 15,000 transcribed tension in the young volunteer community manuscript pages. More than 1,500 partici- as members observe each other’s activity pants from 120 educational institutions par- and examine each other’s contributions, and ticipated.8 are sensitive towards the review process of The Talka website randomly displays both their contribution and that of others manuscript pages in a slider carousel, which provides additional load of communi- providing users with an easy way of finding cation for curators.

7 The title references ethnographical collec- tive work in the fields and can be translated as ‘collaborative work for language’. Cam- paign’s website: http://talka.garamantas.lv 9 The comparison is made with the second 8 Average school participant was 12 to 18 crowdsourcing campaign of the ALF years old. Teachers had a significant mediat- (http://lv100.garamantas.lv/en), which is ing role to attract and encourage students to based on the pure concept of volunteering get involved. with no competition.

97 User group Division of Contribution No. / Level of Number Percentage groups by Contribution (transcribed involvement of users of users contribution (%) characters) 1 / Low 851 55.0% <=1 000 3.6% 394 430 2 / Some 1 001– 582 37.6% 10 000 16% 1 756 388 3 / High 10 001– 92 6.0% 100 000 23.4% 2 570 067 4 / Very high 21 1.4% >=100 001 57% 6 252 204 Total 1546 10 973 089 Figure 1.

The first crowdsourcing campaign of The analysis of user statistics collected folklore manuscript transcription provided during the campaign, after the campaign, the Archives of Latvian Folklore not only and while launching a new campaign11 sug- with an opportunity to significantly increase gests the high importance of the group of textual corpora of the digital archives10 and permanent users who have already formed a increase the amount of visitors and its popu- virtual community of trusted contributors larity on a national scale, but also provide and experts. They not only provide a per- several valuable lessons on campaign strate- manent flow of contributions that can easily gy, management of communication with us- be followed by anyone but also positively ers, and editorial workflow planning. It was influence the overall quality of submitted one of the most intense periods of commu- contributions. This makes sense because it is nication with mass media and users of digital customary for new users to undergo some archives the ALF has ever experienced. It kind of “consulting” by exploring recent was also the largest crowdsourcing initiative records submitted by other users before they to date in the field of the intangible cultural begin transcribing a manuscript themselves. heritage of Latvia. On average 210 manu- If theses samples are correctly done by a script pages were transcribed per day, and trusted user, the contribution stands as an the highest attainment was 520 transcribed example of good practice and helps to pages in one day. spread it further. User involvement analysis indicates a re- markable disproportion of engagement ac- Topics: Nordic Textual Resources and tivity among Talka participants. A medium Practices level of activity was demonstrated by 37.6% Keywords: crowdsourcing, vernacular know- of users (Group 2, see Table 1), whose con- ledge, tradition archives, society engagement, tribution equaled 16% of all transcriptions; knowledge production whereas the second highest involvement us- er group (6%, Group 3) provided 23% of all contributions. The lowest involvement user group consisting of 55% of all users (Group 1) provided 3.6% of all contributions which highly contrasts with the very high involve- 11 Open-ended crowdsourcing campaign for ment group, which included only 1.4% of all folklore manuscript transcription “Simtgades users (Group 4) but who provided the most burtnieks” (The Wizards of Centenary) was prominent results in terms of quantity, 57% launched in June 2017 by the Archives of of all contributions (Figure 1). Latvian Folklore in cooperation with Latvian Centenary Bureau and National Radio and 10 Website of the Digital Archives of Latvian Television. Campaign’s website: Folklore: http://garamantas-lv/en http://lv100.garamantas.lv/en

98 Bokhylla: A Case Study of infrastructures”. What where the infrastruc- tures enabling the library to construct a ro- the First Complete National bust legal agreement? The labor union field Literature Database of Norway will be explored. What techno- in the World logical premises enabled the bokhylla solu- tion? The file and database architecture will Eivind Røssaak be explored. How was the “analogue” book National Library of Norway aesthetically speaking turned into a digital artifact still resembling or signifying “litera- This paper presents a part of my Norwegian ture” as we used to know it? The modes of Research Council funded research on how interaction and interfaces will be explored. digitization refashions a nation‘s memory. And finally: what sort of DH research does Digitization within the cultural heritage sec- this construction enable? Some of the appli- tor is crucial here, and I will focus on the cations and research connected to Bokhylla National Library of Norway in this presenta- will be presented. tion. A key digital resource in Norway is the Topics: Nordic Textual Resources and National Library’s digitized book collection, Practices, The Digital, the Humanities, and “Bokhylla” (The Bookshelf). Tech compa- the Philosophies of Technology nies and libraries around the world have at Keywords: memory, cultural heritage, techno- least since the mid-1990s struggled to find logy, new media, digital books, bokhylla, the good solutions for presenting and preserving national library, digital humanities our cultural heritage in a digital age. The challenges are institutional, legal, technologi- cal and aesthetical. The challenges have fos- Bibliography tered a series of innovations within these 2016. Memories in Motion (co-eds. I. Blom fields. This paper will explore and present and T. Lundemo), Amsterdam: Am- these innovations. sterdam University Press. A key challenge when it comes to creat- 2011. Between Stillness and Motion: Film, ing digital access to book repositories is legal Photography, Algorithms. Editor. barriers. When the National Library of Amsterdam: Amsterdam University Norway established its main digital resource, Press. Bokhylla, an important part of its prehistory 2010. The Archive in Motion: New Concep- was a complex legal agreement with the tions of the Archive in Contemporary copyright holders enabling the library to Thought and New Media Practices. make available all books ever printed in the Editor. Oslo: National Library. nation up until 2001. While Google has been 2010. The Still/Moving Image: Cinema and in a continuous legal grey zone, Bokhylla has the Arts, Saarbrucken: Lambert Aca- become a highly original and complex digital demic Publishing. artifact. 2005. Selviakttakelse – en tendens i kunst og Methodologically my approach is inspired litteratur, [Observation of the Self – in by a Science and Technology Approach the Arts] Bergen: Fagbokforlaget. which relies on Bruno Latour’s Actor Net- 2004. Kyssing og slåssing. Fire kapitler om work Theory (ANT) in particular. The film, [Eros and Agon: Four Chapters ANT-approach uses “decomposition” of the on Film] Oslo: Pax (with C. Refsum). artifact to see how it is generated historically 2001. Sic. Ved litteraturens grenser, [An ex- and by a variety of actors (human and non- perimental history of margins in art, human). It took ten years to create Bokhylla literature and philosophy] Oslo: Spar- the way it looks today. It is a legal, techno- tacus. logical and aesthetic artifact. Each of these three elements rely on a complex network of actors constituting what I will call “digital

99 Life Based Design for using them, and finally – and most im- portantly – the technology’s role in human Human Researchers life. The latter is always present in design, either consciously or tacitly. Pertti Olavi Saariluoma The first question in HTI design con- University of Jyväskylä, Finland cerns how the behaviour (functionalities) of Jaana Leikas a technical artefact can be controlled. How VTT technology center of Finland should the artefact be manoeuvred so that it can reach its expected state or carry out the Technology is only valuable to the extent expected processes? During a human action, that it can enhance the quality of life. When an expected state can refer to a process that solving complex engineering problems, it is makes it possible for the user to reach her easy to forget the basic reason why technol- goal. In this sense, the expected state of a ogies are designed and developed. They are sailing boat can be as much about sailing on developed to improve the quality of human the sea as reaching a destination. life. How to control the behaviour of a tech- Designing technology to improve the nical artefact is a fundamental problem in quality of human life requires a multidisci- HTI design. No technical artefact can exist plinary design approach. On one hand, mul- without providing its users the methods to tidisciplinary teams can give designers with a use it. The problem can be called technical technical background the opportunity to UI. First, the behaviour of the artefact must better acquaint themselves with human re- be logically linked to the human action in search by working with human researchers. question. In the case of the lift, the technical On the other hand, human researchers capacity to move from one floor to another should be more aware of the various roles is the artefact behaviour that makes it possi- they can play in the process of designing and ble to support people’s movement in a block developing new technological solutions for of flats. Second, it is essential to link the be- people. Human researchers can be provided haviour of the artefact to users’ actions via a with concepts, facts, methods and theories user interface. In the case of a lift, this often that are useful in many aspects of design. refers to the set of control buttons referring A study of multitude of HTI paradigms to which floor the lift should stop on. How- illustrates that one can integrate them into ever, the latter presupposes design four research programs characterized each knowledge of how people use the artefact. by a separate design question. The tradition- The next fundamental question in HTI al HTI design discourses can thus be sys- design concerns the fit of the technology tematized by showing that HTI design with users’ ability to use it. This problem can thinking must always meet four fundamental be called usability. This problem concerns design questions: the human dimension of the user interface 1. functionalities and technical user inter- and opens up a new set of questions and face design; sub-questions that can be answered with the 2. fluency and ease of use; help of human research and underlying psy- 3. elements of experience and liking; and chological concepts. In order to guarantee 4. the position of technology in human smooth and easy interaction, user interface life. architectures should explicitly organize dia- As these fundamental questions are nec- logues. For example, elements with similar essarily present when designing the human functions should be associated in a sense- dimension of technology, it is useful to un- making manner. The foundations of under- derstand the logic behind them. The ques- standable user interfaces can be searched tions define the basic tasks in HTI design: from human research – that is, answers to the decision of the functionalities of the ar- such questions as why a particular architec- tefact, understanding how to best use them, ture is favoured over another. understanding the overall experience when

100 However, interaction is not only cogni- Because of hectic design cycles, interaction tive but there are many dynamic aspects to designers do not necessarily have the time to be discussed in design. Emotions are im- apply systematic methods or use scientific portant, as they define the human position knowledge to construct interface. Although towards specific issues (Frijda, 1988). In de- it is common in engineering design to apply sign, the question is not only about positive the laws of nature and other scientific facts, emotions but also about asking which emo- this approach is unfortunately not often tak- tions are relevant in the particular interaction en in HTI design. situation. One should be able to feel angry If an organization wishes to exploit scien- when the cause is irritating enough and hap- tific knowledge in interaction design pro- py when experiencing positive actions. Oth- cesses, it is important to create systematic erwise, the human emotional system does procedures for doing so, for example by de- not operate in a rational manner. In interac- fining the relationships of relevant design tion design, it is essential to consider how concepts and questions and organizing de- people experience a situation or event that sign processes around them. The concept of arouses their emotional responses (Frijda, usability, for example, opens up a large set 1988). Designers mostly strive to create a of questions and sub-questions that can be positive mood in their clients when interact- both general and product specific in nature. ing with the product. To understand the Such systems of questions and answer emotional dimensions of human experience make it logical to ask, whether they could be it is necessary to understand human emo- ontologised and thus used to give a structure tions and motivation, which is closely linked of HTI-desing processes (Saariluoma, Cañas with emotions. People often pursue positive and Leikas 2016). . Knowledge management mental states, and are therefore motivated to in design has been a topical issue for some use artefacts that help them do so. The mo- time (Gero, 1990). A key concept in this dis- tives for doing something can be complex cussion is ontologies, which are organized and long lasting. The modern psychology of sets of domain-specific concepts (Chandra- motivation offers a sophisticated framework sekaran, Josephson, & Benjamins, 1999) that for analysing the motives for using technol- describe the most general concepts in a giv- ogies The importance of this sub-discourse en field; they are widely used in knowledge is obvious: designers need to know why management. Ontologies can be seen as people use some technical artefacts and ig- theories of information contents (Chad- nore others. rasekaran, Josephson, & Benjamins, 1999; Finally, it is essential to ask, what is a Gero, 1990) technology intended to be used for, in the Traditionally, ontologies have mainly first place? One can call this problem de- been used to describe the structures of some signing for life. Why is it used? What is its domains as facts. For example, some prod- position in people’s life? Answering these ucts have been described as sets of elements questions is a prerequisite for successful in- and relations. In such instances, ontologies teraction design, and calls for an understand- have had the role of information storage and ing of the life settings that the artefact is in- retrieval. However, when considering design tended to support. This is possible with the as a dynamic thought process, it is more help of a general notion of ‘form of life’ that worthwhile to discuss ontology as a question can describe any domain or context of life in structure to be used to manage design prob- relation to technology. Form of life is a sys- lems. Ontologies in this sense can be seen as tem of actions in a specific life setting with tools for creative thought rather than as in- its rules and regularities, and facts and values formation storage. Ontologies for HTI de- that explain the sense of individual deeds sign can thus be used to generate sets of de- and practices in them. sign questions describing the interaction – In HTI design, innovative thought pro- that is, the questions that must always be cesses are often more or less unorganized. answered when a technology and its rela-

101 tionship to users is designed – and to con- Leseutgave av Hrafnkels duct the HTI design process accordingly. As concepts, questions and ontologies saga, Menotas koding og provide a means of managing corporate knytting til andre ressurser thinking and making corporate knowledge explicit. When thought processes are explic- Fabian Schwabe it, it is possible to support them, to provide University of Tübingen, correct knowledge to thinking, to foster in- Germany novations and to move tacit knowledge from one process to another. The answering Når man arbeider på ei digital utgave av en to the four fundamental questions presup- norrønt tekst trenges det å tenke på ei or- poses by far unified argumentation based on dentlig koding av denne teksten slik at man human research. ikke bare kan vise teksten selv på ei nettside, men også har teksten i en format som er for- Topics: The Digital, the Humanities, and the ståelig for andre og kan benyttes av dem. På Philosophies of Technology dette feltet er det sikkert lurt å kaste et nøyet Keywords: design science, life-based design, blikk på anbefalingene til Medieval Nordic design ontologies Text Archive (Menota). Der finnes det et utarbeidet foreslag, hvordan man kunne Bibliography bruke ei XML-koding til å beskrive norrøne Chandrasekaran, B., Josephson, J. R., & håndskrifter og lagre deres innhold. Benjamins, V. R. (1999). What are on- Denne kodinga bygger opp på foresla- tologies, and why do we need them? gene til Text Encoding Initiative (TEI) som Intelligent Systems and their Applica- nå har utviklet seg som en standard til å tions, 14, 20–6. kode tekster innenfor humaniora. Forskjell- Chandrasekaran, B., Josephson, J. R., & en mellom kodingssystemene er at TEI Benjamins, V. R. (1999). What are on- prøver å by ei mer allmenn koding for nes- tologies, and why do we need them? ten alle tekstsjanger man kan tenke seg, Intelligent Systems and their Applica- mens Menota har momentan et veldig be- tions, 14, 20–6. grenset anvendelsesområde i blikket, fordi Gero, J. S. (1990). Design prototypes: A dets fokus ligger på det enkelte håndskriftet knowledge representation schema for og lemmatiseringa av ordformene brukt i design. AI Magazine, 11, 26–36. dette håndskriftet. Men dette vil bare være Frijda, N. H. (1988). The laws of emotion. begynnelsen. Odd Einar Haugen, en av initi- American Psychologist, 43, 349–58. ativtakene til Menota, beskriver i artikkelen Saariluoma, P., Cañas, J. & Leikas, J. sin Stitching the Text Together (Haugen (2016) Designing for life- A human 2010) at på grunnlaget av håndskriftene i perspective on technology develop- form av enkle dokumentariske edisjoner kan ment. London: PalgraveMacmillan. det oppstå en (ny) eklektisk edisjon av en tekst. Når den skal være digital, må edis- jonen en gang til bli kodet. Momentan er dette ikke inn i målene til Menota, men på lang sikt vil det utvilsom komme inn som det ble omdiskutert i Haugens artikkel. I stedet for å vente på ei XML-koding av alle relevante håndskrifter til en tekst til å lage en eklektisk edisjon, kan man også lage litt mindre ambisjonerte leseutgaver til tek- stene som møter en stor interesse eller har ei stor betydning innenfor den norrøne filolo- gien, i forskning eller undervisning. Med

102 henblikk til undervisning av nybegynnere av konjunksjoner. Men i kodingssystemet var det norrøne språket jobbet jeg med ei slik det ikke planlagt å bestemme bøyingsklas- leseutgave til Hrafnkels saga Freysgoða. Må- sene til substantiver eller verb. Bøyingsklas- let av utgava er en grammatikalsk og seman- ser er svært interessant for nybegynnere, tisk selvforklarende tekst. Det vil si at hver fordi de hjelper å identifisere ordformer og eneste ordform i sagateksten ble eller blir finne seg til rette i en norrøn tekst. Utvidel- lemmatisert slik at den språkinteresserte sen av kondingssystemet jeg gjorde, fører til leseren får nok hjelp for å forstå syntaksen at edisjonen kunne være knyttet til andre og betydninga av ordene. ressurser på nettet som atter forbedrer edis- I tillegg er alle ord knyttet til ordbøkene jonsteksten selv. til Fritzner og Cleasby/Vigfusson som gir I øyeblikket blir det arbeidet på en revis- oversettelser til norsk respektive engelsk, og jon av Menotas håndbok om XML-kodinga. til Noreens grammatikk som gir mer infor- Jeg har allerede meldt tilbake til Menota at masjon om deklinasjonen til ord og enkelte det burde være mulig å være mer presis med ordformer. Knyttinga fungerer for største beskrivinga av grammatikken. Sannsynligvis delen med enkle lenker. Det går ganske bra skal et resultat av revisjonen være denne med Noreens grammattiken som Andrea de utvidelsen. Med dette kodingssystemet har Leeuw van Weenen har overført til ei man et verktøy som ikke bare kan nyttes for HTML-fil, og Fritzners ordbok som ble or- å kode håndskrifter, men som nok er detal- ganisert som en database av prosjektet Ei- jert for å finne anvendelse i andre grammati- ning for digital dokumentasjon (EDD). kalsk orienterte prosjekter. I tillegg kan Ordboka til Cleasby/Vigfusson ble overført XML-kodinga til Menota blir et digitalt til ei fil med enkel markup av Sean Crist av verktøy for forskjellige edisjonstyper (in- prosjektet Germanic Lexicon. Knyttinger til nenfor norrøn filologi), og ikke bare for nettsida av prosjektet er bare ei mellomløs- dokumentariske edisjoner som i dag kodinga ning, til bearbeidelsen min av dataene i fila brukes til. som er fri tilgjengelig, er ferdig gjort. Jeg skal jobbe på å vise bare de relevante ord- Topics: Nordic Textual Resources and bokartiklene i en klar og enkel layout. I den Practices omtalte edisjonen er nå bestemt omtrent 90 Keywords: digital edition, working with online % av ordene; edisjonen eller leseutgava som resources, teaching jeg kaller den, finnes under http://ecenter.uni-tuebingen.de/hrafnkels- Bibliography saga/start.html. – Cleasby, Richard og Gudbrand Vigfusson, Den digitale teksten med alle språklige An Icelandic-English Dictionary, Ox- annotasjoner er kodet som XML etter stan- ford 1874. darden til Menota, mens Ordbog over det – Fritzner, Johan, Ordbog over det gamle norrøne prosasprog er grunnlaget for nor- norske sprog, 4 bind, 2. utgave, Kristi- maliseringa. Lemmatisering er lagt etter ania 1883-96. kodingssystemet til Menota, men for å rekke – Noreen, Adolf, Altisländische und altnor- målene måtte det bli utvidet slik at den wegische Grammatik. Laut- und Flex- grammatikalske kodinga kunne være mer ionslehre unter Berücksichtigung des detaljert. I Menotas kodingssystem er det Urnordischen (Sammlung kurzer mulig å klassifisere verb som svake, sterke Grammatiken germanischer Dialekte eller redupliserande, mens substantiver kan A, 4), 4. utgave, Halle/Saale 1923. bare kategoriseres som vanlige eller egen- – Haugen, Odd Einar, Stitching the Text navn. Når det gjelder preposisjoner og kon- Together: Documentary and Eclectic junksjoner er det mulig å bestemme dem Editions in Old Norse Philology. In ganske detaljert. Preposisjonen har en Quinn, Judy & Lethbridge, Emily reksjon eller blir brukt adverbial. Kon- (Hgg.), Creating the Medieval Saga: junksjonene deles opp i subjunksjoner og Versions, Variability and Editorial In-

103 terpretations of Old Norse Saga Lite- machines as agents in the world. Recurring rature, Viborg 2010, s. 39-65. cultural myths have had a tendency to pre- – Heimskringla.no, Hrafnkels saga Freys- sent agential technologies as either omnipo- goða etter Guðni Jónsson - tent masters or as completely loyal servants. http://heimskringla.no/wiki/Hrafnkel However, social robots are likely to compli- s_saga_Freysgo%C3%B0a cate that dichotomy. As effectively illustra- – Leseutgave av Hrafnkels saga - ted by Cozmo, its programmability creates http://ecenter.uni- an interesting tension inbetween master and tuebingen.de/hrafnkels- servant. That is, Cozmo is both pre- saga/start.html. programmed (with certain secondary agency) – Digital versjon av Altnordische Gramma- and programmable (through its so-called tik av Adolf Noreen - http:// software development kit or SDK). In its www.arnastofnun.is/solofile/1016380. pre-programming Cozmo simulates emotive – EDD, Johan Fritzners ordbok - and social capabilities. It is, for example, http://www.edd.uio.no/perl/search/s programmed take certain mischievous initia- earch.cgi?appid=86&tabid=1275. tive when interacting with its surroundings, – Germanic Lexicon Project - including human and animal actants. We ar- http://lexicon.ff.cuni.cz/. gue that this is a necessary component of a – Menotas håndbok 2.0 - partner technology, to be playful, to take ini- http://menota.org/HB2_index.xml. tiative, and to display some eccentricity. An - TEI: P5 Guidelines - eccentric relation enacts an amalgamation of, http://www.tei- in the case of Cozmo, a quirky personality, a c.org/Guidelines/P5/index.xml mischievous tendency, and a capability to simulate anger or disappointment. Notably though, an eccentric partner relation can not Mischievous Machines: be allowed to become too eccentric. A com- pletely disobedient and fully self-aware tech- A Design Criticism of nology has been a prime symbol of fear in Programmable Partners several science fiction narratives (e.g. Ex Machina, Matrix, Colossus – The Forbin Jörgen Skågeby Project). Stockholm university, Sweden Cozmo’s programmability also allows users to program Cozmo. From a human- This paper presents the results from a design technology relations perspective, the SDK critical reading (Bardzell & Bardzell, 2015; offers a way to open up a black box (Hertz Bardzell, Bolter, & Lowgren,̈ 2010) of the & Parikka, 2012) and form a more concrete AI-powered social robot Cozmo. Cozmo and design-oriented relation to technology. was released to the market during the fall of Nevertheless, this also spurs a tension 2016 and is described as a “supercomputer between a machine and an “almost human” on treads”. It comes in the form of a small (as Anki, the company behind Cozmo, forklift-like vehicle, which most prominent themselves put it). The illusion, if you will, features are the caterpillar bands that drive of Cozmo breaks down slightly when the it, the lift in front of it, and a screen, ef- ‘magic’ of it is revealed in a symbolic envi- fectively displaying stylized graphical facial ronment. The question is if the “almost hu- expressions. man” (eccentric) qualities will wither when The design critique will focus on the laid bare. At the same time, if Cozmo was notion of programmability (Chun, 2008; Pa- only completely obedient (programmed for rikka, 2014) and how this condition may predictability) it would soon become boring. affect human-technology relations (Ihde, In other words, eccentricity has to be ba- 1990; Nørskov, 2015; Verbeek, 2011). Pro- lanced. If Cozmo was really self-aware and grammability is the very precondition for only simulated a well-adjusted eccentric

104 partnership while interacting with humans, “En temmelig lang fodtur”: and pursued its private, potentially undesi- rable, agenda while on its own (or when in hGIS and Folklore Collection interaction with other partner technologies), in 19th Century Denmark it would turn into a Trojan technology. This balancing between a strong-willed eccentric Ida Storm partner and a wilful Trojan technology, and Timothy R Tangherlini the question of where a line can be drawn, Georgia Broughton may arguably be what will signify human- Holly Nicol machine relations in times to come. University of California Los Angeles, United States of America Topics: The Digital, the Humanities, and the Philosophies of Technology Introduction Keywords: humanistic HCI, human- Folklore has played a significant role in the technology relations, social robots, “imagining of the nation” since the incept- programmability, coactive technologies ion of the field in the late 18th century. In Scandinavia, the “golden age” of folklore Bibliography collection of the 19th century coincided with Skågeby, J. (in press) Im/possible desires: rapid changes in political, economic, and media temporalities and (post)human- social organization. Although some later technology relationships. Confero: Es- folklorists have expressed skepticism about says on Education, Philosophy and these collections, this skepticism is often ba- Politics, 4(2). sed on perceived notions of how these col- Skågeby, J. (2016) Media futures: premedi- lections came to be, rather than a deep ex- ation and the politics of performative ploration of the actual practices of the col- prototypes. First Monday, 21(2). lectors themselves. We show how techni- Skågeby, J. (2015) The media archaeology of ques from historical Geographic Informat- file sharing: broadcasting computer ion Systems (hGIS) wedded to time tested code to Swedish homes. Popular archival research methods can reveal how a Communication – The International folklore collection came into being. By de- Journal of Media and Culture, 13(1): tailing the routes taken by the Danish folk- 62-73. lore collector Evald Tang Kristensen (1843- 1929) over the course of his fifty-year career, we trace not only his selection biases for geographic areas (and by extension, social and economic classes), but also the impact that intellectual currents, political develop- ments and changes to transportation infra- structure have on his collecting. Tang Kristensen over the course of his sixty year career traveled over 67,000 km,

largely on foot, visiting ~4,500 storytellers

in 4,203 unique places, recording these sto-

ries in ~24,000 field diary pages. In this

work, we focus on determining how, when

and where Tang Kristensen traveled in

Denmark as he created his collection. We develop detailed route maps projected onto appropriate historical base maps showing his movement through the countryside. We develop aggregate statistics that allow us to

105 understand, at a granular level, his collecting tracted in our “proto-routes” was a signifi- habits. In addition, we align the field trips cant challenge. As with most historical data, with his writing about collecting, allowing places can be difficult to locate: some are us to approach a “thick description” (Ge- very small, names have changed, and some ertz 1973) of folklore collecting in late 19th places have disappeared. Contemporary ga- century Denmark. In all, we map 267 field zetteers are inadequate to the task and often collecting trips, starting in 1868 and ending confound, rather than solve, queries. To in 1916. This work considerably extends address this problem, we downloaded the qualitative assessments of Tang Kristens- historical place name database developed by en’s collecting (Christiansen 2013) and is a the Afdeling for Navneforskning, Køben- key contribution toward the development havns Universitet, and used it to generate a of the “Folklore Macroscope” (Tangherlini customized “Address Locator” for 19th cen- 2013). tury Denmark. We matched the stops from each field trip with the address locator, gene- Data Extraction rating a “best guess” for each field trip. Mul- When we began this work, there was no ex- tiple places with the same name were resol- isting catalog of Tang Kristensen’s field col- ved through the ESRI ArcMap “Interactive lecting routes – we had to devise this catalog Rematch” interface. To derive the final field ourselves by coordinating annotations in his trip stops, each trip was inspected individu- hand written field diaries with his four vo- ally. lume memoir Minder og Oplevelser (1923- 1927). The memoir is based largely on letters Routes he wrote home detailing all of his stops With the stops in a provisional sequential while out collecting, and includes informat- order, we created the “most likely route” for ion on means of transportation as well as each trip. A basic assumption was that, un- travel dates. Our team began by making less otherwise specified, Tang Kristensen “proto-routes.” We extracted trip start and would take the shortest path between two end dates, as well as all stops and stop order points, an assumption that aligns with the for each trip by hand, and aligned these underlying “Network Analyst” algorithm in proto-routes with the field diaries (Figure 1). ArcMap. We used a transportation network In later work, we will also align field stops from OpenStreetMaps pruned against the with our electronic catalog of informants. cadastral survey maps of ~1880, the highest resolution historical maps from the era. Address Locator Since Tang Kristensen occasionally traveled Finding the locations for the stops we ex- by boat, ferry routes based on ferry schedu-

106 les and close study of historical maps were Animations also added. By feeding the provisional Animations provide a dynamic representat- sequential stops to the network analyst, we ion of Tang Kristensen’s movement through were able to create the most likely routes for the countryside. Current animations reveal, each trip as a single line record. These routes for example, the numerous times where he were then visualized as a line with sequenti- backtracked, and can be used to augment the ally numbered stops (Figure 2, see previous understanding derived from the static maps. page). The visualization is augmented by To allow for sequential animations of all fi- simple statistics, such as route length, as well eldtrips, we devised an additional “absolute as descriptors from our database, including order” field, and split all routes into inter- dates of collection, field diary page ranges, stop segments. and modes of transportation.

107 Travel Statistics Conclusions By splitting routes into inter-stop segments, Our work reveals the shifting parameters of we could develop more detailed statistics Tang Kristensen’s field collecting, from his regarding segment length, speed of travel, intensely local focus early on to his more and travel mode. More importantly, we can expansive and confident travels at the end of now aggregate segment statistics and align his career, when his collecting was no longer this information with other data, allowing us aligned with Romantic nationalist goals, but to address a broad range of questions. For more in tune with a thick descriptive appro- example, we can see how far Tang Kristens- ach to Jutlandic rural life. By using hGIS te- en traveled when he lived in a specific place, chniques, we can provide a degree of detail his travel distances at different times of year, about his travels missing in earlier studies. and his travel distances in different parts of Our approach enables a truly macroscopic the country. Furthermore, we can consider approach to folklore collecting, allowing us changes in average travel segment or field to interrogate Tang Kristensen’s field col- trip distance over time. Future work will lecting at varying levels of resolution. For align stops with storytellers, allowing us to example, we can move from the micro- include story statistics with the field trip consideration of a single field trip, to a meso- statistics. Population data and transportation consideration of all trips that included a par- data will further add to this picture (Figure 3 ticular parish, to a macro-consideration of all and 4, see previous page). of his trips taken as a whole.

Topic Models and Field Trip Descriptions Topics: Visual and Multisensory Representat- We consider each field trip description in ions of Past and Present Minder og Oplevelser a “document” and use Keywords: historical GIS, folklore, named- this collection of documents to constitute entity detection/extraction, culture analytics the corpus of field trip descriptions. Using a probabilistic topic modeling algorithm References (LDA), we model these descriptions at vary- Christiansen, Palle Ove. Tang Kristensen og ing topic levels (k=10-30 at intervals of 10) tidlig feltforskning i Danmark. Nat- to uncover latent topics in his descriptions. ional etnografi og folklore 1850-1920. This modeling allows us another method for Copenhagen: The Royal Danish aggregating field trips. We can then explore Academy of Sciences and Letters, the characteristics of field trips associated 2013. with a particular topic. This work is a Tang Kristensen, Evald. Minder og Oplevel- preliminary step toward aligning the field ser. Volumes 1-4. Viborg: Forfatterens trips with the stories collected on those field forlag, 1923-27. trips (Figure 5, see following page). Tangherlini, Timothy R. "The Folklore Macroscope." Western Folklore 72.1 (2013): 7-27.

108 109 Representations: The The presentation will use examples from da- guerreotype to digital in our library. Analogue Photography For the last twenty five years The Na- as a Digital Source tional library of Norway have been convert- ing it’s photographic collections into the dig- Arthur Tennøe ital format. Its mission is both to give access National Library of Norway to content but also to the medium that carry and creates it. It started with databases be- Libraries have had to consider both the me- fore the internet and has been a long road of dium and the message, in different ways. developments since. This has qualitative and Twentyfive years ago the archive of photo- quantitative implications for the use and un- graphy could be called a frozen stream of derstanding of the collections. It started with pictures. Since then we have been facing vast manual registration and small files and now changes in the production, consumption and is heading towards automatization of all distribution of photography in the libraries processes in a high quantity output. The National library of Norway has initi- The analogue photo however is a com- ated and been involved in several projects plex object in itself. The photographic pro- that focuses on the current shifts: The ar- jects different stages results in different ma- chive in motion, 80 million pictures and The terial artefacts. From the nineteenth century ends of photography. The presentation will photographs produced for many different focus on some aspects of these projects and ends, portraits, prospects, records and pri- the themes involved. vate archives. The choices we make will An archive is never the only end of pho- make a difference for the output. Photog- tography. It endeavours to take its deposits raphy also started to convert into different toward other ends and uses. In the 1990s the printed documents suchs as newspapers, library, opened the first digital photo- magazines, books, commercials, postcards databases in Norway, and later went on to etc. As a result of digitization of these other digitize, photographing the whole national media both printed versions of the photo- collection of the library, and to make it ac- graphic images and metada are already pre- cessible in an integrated search, making ac- sent in the big data collection of the library, cessible objects from all media together, a or from other sources. Recent projects are as work in progress. different items as unika daguerreotypes from Our ambition is to be an active part of the 1840s, thousands of aerial photo from all the research infrastructure. Our task is to municipalities of Norway and newspapers make possible the understanding and inter- photographs with millions of photographs. pretation, the adequate contexts, to make They shall be published on our new net site every relevant document in all sorts of media for to be looked at and to be a proper a source of research for the future. This in- source for research. New methods of scan- volves challenges of collecting, registering, ning, image recognition and OCR create cataloguing, conservation and preservation, new possibilities for search understanding digitizing and long-term preservation. To and contextualisation of the objects and give acces to photography, we have, along their content. the way, to make choices that will have im- The Daguerreobase is an online applica- pact on the possibilities for use now and in tion designed to contain detailed infor- the future. What is lost/gained in the trans- mation about daguerreotypes. Members can lation from analogue to digital, focusing on view, edit and store records of individual the photographic content, negatives and vin- daguerreotypes and establish relations to tage prints, series, the original context of use other records based on a wide range of char- and the archival aspects? We cooperate on acteristics. This includes collections, owners, international database-projects. What are the creators, hallmarks, housing models, sizes, consequences of the new digital archiving? materials and free text descriptions. Dager-

110 reotypes mainly created in 1840s and 50s forming all the different part into digital rep- was a very international activity and needs to resentations are a complex process. be studied in this perspective. Metadata can A even newer project are the huge press originally be very sparse but through this photo archive from the norwegian news- cross-over project a new foundation has paper Bergens Tidende. The library will been established for the study of this im- from 2017 start its newest line that will digit- portant historical objects, rarely seen outside ize the archive of over 5 million photos. the archives before. The partners include a This demands a new look on the possibillies blend of institutions from all over Europe. of automatization of processes that starts Bergen City Museum; FotoMuseum Provin- even before the transport of the collection cie Antwerpen; Museo Universidad de Na- to the library. In the National Librarys data- varra; Biblioteca Panizzi, Reggio Emilia; Mu- base this collection also convergerges with seum of Decorative Arts in Prague; National the other materials digitized in other pro- Media Museum, Bradford; Musée Gruérien, jects. Bulle; Rijksmuseum, Amsterdam; Agence Bokhylla.no (The Bookshelf) is a collabo- Roger-Viollet, Paris; Albertina, Wien; The ration project designed to provide online Royal Collection Trust; Finnish Museum of access to literature published in Norwegian Photography; Oslo Museum; private collec- based on a formal agreement between the tions and others. National Library of Norway and Norwegian Aerial photo is photography of land- beneficial owners represented by Kopinor. scapes and buildings taken from the air. The The service will cover around 250,000 earliest aerial photos was photographed books when completed in 2017. Books from from balloons. First man out was by what the entire 20th century will be available to we know the famous French photographer anyone with a Norwegian IP address. Books Nadar. Already in 1858 he took photographs not protected by copyright may be down- of Paris seen from above. loaded. Aerial divided into two main types. Verti- The digital newspaper service is based on cal Photo is taken straight from the top agreements between the National Library down (perpendicular) and oblique photos and a growing number of Norwegian news- taken at an angle down (acute angle). Verti- papers. These agreements secure digital de- cal Photo has usually been used to produce livery of new publications and the digitiza- maps, surveying and military intelligence. tion of historical newspaper archives. A cen- Oblique photographs give a more three- tral aspect of these agreements is the right of dimensional impression of the photo- Norwegian libraries to make newspaper ar- graphed. We glances obliquely down on chives available on their premises. mountains, hills and buildings. The design Since many photos are already published observed thus not only from above, but also in books and newspapers. This means that partly from the side. We get in many ways this sources together will give us a new pos- more topographic information from oblique sibilities never seen before for tracing the photographs than from vertical photo- shooting, alternative takes, all publications, graphs. Oblique photographs are thus a rich and historical impact of the photos. topographical source material with a variety So: The paper will give the background of applications. They can be used for every- for the new situation for photography as a thing from solving border disputes, detect- source of knowledge in the digital contexts ing changes of buildings and vegetation, or and on this basis reflect critically on the sub- for documentation of historic gardens and jects such as quantitative perspectives, con- other things that can be changed or disap- textualization and metadata constuction to peared since the pictures were produced. reflections on the specific qualities of the Aerial photography has even been used for photographic medium. reindeer counting on the Hardangervidda (1948). Collecting this archive end trans-

111 Topics: Visual and Multisensory Representat- munity group representatives, academics and ions of Past and Present experts in the field. The process was meant Keywords: Analogue-photography, digital- to open up the curatorial practice and move source, metadata, visual-media, big-data it towards current forms of public engage- ment, co-creation and democratic participa- Bibliography tion in knowledge production. The paper Forglemmegei - Autochrome, i Historikeren will discuss how this approach bore direct n. 3 2016 consequences on the creation of the online Norwegian official photo no. L.8846., i platform, with regards to its design, as well Historikeren n.2 2016 as the kind of archive material, which was Bergensbanen, i Wilse mitt norge, Oslo 2015 made accessible for the first time on the online platform. The critical question was not only to in- The Trading Faces: Online clude voices from outside the heritage sector, Exhibition and Its Strategies but also to re-mould the practice of archiving and the representation of archival material, as of Public Engagement well as to resist the tendency of objectifying the past within rigid co-ordinates of time and Alda Terracciano space. The paper will discuss the involve- UCL, United Kingdom ment of black artists in the curatorial process as a way of privileging a synchronic rather This paper will explore the dynamic relation- than diachronic approach to history and ship between artistic practices and digital memory. This resulted in a number of essays humanities in the creation of an online plat- on art forms originated in Africa, which were form launched in 2009 to commemorate the produced by a number of artists of African 200th year anniversary of the parliamentary descent living in the UK, not only to provide abolition of the Transatlantic Slave Trade in a critical and historical context to the exhibi- Britain. It will consider how orality was used tion, but also to shift its focus from the ob- in the process of selecting and digitising ar- ject of the analysis to the discourse that pro- chival material related to the history of per- duces it. This is an approach to history in- forming arts produced by people of the Af- debted to the African practice of ‘Orature’, rican diaspora in the UK. More specifically, which implies a circularity of knowledge and the way in which African oral traditions and a creative exchange between performers and techniques of storytelling played a role in the members of the audience. process of designing and constructing the The paper will also discuss how the herit- first online exhibition on the legacy of the age of the Transatlantic Slave Trade within Transatlantic Slave Trade in British perform- British performing arts and society was set ing arts and society. against the wider context of cultural identity Amongst its various activities, the project, and performance, and in particular contem- which was produced by a consortium of porary forms of migration and human trade partners including Future Histories, Talawa in Britain. To do so it will analyse the Voices Theatre Company, The National Archives, section of the exhibition, which juxtaposed and the Victoria and Albert Museum, fo- the experiences of 18th century African abo- cused on the preservation and cataloguing of litionists Olaudah Equiano and Mary Prince, a number of black theatre archives from the extracted from their autobiographical ac- Theatre Collections of the Victoria & Albert counts, to the testimonies of two present Museum, and the Talawa Theatre Company day migrants from China and Russia named archive, as well as the digitisation of 257 ar- as Natasha and Liu to protect their real iden- chive items (totalling to about 600 document tities. pages) for online publication. The selection Supported by historical essays and links process was based on consultation with a to further resources, Equiano and Prince's number of lecturers, students, artists, com-

112 views were set against the stories of Natasha and current intercultural artistic and cultural and Liu, whose memories of their degrading heritage practices in Europe. treatment in the UK, recorded and filmed in As the project evaluator commented in London in April 2008, uncannily resonated the End of Project Evaluation Report dated with those voices from the past. Their 8 March 2009: memories reflected the pernicious continuity “Inspiring confidence and trust in the of two key aspects of the Transatlantic Slave public to submit material to a public site is Trade: economic exploitation and the in- no mean feat: such work needs to be done fringement of human rights. through outreach activities such as presenta- By asking the question “Has slavery really tions and discussions; and by encouraging ended?” the exhibition looked at these two established contacts such as lecturers, teach- moments in human history in their resem- ers, community course directors etc. to inte- blance, as well as crucial differences. The grate the Online Exhibition into their pro- history of the Transatlantic Slave Trade was grammes so as to enable more engagement one of human subjugation, but also of racial and submissions by the public. Funding discrimination, as the de-humananization of pending, efforts to publicise the site in order people from the African continent was key to receive more submissions should contin- to economic exploitation. The condition of ue by those responsible for Trading Faces: people trapped in human trafficking today Recollecting Slavery site’s maintenance so as resembles the past, but is also different: the target of 25 submissions for ‘Open shorter periods of so-called ‘enslavement’, Doors’ is now met by the end of this year general absence of a racial bias, different ju- 2009.” (Raminder Kaur) ridical status, and so on. Nonetheless, forms The paper will consider the challenges of enslavement of human beings are still tak- faced by FUTURE HISTORIES, the organ- ing place in dirty, dangerous and difficult isation responsible for the delivery of the work in Britain, in the running of private online exhibition, in its attempt at stimulat- homes, the care of the elderly and disabled ing the production of new material to be up- and in keeping the sex industry alive. Many loaded on the online platform. It will frame vulnerable people are trafficked or smuggled them within the wider context of the organi- into the UK today. Natasha from Russia and sation attempts at ‘popularizing’ the use of Liu Bao Ren from China are two of them. primary resources beyond academic and The attempt of the online exhibition was post-colonial intellectual circles, referencing to bring alive the resilience and resistance of the categories of ‘speech’ and ‘history’ to people of the African descent by activating reflect on the intrinsic intersubjectivity of intimate narrative points through different the archiving medium and the multiplicity of configurations, both visual and audio, which voices encompassed by black British per- would facilitate the exploration of emotional formance. and political connections between histories and stories from the past and the present. Topics: Visual and Multisensory Representat- To better elucidate this point, the presen- ions of Past and Present tation of the paper will intersperse research Keywords: black cultural heritage outcomes from the exhibition with the screening of video sections related to the Bibliography experience of Natasha, an underage young Terracciano, Alda. “Future Histories – An Russian woman forced to prostitution in the Activist Practice of Archiving,” in UK today and Liu, a Chinese man who ex- Popular Postcolonialisms: Discourses perienced forced labour in the UK. of Empire and Popular Culture, edited Finally, the paper will consider the unfin- by Atia N. & Houlden K. London: ished issues of public engagement and active Routledge. Awaiting publication. contribution to the project, setting this aim Terracciano, Alda. “Mapping Memory once again within the context of Orature Routes: A multi-sensory experience of

113 7 cities in 7 minutes,” in Curating the the National Endowment for the Humani- City. Proceedings of Challenge the ties, Deutsche Forschungsgemeinschaft and Past / Diversify the Future, University other bodies). SkP is nearing completion, of Gothenburg. Awaiting publication. with over 80% of the corpus entered into its • Terracciano, Alda. “Trans-national digital resource. politics and cultural practices of the SkP was inspired by major problems with Trading Faces online exhibition,” in previous research, in particular Finnur Black arts in Britain: Literary, Visual, Jónsson’s edition (1915-18) of the corpus of Performative, edited by Annalisa skaldic poetry (Skj) and the dictionary based Oboe and Francesca Giommi. Roma: on it (Lexicon Poeticum, 2nd ed. 1931). Aracne, 2011. While Skj is a monumental work which has Terracciano, Alda. The Future Histories Re- provided the foundation for almost a centu- search Toolkit for African, Asian and ry of skaldic poetry studies, Finnur Jónsson Caribbean Performing Arts. 2009. Ac- used a heavy hand of intervention, with fre- cessed 24 June 2016. quent and silent emendation. His lexicon, http://www.tradingfacesonline.com based on his own corpus, is therefore Terracciano, Alda. “The Black Theatre Fo- founded on a body of material that does not rum and the Experiments of the Black accurately reflect the manuscript evidence. It Theatre Seasons,” in Alternatives includes a large number of words that only Within Mainstream: British Black and exist through editorial conjecture, and omits Asian Theatre, edited by Dimple large numbers of words that are evidenced Godiwala. Newcastle: Cambridge in the manuscript tradition, particularly as Scholars Press, 2006. manuscript variants are largely ignored. This Terracciano, Alda and Kaur, Raminder. situation has left a significant gap in meth- “South Asian / BrAsian Performing odologies between the material evidence of Arts,” in A Postcolonial People: South the poetic lexicon and the resources to ana- Asians in Britain, edited by Ali, N. lyse it. Kalra, V. and Sayyid, S. London: C. SkP provides the foundation for the cur- Hurst & Co, 2006. rent project because it will have re-edited the Terracciano, Alda. “Together We Stand,” in entire corpus based on current philological Navigating Difference, edited by and textual editing methodologies. The edi- Heather Maitland. London: Arts tion is in the form of a digital resource Council England, 2005. (skaldic.abdn.ac.uk) from which the printed volumes are exported. It links together the normalised, occasionally emended edition The New Lexicon Poeticum with variant readings, manuscripts, secondary literature, prose contexts and previous edi- Tarrin Wills tions. It includes unnormalised transcriptions Københavns Universitet, Denmark of the main manuscripts of the corpus and significant numbers of variant manuscripts. The New Lexicon Poeticum (lexiconpoeti- The new resource will be linked directly to cum.org) is a project to produce a new lexi- these resources, enabling the lexicon to be cographic resource covering Old Norse po- understood in its complex contexts. etry (initially the category known as skaldic ONP, founded in 1939, is the major dic- poetry). It is based on the corpus produced tionary of Old Norse. The poetic corpus was by the Skaldic Project (supported project no. specifically excluded from ONP because of 60 of the Union Académique Internationale, the lack of a reliable edition of this material with funding provided by UK Arts & Hu- — a lack that is now being addressed by manities Research Council, Australian Re- SkP. ONP has a sophisticated database with search Council, Joint Committee of the a web interface that links the lexicon to the Nordic Research Councils for Humanities, citation index and textual corpus. It uses re-

114 liable diplomatic editions and manuscript 1. How to create interfaces for linking spellings, but is reliant on those editions ra- hundreds of thousands of words to tens of ther than the manuscripts themselves. thousands of headwords. Additionally, vari- The skaldic project's corpus is in a rela- ants add another 20% to the corpus, but tional database structure with all words en- need their status and relationship to the tered as separate items, with a normalised manuscript preserved. All this information syntax and translation linked to each word, must be in a form that can be checked and along with linked manuscript information updated. Some forms of analysis were per- including variants. It differs from lemma- formed by the original project (diction (ken- tised XML texts in that the lemmata (head- ningar and heiti), translations, free text vari- words) are linked to the (future) dictionary ants); others were not (lexical variants, lem- entry. The nature of the corpus is such that matising, compounds). there are a very large number of headwords: 2. How to maintain alignment with both with 100,000 words lemmatised, over 13,000 the original database and other lexicographic headwords have been linked to the corpus. projects, particularly ONP, so that a word’s Lemmatising produces an automatic con- use and history can be researched across cordance with a full set of contextual trans- corpora. lations. Owing to the structure of the corpus 3. As a more general question, how to database, each headword can be linked to its create a meaningful and useful lexical re- manuscript witnesses and to nominal pe- source when the original and underlying riphrases (kennings) in which it occurs. corpus is so rich in itself, with translation, There are a number of questions that notes and commentary linked to each word arise from the project as it has been con- — and how to publish it in the current met- ceived: rics-driven research environment.

115 User interfaces The user can select the lemma if it has the The original skaldic project uses a web inter- same form as the text, or look up the lemma face to enter, edit and manage the data of by entering a search term. Variations in form the project. Relational databases differ from and spelling are saved and used to prompt XML as there is no inherent connection be- the user when they next occur, although all tween the data structure and its digital stor- choices must be confirmed manually. The age (serialisation). This has the advantage word list was originally taken from ONP that the data can easily be exported in a (with permission) and has been supplement- number of ways, but direct editing of the ed as new headwords are identified (Figure data is not easy to perform. Early on I de- 1, see previous page). veloped a web application for both viewing The new lexicon will include all variant the edition, browsing the contextual infor- manuscript readings, something that previ- mation and editing the data, with customised ous lexica poetica have not documented sys- forms for entering the textual data, and a tematically. As the original variants were en- generic interface for dealing with other in- tered as free text, rather than as words with- formation. This allowed editors to produce in the data structure for words in the data- editions where a putative natural prose order base, the new project needs to add these to is linked to each text (allowing for easier in- the corpus. To aid this process I have creat- terpretation and potential morphosyntactic ed a web form which uses the variant appa- analysis), as well as a translation, with each ratus in the corpus database to prompt the word linked and reordered. Each stanza has user to add lexical variants and link them to a full set of linked manuscript references, as headwords. This is a complex process, with well as variants linked to both the words and no direct correspondence between the manuscripts. words linked in the main text and those in The process of lemmatising has been per- the variants, but the interface attempts to formed on the original corpus, again facili- analyse the information in the database to tated by the user interface. A web form lists facilitate the process (Figure 2). all the words in a stanza or block of text.

116 Relationship to other dictionaries by this process to identify line types (e.g. the The original word list for the lexicon was Sievers/Kuhn system). copied from ONP almost a decade ago. Un- Additional dating information for both fortunately the original unique identifiers for the manuscripts and the poetry (albeit unre- this list were not saved, and both the original liable at this stage) allows us to trace the his- ONP wordlist and the new lexicon’s wordlist tory of the word in its poetic and material have continued to evolve. The connection sources. Likewise, adding geographical data between headwords in the two lexica is not based on the poem’s place of composition reliable but we are making efforts to recover and/or recitation allows us to perform di- and check this information so that a single atopic analyses of the words and language of interface can be built to both resources. the corpus. There are still some questions regarding the nature and function of the new lexicon. Topics: Nordic Textual Resources and The process of lemmatising a corpus with Practices translations linked to each word produces Keywords: Old Norse, lexicography, poetry, already a concordance of all words with a relational databases, web interfaces gloss that effectively gives the interpretation of that word by the editor. Further infor- Bibliography mation about each word can often be found Tarrin Wills, ‘The thirteenth-century runic in the notes linked to the word. What, then, revival in Denmark’, NOWELE 67 does a dictionary entry for the word add to (2016), 114-129. the information already available? Addition- Tarrin Wills, ‘Social Media as a Research ally, the prose dictionary ONP will have Method’, Communication, Research & more comprehensively covered the more Practice [special issue ‘Digital Media common words in the lexicon. Should LP Research Methods: How to research simply supplement that lexicon, or should it and the implications of new media da- be a full description of the skaldic lexicon in ta’], 2:1 (2016), 7-19. its own right? These questions derive from doi:10.1080/22041451.2016.1155312 broader issues about the nature of traditional Tarrin Wills, ‘Semantic modelling of the Pre- scholarship as DH methods become increas- Christian Religions of the North’, Dig- ingly sophisticated. ital Medievalist 9 (2014) The linking of the rich corpus to dictionary Tarrin Wills, ‘Relational Data Modelling of headwords in itself provides an enormous Textual Corpora: The Skaldic Project amount of information for each word. The and its Extensions’, Literary and Lin- current interface shows all instances of each guistic Computing [Digital Scholarship word with contextual translation and linked in the Humanities] (2013) notes where relevant, plus compounds. doi:10.1093/llc/fqt045. Words occurring within kennings (nominal Odd Einar Haugen, Matthew Driscoll, Karl periphrases) are also explained in this con- Gunnar Johansson, Rune Kyrkjebø, text. Additionally, using the linked manu- Tarrin Wills, The Menota Handbook: script information, all manuscripts repre- Guidelines for the electronic encoding senting the word in both the base text and of medieval Nordic primary sources variants can be listed. (Bergen: Medieval Nordic Text Ar- Analysis can be performed on this infor- chive (Menota), 2008). mation to see, for example, the way parts of speech are distributed within each stanza and half-stanza of poetry. We plan to per- form more nuanced analyses of the metrics by using the grammatical information linked

117

118

SHORT PAPERS

119

120 What’s Missing in This word usage, as well as to look for patterns that are less apparent and only identifiable Picture? Political Change via computation. The corpus examined in and Wordscapes of this study entails 480 poetry books, which Latvian Poetry were published for the first time between 1920 and 1999 (samples of 60 volumes per Anda Baklanē decade). This is a new dataset, which was National Library of Latvia aggregated during the winter of 2016/2017 and has not been statistically studied before. Lexical and semantic change is an ongoing For this paper, analysis is based on results process in language and literature in particu- retrieved from two different tools – the cor- lar; following cultural and technological de- pus analysis toolkit ‘AntConc’ and the envi- velopments, new words enter the circulation ronment for statistical computing ‘R’. while others are shunned. Political factors While the analysis of word-lists often also greatly contribute to this process, alte- yields interesting conclusions and hints for ring the vocabularies of discourses, both further research, there is always the chal- evidently and subtly. Drawing the conclus- lenge of displaying the results. The visu- ions from the world-list analysis of alizations are important not only as means comprehensive corpus of Latvian 20th cen- of presenting the information in a way that tury poetry, the paper looks into the lexical is audience-friendly and appealing, but also change that followed distinctive political as cognitive tools that can help the resear- turning-points in Latvian history - Soviet cher to discover overlooked links and ano- occupation in 1940s and regaining of inde- malies. In this paper, several visualization pendence in 1990. tools are explored for displaying the “words- It has been previously established that in capes” of Latvian poetry as they change in the aftermath of World War II literary pro- time - starting with simple Excel graphs and cess in Latvia was greatly affected by the easy-to-use contemporary web-based tools, censorship and new ideological tasks that such as ‘Voyant’, to more sophisticated writers had to assume. A number of topics network visualization tools, among them the were officially banned from the creative wri- open-source software ‘Gephi’, which require ting (such as criticism of Soviet life along more skills, however, can also render more with references to mysticism, religion, cer- exciting and possibly revealing results. tain historical events etc.) while other topics In order to introduce the digital methods and utterances quietly vanished from the li- into the mainstream of humanities research, terary discourse, since such features as ex- it is important to develop tools (or user in- pressions of sadness, displays of intimate terfaces) that are not forbiddingly complica- feelings, and vagueness in general were har- ted, hence, the approach in this study was shly criticized. The range of topics as well as not to look for “new” and particularly smart vocabularies of authors notably broadened methods, rather to find simple and mature again in the 1970s and 1980s, however, the solutions that could be recommended to re- textual scene already remarkably differed searchers as ready-to-use and effective while from that of 1930s. At the beginning of working, for example, with the digitized 1990s, the collapse of the publishing indu- materials of cultural heritage at the National stry trimmed the production of literature, Library. nevertheless, the change of political regime Topics: Visual and Multisensory Representat- opened seemingly endless possibilities for ions of Past and Present topics (or, for that matter, avant-garde non- Keywords: digital literary stylistics, digital topics) that now could be discussed. culture studies The aim of this study was to explore if or how these developments can be traced and described in computational analysis of the

121 The Space Between: The work, distant reading or close reading) I ar- gue for a position in between. My field of Usefulness of Semi-distant study is contemporary Swedish crime fiction. Readings and Combined In an on-going study I use a corpus of 116 Research Methods in Swedish crime fiction novels published 1998–2015 and written by the most well- Literary Analysis known and commercially successful authors of this period of time. Hence, I do not ana- Karl Berglund lyse the entire genre, yet not only the most Uppsala University, Sweden renowned novels within it, but instead around ten per cent of all Swedish novels The study of literature has traditionally been published in this period (the top decile). This a qualitative scientific endeavour. Resear- choice makes it possible to both get the big- chers have, generally, analysed few and ca- ger picture and be very familiar with the nonised works, and these works have been material. examined in great detail, with “close rea- Moreover, I approach this corpus dings” being the typical choice of method. through a combination of methods, where Franco Moretti’s term “distant reading” and some are computer aided and digital (word the rise of digital humanities and different frequencies, topic modelling), others more sorts of text mining methods have, at least traditional and analogue (shallow thema- partly, changed this. tically-oriented readings of the entire cor- Moretti and his ilks in a way turned the pus). Together these methods provide solid scholarship of literature upside down by knowledge of the genre that is both quanti- focusing units much bigger or much smaller tative and qualitative. than singular works of literature: “devices, In my presentation I argue that such a themes, tropes – or genres and systems”, as combination of methods on semi-big data or Moretti put it. Instead of reading the most corpuses can be very fruitful to many literary well-known works of a specific period or studies, with material from different epochs genre it was suddenly possible to read al- and genres. Literary scholars should start to most all literature published and draw other make use of this “space between” the very sorts of conclusions (though this also meant distant and the very close, and let computer- handing over much of the reading process to aided methods serve as a helping hand rat- computers). her than a goal in itself. Most literary scholars engaged in text mi- ning have used very large data corpuses. The Topics: Nordic Textual Resources and general rule seems to have been “the bigger Practices data, the better”. This is certainly true when Keywords: distant reading, text mining, it comes to showing statistical patterns etce- method, popular fiction, crime fiction tera. However, the bigger the material, the longer is also the distance between the Bibliography machine-generated results and the qualitative Mordförpackningar. Omslag, titlar och analysis of these results. If your corpus con- kringmaterial till svenska pocketdeck- sists of thousands of books it is simply not are 1998–2011, (Uppsala: Uppsala possible to know the content of this corpus universitet, 2016), 283 pp. very well. This is at the same time the ”Ett halvt sekel litteratursociologi. En kvan- strength and the weakness of the text mining titativ genomgång av skriftserien Skrif- research on literature conducted in recent ter utgivna av Avdelningen för littera- years. tursociologi vid Litteraturvetenskap- In my opinion, debates about pros and liga institutionen i Uppsala 1967– cons of text mining methods in the study of 2015”, Spänning och nyfikenhet. Fest- literature have been far too black and white. skrift till Johan Svedjedal, (ed.) Gunnel Instead of either or (big data or canonised

122 Furuland, Andreas Hedberg, Jerry Drawing upon epistemological theories Määttä, Petra Söderlund & Åsa War- of constructivism and emergent learning nqvist (Möklinta: Gidlunds, 2016), pp. (e.g. McMurtry, Osberg, Biesta and Cilliers) 482–499 and those concerned with the affordances of [Review: Matthew L. Jockers, Macroanalysis: digital art, representation, and interaction Digital Methods and Literary History, (e.g. Strickland, Coverly/Luesebrink, Augs- Urbana: University of Illinois Press, burg), the paper explores the ways in which 2013], Samlaren. Tidskrift för forsk- the works themselves theorize relationships ning om svensk och annan nordisk lit- between experiential knowledge and con- teratur, vol. 135, 2014, s. 342–345 structions of “the past.” Since a good deal of “A Turn to the Rights: The Advent and Im- digital literature is non-linear, it is not pact of Swedish Literary Agents”, enough that these pieces simply use ergodic Hype: Bestsellers and Literary Culture, modalities; instead, the focus of the study (ed.) Jon Helgason, Sara Kärrholm & are specific visualizations of the forgotten or Ann Steiner, (Lund: Nordic Academic fading past. Thinking of knowledge in relat- Press, 2014), pp. 67–88 ional terms is to see it as a dynamic relat- Deckarboomen under lupp. Statistiska per- ionship between the knower and world, a spektiv på svensk kriminallitteratur participatory relationship where knowledge 1977-2010, (Uppsala: Uppsala univer- allows the knower “to interact effectively sitet, 2012), 224 pp. with something else” (McMurtry). Memory loss reconfigures that world and results in a loss of the efficacy of interaction as modes ”These Memories of relating begin to weaken, shift, and disap- pear. The fact that these pieces are about Won’t Last”: Visual memory loss suggests specific configurations Representations of of the experience of knowing/forgetting and the Forgotten allows us to backwards engineer, in a sense, the epistemological implications that under- Jennifer J Dellner pin the visualization of what is being lost. Ocean County College, United States of As such, the back end of the visualization America will be explored in terms of its relationship to these ideas. Strickland (2009) writes, “These Memories Won’t Last:” is a digital “time-space processing in e-lit is of another comic (2012) by Stuart Campbell that sort [from print literature]. It encompasses depicts his grandfather’s descent into Alz- … kinds of time-space processing that aut- heimer’s and their, both the grandfather’s hors set out deliberately to explore, because and Campbell’s, attempts to piece together the computational situation allows them to and make sense of two simultaneous pasts: imagine and build with their (code) writing.” the grandfather and his life as a WW II sol- In “These Memories Won’t Last,” the image dier as well as Campbell’s memories of his of a disappearing rope serves to link vignet- grandfather’s forgetting. Beginning with tes of the narrative together at the same time Campbell’s digital comic, this study exami- as it signals the grandfather’s inability to do nes two other pieces of e-literature, Strasser so. The more one manipulates it or scrolls and Coverly’s “in the white darkness” (2004) back, the more it fades and becomes irretri- and Wilks’ Rememori (2012), a digital poem evable. This design is ironically dependent and game respectively, whose common aim on jquery architecture, a feature of which is is to present experiences and representations chaining, represented, I argue, as the rope or of memory loss: while primarily visual pie- thread that stands for the grandfather’s ces, each seeks to invoke in the reader the memories and his attempts to chain or link diminished ability to access and make sense them into a coherent past of memories; as of the past. these fail, the very chaining in the code ena-

123 bles this representation. The paper conclu- projects, the theoretical findings from my des with an examination of the tensions dissertation (Lie, 2008) and the first edition between the visualization of pasts lost and of Sett i gang (Aarsvold & Lie), a text geared the techno-artistic choices that encode them. towards North Americans I co-authored and taught from for 10+ years. Topics: Visual and Multisensory Representat- Before the portal’s conception, students ions of Past and Present used a print-only curriculum entitled Sett i Keywords: memory, forgetting, e-literature, gang (a print workbook and print glossary to digital comic, design supplement a print textbook) for beginning language learning. Now, the second edition Bibliography of Sett i gang utilizes technology to motivate Forthcoming: DH as Intervention, Hybrid and stimulate language learning in new and Pedagogy, early 2017 meaningful ways by utilizing a print text- McMurtry, A. & Dellner, J. (2014) Relation- book together with an online web portal. alism: An interdisciplinary epistemo- This web portal houses thousands of lan- logy. Or, why our knowledge is more guage learning resources together in one lo- like a coral reef than fish scales.” In- cation for first year Norwegian language tegrative Pathways: Newsletter of the learners. It’s expansive as well-- housing Association for Interdisciplinary Stu- 800+ webpages, 500+ interactive activities, dies, 36 (3), 1, 6-12 500+ flashcards and vocabulary games, Chapter in a Book: “Children of the Island: 500+ audio clips, additional resources for Ovid in the Poetry of Evan Boland instructors and many links to authentic ma- and Derek Mahon,” in (ed. J. terials online. Ingleheart) Two Thousand Years of The portal was built from an understand- Solitude: Exile After Ovid, Oxford ing of how theory and practice meet in the University Press. 2011 interdisciplinary fields of Applied Linguis- “The Big End: William Gibson and the Eco- tics, Foreign Language Learning, Education- logy of Cool,” American Fiction Re- al Technology, and Online Learning. This flecting Global Ecological Concerns, presentation will focus on 10 specific re- ed. Linda Cook, Cambridge Scholars search findings from the above-mentioned Press. Tentative: Under Contract fields, which shed light on how students ex- perience an online learning environment dif- ferently from a face-to-face environment. From Theory to Practice: Additionally, this talk will examine how specific research findings have helped to The Sett i gang Web Portal create a platform that can provide learners with the learning experience they need to be Kari Lie Dorer successful. St. Olaf College, Minnesota, United States of These findings and references to studies America include: immediate feedback (Northrup, 2002; Brown, 1996; Lie 2008; Csíkszent- The Sett i gang web portal is a project that mihályi, 1990); proximal goals & mastery began as a collaborative, student-faculty pro- experiences (Bandura, 1986); advance organ- ject in 2014 and is currently used by approx- izers (Ausubel, 1968; Chen & Hiumi’s, imately 15 universities in North America by 2009); authentic texts (Harmer, 1991; Lee, approximately 300 students. The portal was 1995); authentic tasks (Reeves, Herrington, created based on an understanding of the Oliver & Woo (2004), life-long learning Bi- scholarship of teaching and learning within lash, Gregoret & Loewen 1999); raising met- beginning Norwegian language instruction alinguistic awareness (Roth, Speece, Cooper, and also within an online learning environ- & de la Pazas, 1996; Sorace, 1985; Alderson, ment. It is an extension of two of my earlier Clapham & Steel, D., 1996); reducing for-

124 eign language anxiety (Horwitz, Horwitz & forms poorly with old font types such as Cope, 1986; Horwitz & Young, 1991; fraktur. Crookall & Oxford, 1991; Krashen, 1985); One way to find OCR errors is by using and reducing technological anxiety (Saadé & the unsupervised Word2Vec[3] learning al- Kira, 2009). gorithm. This algorithm identifies words that I will also discuss how this project is one appear in similar contexts. For a corpus with small portion of a four-year $700,000 An- perfect spelling the algorithm will detect si- drew Mellon Foundation grant aimed at ex- milar words synonyms, conjugations, ploring and developing the digital humani- declensions etc. In the case of a corpus with ties at St. Olaf College. One unique piece of OCR errors the Word2Vec algorithm will this project is the emphasis given to faculty- find the misspellings of a given word either student collaboration; it simultaneously from bad OCR or in some cases journalists. funds faculty to explore new ways of teach- A given word appears in similar contexts ing and new lines of inquiry for research despite its misspellings and is identified by while also enables students to learn digital its context. For this to work the Word2Vec research methodologies relevant to careers algorithm requires a huge corpus and for the in the humanities and humanistic social sci- newspapers we had 140GB of raw text. ences. Given the words returned by Word2Vec I will conclude with the preliminary re- we use a Danish dictionary to remove the sults of an intensive research project con- same word in different grammatical forms. ducted on student use and perception of the The remaining words are filtered by a simi- portal which seeks to complete the theory to larity measure using an extended version of practice and back to theory cycle, again a Levenshtein distance taking the length of the student-faculty research collaboration. word and an idempotent normalization ta- king frequent one and two character OCR Topics: The Digital, the Humanities, and the errors into account. Philosophies of Technology Example: Let’s say you use the Keywords: applied linguistics, foreign language Word2Vec to find words for banana and it learning returns: hanana, bananas, apple, orange. Remove bananas using the (English) dictionary since this is not an OCR error. For the three remaining words only hanana is close to ba- Automated Improvement of nana and it is thus the only misspelling of Search in Low Quality OCR banana found in this example. The Using Word2Vec Word2Vec algorithm does not know how a words is spelled/misspelled, it only uses the Thomas Egense semantic and syntactic context. Statsbiblioteket, Denmark This method is not an automatic OCR er- ror corrector and cannot output the correc- In the Danish Newspaper Archive[1] you ted OCR. But when searching it will appear can search and view 26 million newspaper as if you are searching in an OCR corrected pages. The search engine[2] uses OCR (opt- text corpus. Single word searches on the full ical character recognition) from scanned pa- corpus give an increase from 3 % to 20 % in ges but often the software converting the the number of results returned. Preliminary scanned images to text makes reading errors. tests on the full corpus shows only relative As a result the search engine will miss few false positives among the additional re- matching words due to OCR error. Since sults returned, thus increasing recall sub- many of our newspapers are old and the stantionally without a decline in precision. scans/microfilms is also low quality, the re- The advantage of this approach is a quick sulting OCR constitutes a substantial pro- win with minimum impact on a search blem. In addition, the OCR converter per- engine [2] based on low quality OCR. The

125 algorithm generates a text file with syno- housed in archives throughout the world, nyms that can be used by the search engine. making it difficult for scholars to engage Not only single words but also phrase search with them as an entire corpus. Furthermore, with highlighting works out of the box. An of the 18th-century memoirs, over 90 % are OCR correction demo[4] using Word2Vec in manuscript form. As project collaborators on the Danish newspaper corpus is available establish the foundations of a massive digital on the Labs[5] pages of The State And Uni- archive that houses facsimiles of the versity Library, Denmark. memoirs, we wrestle with how best to publish the memoirs in machine-readable Topics: Nordic Textual Resources and format: existing optical character recognition Practices (OCR) software does not reliably manage Keywords: text, corpora, NLP, OCR 18th century German script; in addition, the volume of pages to be transcribed challenges References traditional transcription capabilities. Rese- [1] Mediestream, The Danish digitized arch teams at Bucknell and the University of newspaper archive. Gothenburg in Sweden are collaborating to http://www2.statsbiblioteket.dk/medie develop a suite of tools that will support stream/avis large-scale controlled crowdsourcing of [2] SOLR or Elasticsearch etc. transcription and exportation of text and [3] Mikolov et al., Efficient Estimation of data sets to support a wide range of research Word Representations in Vector Space needs by scholars in fields ranging from https://arxiv.org/abs/1301.3781 autobiography to theology, religious history, [4] OCR error detection demo (change word social history, historical and computational parameter in URL) linguistics, and gender studies. In this paper, http://labs.statsbiblioteket.dk/dsc/ocr Katie Faull and Trausti Dagsson will discuss _fixer.jsp?word=statsminister the challenges we face as we establish best [5] Labs for State And University Library, practice for developing an interactive plat- Denmark form for editing and accessing this critically http://www.statsbiblioteket.dk/sblabs/ significant collection.

Topics: Nordic Textual Resources and Reading Moravian Lives: Practices Keywords: transcription, digital history, Overcoming Challenges in autobiography, metadata, Moravian Transcribing and Digitizing Archival Memoirs Bibliography “Doing DH in the Classroom: Transforming Katherine Faull the Humanities Curriculum through Bucknell University, United States of Digital Engagement” (with Diane America Jakacki) Doing Digital Humanities: Trausti Dagsson Practice, Training and Research. Ri- University of Gothenburg, Sweden chard J. Lane, Raymond Siemens, and Michael McGuire Constance Crompton, eds. Abington, Bucknell University, United States of UK: Routledge. Forthcoming. America “Reifying the Maker as Humanist” (with Di- ane Jakacki and John Hunter). Making The Moravian Lives project aims to digitize, Humanities Matter, Jentery Sayers, ed. transcribe, and publish for analysis more Minneapolis, MN: U. of Minnesota than 60,000 manuscript and print memoirs, Press. Forthcoming. written by members of the Moravian Church Faull, Katherine (with Diane Jakacki). between 1750-2012. These memoirs are “Digital Learning in an Undergraduate

126 Context: Promoting Long Term Stu- collecting visual and material interpretations dent-Faculty Collaboration.” Digital of early-modern and modern handiwork. Scholarship in the Humanities. DOI: Furthermore, the project will apply text re- http://dx.doi.org/10.1093 cognition tools (Transkribus) developed “Anna Nitschmann” Pietismus Handbuch, within the EU H2020 project ‘READ – Re- ed. Wolfgang Breul, Mohr Siebeck cognition and Enrichment of Archival Verlag. Forthcoming. Documents’.

Topics: Visual and Multisensory Representat- Senses and Emotion of ions of Past and Present Early-modern and Modern Keywords: material culture, elites, 18th century, 19th century, Europe Handicrafts – Digital History Approach Bibliography Johanna Ilmakunnas & Jon Stobart (eds), A Johanna Ilmakunnas Taste for Luxury in Early Modern University of Turku, Finland Europe: Display, Acquisition and Boundaries. Bloomsbury 2017. The proposed paper explores handicrafts Johanna Ilmakunnas, Marjatta Rahikainen & (embroidery, plain sewing, shellwork, paper- Kirsi Vainio-Korhonen (eds), Early cuts, silhouettes, woodturning etc.) in Professional Women in Northern Europe, c. 1700–1850 and sensory and Europe, c. 1650–1850. Routledge 2017. emotional practices linked to them. The pa- Johanna Ilmakunnas, ‘Embroidering Wo- per discusses how manual work can be men & Turning Men: Handiwork, found from the wealth of sources, both digi- Gender and Emotions in Sweden and tal and non-digital, both textual, visual and Finland, c. 1720–1820’, Scandinavian material. The paper aims also to explore Journal of History, Special Issue on what possibilities and restrictions historians Gender, Material Culture and Emot- may encounter while using digitized museum ions in Scandinavian History. 41:3 collections as source material. The paper will (2016), pp. 306–331. discuss the possibilities of exploring before DOI:10.1080/03468755.2016.1179831. relatively closed museum collections of Johanna Ilmakunnas, Joutilaat ja ahkerat: objects and potentiality for novel appro- Kirjoituksia 1700-luvun Euroopasta. aches digitized collections offer for history Siltala Publishing 2016, 272 p. [Idle and research. It also discusses the opportunities industrious: Writings from eighteenth- text and image recognition brings to a century Europe] subject that has been little researched despite important recent work on handiwork and research projects digitizing sources (e.g. pro- ject ‘Lady’s Magazine: Understanding the Reading Through Emergence of a Genre’ at the University of the Machines: Epistemology, Kent). Furthermore, restrictions such as in- sufficient information on images, inadequate Media Archeology and meta data or strict copyright regulations will the Digital Humanities be discussed. The paper presents a new research pro- Jonas Ingvarsson ject that explores handiwork done by Euro- University of Skövde, Sweden pean elites. It is part of a larger project on work and profession of early-modern Euro- In this presentation I approach the more pean elites, lead by prof. Johanna Ilmakun- abstract relations between art and digital cul- nas. Within the project, citizen science and ture, the dimension of »digital epistemolo- crowdsourcing will be used especially when gy», where »the digital» is regarded not as a

127 set of technologies, structures or gadgets, artifacts. Digital epistemology in this mode but rather as a lens (Lindhé 2013; O’Gorman functions as a multifocal lens by which we 2006), through which we focus on culture, zoom in and explore the digital not only as history and our own contemporary times. technology, object or network, but as a criti- Initially, this has meant to relate literary cal concept and historical facticity in the re- texts to digital culture and history even flection upon our cultural environments. In though these texts not explicitly mentions this presentation, then, I intend to propose a digital culture, computers or networks, or few intersecting – and heuristic – approach- are published on a digital platform. By per- es to digital epistemology: forming these readings, I have also found it 1. Relating literary texts and artworks to digital productive to relate the forms of the digital to culture and digital history. That is: What does it early modern aesthetic genres, many of mean to relate cultural artifacts to the com- which – for example the emblem (Daly municational and organizational logic that 1979; Agrell 1994; Manning 2002) and the has been put forward – in different ways – cabinets of curiosities (Bredekamp 1995) – by digital technology since the 1950’s? were regarded not only as genres but as 2. Reading analogue literature and art as if they modes of thought. The entrance into digital cul- were electronic texts. That is: What happens if ture, and digital aesthetics therefore also be- we analyze for example a print novel in comes a historical tool. Moreover, the con- terms of embodiment, processes, performa- nections between our own digital age and tivity, materiality and even »software», or early modern modes of thought could foster other »buzz concepts» in the analytic tradi- a new understanding of our own technologi- tion of electronic texts and digital culture? cal times. Will this encourage a focus not on what an This short presentation will introduce artwork mean but what it does? some of the critical perspectives I have 3. Juxtaposing expressions of digital culture with probed in an ongoing research project. early modern modes of thought. That is: How does While discussing these perspectives, I use social media as Instagram, Facebook and »mode of thought» as an epistemological Twitter relate to the Salon Culture? How concept, and »lens» as the driving metaphor. does computer games and web pages relate As a background, though, I should mention to the aesthetics of the Renaissance em- professor Alan Liu’s short paper on the no- blem? How does the result of Internet tion of »the epistemology of the digital» (Liu, search engines relate to the Cabinets of Cu- 2014). Liu identifies a few important fields riosities, or to the archival »principle of per- where digital environments could or should tinence» (sort by subject rather than prove- influence the academic curriculum in general nance)? and the Humanities in particular. The point These lines of digital epistemology do of Liu’s text is that digital knowledge is not a have one thing in common: the digital is concern only related to digital objects and seen as a mode of thought, rather than as a set electronic culture, big data, the digitalization of gadgets, machines or electronic networks. of the cultural heritage and new positivist The concept of digital epistemology suggests trends in its wake – no, digital knowledge that the humanities curriculum should be should announce an epistemic shift for the revised, since «the digital» – understood as a academic practice as such. The aim of Liu’s perspective, or a set of lenses – shifts our fo- »provocation more than a prescription» is to cus in the treatment of contemporary culture challenge the basic structures of knowledge as well as of historical topics and aesthetics. distribution and production within the aca- demic field. Topics: The Digital, the Humanities, and the In this presentation, I will narrow down Philosophies of Technology these challenges to a few more concrete as- Keywords: digital epistemology, media pects of how digital epistemology can in- archaeology, hypertext theory, form the analyses of literature and cultural game philosophy

128 Organizational and nizers brought a variety of competences, ie. in the archival material, in visual culture, in Educational Issues in spatial humanities, in technology and in run- Representing History ning data sprints. through a Series of Data The data involved were maps, images and metadata related to the former Danish colo- Sprints on Visual Data nies. 2017 is the centenary of the sale of the from an API Danish West Indies and the material raises a range of possible questions like: What is Lars Kjær drawn, surveyed and photographed? What is Ditte Laursen the origin and the context of the material Stig Svenningsen and how did it found its way to the col- Mette Kia Krabbe Meyer lections of the library? How do we commu- The National Library, Denmark nicate them in today's postcolonial society? While we are still running the events and While archives and libraries have made digi- processing interviews and experiences when tal data available in dissemination platforms writing this abstract, preliminary results sug- for decades, with access to one single object gest that a data sprint is a suitable format for at a time, they have little – but a growing – creating an interdisciplinary and cross- experience in making data available as data- material framework for releasing the pot- sets through API's and making them avai- ential of digital humanities in relation to digi- lable in user friendly ways. Correspondingly, tized cultural heritage in archives and nat- students and researchers in the field of hu- ional libraries. However, there is also a need manities have little – but a growing – un- for improving access as well as interopera- derstanding of using digital data from API's. bility to the digital data held by the library In this presentation, we will present the re- and other cultural institutions. Moreover, sults of a university and library collaboration there is a need for setting up boot on making available and bring into play data camps/workshops prior to data sprint through an API, in a series of data sprints. events to strengthen digital skills among the We base our presentation on interviews with participants, such as Tableau, OpenRefine, participants, on analysis of the products that Python and Geographic Information they made doing the data sprints, and on our Systems (GIS). own experiences as organizers and data pro- On a broader canvas, this study provides viders. empirical evidence of organizational barriers A data sprint is in our definition an inten- and possibilities for archives and libraries of sive period where a group of people work making digital data available in new ways, as with selected data by collecting, refining, well as support for recent discussions on analyzing and visualizing it to solve a pro- educational issues in balancing a strong the- blem, to create insights, and to learn about a oretical and methodological grounding in topic. About 50 students and researchers humanities with an understanding of techno- from Copenhagen University and Copenha- logy. gen IT University joined the exploration of the material in three data sprints during Topics: Nordic Textual Resources and autumn 2016 (http://kub.kb.dk/humlab/ Practices datasprint). The participants had very diffe- Keywords: data sprints, API, visual data, open rent skills within humanities and IT. For in- data stance, some were experts on the subject colonial history, others had technical pro- Bibliography gramming skills, and others just had an inte- http://kub.kb.dk/humlab/datasprint rest in combining and learning about using digital data in new ways. In turn, we as orga-

129 The Afterlife of Early Modern collections provides a centralized data mass of images, similar to the data set of 3 200 Portraiture in Digitized selfies provided by one of most well-known Museum Collections: projects in digital humanities today, Sel- Discovering Conventions fieCity, coordinated by media theorist Lev Manovich. and Forgotten Images In my presentation, I would like to com- pare methods and outcomes between my Charlotta Krispinsson project and the methodological foundation Stockholm University, Sweden of SelfieCity, and also expand upon how di- gitized museum collections could provide The aim of this paper is to discuss how digi- new opportunities to art historical research. tized museum collections of early modern Scanning through the afterlife of early mo- portraiture added analytical possibilities to dern portraits in digitized national portrait my recently finished PhD project (Historiska collections provided different kind of hist- porträtt som kunskapskälla: Samlingar, arkiv orical insights than close readings of a few, och konsthistorieskrivning, Nordic Acade- select portraits could. It showed how the mic Press, 2016). typical kind of early modern portraiture put A methodological point of departure for on display in art museums today (chosen for the project was to treat early modern originality, artistic quality, or the works po- portraiture as a material as well as mental sition in the history of art) need to be regar- category of images. My interest was to study ded as rare exceptions in the total product- the modern reception history of this cate- ion of portraits, just as the iconic selfie is a gory, ca 1880-1945. For this reason, the in- rare exception to the big data of quickly for- vestigation started with a need to take stock gotten, digital images. of the characteristics of early modern pain- ted portraits. The previous research on early Topics: Visual and Multisensory Representat- modern portraiture is vast, but is often cha- ions of Past and Present racterised by an aesthetic and art theoretical Keywords: art history, SelfieCity, early modern focus on singular works that do not reflect portraiture the historical artistic production of portraits in different medias as whole. Bibliography The Swedish national portrait collection Historiska porträtt som kunskapskälla: Sam- (part of the collection of Nationalmuseum) lingar, arkiv, konsthistorieskrivning, consists of ca 3 000 objects. It is, together diss. Stockholm, Nordic Academic with the collections of the National Portrait Press, Lund 2016. Gallery in London, one of the largest col- ”Collecting Faces: Art History and the lections of portraiture in Europe. Together Epistemology of Portraiture. The Case they comprise a large quantity of portraits, of the Swedish Portrait Archive”, Sen- selected mainly according to the name of the sorium Journal, no. 1, 2016. depicted subject (and not according to the ”Collection BIOMUS / Museum Fantasies”, artistic merits of the portrait painter). How to gather? Acting in a Center in a Browsing these digitized museum collections City in the Heart of the Island of of early modern portraiture thus made it Eurasia, utst. katalog, Moscow Bien- possible to better detect visual conventions nale Art Foundation, Moska 2015. characteristic of the historical production as ”Aby Warburg’s Legacy and the Concept of whole. Image Vehicles. ”Bilderfahrzeuge”: Early modern portraiture and contempo- On the Migration of Images, Forms rary selfies are both images of individuals and Ideas. London 13-14 March where identity reflected through stereotypi- 2015”, Konsthistorisk tidskrift, nr. 4, cal expressions of self is key. To continue vol. 84, 2015. this comparison, digitized national portrait

130 ”The Challenge of the Object. CIHA:s fiction, while the web portal is nonetheless (Congrès International d'Histoire de designed to serve the Icelandic speaking l'Art) 33:e internationella kongress för public in general. Málið.is strives for plain konstvetare. Nürnberg 15–20 juli and non-technical exposition and concise- 2012.”, Konsthistorisk tidskrift, vol: ness, whenever possible. 81, 2012:3. A major challenge in the process of crea- ”Catharina Nolin: En svensk lustgårdskonst ting and launching málið.is was the different - Lars Israel Wahlman som trädgårds- nature and content of the various language arkitekt, Stockholm 2008”, Konsthi- resources on the one hand, and the different storisk tidskrift, vol. 80, 2011:1. motivations and expectations of individual ”Lars Nilsson och svensk postmodernism users on the other hand. Some of the data före 1987”, Valör 2011:1. are explicitly of a prescriptive nature, while others have primarily descriptive function. As we do not expect users (unless those who have linguistic training) to be immediately Málið.is: An Icelandic Web familiar with this fundamental distinction, Portal for Dissemination of we realized that this could perhaps lead to Information on Language misinterpretation of the data presented. However, since málið.is facilitates the com- and Usage parison between the two data types, our conclusion is that users will be able to ack- Ari Páll Kristinsson nowledge the distinction. Indeed, one the- Halldóra Jónsdóttir oretical contribution of málið.is is that it The Árni Magnússon Institute for Icelandic highlights the difference between descriptive Studies, Iceland and prescriptive language resources, for the benefit of students and researchers. A new web portal on the Icelandic language, The name of the web portal is its web and language use, was opened in Iceland in address: málið.is, which translates as ’the November 2016. The users of málið.is only language.is’. The functions of this portal are need this single web address in order to ac- in many ways similar to the Danish web por- cess abundant reliable and authoritative in- tal sproget.dk. Indeed, the Danish portal, formation, guidance, help and advice on the initiated and operated by our colleagues at Icelandic language, its use and nuances, hist- the Danish Language Council and Society orically and contemporarily. This concerns for Danish Language and Literature, served e.g. orthographical matters, grammatical is- as a model and an inspiration as we were sues such as inflections, grammatical agree- planning this Icelandic web portal, at The ment, and a variety of other questions of Árni Magnússon Institute for Icelandic syntax, word formation, semantics, the lex- Studies. Thus, málið.is is an example of icon, the history and etymologies of particu- fruitful Nordic cooperation in the field of lar lexical entities, phraseology, synonymity, digital humanities. The two portals differ in terminologies and translations of technical some details, e.g. in that málið.is primarily vocabulary, and many questions of language focusses on its source data bases and search and usage. results, while the sproget.dk main site also Previously, these resources were ac- offers the user a variety of links, games, sug- cessible via a variety of different formats, gestions etc. The team behind the planning user interfaces, web addresses, search of málið.is is not convinced that it is feasible methods and functions, which caused pro- to add much material of this type on the blems for many users as they were typically website. Another difference worth not aware of all possibilities. mentioning is that while sproget.dk e.g. ex- Among the principal target groups of plicitly comments that the spelling of the málið.is are students, and writers of non- ODS is not necessarily in harmony with

131 modern spelling rules, málið.is leaves the ring boundaries between interpersonal and interpretation of the data to the user. mass, professional and amateur, bottom-up and top-down communications” (Shifman, Topics: Nordic Textual Resources and 2014). In this society, the cultural partici- Practices pants are not interested in being passive Keywords: web portal, Icelandic, language consumers (Kolb, 2005) of culture, rather resources, Nordic cooperation, disseminat- they recreate culture by reusing, remixing, ion of knowledge and recirculating it. One of the intrinsic character of the internet society is using im- Bibliography (selected) ages to convey sentiments, often ranging Kristinsson, Ari Páll. 2016. Language in pub- from cynicism to humorous, often on con- lic administration in present-day Ice- temporary issues, referred as meme (refer land: some challenges for majority lan- Fig 1). In fact, this simplified, clear and con- guage management. In: Language use cise way of expressing complex sentiments is in public administration. Theory and an essential and indispensable part of the practice in the European states. Pirkko contemporary digital culture. Nuolijärvi & Gerhard Stickel eds. European Federation of National In- stitutions for Language. Budapest: Re- search Institute for Linguistics, Hunga- rian Academy of Sciences. Pp. 83-92. Kristinsson, Ari Páll. 2016. English Langu- age as ‛Fatal Gadget’ in Iceland. In: Why English? Confronting the Hydra. Pauline Bunce, Robert Phillipson, Vaughan Rapatahana & Ruanni Tupas eds. Bristol: Multilingual Matters. Pp. 118–128. Kristinsson, Ari Páll. 2016. Om følgerne af leksikalsk purisme i Island. [On the consequences of lexical purism in Ice- land] Dansk Noter 1/2016:40‒44. Kristinsson, Ari Páll. 2016. Editor of Orð og tunga 18. [Orð og tunga is a peer-reviewed journal on language and linguistics, published annually by the Árni Magnússon Institute for Icelandic Studies.]

[Re]use of Medieval Paintings in the Network Society: A Study of Ethics

Pakhee Kumar IMT School of Advanced Studies Lucca, Ita- ly

The internet society is a “network society” (Castells, 2014) characterised by quickness Figure 1. Memes created by Medieval Reactions of information. It is an also an “era of blur- using painitings.

132 The word meme was introduced by Topics: The Digital, the Humanities, and the Dawkins (1989, s. 92) to explain a concept Philosophies of Technology of culture. He noted that meme is the new Keywords: digital culture, meme, ethics replicator, a noun that conveys the idea of a unit of cultural transmission, or a unit of im- itation. Further, Oxford dictionary defines Digitization of Literary meme as “a humorous image, video, piece of text, etc., that is copied (often with slight Fiction. Example of Jan variations) and spread rapidly by Internet Potocki's The Manuscript users” (meme, nd) to spread particular idea Found in Saragossa (Colin & Knobel, 2007).

The creation of meme does not require Rafał Kur any particular artistic skills, only connection Jagiellonian University, Poland to the internet and hence, can be created, circulated and consumed by anyone. This The Manuscript Found in Saragossa (original reflects the freedom to participate envi- title Manuscrit trouvé à Saragosse) was sioned by Berners-Lee (2010, s. 82) that written by Jan Potocki in the years 1979- “people must be able to put anything on the 1805. The work consists of several plots ma- Web, no matter what computer they have, king up different stories. The thick web of software they use, or human language they connections between places, plots and prota- speak and regardless of they have a wired or gonists in The Manuscript Found in Sara- wireless connection”. In this process, not gossa while partly following the story within only the original context of the paint- a story formula, goes beyond it. However, it ings/artwork is lost but also the meaning of more resembles a tangle or a maze. While it the painting is altered. However, this does is in fact one story, it is told in several dozen not diminish the popularity of such images. ways and it is filled with quotes and repetit- Shifman (2014) raised a few question regard- ions. This kind of composition, recorded on ing this issue: how did such bizarre piece of paper, in which one story includes another culture become so successful? Why are so one, while within the second one emerges a many people investing so much effort in- third, still obscures from the reader the web venting it? Why do some of these amateur of internal connections between the narra- imitations attract millions of viewers? tors, characters, events and places. One of the possible reason for their pop- That is why a Krakow literary community ularity may be that it is minimalistic, there- with the help of IT specialists and graphic fore, an easy way of catching attention. designers created a reinterpretation of the Moreover, the relationship to contemporary work adding a visual layer. The completed issues further adds to its popularity. Lastly, work was made available on a website. the attempts to humor-ise even the immoral Owing to the project, the book may be read situations may also be the reason of its pop- anew, discovering even deeper the talent and ularity. Indeed, every age looks at the past in the imagination of Potocki, overwhelmed by a different way. However, the question is the sheer number of pages in the traditional whether this creation and consumption de- printed form. generates the original content or enhances it Digital text is a web and a database, a by utilising amateur and untrained yet skilful space in which distant elements are only people. one click away from each other. Each of This paper will examine the reuse of me- the 66 days-chapters was given a plaque, dieval paintings in the internet age. It will owing to which we will not get lost in the examine various typologies of reuse to re- labyrinth of plots and characters, and we present the contemporary sentiments. Lastly, will be able to follow individual plots in a the paper will examine ethics related to cir- free order without the fear of missing a part culation of such images in the internet. of the story. The only required tool is a web

133 browser. A clear and simple graphical inter- 2015 in the The Bank of Finnish Termino- face leads the reader wherever one wishes. logy in Arts and Sciences (BTA). The BTA While the visual setting creates a unique was founded in 2011 as a permanent open atmosphere. The Manuscript Found in access termbase for all fields of research in Saragossa highlights the vividness of the Finland. One of the main goals was to create form of the story, but the creators used ad- a collaborative environment for experts from ditionally the iconography of the film adap- different fields. Term entries in the BTA are tation of the novel (The Saragossa Ma- written by experts, but the termbase uses a nuscript, 1965, directed by Wojciech Has), Semantic MediaWiki platform, which offers that is the gothic, picturesque, grotesque all registered users the possibility to partici- and vintage elements. pate in the discussion about scientific terms. I chose the example of the work of The Bank of Finnish Potocki, since it is an interesting, fresh Terminology has been funded by the Univer- electronic adaptation of a novel and it fits sity of Helsinki and the Academy of Finland. perfectly the literary and digital game of as- In the beginning of 2015 I started with a sociations. An equally good material could be couple of researchers a special working the stories "The Garden of Forking Paths" group within the BTA that focused on terms of Borges (El jardín de los senderos que se that were used across different humanities bifurcan, 1941), "Hopscotch" by Cortazar disciplines (terms like representation, sign, (Rayuela, 1963), or "Life a User's Manual" by performance, discourse, affect, realism, criti- Perec (La vie mode d’emploi, 1978). que, text, code, expression, etc.). The group Usually this kind of novels finishes with consisted of experts from aesthetics, lingu- the syndrome of tiredness with ever new, istics, literary studies, philosophy, semiotics budding stories and the confusion of names and theatre studies. Our goal was to write of characters and names of the novel's locat- collaboratively definitions and descriptions ions. This type of books is not easy for an for multidisciplinary terms in the humanities. average reader. That is why, while presenting Our work represents a new form of collabo- the example of the prose of count Potocki in rative writing made possible by digitalizat- a new digital setting, I would also like to ion. Firstly, it exceeds disciplinary boun- show the method of refreshing of literary daries that have usually been very strong in texts. The visualisation and the interface, terminology work and provides via hyper- while being basic tools for this type of novel, links much more information of the multidi- are becoming increasingly widespread. sciplinary use of scientific terms. Secondly, the collaborative writing process and multi- Topics: Visual and Multisensory Representat- disciplinary approach creates a new way of ions of Past and Present depicting and understanding conceptual his- Keywords: digitization, visualisation of literary tory, which is essential for both research and narrative, internet, eighteenth-century higher education (not to mention the general literature public). Thirdly, the collaborative work re- presents a form of academic communication that is ongoing, self-correcting, and not con- Multidisciplinary fined to the conditions of predigital acade- Terminology Work in mic publishing. the Humanities: New Form Links of Collaborative Writing http://tieteentermipankki.fi/wiki/Termipan kki:Etusivu/en (in English) Tiina Mirjami Käkelä-Puumala http://tieteentermipankki.fi/wiki/Termipan University of Turku, Finland kki:Etusivu (in Finnish) http://tieteentermipankki.fi/wiki/Monitietei In my paper, I’ll present a multidisciplinary nen_termity%C3%B6 (in Finnish) terminology project that started in February

134 Topics: Nordic Textual Resources and The platform has been in development Practices since October 2016 and utilises strictly open Keywords: humanities, terminology, multi- source software in order to facilitate long disciplinary term maintenance. It is implemented on a mobile first approach as a progressive web Bibliography app built on the AngularJS 2 and Ionic 2 "Interdisciplinary Terminological Work, frameworks. A RESTful API handles com- Family Resemblance and Interdiscipli- munication between the backend and the nary Concept Analysis." Markku user interface. The platform is initially built Roinila & Tiina Käkelä-Puumala to host two digital critical editions, Zacharias (Presentation at conference Crossing Topelius Skrifter and Henry Parlands Skrif- Borders 2015). ter; however, it is intended as a generic plat- https://www.academia.edu/20941996 form able to accommodate future scholarly /Interdisciplinary_Terminological_Wo editions of other types as well. The live de- rk_Family_Resemblance_and_Interdis monstration of the platform will showcase ciplinary_Concept_Analysis material from Zacharias Topelius Skrifter. (forthcoming 2017)"This Land Is My Land, The responsive user interface of the plat- This Land Also Is My Land”: Real Es- form enables presenting the digital edition in tate Narratives in Pynchon’s Fiction" two modes depending on media and user Textual Practice 1:2017 choice: basic and advanced mode. The basic “Postmodern ghosts and the politics of in- mode revolves around the reading text and a visible life.” Death in Literature. Sari minimal set of paratextual materials. The Kivistö and Outi Hakola (eds.). Cam- idea is to display the text in a more ac- brige Scholars Publishing, 2014. cessible form to primarily support reading, rather than studying. Thus, for instance, an- notations are available, but variants, facsimi- Towards a Reader-friendly les and transcribed manuscripts are not. Digital Scholarly Edition These appear only in the advanced mode. The basic mode is essentially a stripped Sebastian Köhler down version of the full digital edition, fea- Society of Swedish Literature in Finland turing a limited critical apparatus with some, but not all aspects of the history and trans- Looking at digital scholarly editions today mission of the text, as well as limited you will usually find that “digital” implies scholarly paratexts, search options and tools. “to be used in a desktop environment”. Di- This restricted scope combined with the gital scholarly editions are seldom adapted, fact that the basic mode is first and foremost let alone optimized for small screen devices intended for access on mobile devices, like smartphones and tablets, and often have though also accessible on PCs and laptops, features superfluous to users who just want should provide a reading experience suitable to read the “plain” text. Consequently, digi- for a wider audience than that of the tradit- tal scholarly editions are generally not ex- ional scholarly edition. This includes, among pressly reader-friendly in a non-scholarly others, students, teachers, non-editor scho- sense and fail to meet the needs of a wider lars and people passionate about literature in public. general. In this paper I will present the Digital For research-oriented users the advanced Edition 2 platform of the Society of Swedish mode with the complete set of features will Literature in Finland, with special conside- be available in desktop environments. ration of its lightweight user interface, targe- The critical and annotated edition of the ted primarily at read- rather than research- writings of Zacharias Topelius currently oriented users, as well as smartphone and comprises six volumes, the first published in tablet use. 2010. Thus far the digital edition contains

135 the equivalent of about 4,000 pages of text SSLS 563:3, Helsingfors: Svenska litte- by Topelius, 400 pages of introductions by ratursällskapet i Finland 2016 editors and 10,000 annotations. It is freely Köhler, Sebastian, "'Det gjelder å feste blik- accessible at topelius.fi. ket'. Kampen mot nihilismen i Karl Ove Knausgårds Min kamp", Norsk litterær årbok 2014, red. Heming Gu- jord & Per Arne Michelsen, Oslo: Det Norske Samlaget 2014, s. 212–226

Towards a Digital Edition of the Codex Regius of the Prose Edda: Philosophy, Method, and Some Innovative Tools

Michael John MacPherson University of Iceland

The Codex Regius of the Prose Edda (GKS 2367 4to, or R) is the subject of a new pro- ject based at the Árni Magnússon Institute in Iceland. One of the aims of this project is to produce a multi-level, fully lemmatized, and morphologically analyzed digital edition of the manuscript in TEI-XML. The purpose of this talk is to address the motivation for such an edition in the context of current textual research on the Prose Edda, and to present some tools developed internally which are intended to make the edition more flexible while reducing the resources required. Figure 1. A mock-up of the basic mode on a Recent publications have increased our smartphone, displaying an annotation to the novel understanding of the prehistory of two other Fältskärns berättelser by Topelius. main manuscripts of the Prose Edda, Codex Wormianus and Codex Upsaliensis (Johans- son 1997 and Mårtensson 2013), and one of Topics: Nordic Textual Resources and the proposed outcomes of the project is to Practices; The Digital, the Humanities, and perform an analogous study of R. The pro- the Philosophies of Technology posed method, modeled on these earlier Keywords: digital scholarly edition, progressive publications, involves an investigation into web app, reader-friendly the palaeographic, graphemic, and ort- hographic norm upheld by its main scribe. Bibliography Deviations from this norm can sometimes Köhler, Sebastian, Boel Hackman & Carola be explained as influence from the scribe’s Herberts, Kommentar till Edith Sö- exemplar, allowing us to reconstruct the dergrans Dikter och aforismer. Varia, prehistory of the manuscript. In an attempt to move towards a more quantitative and

136 reproducible approach, this investigation will Topics: Nordic Textual Resources and leverage the digital edition as a source of in- Practices, The Digital, the Humanities, and formation about the scribe’s norm. A close the Philosophies of Technology transcription policy was developed to Keywords: digital philology, prose Edda, account for significant variation at the pa- linguistics, TEI, grapheme-to-phoneme laeographic level, with the main effort of transcription dedicated to capturing this va- Bibliography riation. Johansson, Karl G.. Studier I Codex Wormi- This type of close transcription is often anus: Skrifttradition och avskriftsverk- time-consuming. A novel tool was deve- samhet vid ett isländskt skriptorium loped leveraging open source grapheme-to- under 1300-talet. Göteborg: Novum phoneme (G2P) software. G2P is commonly Grafiska AB, 1997. applied in text-to-speech problems, where Kjeldsen, Alex Speed. Et Mørt håndskrift og computers need to guess the best pronunci- dets skrivere: Filologiske studier i ation of a word based on its orthography. kongesagahåndskriftet Morkinskinna. Instead, models of the scribe’s practice were PhD thesis, University of Copenha- trained using a sample of the close gen, 2010. transcription and existing normalized sour- ———. Icelandic Original Charters Online. ces. Close transcriptions were then genera- Forthcoming. ted for the remaining unseen text using the Mårtensson, Lasse. Skrivaren och förlagan: existing normalized sources. This method Norm och normbrott I Codex Upsali- generates entirely correct words 70 % of the ensis av Snorra Edda. Oslo: Novus time, with most of the wrong words being AS, 2013. only off by one or two letters. The generated Weinstock, John Martin. A Graphemic- text was then incorporated into the work- Phonemic Study of the Icelandic flow of the project’s transcribers. Manuscript AM 677 4to B. PhD the- This is then followed up with a grapho- sis, University of Wisconsin, 1967. phonetic markup based on Alex Speed Kjeldsen’s work on Icelandic Original Char- ters Online (forthcoming) and in his docto- Contributing to Nordic ral thesis (2010). This involves mapping each character in the transcription with a corre- Cultural Commons sponding theoretical etymological phonetic through Hackathons value based on our understanding of Old Norse language history. This allows for the Sanna-Maria Marttila exploration of grapho-phonetic relationships Aalto University, Finland first implemented with success by Weinstock in 1967. Further tools are then developed Digitalization has affected nearly all aspects for the automatic generation of transcript- of our society, albeit in different ways. For ions according to multiple diplomatization cultural and memory institutions, it has crea- and normalization schemes. It also allows ted enormous potential to expand public for modernization, granting access to the access to their (digital) holdings and natural language processing toolkit IceNLP establish and renew collaborative relations- for lemmatization and morphological analy- hips with their visitors. Along with the digi- sis. tizing of cultural heritage, new digital tools The result is a highly flexible digital edit- are also creating novel ways for people to ion designed from the ground up to describe access, appropriate and reinvent culture. the habits of its main scribe, allowing for Despite these developments, cultural and quantitative queries of philological criteria memory institutions are not providing as which can be easily reproduced by future much access as they could to their digitized researchers. collections (Bellini, et al. 2014), nor are they

137 creating good conditions for people’s crea- ganizing and facilitating hackathons, and tive re-use activities (Terras, 2015). This furthermore discuss how they can contribute short paper explores how cultural hack- to the building and sustaining open cultural athons can enhance creative re-use of digital commons. holdings of memory and cultural institut- ions, and contribute to co-designing, buil- Topics: Nordic Textual Resources and ding and sustaining of open cultural com- Practices mons. Keywords: cultural commons, open digital he- Governmental bodies, businesses and ritage, hackathon cultural institutions alike are hosting hack- athons to stimulate innovation with their References digital offerings and resources. This Bellini, F., Passani, A., Spagnoli, F., Crom- emerging approach has become an effective bie, D., & Ioannidis, G. (2014). and favourite way to encourage exploration MAXICULTURE: Assessing the Im- and creativity with digital technologies pact of EU Projects in the Digital Cul- (Briscoe and Mulligan 2014). A hackathon is tural Heritage Domain. often described as a problem-solving event Briscoe, G., & Mulligan, C. (2014). Digital through intensive software programming Innovation: The Hackathon and development in a short period of time Phenomenon. (Topi and Tucker 2014). It can also refer to Terras, M. (2015). Opening Access to collec- a competition where participants can pitch tions: the making and using of open and develop their ideas and prototypes to- digitised cultural content, Online In- gether with others. Often these events draw formation Review, Vol. 39 Iss: 5, together software developers and designers pp.733 – 752. from various fields to collaborate either in Topi, H., and Tucker, A. (2014). Computing teams, or working together solving a specific Handbook, Third Edition: Informat- problem, idea or theme. This has also been ion Systems and Information Techno- seen as a challenge of hackathons, as proto- logy. CRC Press. types rarely are developed into finalized products that could generate revenue or monetary business value (Komssi et al. Young People’s Historical 2014). The empirical material is based on long- Thinking in the Face of term engagement and action research on de- Digitized Sources signing and organizing cultural hackathons in Finland, Denmark and Sweden in the re- Åsa Olovsson cent years, and personal reflections on these Uppsala university, Sweden experiences. Through these case studies on the Hack4FI, Hack4DK and Hack4Heritage The framework for my research is a database hackathons focusing on creative re-use of within a project called Gender and Work digital cultural heritage materials, this article (GaW) at Uppsala university, which presents explores the application of hackathon as a how men and women made a living during approach and way to engage people in pub- the period 1550-1800. The sources here are lic matters (such as discussion on intellectual mainly court records from different parts of property rights) and in building shared cul- Sweden. Besides gender and work, the data- tural common-pool resources. Through the base provides a wide base of information critical analysis, the author reflects if and enabling different themes of interest for how these arranged events can support soci- young people. Some examples are sexuality, al and digital innovation, and creation of relations, marriage and children. I believe new services, tools and practices. The paper this kind of modern digital technology may sheds light on the strategies and tactics or- prove to be essential for the development of

138 history education. With the help from avai- tory. A related question is whether the stu- lable primary sources and usable methods, dents are influenced or not by the sources the students can achieve a nuanced view on they work with. Another feature is if the history as such and proper scientific thin- past, present and future become visible king, including methodology. The aim for through the studied phenomenon. Finally I the project I am planning in the field of digi- will find out how the students' expressions tal humanities is to produce concrete tools may be explained. Hopefully this way of for history teaching and learning in the up- working will activate students narrative per secondary school in Sweden. The cur- about the past and through that also develop riculum clearly marks scientific thinking as their historical thinking. key. Thus the use of primary sources in the The search for relevant theories is still in classroom should be a fundamental part of progress. Since I believe that this form of history education. teaching may engage the students’ emotional Design study methodology is suitable for sides, theories concerning emotion and hist- this project, since it provides the op- orical learning could be useful.To my know- portunity to conduct research in real life ledge so far, this study will take me to mainly classroom situations. Within this method, it unknown territory. Only a few national stu- is possible to develop concrete tools.These dies have had specific focus on what hap- tools could be scaffolding of different kinds pens when students are exposed to digitized - assisting and supporting students in their sources. The use of the database GaW in learning process. For example, I will create history education has never been studied complete paths or activities for the mention- from a didactic perspective. ed themes. Above that, study handbook is necessary for teachers and students, with Topics: Nordic Textual Resources and instructions on how to read and interpret Practices sources. Further, glossary which explains Keywords: primary sources, digital archives, concepts from the current theme. The design study, cultural heritage, gender and scaffolding could be written, but it is also work possible to use multimedia and make tutorial videos. References I am especially interested in exploring the Brush, Thomas A and Saye, John W. “A students’ historical thinking and what mea- Summary of Research Exploring Hard ning they make of history, their history. and Soft Scaffolding for Teachers and How will that meaning evolve under expo- Students Using a Multimedia Suppor- sure of authentic sources from the digital ted Learning Environment” archive? In order to know if this method http://www.ncolr.org/jiol/issues/pdf/1.2.3. works, a comparison it is necessary with a pdf control group working in a more traditional Lévesque, S. (2007) Can Computational Techno- way. Surveys or interviews are necessary be- logy Improve Students' Historical Thinking? fore and after working with the database Experience from the Virtual Historian© working to follow the student’s progress. with Grade 10 Students. " Journal of the Working with primary sources from the da- Ontario History and Social Science Teachers tabase will make the students active learners Association , (Printemps): instead of passive recipients. They will make Nygren, Thomas, Sandberg Karin & Vik- them mediate the digitized sources and ström, Lotta “Digitala primärkällor i through them I believe they will become co- historieundervisningen: En utmaning creators, see new perspectives and create för elevers historiska tänkande och hi- meaning. Preliminary research questions are storiska empati” [Digital primary sour- How can GaW be used as a learning tool? ces in history education: A challenge How does teaching with digitized sources for students’ historical thinking and affect students reasoning and learning in his- historical empathy] Nordidactica, 2014, 2

139 Nygren, Thomas och Vikström, Lotta (2013) cations. Specifically, I will look into a series “Treading Old Paths in New Ways” in of essays (under the rubric ‘Semicolon’) and Education Sciences some text-sound poetic experiments by the Sandberg, Karin (2014) Möte med det förflutna Swedish poets and composers Lars-Gunnar Digitaliserade primärkällor i historieunder- Bodin and Bengt Emil Johnson, in which visningen Lic.avh. Umeå: Umeå universitet they approach what might be called a ‘bio- Shavelson, Richard J., Phillips, D. C., poetics’, bringing bodies and machines in Towne, Lisa and Feuer, Michael J., On closer contact and trying to re-articulate, the Science of Education Design Studies even dissolve, the mediating moments in the assemblage artist–artwork/art event– viewer/reader/listener. I will also bring up and discuss the somewhat later ‘bio-music’ Sixties Biopoetics: A Media of Manfred Eaton, partly taking its cue from Archaeological Reading of some works by the American composer Al- Digital Infrastructure vin Lucier, not least, perhaps, ‘Music for Solo Performer’ (1965), in which EEG

electrodes were attached to the skull of the Jesper Olsson composer and performer and connected to a Linköping University, Sweden sound system, which then generated sound

and music. Through the rise of planetary scale compu- In these works, a different ecology of na- ting and global digital infrastructure during turecultures is imagined and explored, partly the last decades (Cf. Bratton 2016, Gabrys through an artistic misuse of technologies, 2016, Starosielski 2015, and others), a new which ‘prehends’ (to use A. N. Whitehead’s ecology of nature and culture, bodies, pro- concept) a contemporary media ecological tocols, and machines has emerged. Including formation. My aim is, thus, to explore how a everything from the internet of things, i.e. seemingly marginal cultural practice, such as sensor topographies, smart fridges, and avant-garde poetry, can function as a media wearables, to underwater cables from the archaeological probe or platform for ana- 19th century and distant server halls this lyzing and experiencing some of the other, process has had radical epistemic, economic, submerged layers and temporalities of our aesthetic, political, and social consequences. brand new machine park and the various Not least, it has challenged and dissolved the effects it displays. charged boundaries between humans and their surroundings, necessitating an analysis Topics: The Digital, the Humanities, and the that thinks and analyzes ‘naturecultures’ Philosophies of Technology (Haraway 2003) as always intertwined and Keywords: media archaeology, media ecology, merged. Accordingly, water, air, minerals, infrastructure, natureculture, poetry plants, animals, cellphones, optical fiber, pads, pods, and satellite technologies are Bibliography part of one ecology – one world, many Brattton, Benjamin, The Stack, MIT Press forms, to paraphrase Gilles Deleuze. 2016 However, this tangible transformation Gabrys, Jennifer, Program Earth: Environ- did not take place in an instant. It has a long mental Sensing Technology and the and dwindling material history. In this paper Making of a Computational Planet, I will try to disentangle some of the strands University of Minnesota Presss 2016 of this history by returning to artistic and Haraway, Donna, The Companion Species literary practices of the late 1960s and early Manifesto, University of Chicago 1970s. Focusing on some aesthetic experi- Press 2003 ments and poetic speculations of the period, Starosielski, Nicole, The Undersea Network, I hope to shed light on the contemporary Duke UP 2015 digital ecology and its larger cultural impli-

140 Mapping Letters om brevet og så kunne gå videre til mer de- taljert informasjon på prosjektenes utgavesi- Across Editions der. Til dette arbeidet ønsker vi å benytte oss Vemund Olstad av kartprogramvaren som ble utviklet i for- Directorate for Cultural Heritage, Norway bindelse med prosjektet Kultur- og natur- Hilde Bøe reise. Kildekode og beskrivelse er tilgjengelig Munch Museum, Norway her: http://knreise.no/demonstratorer/. Denne kartløsningen administreres og vi- Kartfesting av utgitte brev dereutvikles av K-lab, og den vil bli vi- I løpet av de seneste 20-30 årene har man i dereutviklet i 2017 – blant annet med en de nordiske landene publisert (eller satt i tidslinjefunksjonalitet, som vil gi oss mulig- gang publiseringsarbeid av) en rekke digitale heten til å kunne gjøre et kronologisk utvalg tekstkritiske utgaver av viktige forfattere og for de stedfestede brevene. En veldig tidlig kulturpersonligheter. Av de større prototyp er tilgjengelig på http:// prosjektene kan nevnes: knreise.github.io/demonstratorer/demonstr * Henrik Ibsens skrifter (Norge) atorer/historiskeBrev.html. * eMunch (Norge) I tillegg til det praktiske arbeidet med å * Ludvig Holbergs skrifter (Danmark/ formidle brev i kart, vil vi forsøke å se litt Norge) nærmere på hvordan utgaveprosjekter job- * Grundvigs Værker (Danmark) ber med å tilrettelegge sine grunnlagsdata for * The Linnean Correspondence (Sverige) andre brukere. En viktig forutsetning for at * Zacharias Topelius skrifter (Finland) vi skal kunne samle utgavedata i en sentrali- Edisjonsfilologimiljøet i Norden er rela- sert løsning er at de er tilgjengelige via en tivt lite og oversiktlig, hvilket gjør at man i tjeneste det går an å hente data fra. Dette stor grad har kunnet utveksle erfaringer på kan være i form av eksterne endepunkt (rest tvers av både prosjekt og landegrenser. En API / sparql endepunkt eksempelvis), eller konsekvens av dette er at grunnlagsmateri- ved at man laster opp data til en aggrega- alet for de forskjellige utgavene i veldig stor tortjeneste (Norvegiana / K-Samsök / grad benytter seg av samme kodestandard og Europeana). Hvor flinke er utgaveprosjekt til format (TEI XML). Måten grunnlagsdataene å tenke på denne typen etterbruk? Hva kan presenteres ut til brukerne på varierer, natur- gjøres for å samordne innsatsen på dette lig nok, fra prosjekt til prosjekt – og vi området? ønsker å se nærmere på muligheten til å pre- sentere materiale fra flere nordiske utgave- Topics: Nordic Textual Resources and Prac- prosjekt sammen i en kartbasert løsning, ved tices å sammenstille utvalgte data og metadata fra Keywords: maps, geotags, letters forskjellige utgaveprosjekt. I perioden frem til DHN 2017 vil Riksan- tikvaren, som en av de deltagende etatene i The Battle of the Text – K-labsamarbeidet (http:// www.riksantikvaren.no/Veiledning/Data- Quantitative Methodologies og-tjenester/K-lab), sammen med Munch- in Literary Studies museet arbeide med å georeferere brev fra utgaveprosjektetene eMunch og Henrik Julia Pennlert Ibsens skrifter for å kunne plassere disse på Umeå University, Sweden et interaktivt kart. Tanken bak dette pilot- prosjektet er kunne lage til en kartbasert It is often claimed that our digital present inngang til større nordiske brevsamlinger, time gives the literary scholar possibilities to der man ved å navigere rundt i kartet skal question and reconfigure what it is to read kunne se hvor brev er skrevet, hvem som or analyze a literary text. This statement is skrev de, få opp grunnleggende informasjon part of a larger discourse that emphasize that

141 our digital time, is as a time of change. Due method, a combination of methodologies to the fact that online publication venues that can be used as a productive way out of have become a vital part of literary culture, the sometimes polemical discussions on or by projects that digitize literary texts by what and how literary studies can be con- presenting them in online archives such as ducted. the Swedish Litteraturbanken (litteraturban- ken.se) - the literary text is attached to others Topics: The Digital, the Humanities, and the in a network of literary publications. The Philosophies of Technology notion of a digital text is often explained as Keywords: distant reading, quantatitve analy- the main reason for why the literary scholar sis, statistical readings, literary methodolo- needs to address methodological issues. As a gies result, the literary scholar is part of the di- scourse that underlines change, and as a Bibliography consequence the researcher has to adjust or ”Textuella bataljer - om kvantitativa metoder develop new methods to read, analyze or i litteraturvetenskapens tjänst” (artikel study a certain text. kommande, i antologin Kvantitativa Me- During the last decade literary studies is toder inom Humaniora och samhällsveten- characterized by a methodological turn, skap) especially within in the field of digital huma- nities. Several theorists have presented ‘new’ types of reading for example “distant rea- Spatial Humanities and the ding” (Moretti), “macro-analysis” (Jockers) Norwegian Folklore Archive or “hyper-reading” (Hayles). These new forms of studying and analyzing texts have Kristina Skåden been discussed and criticized. These University of Oslo, Norway methods present a new optic, or gaze, to study or analyze a certain text. In Literary This papers idea is to present the ongoing Studies in the Digital Age (2013), Tanya work on “spatial Humanities” at The Nor- Clement describes the computer-assisted wegian Folklore archive and the Department method as a way for the researcher to get an of Cultural Studies and Oriental Languages overview of a vast material by using a “mag- (IKOS) at the University of Oslo. The main nifying glass upside down.” focus will be on how Spatial Humanities However, these methodologies can be may be of interest for education and rese- compared to historical equivalent discuss- arch in the field of cultural history and ions that highlight what a literary scholar museology. (should) study and how a reading of a text The spring term 2017, the department should be performed. In a Swedish research will start up two new and innovative pro- context technological tools in literary studies jects: Firstly, a course on MA-level “Cultural is especially discussed during the 1960’s and heritage production, Eilert Sundt and Digital 1970’s. During this time several suggestions Humanities”: on how the researcher can adopt their http://www.uio.no/studier/emner/hf/ik methodologies are presented, in for example os/KULH4015/kulh4015var2017.html Litteraturvetenskap – Nya Mål och Metoder In 2017 is it 200 years sins the important (1966), or Forskningsfält och metoder inom cultural and the social scientist Eilert Sundt Litteraturvetenskap (1970). was born. This event is the starting point for In my paper I will compare the methodo- the education in critical cultural heritage logical turn within digital humanities with production. The aim is that the students, by historical examples and argue for why it is a theoretical and practical digital humanity- important to trace historical similar mo- approach will produce an Eilert Sund-map. vements and descriptions in to our present Some que questions the students will work digital time. I will also present an alternative on: How will the mapping of different quali-

142 tative sources enrich the understanding of However, there are also other ways to Eilert Sunds research? What kind of space is study, at least partially, the Suomi24 discuss- produced be this mapping practice? ions, not only with data stored in the Finnish Secondly, the IKOS-department will de- Language Bank and opened for the resear- velop a digital mapping tool for use in rese- chers. This methodological paper examines arch, education and communication. This critically possibilities to search and browse project is in progress, and it will therefore be Suomi24 discussions not only with above interesting to discus with Nordic colleges, mentioned KORP interface as well as with different opportunities and pitfalls. Suomi24’s own search engine, Google se- arch, Internet Archive, and with the Nation- Topics: Visual and Multisensory Representat- al Library’s collection of Finnish websites. ions of Past and Present We ask here how the different user inter- Keywords: spatial humanities, mapping, faces affect to the ways of finding discuss- cultural heritage ions, contextualizing them and use them in other ways in digital cultural research. We use the study of analyzing online popular How to Study Online Popular discourses on otherness as the methodologi- cal case example. The study introduced here, Discourse on Otherness is part of “Citizen Mindscapes – Detecting – Public User Interfaces to Social, Emotional and National Dynamics in Online Discussion Forum Social Media” project funded by the Academy of Finland Digital Humanities Materials Research Programme (http://www.aka. fi/digihum). Jaakko Suominen Elina Vaahensalo Topics: Nordic Textual Resources and Prac- University of Turku, Finland tices, The Digital, the Humanities, and the Philosophies of Technology Suomi24 (suomi24.fi, established in 1998) is Keywords: online discussion forums, method- Finland’s biggest online discussion forum ology, search engines, contextualization and leading topic-centric social media, and one of the largest non-English online di- Bibliography scussion forum in the world. According to Suominen, Jaakko (2016): “How to Present TNS Metric service, over 80 % of Finns visit the History of Digital Games: Enthu- the site monthly, at least when searching in- siast, Emancipatory, Genealogical and formation with Google or with other search Pathological Approaches.” Games & engines on various topics. Citizen Mindsca- Culture, Published online before print, pes research initiative has collaborated with June 20, 2016, doi: Aller Media, the owner of Suomi24, as well 10.1177/1555412016653341 as with FIN-CLARIN, and opened the di- Suominen, Jaakko (2016): ”Helposti ja hal- scussion forum posts for research use. The valla? Nettikyselyt kyselyaineiston ko- data, available e.g. via KORP user interface koamisessa.” Korkiakangas, Pirjo, (https://korp.csc.fi/), consist of over 2 billi- Olsson, Pia, Ruotsala, Helena, on words, 53 million comments and almost Åström, Anna-Maria (toim.): Kirjoit- 7 million threads and covers online discuss- tamalla kerrotut – kansatieteelliset kys- ions over 15 years. Thus, the data gives op- elyt tiedon lähteinä. Ethnos-toimite portunities to longitudinal studies conside- 19. Ethnos ry., Helsinki, 103–152. ring very many aspects, not only focusing on [Easy and Cheap? Online surveys in questions on online cultures but also quest- cultural studies] ions on the change of Finnish society in ge- Suominen, Jaakko & Sivula, Anna (2016): neral. “Digisyntyisten ilmiöiden histori-

143 antutkimus.” In Elo, Kimmo (toim.): Previous work on the archives as well as Digitaalinen humanismi ja histori- recent development of open access databa- atieteet. Historia Mirabilis 12. Turun ses such as Trismegistos (TM)12 enables large Historiallinen Yhdistys, Turku, 96– scale and systematic examination of the 130. [Historical Research of Born Dig- texts. Despite enormous potential of the ital Phenomena] sources, methods more traditionally used in Suominen, Jaakko – Saarikoski, Petri – Tur- the field of Egyptology generally fall short in tiainen, Riikka – Östman, Sari (2016, terms of comprehending and embracing the accepted): “Survival of the Most Flex- diversity and complexity they represent. For ible? National social media services in big data projects such as this, newly deve- global competition: The Finnish loped digital tools can help unlocking the Case.” In Goggin, Gerard & McLel- potential embedded in the source material. land, Mark (Eds.), Routledge Compan- Particularly relevant for the current pro- ion to Global Internet Histories, ject is 'Social Network Analysis' (SNA), forthcoming. Routledge, London. aspects of which have been fruitfully applied to written sources from ancient Egypt in the past (e.g. Ruffini 2008; Broux 2015; Broux & Socio-Economic Relations Depauw 2015; Cline & Cline 2015; TM Networks).13 Within a network perspective, in Ptolemaic Pathyris: ancient societies can be conceptualised as A Network Analytical dynamic 'whole-networks' (Marsden 2005: 8) Approach to a Bilingual that are structurally composed by complex systems of overlapping, collaborating, and Community competing sub-networks. My working hy- pothesis is that using the network analytical Lena Tambs software ‘Gephi’ in employing various analy- University of Cologne, Germany tical tools embedded in SNA to map a high number of specific relations, will facilitate Sometime between 165 and 161 BCE, a subsequent analysis and interpretation of subdivision of the larger military camp of emerging patterns of socio-economic con- Krokodilopolis was established at Pathyris, nectivity in Pathyris. c. 30 km South of Thebes in Ancient Egypt. The current talk provides an outline of Following Upper Egyptian practice, the the project’s main objectives, theoretical ap- community mainly consisted of local soldiers proaches and applied methodologies. Ar- and their families (Vandorpe 2011: 295-296). guing for the applicability and usefulness of However, progressive efforts to Hellenize formal SNA, not only as a powerful visuali- the region soon led to Pathyris evolving into sation tool but also a multi-functional digital a bi-cultural society, with co-existing Egyp- toolbox and interactive interface for analysis tian and Greek languages, institutions and and hypothesis testing, examples will be practices. drawn from a case study of the ‘Archive of From the time of its establishment until Horos, son of Nechouthes’ (TM Arch 106). its abandonment in 88 BCE, the structural and cultural complexity of the Pathyrite Topics: Visual and Multisensory Representat- community can be studied in some detail. ions of Past and Present This is made possible by a considerable Keywords: texts, ancient Egypt, social network amount of surviving documentary sources. analysis, Gephi 0.9.1 To date, a total of 21 Greek-Demotic archi- ves have been reconstructed (Vandorpe 1994; Vandorpe & Waebens 2009), provi- ding detailed information about the camp, its inhabitants and their affairs. 12 http://trismegistos.org 13 http://trismegistos.org/network

144 Bibliography Vandorpe, K. & S. Waebens 2009, Recon- Broux, Y. 2015, ‘Graeco-Egyptian Naming structing Pathyris' Archives. A Multicultur- Practices: A Network Perspective’, in: al Community in Hellenistic Egypt, Collec- Greek, Roman and Byzantine Studies, vol. tanea Hellenistica III, Brussel: Kon- 55, pp. 706-720 inklijke Vlaamse Academie van Belgie Broux, Y. & M. Depauw 2015, ‘Developing & l'Union Academique Internationale Onomastic Gazetteers and Prosopog- raphies for the Ancient World through Named Entity Recognition and Graph Combining Data Sources for Vistualization: Some Examples from Language Variation Studies Trismegistos People’, in: Social Infor- matics. SocInfo 2014 International Work- and Data Visualization shops, Barcelona, Spain, November 10, 2014. Revised Selected Papers, Aiello, L. Kristel Uiboaed M. & D. McFarland (eds.), pp. 304- Eleri Aedmaa 313 Maarja-Liisa Pilvik Cline, D. H. & E. H: Cline 2015, ‘Text Mes- University of Tartu, Estonia sages, Tablets and Social Networks: The “Small World” of the Amarna Mapping and cartographic visualization is an Letters’, in: There and Back Again – the essential component of dialectology, addit- Crossroads II: Proceedings of an Internation- ionally to several other fields of the humani- al Conference held in Prague, September 15- ties. The current paper introduces an ongo- 18, 2014, Mynářová, J., Pavel, O. & P. ing project on digitizing, combining and vi- Pavúk (eds.), pp. 17-44 sualizing data of different types and sources Marsden, P. V. 2005, ‘Recent Developments for linguistic research purposes. In the pre- in Network Measurement’, in: Models sentation, we introduce applied methods, and Methods in Social Network Analysis, tools and basic workflow of our project Carrington, P. J. et al. (eds.), Structural “Spatial Data in Linguistics” Analysis in the Social Sciences, vol. 27, (http://rurake.keeleressursid.ee/). Cambridge: Cambridge University The initial idea of the project was to digi- Press, pp. 8-30 tize the maps in the only existing atlases on Ruffini, G. R. 2008, ‘Social Networks in Byzan- Estonian dialects (Saareste 1938, 1941, 1955) tine Egypt’, Cambridge: Cambridge and make them publicly available. The digi- University Press tization was necessary for presenting and Vandorpe, K. 2011, ‘A Successful, but frag- analyzing the old atlas data with contempo- ile biculturalism. The Hellenization rary methods and tools, thus enabling process in the Upper Egyptian town queries and a wider range of visualization of Pathyris under Ptolemy VI and options, among other things. We, therefore, VII’, in; Ägypten swischen innerem Zwist created a new resource for automatic pro- und äusserem Druck: Die Zeit Ptolemaios’ cessing of old dialectological data and made VI. Bis VIII. Internationales Symposion it available for other research purposes as Heidelberg 16.- 19.9.2007, Jördens, A. & well. J. F. Quack (eds.), Philippika 45, We proceeded with combining the digi- Wiesbaden: Harrassowitz Verlag tized atlas data with the data from the Esto- Vandorpe, K. 1994, ‘Museum Archaeology nian Dialect Corpus (CED). The compari- or How to Reconstruct Pathyris Ar- son and simultaneous analysis of different chives’, in; Acta Demotica: Acts of the data sources make it possible to shed light Fifth International Conference for Demotists, on the studied phenomenon from different Pisa, 4th – 8th September 1993, Bre- perspectives, thereby creating a deeper un- sciani, E. (ed.), EVO 17, pp. 289-300 derstanding of the spread and actual fre- quency of linguistic material. These kinds of combined data are not only of interest to

145 linguists but can be made use of in other Places and Journeys areas of the humanities as well as they con- vey information about history, ethnography of the Contemporary etc. Making the available data reusable and Norwegian Novel: accessible to different disciplines is an im- A Pilot Study portant facet of modern research practice and should be encouraged. It is also ne- Kim Tallerås, Tonje Vold cessary to share and develop the tools and & David Massey techniques for working with this data. We Oslo and Akershus University College of therefore also introduce the tools we have Applied Sciences, Norway used in our project and demonstrate how GIS and R can be combined to present In Atlas of the European Novel 1800- spatial data, and how R can be applied for 1900 (1998), Franco Moretti investigates the producing interactive applications of data European novel from the point of view of visualization. Employing such widely-used maps. What does geography and settings software as GIS applications and R makes mean in these storylines? Moretti’s provok- our contributions generalizable and usable ing and compelling idea is that “each space also for other researchers of various fields. determines, or at least encourages, its own kind of story” (p. 70). A corresponding study Topics: Nordic Textual Resources and has not been conducted in the Norwegian Practices context although Norwegian literature typi- Keywords: textual data processing, spatial data, cally is very conscious of geography, and the corpus linguistics, data visualization meanings of regional and local specifics, in a thinly populated country of mountains and References valleys, fjords and a long coastline. CED = Corpus of Estonian Dialects The National Library of Norway have of Norwegian literature, which represent a QGIS Development Team 2016, QGIS Ge- promising basis for a large-scale automated ographic Information System. Open analysis of geographical information. How Source Geospatial Foundation. URL can this digitized collection be used in order http://www.qgis.org/. to investigate Moretti’s idea further through R Development Core Team 2016, R: A Lan- a ‘distant’ reading of Norwegian novels? guage and Environment for Statistical This research question calls for an interdis- Computing. R Foundation for Statisti- ciplinary approach and methods from the cal Computing., R version 3.2.4, Aus- digital humanities. In this short paper, we tria, Vienna, http://www.r- pro- introduce a pilot study for a Norwegian “At- ject.org/. las”-project through focusing on places and Saareste, Andrus 1938, Eesti murdeatlas. I journeys in contemporary Norwegian nov- vihik = Atlas des parlers estoniens. I els. The study includes a test conducted on a fascicule. [Atlas of Estonian Dialects. I limited selection of digitized novels, in order part] Tartu: Eesti Kirjanduse Selts. to discover challenges and opportunities for Saareste, Andrus 1941, Eesti murdeatlas. II an automated analysis. The thematic limita- vihik = Atlas der parlers estoniens. II tions of journeys and contemporary litera- fascicule. [Atlas of Estonian Dialects. ture reflect a methodological need for a con- II part] Tartu: Teaduslik Kirjandus. sistent and comparable selection, but also Saareste, Andrus 1955, Petit atlas des parlers enable certain perspectives on e.g. gender, estoniens: Väike eesti murdeatlas. urbanism and environmental criticism we [Small atlas of Estonian Dialects] wanted to include in the project. Uppsala: Almqvist & Wiksell. In the test, a simple schema of entity and relationship types were developed experi-

146 mentally based on text snippets from the exhibition was similar to the technology that sample corpus. Then, three annotators used has been developed for the virtual autopsy Brat14 and the schema to annotate geograph- table at the Center for Medical Image Sci- ical entities and contextual information ence and Visualization (CMIV) at Linköping found in the sample novels. Eventually, the University, Sweden. Huge data sets genera- resulting annotations were analyzed, to see ted by computer tomography (CT) were i) what types and extent of information we used to create three-dimensional images of find in the novels and ii) which textual fea- the mummies, revealing different layers of tures (hypernyms etc.) that could be used as the body: skin, muscles, organs, and the evidence in an automated processing. bone structure as well as the cartonnage case The project’s long-term goal is two- and the wrappings that surrounded the folded, on the one hand to contribute to lit- mummies. Also revealed were amulets and erary studies: What does geography and set- other objects hidden beneath the wrappings. tings (fjords and valleys, small towns and Some of these amulets had been 3D-printed cities (etc.), and the journeys between them) and replicas were displayed side by side with signify in contemporary Norwegian novels? the screen-based visualisations. To facilitate On the other, we want to investigate and for the visitors to interpret the images, contribute to digital humanities methods colour coding was used. The surface layer of that can be used in order to exploit the new- the visualisation was coloured blue, and in- ly digitized Norwegian text corpus. side the mummies embalming tools had been coloured green and organs had been Topics: Nordic Textual Resources and given different blue nuance. The 3D-printed Practices replicas of the amulets were of white plastic Keywords: literary studies, entity recognition, material, because it was not possible to de- information extraction, annotation, visuali- cide what metal the original amulets were sation made of, although it was probably gold or silver. Using white was a way of indicating References uncertainty and keeping to the facts. All the Moretti, Franco (1998). Atlas of the European visualisations in the exhibition were life-size, Novel 1800–1900. London: Verso as it was considered important for the un- derstanding of the images. This paper will discuss the importance of colour coding in The Use of Medical visualisations used in cultural heritage ex- hibitions. What does colour mean in this Visualisation in Cultural context and how does it relate to colour Heritage Exhibitions coding in medical visualisations meant for an audience of medical professionals? How Karin Wagner does scale influence our understanding of Gothenburg University, Sweden reproductions of objects? We are used to seeing art and cultural heritage artefacts re- This paper deals with how medical imaging produced in smaller scale in books, but with is used in visualising cultural heritage, taking digital visualisations the possibility to offer the British Museum’s exhibition Ancient li- life-size reproductions of objects museum ves, new discoveries (2104-15) as its case has been greatly increased. The potential of study. In the exhibition, eight exemplars 3D-printing and the tactile dimension this from the museum’s collection of mummies can add to cultural heritage exhibitions will were on display together with medical visua- also be explored, and how different types of lisations, that hade been composed into inte- visualisations can work together. ractive displays. The technology used in the

14 http://brat.nlplab.org/

147 Topics: Visual and Multisensory Representat- Canadian scholar Lubomír Doležel. The ions of Past and Present theory aims at discovering narrative patterns Keywords: medical visualisation, colour in a given text and thus reconstructing “fic- coding, scale, 3D-printing tional world” as a linguistically constructed semiotic object. Doležel starts with looking Bibliography at the “texture” of the text in question, that Wagner, K. 2015 "Reading packages: social is, at the distribution of its linguistic features semiotics on the shelf", Visual Com- (tenses, persons/narrators, chapters, para- munication, vol. 14, no. 2, pp. 193- graphs, etc.) with emerging regularities and 220. irregularities. This basic structure can be fur- ther analyzed as a way of expressing text’s extensional (themes, events, motifs, charac- Visualizing the Landscape ters) and intensional structure (rendered by of Contemporary Doležel’s “functions” of authentication and saturation). In the end, one arrives at the sty- Norwegian Novels listic pattern(s) or “shape” of a given narra- tive text/fictional world. I would like to Miroslav Zumrík stress that such analysis does not rely on Slovak Academy of Sciences, Slovak semantic annotation and does not require in- Republic depth plot segmentation. This next step would require much more time and effort, I would like to present my research idea, given that discerning temporal structure which deals with visualization of contempo- (with respect to categories proposed by Ge- rary Norwegian novel production in a given nette), or key motifs/events is already a time period (a year, a decade), that is, with question of interpretation. Doležel’s theory, detecting common features, tendencies and as I understand it, complies with some re- extremities with respect to narrative structure cent research tendencies within narratology of novels in question. My project could thus – the interest for the peculiar, experimental, be seen as an enhancement of what is already “unnatural” narratives (Hansen – Iversen – being done in the series of Samlaget’s Nor- Nielsen – Reiter (eds.) 2011, Alber – Hansen wegian Literary Yearbook (Norsk Litterær (eds.) 2014), the use of computational meth- Årbok) in the section “A year in the novel” ods and tools on narrative texts (Weixler – (Romanåret). I would argue that a computa- Werner (eds.) 2015, Brunner 2015, Gius tionally aided and statistically evaluated anal- 2015), reconstructing the “shape” of fiction- ysis of a representative set of novels could al worlds (Pettersson 2016). This theory I provide literary scholars with a suitable em- already applied in my PhD thesis, where I pirical background for making statements dealt with the novel “In the Shadow of Sin- and formulating hypothesis on narra- gularity” (2013) by the Norwegian writer tive/stylistic tendencies employed by con- Thure Erik Lund. The novel makes exten- temporary Norwegian novelists, such as use sive use of a disembodied “we” narrator and distributional schemes of narrators and from the “posthuman” future, re-telling the tenses. One of the aims of the presentation, history of both mankind and the author of which will focus mostly on the new and also the novel, in which it appears, and stating on the ideas developed in my Phd-work, is to that it had “used” the writer as a vehicle in address and find research fellow(s) in Nor- order to “protrude” back to the “presingu- way that would be willing to cooperate on lar” time. starting project of visualization of narrative features in contemporary Norwegian novels. Topics: Visual and Multisensory Representa- As a theoretical background for my pro- tions of Past and Present ject, I would like to employ the theory of Keywords: contemporary Norwegian novel, fictional worlds, as created by the Czech- narratology, theory of fictional worlds

148

POSTERS

149 150 Interdisciplinary the starting point and give them the op- portunity to describe their work in small Collaboration for Making groups, allowing data holders and speech Cultural Heritage Accessible technologists to suggest ways in which the for Research research process could be facilitated by large speech data sets and speech technology. The Johanna Berg hands-on task was to come up with suggest- The Swedish National Archives ions for research projects. As a result, the Rickard Domeij three sub-studies within the Tilltal project Swedish Language Council were conceived. They examine how speech Jens Edlund technology can be used to investigate KTH Royal Institute of Technology, research questions within three disciplines: Sweden folklore, dialectology and conversation rese- Gunnar Eriksson arch. Swedish Language Council Along with the three sub-studies, a usage David House study will be performed that applies activity Zofia Malisz theory to survey the research activities sur- KTH Royal Institute of Technology, rounding the archival materials, cf. Nardi Sweden 1996. Considering the needs of the resear- Susanne Nylund Skog chers, we will model characteristic situations Jenny Öqvist of use following ideas in Hansen et al. 2014, The Folklore Archives, Sweden propose language technology solutions and assess their usefulness in practice by means In this poster, we will present plans and of use cases, in spirit of Jacobson et al. 1992, initial experiences of collaborating between 2011. disciplines within the newly started project The long-term goal of the Tilltal project Tilltal, as well as studying users and research is to make the Swedish speech archives more activities related to the use of memory archi- accessible in general, and to SSH researchers ves for research. in particular. We hope to achieve this not Currently, the large amounts of recorded only by describing methods by which speech speech available at Swedish memory institut- technology can be used to reach SSH rese- ions are rarely used due to the lack of ef- arch goals, but also by providing examples fective methods for handling archival soun- of fruitful interdisciplinary collaborations. In ding material. The aim of the project Tilltal the poster we will present our plans, experi- is to examine how speech technology ences and results so far regarding interdisci- methods can make speech recordings at plinary collaboration, surveying of research public memory institutions more accessible activities and use case modelling. to researchers. The project explores how The project is a collaboration between speech technology methods and tools can be Digisam, The Institute of Language and adapted and developed to process large Folklore (ISOF), the Royal Institute of Te- amounts of historical voice recordings from chnology (KTH) and Sweclarin. It is funded the archives of the Institute for Language by the Swedish Foundation for Humanities and Folklore (ISOF). To make this possible, and Social Sciences from 2017 to 2020. language technologists, SSH researchers and data holders will work in close cooperation Topics: Nordic Textual Resources and within the project. Practices In a workshop we had in 2015, the idea Keywords: speech technology, folklore, dia- was to put together groups of three partners: lectology, conversation analysis, user studies an SSH researcher, a speech technologist, and a data holder. We wanted to take the SSH researchers’ current work procedure as

151 References A linguistic landscape is shaped by the Berg, Johanna, Rickard Domeij, Jens Ed- combination of different forms of official lund, Gunnar Eriksson, David House, and less official signs, i.e. ”road signs, ad- Zofia Malisz, Susanne Nylund Skog vertising billboards, street names, place na- and Jenny Öqvist (forthcoming). Till- mes, commercial shop signs, and public Tal – making cultural heritage ac- signs on government buildings [in a given] cessible for speech research. In: Selected territory, region, or urban agglomeration” Papers from the CLARIN 2016 Confe- (Landry & Bourhis, 1997:25). Linguistic rence, Aix-en-Provence, 25-28 October Landscape Studies give attention to the 2016. Linköping Electronic Confe- consequences and impact that the visibility rence Proceedings. and materialization of languages can have Hansen, Preben, Anni Järvelin, Gunnar Er- not only by having an informative and sym- iksson, Jussi Karlgren (2014). A Use bolic function, but also their impact on lan- Case Framework for Information Access guage vitality (Landry & Bourhis, 1997:45). Evaluation. I: Paltoglou, Georgios, Attitudes toward a language, in relation to Loizides, Fernando, Hansen, Preben visibility and use in public spaces, influence (red.), Professional Search in the Mo- the language’s prerequisites and conditions dern World: COST Action IC1002 on for acquisition and revitalization (Grenoble Multilingual and Multifaceted Inte- & Whaley 2006; Hyltenstam 1991). ractive Information Access (ss. 6-22). One of the focuses of the project Mapping Springer. Language Vitality that will be illustrated and Jacobson, I., Christerson, M., Jonsson, P., described in the poster is the case of official and Overgaard, G 1(992). Object- minority and Indigenous languages in Swe- Oriented Soft-ware Engineering: A Use den. The historical hierarchical relationship Case Driven Approach. Addison-Wesley. between these languages and majority langu- Jacobson, Ivar, Ian Spence, Kurt Bittner ages, and recent changes in minority politics (2011). Use Case 2.0: The Guide to motivate this choice of focus. Also, the Succeeding with Use Cases. Ivar Jacobson practice of naming places has an additional International. dimension in Indigenous contexts: to name a Nardi, B. A. (1996). Context and consciousness: place in the Indigenous language of the in- Activity theory and human-computer inte- habitants is described a pert of a decoloni- raction. MIT Press. zation project (Tuhiwai Smith 2012: 158). In the poster, we propose to present a pi- lot study conducted in 2015-2016 in Umeå, Mapping Language Vitality the prototype (deep map) that has been de- veloped, and the preliminary results upon Coppélie Cocq which we plan to pursue the project. In the Umeå University, Sweden pilot study, about 400 linguistic expressions were photographed in Umeå's cityscape, of This poster will present an ongoing project which about 150 were coded. The linguistic aiming at visualizing urban linguistic lands- expressions consist of fixed signs and sign- capes through digital mapping in the pur- boards, posters, temporary vernacular signs pose of approaching representations and etc. A visualization was created by producing conditions for multilingualism in Sweden. a digital map in order to see where and when Languages available in our environment languages, for example Ume Sami or Finnish in the form of words and images, and dis- were materialized in the city. The photo- played in public places have been the focus graphs were coded and assigned charac- of scrutiny within a rapidly growing research teristics such as language, position, type of area called Linguistic Landscape Studies (see for sign, sender, addressee etc. and placed geo- instance Blommaert 2013; Shohamy et.al. graphically on an interactive map with fil- 2008; Shohamy & Gorter, 2010). terable categories, i.e. that enables the user

152 to navigate through layers based of the data Bibliography linked to the images. Recent publications: In the next step of the project, the same Cocq, C. Turning the Inside Out: Social Me- model and tools are used and further deve- dia and the Broadcasting of Indigenous loped to include and analyze a larger set of Discourse. (European Journal of Co- data (photos) covering several geographical munication, Accepted for publication). areas. The digital map is central in order to Co-author: Lindgren, Simon. visualize connections between languages (as Cocq, C. Narrating Climate Change: Con- they materialize landscape) and other layers ventionalized Narratives in Concord- of information. Also, this form of digital vi- ance and Conflict. (Narrative Works, sualization enables us to explore how and to accepted for publication 2016). Co- what extent different languages coexist, and author: Andersson, Daniel. examine how this looks in relation to the Cocq, C. Exploitations or Preservation? majority language Swedish and other socio- Your choice! Digital modes of expres- cultural and socio-linguistic factors. sions for perceptions on nature and the land. I: Communicating environment. Topics: Visual and Multisensory Representat- How different communication forums ions of Past and Present react to ecological dangers. Red: Heike Keywords: languages, revitalisation, mapping, Graf. Cambridge: Open Book Publish- linguistic landscape ers. 2016. Cocq, C. Reading small data in indigenous References contexts: ethical perspectives. I: Re- Blommaert, Jan. 2013. Ethnography, Super- search Methods for Reading Digital diversity and Linguistic Landscapes. Data in the Digital Humanities, Eds. Chronicles of Complexity. Toronto: Gabriele Griffin and Matt Hayler. Ed- Multilingual Matters. inburgh University Press. 2016. Grenoble, Lenore A., and Linsay J. Whaley. Cocq, C. Mobile Technology in Indigenous 2006. Saving Languages : An Introduc- Landscapes. In: Indigenous People and tion to Language Revitalization. Cam- Mobile Technologies. Red. Laurel Eve- bridge: Cambridge University Press. lyn Dyson, Stephen Grant and Max Hyltenstam, Kenneth. 1996. Tvåspråkighet Hendriks. Routledge, New York and Med Förhinder? : Invandrar- Och Mi- Milton Park, UK. Pp147-159, 2016. noritetsundervisning. Lund: Studentlit- Cocq, C. Indigenous voices on the web: teratur. Folksonomies and Endangered Lan- Kasanga, L. 2012. Mapping the linguistic guages Journal of American Folklore, landscape of a commercial neighbour- Vol. 128, no 509, 2015. pp. 273-285. hood in Central Phnom Penh. Journal of Multilingual and Multicultural De- velopment 33(6):1-15 Landry, R., & Bourhis, R. Y. 1997. Linguistic Enemies of Books Landscape and Ethnolinguistic Vitality: An Empirical Study. Journal of Lan- Olof Gunnar Essvik guage and Social Psychology, 16(1), Gothenburg University, Sweden 23–49. Shohamy, Elana & Gorter, Durk. 2008. Lin- I download the book from the Internet, guistic Landscape. Expanding the https://books.google.com, Enemies of books, written Scenery. Taylor and Francis. by William Blades, published in 1881. A book on Shohamy, E. et al. 2010. Linguistic Land- the decay of books. The enemies of the physical book scape in the city. Multilingual matters. – fire, water, gas, the bookworm, dirt, bigotry etc. A Tuhiwai Smith, Linda. 2012. Decolonizing digitized book with no identity. Black letters on a Methodologies: Research and Indige- white background. I buy a copy of the book from nous. Peoples. Zed Books. 1881. A yellowed, and stained copy. The book be-

153 ars traces of a former owner, one Dr. Sarolea. Topics: Visual and Multisensory Representat- Newspaper cuttings on his death and his extensive ions of Past and Present book collection. I compare the two texts, the original Keywords: bookbinding, shadow libraries, ca- and the digitized copy. Using my computer I create a lenders, 3d printed, objects, time, 100 years tool for binding books. I combine and modify tradit- ional tools that have been used for hundreds of years Bibliography and print out the components on a 3D-printer. The Books next day I print out another copy of the digitized Essvik, Olle & Nordqvist Joel, Den här da- book, and using the 3D-printed tool I make an ex- torn, 2015, Rojal Förlag, Gothenburg, act replica of the book from 1881, in its original Sweden design. I construct a manual describing the process Essvik, Olle & Nordqvist Joel, Virtuella and upload the files to the Internet. Utopier, 2015, Rojal Förlag, Gothen- This is a description of a project, which I burg, Sweden begun in 2014 as an artistic development Essvik, Olle, Enemies of Books, 2014, Rojal project financed by the University of Förlag, Gothenburg, Sweden Gothenburg. A project exploring digitizat- Articles/papers ion, human traces and the unique copy, and Essvik, Olle, Chance Execution, Data at the same time an act of resistance against Browser 06, Executing Practices, digitization and technology. The outcome of (http://www.data-browser.net/06/), the project was a book, presented together Autonomedia, New York, USA (up- with the tool used to produce it. coming release late 2016) The project has been presented at art- Essvik, Olle The museum: as a game, Art museum, conferences and universities, both and Game Obstruction, Skövde Hög- as workshops as well as performance- skola 2016, Rojal Förlag, Gothenburg lectures. In 2016/2017 I will publish two Sweden (upcoming release late 2016) new books within the project about code, Essvik, Olle, For Years to Come 2015 , In: marbling and chance (a chapter in the book PARSE Conference, Nov 4-5, The 1st DATA BROWSER 06, Autonomedia (will PARSE Biennial Research Conference be released late 2016)) and another book on TIME.( Conference paper) about shadow libraries and pirated books (Georges Perec, The Machine, Rojal förlag 2017). Working with I would like to present this project as an Digital Newspapers experimental poster together with objects from the project. Objects such as coded Katrine Gasser marbleized papers, 3d printed bookmaking Mogens Vestergaard Kjeldsen tools, books and posters describing the pro- Royal Danish Library cess of bookbinding using digital tools. I have also done performances during confer- Introduction ences where I make calendars for 100 years. Since 2014 the Royal Danish Library, Aar- There are numbers of options that could be hus (RDL) in Denmark has been digitizing discussed depending on space and your in- historic newspapers. Until date more than 25 terest. million newspaper pages have been scanned Read more and see pictures at: and OCR processed. This has not only gen- http://www.rojal.se/theenemiesofbooks/ erated one of the biggest digital newspaper The upcoming article about marbled books archives in the world, but also a huge and code could be found here: amount of text documenting the Danish his- http://www.rojal.se/chanceexecution tory and language from the 1800s up until our time. As a means of inspiration for researchers in the digital humanities, we at RDL find it highly relevant to show some of the possibil-

154 ities given when taking a digital research ap- resentation of the words reflects semantic proach to the newspaper archive. There are properties of the words. Words that appear many ways of exploring and approaching a in the same context will be close in the vec- text corpus like this. We have selected visu- tor-space (similar words). But distance be- alization, mapping and OCR correction as tween words can also be used to find analo- the starting points. This has produced the gies. The word2vec tool features several beta version tools Smurf, Dots and W2C, corpora including a very large one based on which are described in detail below. We find the digital newspaper archive. all the tools suitable for implementation with Currently, RDL is examining if word2Vec other text corpora than the digital newspa- can be used to correct OCR scanned text by per archive. comparing a list of words from word2vec with words from a Danish dictionary. Link: Smurf http://labs.statsbiblioteket.dk/dsc/ Smurf is a tool that visualizes how the use of words/phrases in Danish newspapers has Future research options evolved since the 18th century. The visuali- RDL holds various collections and archives zation consists of a graph with a timeline (X) including the Danish Web Archive. We im- and (Y) which represents the occurrence of agine the tools presented above have poten- the searched word or terms as represented in tial to be used to explore content in the ar- the digitized newspapers. With Smurf you chive. can search for words or phrases and see graphs, do multiple searches and compare The Danish Web Archive graphs. Clicking a graph point will take you Netarkivet.dk is the Danish web archive. It directly to the source newspaper. contains the Danish part of the Internet Smurf has been available for the public from July 2005 onwards. Due to Danish since mid-September 2016 and has been laws on personal data and data protection, presented for various researchers but also access to netarkivet.dk is restricted to re- for groups of students (history / literature). searchers with permission for relevant re- The tool’s full potential still needs to be ex- search projects. The archive may be accessed plored. Link: at http://netarkivet.dk/. http://labs.statsbiblioteket.dk/smurf Key learnings Dots One of the key learnings so far is the value Dots is a visualization tool based on the of incorporating other digital sources. As an newspaper corpus (1800–2013) and a map example the STO dictionary (Den store from Kortforsyningen which contains coor- danske SprogTeknologiske Ordbase) dinates and names of cities in Denmark. Dot (Braasch & Olsen 2004) delivered from Cen- visualizes the occurrences of words from the tre for Language Technology, University of newspapers on a map. Dots has a timeline Copenhagen is essential in making the tool where it is possible to limit the search to a Word2Vec useful when trying to identify specific period. Furthermore, the timeline errors in the OCR generated text. The tools makes it possible to track the geographic presented have been customized to the li- presence of a word or sentence over time. brary’s newspaper archive and are not open Dots is a fairly newly developed tool and source. will be public in Spring 2017. Topics: Nordic Textual Resources and Word2vec Practices Word2Vec is a high-dimensional word em- Keywords: text, mapping, corpora, archive bedding tool based on an unsupervised ma- chine learning algorithm using a simple neu- ral network. It maps each unique word in a large text corpus to a vector. The vector rep-

155 References a shared process of different yet equally im- Braasch, Anna & Olsen, Sussi. 2004: STO: A portant agents. My claim is that in this pro- Danish Lexicon Resource - Ready for cess the human agent merges into a greater Applications. In Proceedings of the agential ensemble of non-human quality. Fourth International Conference on This approach stems from the work started Langauge Resources and Evaluation, by Gilles Deleuze and Félix Guattari, and vol IV. Lisbon, pp.1079-1082. more recently continued by philosophers and media theorists such as Jane Bennett, Jussi Parikka, and Rosi Braidotti. Towards a Material Politics The performance research aspect of my thesis is to produce an-archic performances of Intensity – Mimetic, and performance spaces – as introduced by Virtual and Anarchistic director and theatre theorist Antonin Artaud Assemblages of Becoming- – in video game context. For Artaud theatre is “the sense of gratuitous urgency with Non-Human/Machine which they are driven to perform useless in Minecraft acts of no present advantage.” The Artau- dian performance is by nature anar- Marleena Huuhka chic/anarchistic: it does not concede to University of Tampere, Finland negotiate with power structures or to affect via political channels, rather it works prima- My poster for the Nordic Digital Humanities rily through performative demonstrations, Conference will present my ongoing PhD which gain their quality outside the norm project titled “Towards a Material Politics of system. Intensity – Mimetic, Virtual and Anarchistic This kind of performativity can be crea- Assemblages of Becoming-Non-Human/ ted in video games through the strategies of Machine in Minecraft”. counterplay. Counterplay (Nakamura & My doctoral thesis examines Minecraft Wirman 2005, Apperley 2010) means ways and other such sand box building games as of playing, in which the player searches the material, mimetic, virtual, nomadic and anar- virtual environment for ways of being and chistic performance rhizomes and locations acting unthought-of or unintended by the of becoming-something created in cooperat- developers of the game. Usually these ion with human and non-human agencies. practices go against the set goals or intended Human agents involved are for example uses of the game in question. Counterplay is human players, human spectators and hu- thus gratuitous action done purely for the man game designers. Non-human agents sake of itself. The combination of Artaud’s include game devices, pixels, electricity, pro- theory with game theory provides a fresh gramming language, game avatars and virtual angle of approach to games as performan- game environments. My research is located ces. at the intersections of performance research, My thesis concentrates on following new materialist philosophy, and game rese- questions: 1) in what performative arch. assemblages do the human and nonhuman My research deconstructs the subject- agents participate in video games, and what object dichotomy between human and non- kind of meanings are constructed in these human agents. In virtual game performances assemblages; 2) what kind of performative non-human agents participate in the pro- subjectivities are created through the duction of the performances together with potentialities of mimesis, virtuality, and the human player. The avatars movements, becoming-something; and 3) does the Ar- though orchestrated by human hands, are taudian, anarchistic practice of counterplay the results of human/non-human cooperat- open up possibilities for nomadic existence, ion. The performance is thus constructed in and if so, why?

156 Video games have a growing influence on Parikka, Jussi 2014. The Anthrobscene. The our thinking, societies and cultures. The im- University of Minnesota Press. portance of transdisciplinary research is thus Parikka, Jussi 2015. A Geology of Media. greater than ever. My research opens up new The University of Minnesota Press. spaces of possibilities by combining perfor- mance research, game research and new Bibliography materialist philosophy. I suggest new ways 2017 Journeys in Intensity ― Human and of participation, resistance and being that Non-human Co-agency, Neuropower transcend the boundaries of species and and Counter-Play in Minecraft in RE- materialities. The approaches participate in CONFIGURING HUMAN AND the discussions in the field of digital humani- NON-HUMAN: TEXTS, IMAGES ties, and link the pleasure of game play with AND BEYOND. Eds. Karkulehto, the critique of hypercapitalism. Koistinen & Varis. Peer reviewed. In my poster I will present the above in- 2016 Experience the Wild – Non-human troduced key concepts of my research, ac- Agency and Performership in Video companied with practical examples from my Games in NÄYTTÄMÖ JA TUTKI- own game play experiences. MUS 6. Eds. Arlander, Gröndahl, Kinnunen & Silde. Peer reviewed. Topics: The Digital, the Humanities, and the 2015 Labyrinth – Perspectives on Games Philosophies of Technology and Performances in ESITYSTUT- Keywords: theatre, game research, KIMUS. Eds. Arlander, Erkkilä, Riiko- new materialism, Minecraft nen & Saarikoski. Partuuna. With Marjukka Lampo. References Artaud, Antonin 1983. Kohti kriittistä teatteria. Delfiinikirjat, Otava. The Cultural Heritage Apperley, Thimas H. 2010. Gaming Rhythms: Play and Counterplay from the Situated to HPC Cluster the Global. Institute of Network Cultu- res. Per Møldrup-Dalum Bennett, Jane 2010. Vibrant Matter. A political State and University Library, Denmark ecology of things. Duke University Press. Braidotti, Rosi 1994 Nomadic Subjects. Co- The Danish e-Infrastructure Cooperation lumbia University Press. (DeIC) has been charged with spreading Braidotti, Rosi 2013. The Posthuman. Polity. High-Performance Computing (HPC) to Deleuze, Gilles & Guattari, Félix 1987 new research areas, such as the humanities (trans. Massumi, Brian) A Thousand Pla- and social science areas. In order to respond teaus. Capitalism and Schizophrenia (Mille to this, DeIC and the State and University Plateaux. Capitalisme et Schiziphrénie). The Library have agreed to establish the DeIC University of Minnesota Press. National Cultural Heritage Cluster, State and Dolphijn, Rick & Van Der Tuin, Iris 2012. University Library. New Materialism: Interviews & Cartograp- The cultural heritage cluster applies state- hies. Open Humanities Press. of-the-art technologies within data science, Galloway, Alexander R. 2006. Gaming. Essays and for the first time ever facilitates quanti- on Algorithmic Culture. University of tative research projects on the digital Danish Minnesota Press. cultural heritage – e.g. radio and TV pro- Nakamura, Rika & Wirman, Hanna 2005. grammes, websites and historical newspa- “Girlish Counter-Playing Tactics.” in pers. Game Studies, volume 5, issue 1. http://www.gamestudies.org/0501/na Collections Available to Research Projects kamura_wirman/ The State and University Library and the Royal Library together are responsible for

157 collecting and preserving Danish cultural Open Data Platform (ODPi) and commer- heritage, including the digital cultural heri- cial products. ODPi features most of the tage. This digital cultural heritage is divided current Hadoop technologies e.g. Spark and into numerous collections, each with its own MapReduce. properties, formats and possibilities. Ex- BigInsights adds a number of commercial amples of collections that are now made applications to ODPi, most prominently available to researchers include radio/TV, BigSheets and Text Analytics. BigSheets uses the Netarchive and the Danish Newspaper the spreadsheet metaphor and makes it easy Collection. to get started analysing billions of rows of The radio/TV collection contains more structured data. Text Analytics is a browser- than 1 million hours of TV broadcasts and based work area for analysis of unstructured more than 1.5 million hours of radio pro- data e.g. text corpora and it comes with a grammes broadcast on Danish channels number of complete modules for e.g. POS, from the 1980s until today. The collection's NER, and sentiment analysis. data are made accessible as audio and video RStudio and Jupyter technologies will files. The collection also contains large also be available for performing more pro- amounts of metadata, such as programme grammatically and advanced analyses ensu- titles, broadcast times and subtitles, depen- ring that the system can scale to arbitrarily ding on the epoch from which the material complex research projects. originates. Read more at mediestream.dk. The Netarchive contains more than 800 Pilot Projects TB data, corresponding to more than 20 In the first phase, three pilot projects will billion objects gath-ered from the Danish utilise the system's new facilities. The State part of the Internet from 2005 until today. and University Library in collaboration with This archive also contains both data and me- the DeIC eScience centre of competence tadata, and both are made available to rese- will make facilities available and offer trai- arch projects. The Netarchive is a joint nat- ning in use of the system to the researchers ional project between the Royal Library and working on these projects free of charge. In the State and University Library, and you 2017 and 2018, DeIC and the State and can read more at netarkivet.dk. University Library will offer further, fully The digital newspaper collection contains financed pilot projects through open project 25 million newspaper pages from the 1700s invitations. until today. All of these pages are stored as In the course of 2017, it will be possible image files along with a large amount of me- to buy calculation time and consultancy as- tadata and optical character recognition data sistance under a transparent price model, (OCR). which will be developed in connection with The Cultural Heritage Cluster is also avai- the first pilot projects. lable to research projects that bring their The three planned pilot projects are: own data. * Probing a Nation's Web Domain, run by Professor Niels Brügger from Aarhus Platform University. The project will analyse the Da- The Cultural Heritage Cluster is to support nish part of the Internet as it has developed new areas and methodologies, particularly from 2005 until today. Their data source will within digital humanities. It was therefore primarily be metadata from the Netarchive. decided to design a system that would make * Digital Footprints Research Group, run it easier easy to conduct well-established by Anja Bechmann, Aarhus University. This analyses without having to compromise in project will analyse data from social media. relation to advanced and be-spoke methods. The data source will be both the project's The Cultural Heritage Cluster is making own data and data from the Netarchive. IBM's BigInsights platform available to rese- * A project run by Sabine Kirchmeier- arch projects. This platform consists of the Andersen from the Danish Language

158 Council's research institute. This project will dsource this most tedious phase of the pro- analyse the development in the Danes' lan- cess to the informants themselves as fully as guage usage on the social media, and the possible. data source will be the Netarchive. The site hosted several different kinds of tasks for the informants. Some of them just Topics: Nordic Textual Resources and prompted the user what to read; this of Practices course has a seriously deteriorating effect on Keywords: HPC, big data, distant reading, the naturalness of the informants’s speech. quantitative research, EDA In order to collect more spontaneous speech, other tasks gave the informants just barest instructions of what to do. For in- Collecting Speech Data stance, there was a task where the informant was shown two pictures with minor diffe- over the Internet rences; their task was to spot the differences and report them verbally. In another task, Tommi Nieminen the informant was shown a map of an ima- University of Eastern Finland ginary city and asked to guide a stranger Tommi Kurki from one point to another. In still another, University of Turku, Finland the informant was to assume the role of a buyer in a marketplace and ask for some Collecting speech data for linguistic purpo- berries from the salesman. And so on. ses has always been a notoriously tedious Of course in order to count as crow- and labour-oriented process. Even when it is dsourcing instead of being only transfer of possible to gather the informants all in one responsibilities from the researchers to the place, the recording studio, the fine-tuning informants, it is necessary to offer some real of the equipment, test recordings, and other baits for the users. Our bait was to create initial organizing of the session takes so some game-likeness in the site. The infor- much time that even in optimal situation it is mants could for example try and recognize rarely possible to have one to two hour re- the dialects of other informants – i.e., the cordings of more than a handful of infor- could listen to other informants’ speech mants during one workday. And when the samples and hazard a guess. This game-like research is targetting the areal or social vari- nature was however, never fully quite ation of speech it is hardly ever possible to realized during the Prosovar’s lifetime. have the informants come to the researcher Now that the project is almost over (the but the other way around. Thus in many site will close in December 2016) we are in a cases it takes one to two workdays per an position to report what you can and cannot informant, which means that gathering of do in this way. First of all, we now know speech corpora require considerable invest- that it is fully possible and extremely time- ments in work time, and that the build up of saving thing to do. It is obvious that the large corpora tends to be extremely slow. Web 2.0 and its facilities are here to stay This was the dilemma we created the even when the researcher is interested in Prosovar project, or more fully “The dia- speech data. lectal and social variation of Finnish There are nevertheless several obstacles. prosody”, to solve; the project was funded First of all, since the recordings are fully car- by the Kone Foundation from 2013 to 2015. ried out by the informants, it is not possible In it, we experimented on gathering the to control the recording settings (in any ot- speech data in the Internet using Web 2.0 her way than counselling the users). This techniques: the informants sat comfortably tends to leave us to the mercy of the whims in their homes and used their own compu- of the web browsers and their plugins and ters or mobile devices to connect to our site add-ons. Secondly, because of this and other (https://puhu.utu.fi/) to record their own considerations, the quality of the speech data speech samples. Our intention was to crow-

159 is very variable. In this case, we were more Staging the Medieval lucky than others might be since the features of speech we were interested in, the Religious Play in Virtual prosodic features, are more robust than Reality some others. Still, since the recording almost always involved (lossy) packing, the spectral Annika Rockenberger information in the sound data is always so- University of Oslo, Norway mewhat corrupted. Thirdly, there is no obvious technological candidate for imple- Late Medieval Germany has seen the emer- menting the recording. We used Flash alt- gence of the so-called religious play as the hough it is quickly being phased out. For predominant ‘dramatic’ form in an institut- security reasons, mobile devices often block ional context, while the theatre tradition of the access to the recording device totally Greek and Latin Antiquity had been discon- when called from a remote connection. This tinued. Contents of these plays were mostly creates further problems we hope to be able taken from the New and Old Testament to solve in the future. (incl. Apocrypha) as well as hagiography. However, settings and themes from the Topics: The Digital, the Humanities, and the secular sphere were often included as social Philosophies of Technology satire and for comic relief. Performed during Keywords: speech corpora, web 2.0, Christian Holidays, and often intertwined crowdsourcing with liturgy and church parades, their venues and ‘stages’ were either set in(side) sacral Bibliography spaces (churches or other religious buil- Małisz, Zofia & O’Dell, Michael & Niemi- dings) or within close proximity: like central nen, Tommi & Wagner, Petra 2016: markets or town squares. Religious plays are Perspectives on speech timing: Coup- often believed to have been performed over led oscillator modeling of Polish and the course of several days, employing multiple Finnish. Phonetica 73: 233–259. setting (‘Simultanbühne’), that allowed for a O’Dell, Michael L. & Nieminen, Tommi & non-chronological as well as a perpetual Lennes, Mietta 2015: Hazard regress- acting and a non-stationary, oscillating focus ion for modeling conversational si- of the audience. lence. – ICPhS 2015: Proceedings. Both the older and the more current 18th International Congress of scholarly editions of these religious plays of- Phonetic Sciences, 10–14 Aug 2015 ten do not take their unique character into Secc Glasgow Scotland UK. account when it comes to performance and Nieminen, Tommi & O’Dell, Michael L. setting (Auditor 2009). Instead, as a result of 2013: Visualizing speech rhythm: A anachronistic projections the antique theatre survey of alternatives. – Eva-Liina Asu or more modern forms of drama are evoked & Pärtel Lippus (toim.), Nordic by mode of representation which has lead to Prosody: Proceedings of the XIth inadequate, misinformed, or flawed interpre- Conference, Tartu 2012. Peter Lang, tations of single plays and an overall Frankfurt am Main. 265–274. misconception of medieval dramatic forms. (Wolf 2004, De Marco 2006, Schulze 2012) Against this, I propose to take full advan- tage of current technological possibilities such as Virtual Reality and 3D-modelling to create a probabilistic model of the medieval multi-sensory, multi-setting ‘stage’ that al- lows to test common assumptions and new hypotheses about setting, artistic perfor-

160 mance, mise-en-scène, audience-performer- also ephemeral things like sounds, noise, and interaction and participation in religious smells. plays. (2) Given a ‘mixed’ venue making use of My corpus consists of Medieval German both church buildings and open spaces plays from late C14th to C16th, most of (markets, town squares etc.), how would which are accessible in scholarly editions multiple setting, stage, scene and time ar- (Bergmann 1986). I will extract staging in- rangement have to be done differently? formation from the plays’ paratexts and re- What other general constraints apply here? levant contexts, especially of plays where (3) Given the great number of text lines church buildings or staging spaces are per play and taking into account the afo- known or can be inferred from historical rementioned considerations, I believe that sources; in single cases these historical buil- the manuscripts do not provide ‘the text as it dings are still ‘intact’. This material will be was to be performed’ but rather serve as a the starting point from which I create a pro- complete compilation of possible scenes re- babilistic model – an experiential analogy lated to a specific Holiday of which a stage (Foka, Arvidsson 2016) – of the perfor- director had to pick those scenes he deemed mance space using 3D-modelling. Further, I relevant and fitting for a concrete perfor- will make use of a VR engine (UNREAL mance within the spatial, physical, and time Engine), following a set of pre-preparations, constraints of the location. to re-enact single scenes or entire plays. With the proposed poster, I aim to visu- Since we know very little about the How alize these guiding questions and how VR of performing a medieval religious play but and 3D-modelling can help answering them. can infer some information from spatial sett- As an example I chose the well-known Ne- ting and artistic motion sequences and patt- ustifter-Innsbrucker Osterspiel from 1391 which terns, the modelling of the performances, is believed to originate in Southern the audience-performer-interaction and Thuringia, Germany, possibly in the town of especially the multiple setting will have to be Schmalkalden, where the medieval church experimental. The following questions serve building has survived and can thus be a heuristic function: sampled for testing simple VR modell- (1) Given an exclusively ‘inside church- ing using Google Cardboard and photo building’ venue (as accounted for in most of spheres. the shorter plays), where and how would the multiple setting have to be installed to en- Topics: Visual and Multisensory Representat- sure (a) feasible stage arrangement (size, ions of Past and Present shape, hight and number of scene-space(s) Keywords: virtual reality, middle high German, for n performers with at least minimum vi- medieval times, religious play, liturgical sibility and audibility for the audience), (b) performance feasible scene arrangement (‘storyline’, nar- ration, salvation-historical ‘logic’ in relation Bibliography to spatial, movemental, and sensory con- (German) Medieval Religious Plays finement of audience), (c) feasible time or Hofmeister, Wernfried, Cora Dietl, and duration arrangement (within/together Astrid Böhm, eds. Das Geistliche Spiel with/in addition to a liturgical performance Des Europäischen Spätmittelalters. or parade; in relation to seasonal constraints Wiesbaden: Reichert, 2015. Print. Jahr- (daylight/temperature/duties/etc.); in relat- buch Der Oswald-von-Wolkenstein- ion to physical constraints (endurance of Gesellschaft / Oswald-von- performers, audience; attention span, sen- Wolkenstein-Gesellschaft. - Wiesba- sory overload); (d) in regard to additional, den : Reichert, 1981- 15. more general environmental and circumstan- Mattern, Tanja. ‘Liturgy and Performance in tial constraints like lighting, acoustics, but Northern Germany: Two Easter Plays from Wienhausen’. A Companion to

161 Mysticism and Devotion in Northern Wolf, Klaus. ‘Für eine neue Form der Germany in the Late Middle Ages. Ed. Kommentierung geistlicher Spiele. Die Elisabeth Andersen, Henrike Lähne- Frankfurter Spiele als Beispiel der Re- mann, and Anne Simon. Leiden: Brill, konstruktion von Aufführungswirklich- 2014. 285–315. Print. keit’. Ritual und Inszenierung. Geistli- Schulze, Ursula. Geistliche Spiele Im Mittel- ches und weltliches Drama des Mittelal- alter Und in Der Frühen Neuzeit: Von ters und der frühen Neuzeit. Ed. Hans- Der Liturgischen Feier Zum Schau- Joachim Ziegeler. Tübingen: N.p., 2004. spiel ; Eine Einführung. Berlin: 273–312. Print. Schmidt, 2012. Print. 3D-modelling, Virtual Reality, Augmented Reality Prosser-Schell, Michael. Szenische Gestal- in Historical and Archeological Research tungen Christlicher Feste: Beiträge Aus Greengrass, Mark, and Lorna M. Hughes, Dem Karpatenbecken Und Aus eds. The Virtual Representation of the Deutschland. Münster [u.a.]: Waxmann, Past. Aldershot: Ashgate, 2008. Print. 2011. Print. Schriftenreihe Des Johan- Digital Research in the Arts and Hu- nes-Künzig-Instituts / Johannes- manities. Künzig-Institut Für Ostdeutsche Carter, Brian Wilson. ‘The Evolution of Vir- Volkskunde. - Freiburg, Br : Johannes- tual Harlem: Bringing the Jazz Age to Künzig-Inst. Für Ostdeutsche Volks- Life’. Digital Humanities 2016: Confer- kunde, 1998 13. ence Abstracts. Kraków: N.p., 2016. Auditor, Anne. ‘Die “Innsbrucker Spiel- 143–147. Web. handschrift”. Überlegungen zu einer Foka, Anna, and Viktor Arvidsson. ‘Experi- Neuedition’. ‘Texte zum Sprechen ential Analogies: A Sonic Digital Ek- bringen’. Philologie und Interpretation. phrasis as a Digital Humanities Project’. Festschrift für Paul Sappler. Ed. Chris- Digital Humanities Quarterly 10.2 tiane Ackermann and Ulrich Barton. (2016): n. pag. Web. 28 Oct. 2016. Tübingen: Max Niemeyer Verlag, 2009. Scheuermann, L., L. Jantke, and W. Scheu- 297–305. Print. ermann. ‘Erlebter Raum Im Rom Der Bergmann, Rolf. Katalog Der Deutschspra- Späten Republik - Eine Digitale For- chigen Geistlichen Spiele Und Marien- schungsumgebung’. Digital Humanities klagen Des Mittelalters. München: Beck 2016: Conference Abstracts. Kraków: [in Komm.], 1986. Web. Veröffentli- N.p., 2016. 670–671. Web. chungen Der Kommission Für Deut- sche Literatur Des Mittelalters Der Bayerischen Akademie Der Wissen- schaften. Ogden, Dunbar H. The Staging of Drama in the Medieval Church. Newark, Del.: Univ. of Delaware Press, 2002. Print. Hoffmann, Yvonne. Festtagsgeschehen Und Formgenese in Den Gewölben Der Spätgotik. Mannheim: Waldkirch, 2008. Print. De Marco, Barbara, ed. Performance in the Middel Ages and Renaissance. Bing- hamton: Center for Medieval and Re- naissance studies, 2006. Print. Mediae- valia : An Interdisciplinary Journal of Medieval Studies Worldwide. - Bing- hamton, NY : Center, 1975- 1.

162 Use of Digital Methods script, the best media, the best setup, the proper length, the appropriate content, etce- to Switch Identity-related tera. Properties The method relies to a great extent on a treatment where two groups of test subjects Jon Svensson (i.e. students) are exposed to a scripted dia- Roger Mähler logue between two characters, let’s say Umeå University, Sweden “Terry” and “Robin”, in two different vers- Mats Deutschmann ions in which each character is assigned pre- Örebro University, Sweden sumed stereotypical properties. In one vers- Anders Steinvall ion, for example, “Terry” may sound like a Satish Patel man, while the other recording has been Umeå University, Sweden manipulated for pitch and timbre so that “Terry” sounds like a woman. After the ex- It has long been observed that language is at posure the test, subjects are presented with a the heart of mechanisms leading to stereoty- survey where they are asked to respond to ping and inequality. In fact, language is a ma- questions related to linguistic behaviour and jor factor in our evaluation of others, and it character traits of the interlocutors. The re- has experimentally been demonstrated that sponses of the two sub-groups are then individuals are judged in terms of intellect compared and followed up in a debriefing and other character traits on the basis of session, where issues such as stereotype ef- their language output alone. Thus, awareness fects are discussed. of such mechanisms is of crucial importance The project produces the two property- in education, especially in the training of bent versions based on a single recording, groups who will be working with people in and the switch of the property (for instance their future profession; groups such as te- gender or age) is done using digital achers, police, psychologists, nurses. Alt- methods. The reason for this procedure is hough some courses deal with the to minimize the number of uncontrolled consequences of linguistic stereotyping on a parameters in the experiment. It is a very theoretical level, there is a need to provide difficult, if not an impossible, task to trans- students with a deeper understanding of form these identity-related aspects of a vo- how they themselves are affected by such ice recording, such as gender or accent, into processes, so that this knowledge can have a “perfect” voice - a voice that is opposite in an impact on their future practices. the specific aspect, but equivalent in all ot- The RAVE research project at Umeå her aspects, and doing so without changing University addresses exactly this issue by ex- other properties in the process or introdu- ploring and developing (pedagogical) cing artificial artifacts. methods for revealing sociolinguistic stereo- This project doesn’t strive for perfection. typing in regard to identity-related properties Instead the focus is on the perceived credibi- such as gender, age, physical appearance, lity of the scripted dialogue. Various kinds of ethnicity, etcetera. The main approach is the techniques, for instance the use of audio- use of digital matched-guise testing techni- visual cues, can be used to both distract the ques with the ultimate goal to create an on- test subject from the “artificial feeling”, as line, packaged and battle-tested, method well as enforce the target property. For in- available for public use. The project is ho- stance, to enforce the gender, we can use wever not only devoted to creating the pro- visual cues, switching images between a man duct itself but also to testing and evaluating and a woman. We can also add distractions the effectiveness of various digital methods that lessen the listeners’ focus on the spea- and configurations with respect to raising ker. It is also possible to use scrambling te- awareness of linguistic stereotypes, that is, to chniques, for instance by setting up the dia- answer questions such as what is the best

163 logue as a low-quality phone call or a Skype Prozhito: Private Diaries session. This poster session will present the expe- Database riences gained and lessons learned so far in the in the ongoing project. We will give an Nataliya Tyshkevich overview of the various methods used and Ivan Drapkin tested so far, starting from methods used in National Research University, Moscow, prior projects with rather simple, low- Russian Federation quality, gender morphed voices in Second Life enforced with avatars, to more sop- Despite the continuing interest in ego- and histicated qualitative attempts made by audio microhistory research, with particular focus experts and with the use of sophisticated on collective memory (Burns 2006), scholars software. still tend to neglect personal writings. The The focus will be on the credibility most valuable of them are private diaries aspects and the methods to determine if a with entries, usually tied to a chronological certain dialogue is perceived as credible or line, representing personal narratives of pe- not. The presentation covers aspects such as ople of different ages and cultural and social selection of narrators, use of actors, use of levels. There are many projects based on standard audio manipulation software as materials of personal diaries of one person, well as dedicated phonetic software such as such as the diaries of George Orwell (Ge- PRAAT, the use of speech synthesizers, and orge Orwell Diaries 1938–1942). At the morphing towards a reference (or imposter) same time multi-author online-diaries cor- voice. The poster will be supplemented by a pora are rather popular, see for example, mp3 player and a headset where visitors can (Teddiman 2009). However there exist neit- listen to samples. her relevant diary subcorpora of any national European corpora (such as British National Topics: Visual and Multisensory Representat- Corpora), nor special multilingual databases ions of Past and Present of personal diaries. Keywords: language, stereotyping, We are first to present a global database voice-morphing, match-guise technique, of private diaries, tied to a chronological perceived credibility line, representing personal narratives of pe- ople of different ages and cultural and social levels. “Prozhito“ is a non-profit project, which blends the structural experience of blog platforms and archival tradition of cu- rating personal writings. Our database inclu- des 400 diverse non-authorized diaries or 150 000 entries from the XIX-XX centuries in Russian and Ukrainian with a possibility to multilingual expanding. A researcher can work not only with particular texts but with the whole collection of diary entries, using complex search queries by author’s gender and age, journal types (f.e, war, tourist, dream etc.) and filtering results by exact da- tes and places of records. The first version of the site was opened April 24, 2015 and contained 100 journals and 30,000 diary entries, collected from pub- lic internet sources. As a result of collaborat- ions with the Russian media audience of Prozhito groups in social networks has

164 grown several times and there was a steady and public to avoid publishing the informat- stream of volunteers. In November 2016 we ion that could possibly compromise third launched a new version, focused primarily parties. This is one of the most difficult on the ordinary users. The main features of tasks of private archives publishing, the key the last version are intelligent search tools, a to which can only be found in close cooper- developed system of classification (genre, ation with the heirs and administrators of author, language, geographical origin) and family archives. In Prozhito the manuscript multilingual architecture. owners (person or family) continue to parti- On the home page users can use simple cipate in its preparation for publication and tool to search across notes of all diaries by control the text on all the steps of its trans- keyword and dates and observe collections formation from the manuscript to the of notes and random quotes. Extended se- machine-readable database unit. They have arch panel on the Diary page provides pa- the right to exclude fragments, considered rameters for selecting authors and searching unappropriate due to ethical reasons. notes by author’s name, last name, age, no- Working with a family history often acti- tes tags or keywords etc. On the Diary page vates intrafamily communication, but the users can observe all list of uploaded diaries, information, stored in the family archives, is cathegorized by languages. Every diary aut- of interest not only for the family members. hor has his own profile page with informat- Prozhito project allows any user to explore ion about author and his diary. the diaries data and gives huge research From the beginning, the project was de- material for researchers of everyday life. signed as a heavily visited web-platform with unlimited scalability. Architecture of the Topics: Nordic Textual Resources and current version combines several databases: Practices MySql, Sphinx, Redis. In MySql we store all Keywords: private writing, database, corpora, the basic entities: a person, diary, enries, private diaries comments, thematic tags, copyright status, preview, etc. Search by various parameters, and the Creating Children’s Books main morphological search and implemen- in the Context of Pokémon ted through the Sphinx system. Sphinx in- dexes all entries words, diary dates and other Go, Museums and Cultural parameters. Heritage A feature of the loaded data is that not all known records have the exact dating of wri- Lars Vipsjö ting. To include these entries in the sample, University of Skövde, Sweden data for each entry on the stage of the diary is automatically determined by the estimated The poster will show the contents of Kiras date period in which it is written. och Luppes Bestiarium (Kiras and Luppes Bes- Currently published private diaries biblio- tiary) - a children's book series set in Skara- graphy, made by participants of the project, borg cultural environments. The books are contains more than 1,500 units. Project par- supported by an Augmented Reality app: ticipants identify existing publications asso- KLUB Bestiarium, which can be downloa- ciated with the heirs, the owners or ded for free from App Store or Google Play. publishers for electronic text, search for text Users of the AR app will see the mythical copy, scan and recognize the book. beings from the books appear also above the In addition to working with the already books in 3D. This happens when you hold published material Prozhito implements its the mobile phone's or reading pads camera own publishing program. over letter shapes (drop caps/anfangs) Work with texts from family collections found on certain pages. needs to strike a balance between personal The stories are designed together with museum educators and written and illustra-

165 ted by game development students at the Bibliography University of Skövde. The books are Vipsjö, Lars och Johansson, Therese (2016). published in cooperation with Lokrantz Kiras och Luppes Bestiarium. Jättin- book publishers in Lidköping and the app nan. Lidköping: Lokrantz förlag. company Solutions Skovde. The project was ISBN 978-91-98351323 first presented to the public at the Book and Vipsjö, Lars och Hansson, Matilda (2016), Library Fair in Gothenburg in autumn 2016. Harmannen. Lidköping: Lokrantz Two books are published and several more förlag. ISBN 978-91-983513-0-9 are under construction. The friends Kira and Arvas, Maina (2015). ”Lars Vipsjö om onda Luppe are in the books helping the trollsre- stereotyper”. Intervju under temat searcher Lovis save mythical beings from the ”Detalj” i Tecknaren #5 2015, evil ringmaster, who is actually also a troll, a Stockholm: Föreningen Svenska mountain king. The ringmaster wants to Tecknare. force other mythical creatures to perform at “Scarred and evil – A villain stereotype that his circus. does not inspire empathy when he The idea is that the fairytale figures loses” (2015) In Westin, J., Foka, A. through the app also will be portrayed and and Chapman, A. (eds) Challenge the “come to life” at the local heritage sites and past / diversify the future – procee- museums where the stories are unfolding. dings March 19-21 2015 Gothen- Mythical beings, when found, are scanned burg: University of Gothenburg. and saved in the app's bestiary, where there http://hdl.handle.net/2077/38407 users can find facts to read about them. Vipsjö, Lars och Bergsten, Kevin (2014) Idea and outcome targets: Because the Tecknad karaktär – anatomi, fysio- books and the app's content is produced to- nomi och psykologi. Lund: Student- gether with children and museum educators litteratur. ISBN 978-91-44-08482-4 (target group 7-11 years) the material is thought to be useful in teaching situations. Within museums, libraries and schools, the From Online Research books and app. can be tailored to local requirements. Kiras and Luppes Bestiary Ethics to Researching (KLUB) is a part of the project Kastis - Kul- Online Ethics turarv och spelteknologi i Skaraborg (Cul- tural heritage and gaming technology in Ska- Sari Östman & Riikka Turtiainen raborg) supported by Skaraborgs kom- University of Turku, Finland munalförbund (Skaraborg municipal associ- ation), the University of Skövde and a num- Along with the rise of a research field called ber of Skaraborg Municipalities. Some of the digital humanities, online specific research books in KLUB has also received funding in ethics plays an especially significant role. Re- the form of reading promotion support search on the same (Internet related) topic is from the Västra Götaland region. KASTiS is usually multidisciplinary, and understanding a sub-regional collaborative and knowledge research ethics even inside the same research platform for the sustainable use of game te- community may vary essentially. It is im- chnology and interactive media in Skaraborg. portant to recognise and pay attention to The project started August 1, 2015 and ends online specific contexts as well as the resear- on July 31, 2018. cher’s own disciplinary background. In our poster, we will present a model Topics: Visual and Multisensory Representat- which we have under developing process: ions of Past and Present this fourfold table will help researchers in Keywords: augmented children books, cultural positioning themselves as ethical actors on heritage multidisciplinary online-related fields. The positioning is based on the researchers’

166

Figure 1. Axis of Digitality & Humanities. Suominen & Haverinen 2015. topics and their disciplinary backgrounds in of such ethical matters, which are charac- relation to the position of the internet in teristic for humanistic research. their study. In Figure 2 (see following page), we have The model is based on Jaakko Suomi- situated 15 informants to our model. They nen’s and Anna Haverinen’s Axis of Digitality are all researchers in a multidisciplinary con- & Humanities, which is a tool for positioning sortium project, which studies Suomi24, Fin- yourself on the field of Digital Humanities land’s oldest and largest online discussion as such. (Figure 1. Suominen & Haverinen forum. They answered to a survey, which 2015; translated S. Östman & R. Turtiainen, mapped their backgrounds and current rese- 2016, p. 3.) arch as well as their understanding of rese- Our model for ethical positioning builds arch ethics. According to their answers, we on basis of this fourfold table: we suggest have defined their positions in the coordi- that researchers would ask themselves − in nation. Preliminary results suggested that addition to their subjects and backgrounds − researchers from quantitatively and the- whether the internet is a tool, the source or the oretically oriented disciplines on average see subject to their study. It could be just one or research ethics less as a reflective, analytical even all three; in latter case we would consi- process and more so as concerning copy- der internet as research environment of the right laws, for example. On the contrary, study. On the basis of the position of the empirically and understandingly oriented (of- internet, the individual take to research and ten cultural studies -based) researchers ten- e.g. the disciplinary background, we suggest ded to be more analytical, versatile, and re- the researchers would locate themselves in a flective in their ethical views. Humanistic coordination system like the one in Figure 2 researchers also had more education about (see next page) in order to see the relevance the subject (Östman & Turtiainen 2016, 72– 73).

167 Figure 2. Model for ethical positioning. Östman & Turtiainen 2016, 72.

We are currently further studying the humanistinen- multidisciplinary research ethics and further tutkimus/#identifier_14_1502 developing the model for ethical positioning. Östman, S. & Turtiainen, R. (2016). From The questions are, among others, following: Research Ethics to Researching Ethics 1. Which kind of ethical questions are re- in an Online Specific Context. In Me- levant among different disciplines taking dia and Communication, vol 4. iss. 4. pp. part to multidisciplinary consortium?m hans 66−74. Retrieved from 2. Which kind of multidisciplinary ethical http://www.cogitatiopress.com/ojs/i guidelines could be provided as a result of ndex.php/mediaandcommunication/a studying online research ethics in such con- rticle/view/571 sortium? Bibliography Topics: The Digital, the Humanities, and the Östman, S., & Turtiainen, R. (2016a). From Philosophies of Technology Research Ethics to Researching Ethics Keywords: digital humanities, digital culture, in an Online Specific Context. In Me- research ethics, online research ethics dia and Communication, vol 4. iss. 4. pp. 66¬−74. Retrieved from Literature http://www.cogitatiopress.com/ojs/i Suominen, J., & Haverinen, A. (2015). ndex.php/mediaandcommunication/a Koodaamisen ja kirjoittamisen vuo- rticle/view/571 ropuhelu?—Mitä on digitaalinen hu- Östman, S., & Turtiainen, R. (2016b). Un- manistinen tutkimus. Ennen ja nyt. derstanding Ethics in Digital Humani- Retrieved from ties. Guidelines and tools for conduct- http://www.ennenjanyt.net/2015/02/ ing research in online contexts. In koodaamisen-ja-kirjoittamisen- Folklore Fellows Communications nr´. vuoropuhelu-mita-on-digitaalinen- 310 (eds. Pekka Hakamies & Anne Heimo). (In publishing process.)

168

PRE-CONFERENCE WORKSHOPS

169

170 Higher Education Programs grams, all with the title of Digital Humani- ties or very similar (University College Lon- in Digital Humanities: don, King’s College London, a cluster of 4 Challenges and Irish universities, and University of Passau). Perspectives In the Nordic countries similar efforts are underway at the University of Gothenburg Koraljka Golub (http://lir.gu.se/english/education/masters- Linnaeus University, Sweden second-cycle/master-s-programme-in- Jenny Bergenmar digital- humanities), which is launching a University of Gothenburg Master in Digital Humanities in autumn Isto Huvila 2017. The University of Helsinki (https:// Uppsala University, Sweden www.helsinki.fi/en/researchgroups/helsinki Marcelo Milrad -digital-humanities) is also offering a set of Linnaeus University, Sweden courses in Digital Humanities. Linnaeus Mikko Tolonen University (https://lnu.se/digihum/) aims University of Helsinki, Finland towards developing an international distance Master program in Digital Humanities, with Introduction a pilot program starting in the autumn of Different aspects related to higher education 2017. At the same time, at other universities, programs in Digital Humanities (DH), courses in digital methods and topics have whether, what and how they should be or- been integrated as a part of existing and new ganized, are currently discussed at many hig- programs as specific compulsory and her education institutions in Nordic elective modules, or by including Digital countries and beyond. In recent years the Humanities related topics and perspectives establishment of new educational programs as a part of other courses. under the title of Digital Humanities, for ex- However, what a dedicated course, mo- ample in the USA, UK and Germany, are an dule or program in the field of Digital Hu- indication of a perceived need for deve- manities should cover is not always clear. loping such specific curricula. DARIAH-EU There is a considerable variation between has a dedicated research and education different offerings including diverse content centre under the title of Virtual Competency and approaches. The vast range of discipli- Centre (VCC) Research and Education Lia- nes, fields, areas and topics relevant to Digi- ison (http://www.dariah.eu/activities/ rese- tal Humanities present a challenge as to arch-and-education.html). DARIAH-EU what to include in a dedicated program, how also runs a registry of Digital Humanities to address the different challenges related to education in Europe (http://dh-registry.de bringing together different disciplinary trad- .dariah.eu) which, as of 10 January 2017, lists itions and methods, and how to accommo- currently active 17 Bachelor degrees, 38 date professional, infrastructural and acade- Master degrees, and 8 individual courses. mic requirements for such initiatives. More- The University of Stuttgart and the Univer- over, there are several challenges associated sity of Trier are just two examples that run with what is expected from the outcomes of programs under the actual title of Digital these new educational programs and efforts. Humanities. Similarly, EADH (European Which job positions and tasks could a Association for Digital Humanities) provides graduate Digital Humanist take on after a list of education programs, courses and completion of a Digital Humanities pro- seminars in Europe (http://eadh.org/ gram? Is there a need for Digital Humanists education) and names: 7 undergraduate pro- as such or should education in all humanities grams and courses, all with terms like Digital subjects be more inclusive of digital techno- Humanities, Humanities Computing and re- logy-related, cross-disciplinary and cross- lated in the title; 20 postgraduate ones with a sectorial topics? If the latter is the case, do more mixed array of titles; and, 4 PhD pro- we need entire programs or could the alter-

171 native of focusing on dedicated modules or target student groups, content, job market, individual courses address existing and evaluation, experiences and lessons learned. emerging needs of both the academic and 2. Currently developed programs, mo- the non-academic spheres? Furthermore, if dules or individual courses in Digital Huma- both approaches were deemed to have their nities: approaches to the design, target stu- merits, how do they differ, overlap and dent groups and related issues. complement each other in the context of 3. Cross-disciplinary and cross-sectorial educating future researchers and profession- collaboration in Digital Humanities educat- als for different sectors of the society? ion. The aim of this proposed workshop at DHN 2017 is to bring together scholars, Workshop structure educators and others interested in different Indicative agenda structure, covering ap- aspects of Digital Humanities education to proximately 4 hours: explore the current potential and challenges Session 1: Welcome, introduction, mutual and opportunities related to the teaching and presentations (30 min); learning of Digital Humanities. The works- Session 2: Presentations on the main hop will provide an opportunity to share ex- themes (90 min); periences, discuss existing programs, mo- Session 3: Directed discussion emerging dules and courses in Digital Humanities, re- from the main session 30 min); search and development activities, evaluat- Session 4: Presentation and discussion of ion approaches, lessons learned, and fin- submitted papers on timely and related top- dings. A further objective is to systematically ics according to the CfP (60 min); engage in discussions in common areas of Session 5: Concluding discussion, includ- interest with selected related communities ing options for co-operation (30 min). and to investigate potential co-operation and concrete collaborative activities. Audience The workshop will allow major establis- The intended audience includes: teachers hed programs and initiatives to report re- and managers at existing and developing sults, newcomers to interact with established Digital Humanities programs; researchers people in the field in order to allow the en- working with topics in Digital Humanities tire community to critically discuss topical education; professionals who are interested issues. The DHN venue encourages partici- in taking a Digital Humanities program, pation by Digital Humanities teachers, rese- modules, or courses. archers and developers from different per- Number of participants: 20 spectives (reflecting the different conference Register to: [email protected] threads). As the first workshop on education at DHN, it may set the path for future workshops at the annual DHN conferences in order to establish and provide a regular forum for discussions on education in Digi- tal Humanities in Nordic countries and beyond.

Workshop themes The proposed workshop will have three themes as the main focus, together with top- ical presentations arising from the workshop CfP. The main themes are: 1. Existing programs, modules or indi- vidual courses in Digital Humanities: design,

172 Data Management for element of good data management. Data management plans (DMP) have in particular Humanities Scholars the great advantage that they take into – An Introduction to Data account the fact that data has a longer li- Management Plans and the fespan than the research project that creates them. DMP are conceived and applied in Cultural Heritage Data Reuse order to ensure that data will be preserved Charter and useful both now and in the future, for both their creators and their reusers. Be- Marie Puren sides, in order to support open access for Charles Riondet research data, several funders make data sha- INRIA, France ring mandatory, and their applicants must thus provide a data management plan to do With the growth of the Open Science mo- so. vement in the past few years, researchers Data Management Plans are not simply have been increasingly encouraged by their management tools at project level, they also home institutions, their funders, and by so- allow a broader reflexion on research data in ciety at large, to share the data they produce. the Humanities on a larger scale. Although Significantly, the Horizon 2020 Research they can apply to any data for any research and Innovation Programme has undertaken field, we have chosen to make their benefit to open the research data produced by easier to grasp by addressing one specific use H2020 funded projects. A new model of case - but a use case that applies to a wide data sharing is emerging, and the challenges range of Humanities research projects. The this new model raise are impacting more and focus here will be on the reuse, for research more dramatically the research ecosystem. purposes, of data emanating from Cultural Rather than seeing in it an additional Heritage Institutions. In this specific situat- constraint, scholars can benefit from the ad- ion, there is often a lack of a clear policy on vantages that this model of openness offers. interactions between institutions and scho- Sharing their data allows them to collaborate lars. Therefore researchers encounter dif- with fellow researchers within the same di- ficulties to develop a clear data management scipline or with colleagues from other disci- policy for their research projects in con- plines, to reduce costs by avoiding duplicat- nection with Cultural Heritage data. ion of data collection, to make easier validat- The Cultural Heritage Data Reuse Char- ion of results, and to increase the impact and ter we are currently developing in the visibility of their research outputs. context of DARIAH-EU and other rese- Opening research data induces not only a arch infrastructures tackles this issue by of- change in mentality, but also a change in fering an online environment dedicated to all work methods. Data management has to be actors taking part in scholarly reuse of digital seen as the baseline of the research lifecycle. data generated by Cultural Heritage Institut- In this regard, it should be thought of as ions. The Charter online environment will early as possible in a research project, and allow the main actors to declare general should be flexible enough to evolve all along principles (common work ethics), and more the project. For researchers, this practice broadly to express their position on all the supposes to plan and decide how data will relevant information needed to understand be collected, organised, managed, stored, how a given dataset can be reused. Institut- preserved and shared during a research pro- ions will be able to declare their collections; ject, and after the project is completed. researchers their research interests and ex- These requirements can best be addressed isting publications so that these are connec- by setting up so called data management ted together. The Charter will also help plans. This method is fairly new to most document the knowledge generation process Humanities scholars, although it is a key and, consequently, increase the quality of

173 data and metadata accessible to research. data is an asset and a resource that can be Signing the Charter will also imply making a shared with mutual benefits for the person statement about the technical quality of the who share the data and the one who collect data to be reused, or the data derived by the data. such a reuse. More broadly, the Charter will offer a concrete implementation framework 2) Sharing research data: methods, tools and benefits for the FAIR principles (make the data fin- (presentation + hands-on session: 60 min) dable, accessible, interoperable and reu- Sharing their research data allows the resear- sable). Finally, clarifying the reuse conditions chers to organise and retrieve them ef- of cultural heritage data, and by that also the fectively, to ensure their security, to collabo- relationships between scholars and GLAM rate with fellow researchers within the same institutions, will enable to widen the cooper- discipline or from other disciplines, to re- ation opportunities. duce costs by avoiding duplication of data Within the framework of this future on- collection, to make easier validation of re- line environment, CHI and scholars will be sults, to increase the impact and visibility of able to explicit their constraints concerning their research outputs. Many are still re- data reuse. The Charter will not only allow luctant to share their data, but, fortunately, CHI to clarify their policy on data reuse and data sharing is gradually evolving towards a enable researchers to have a precise over- greater openness with the movement for view of their rights, it will also allow CHI Open Science and the development of Open and researchers to handle easily the digital Access. However, researchers need to be data they produced and therefore help them aware of the benefits of sharing their rese- to define their strategy on data management. arch data, because sharing (or not) rests In other words, the Charter will strongly most of the time on the shoulders of the re- connect with data management planning, searchers who decide whether and how to whose main goal consists of clearly stating share their data. the data policy of a research project, and will be an essential asset for data management 3) A future pan-european framework for planning for research on Cultural Heritage. exchanging information about Cultural Data reuse: the Charter online environment (presentation and Workshop provisional program discussion: 60 min) We expect the workshop to last about three In this session, we will present the prototype hours. Detailed presentations will be accom- of a future online environment dedicated to panied by open discussion, where we would Cultural Heritage data reuse. By taking into like to take advantage of the presence of DH account the longer lifespan of Cultural Heri- researchers and representatives of Cultural tage data, this future tool will offer many Heritage Institutions to engage in a fruitful valuable elements (e.g. documentation, exchange. guidelines, list of services) that could be used The workshop will be divided in three to easily create data management plans: sub-sessions: * Long-term and persistent access to me- tadata, texts, images; 1) Data management for researchers: Overview and * Licensing of the content; challenges (60 min) * Formats and standards; In this session, we will discuss the new mo- * Dissemination of both CHI informat- del of data sharing that is actually emerging ion and research (visibility of the work of all as described above. Participants will also get stakeholders); an overview of research data management * Retro-provision (communicating en- and data management planning. Data mana- richments based on CHI data to the CHI gement can offer many advantages, like hig- they originally emanate from); her quality data, increased visibility and bet- * Quality control at all levels according to ter citation rate. In this approach, research appropriate standards.

174 * This session will be dedicated to discuss and Practice, 2010. Accessed October the features that could be offered by this on- 30, 2016. https://goo.gl/hqKuKQ line environment, regarding data manage- Data management planning ment planning and improved cooperation Committee on Ensuring the Utility and In- between relevant actors (Cultural Heritage tegrity of Research Data in a Digital institutions, researchers, data centers, infra- Age, National Academy of Sciences, structures and other facilities). Ensuring the Integrity, Accessibility, and Stewardship of Research Data in Approximate number of participants: 20. the Digital Age, National Academy Authors: Marie Puren and Charles Riondet, Press, 2009. https://goo.gl/URJglu Ph.D., are junior researchers in Digital Hu- Licensing manities at the French Institute for Research Europeana, The Europeana Licensing in Computer Science and Automation (IN- Framework. Accessed October 28, RIA) in Paris. They currently work on the 2016. https://goo.gl/947T4z creation of a Data Management Plan for the Standards PARTHENOS H2020 project. Marie Puren Laurent Romary, “Stabilizing knowledge also contributes to the IPERION H2020 through standards - A perspective for project, especially by upgrading its Data Ma- the humanities”,Going Digital: Evolu- nagement Plan. Charles Riondet is also in- tionary and Revolutionary Aspects of volved in H2020 EHRI project as a meta- Digitization, Science History Publica- data and standards specialist. tions, 2011.

Topics: The Digital, the Humanities, and the Philosophies of Technology Developing a Repository Keywords: data management, research data, reuse, cultural heritage institutions, and Suite of Tools for cooperation Scandinavian Literature

Bibliography Mads Rosendahl Thomsen Principles of the Data Reuse Charter Aarhus University, Denmark Laurent Romary, Mike Mertens, Anne Bail- Timothy R Tangherlini lot, “Data fluidity in DARIAH – UCLA, United States of America pushing the agenda forward”, BIBLI- Kristoffer Laigaard Nielbo OTHEK Forschung und Praxis, De Aarhus University, Denmark Gruyter, 2016, 39 (3), pp.350-357. Background information on Open Access to publica- The goal of the workshop is to set bench- tions marks for the further development of a Murray-Rust, Peter, “Open Data in Sci- machine-readable corpus of Scandinavian ence”, Serials Review, Vol 34, No 1. literary texts which is part of a project that is Accessed October 28, 2016. the continuation of two Carnegie-Mellon https://goo.gl/9ZqdiQ Foundation sponsored conferences on com- putational approaches to Scandinavian litera- Data management and curation ture. A third conference is planned for Ray, Joyce, Putting Museums in the Data UCLA in November 2017. Curation Picture, Springer, 2014. At the workshop the practical implemen- University College London, Advancing Re- tation of the following goals will be discus- search and Practice in Digital Curation sed: and Publishing. Summary Report and 1) a preprocessed benchmark corpus of Recommendations of the Workshop selected literary texts in the Scandinavian on Next Steps in Research, Education language;

175 2) a wider machine readable Scandinavian of Handwritten Text Recognition (HTR) corpus. The corpora will be assembled from and other cutting-edge technologies. DSL, Litteraturbanken and Norwegian lib- The workshop is aimed at researchers raries; and students who are interested in the 3) a portfolio of tools with documentat- transcription, searching and publishing of ion. historical documents. It will introduce parti- The workshop is focused on aligning the cipants to the technology behind the READ needs of literary scholars with the technical project and demonstrate the Transkribus solutions that can be developed by the core transcription platform. Our team has already group members. conducted over 20 similar workshops over Professor with Special Responsibilities the course of the past year, including several Mads Rosendahl Thomsen sessions with digital humanities scholars and ([email protected], Aarhus University), Pro- students. fessor Timothy Tangherlini Transkribus can be freely downloaded ([email protected], UCLA) and Asso- from the Transkribus website. Participants ciate Professor Kristoffer L. Nielbo will be instructed to create a Transkribus ([email protected], Aarhus U) will chair the account and install Transkribus on their lap- workshop. We expect to attract 10-12 other tops in advance of the workshop. They will participants from Scandinavian and the US, also be asked to upload a few images of including scholars from Gothenburg Uni- historical documents to Transkribus prior to versity and Oslo University who have taken the session. They should bring their laptops part in prior meetings. along to the workshop. The workshop will consist of four parts: Topics: Nordic Textual Resources and Practices 1. Introduction to Handwritten Text Recognition Keywords: Nordic literature, text mining, cor- (HTR) technology (20 min) pora building The introduction to this workshop will ex- plain how new algorithms and technologies are making it possible for computer software Transkribus: Handwritten to process handwritten text. Handwritten Text Recognition Text Recognition (HTR) technology works differently from Optical Character Recognit- Technology for ion (OCR) for printed texts (Leifert et al., Historical Documents 2016). Rather than focusing on individual characters, HTR engines process the entire Louise Seaward image of a word or line, scanning it in vari- University College London, United ous directions and then putting this data into Kingdom a sequence. This introduction will outline Maria Kallio the workings of HTR technology and show National Archives, Finland examples of the successful automatic transcription and searching of historical Transkribus (https://transkribus.eu/ Tran- documents. The latest experiments de- skribus/) is a platform for the automated monstrate that Transkribus can automa- recognition, transcription and searching of tically generate transcripts with a Character handwritten historical documents. Transkri- Error Rate of 5 %. This means that 95 % of bus is part of the EU-funded Recognition the characters in the transcript would be and Enrichment of Archival Documents correct. (READ) (http://read.transkribus.eu/) pro- ject. The core mission of the READ project 2. Overview of the READ project (20 min) is to make archival material more accessible This presentation will give an overview of through the development and dissemination the READ project and the specific tools it is

176 creating. Computer scientists working on material is known as ‘ground truth’ (Zagoris READ are developing HTR technology et al., 2012, Gatos et al., 2014). The works- using thousands of manuscript pages with hop leaders will demonstrate how ‘ground varying dates, styles, languages and layouts. truth’ training data can be prepared using Testing the technology on a large and di- Transkribus. verse data set will make it possible for com- Transkribus can also be used simply for puters to automatically transcribe and search transcription. This presentation will explain any kind of handwritten document, from the how to create a rich transcription of a Middle Ages to the present day, from old document in the platform, using structural Greek to modern English. This research has mark-up, tagging, document metadata and huge implications for the accessibility of the an editorial declaration. written records of human history. The READ project is making this technology 4. Working independently with Transkribus available through the Transkribus platform (120 min) but also developing other tools designed to In the last part of the workshop, the partici- make it easier for archivists, researchers and pants will be able to try out the functions of the public to work with historical Transkribus on their own laptops. They will documents. The workshop leaders will pre- be supported by the workshop leaders who sent prototypes of some of these tools. will explain the different elements of the These include a system of automatic writer platform and then give participants the identification, an e-learning app to enable chance to practice each function for them- users to train themselves to read a particular selves. The workshop leaders will circulate style of writing, a mobile app to allow users around the room to answer any questions. to digitise and process documents in the ar- The workshop leaders will demonstrate chives and a crowdsourcing platform where the following tasks. After each demonstrat- volunteers can transcribe with the assistance ion, participants will be given 10-15 minutes of HTR technology. These tools will be to practice what they have learned. open source and are designed to be used and * Document management – how to adapted by other institutions and projects. upload, view, save, move and export documents in standard formats (PDF, TEI, 3. Introduction to Transkribus (20 min) docx, PAGE XML) HTR technology is made available through * User management – how to allow spe- the Transkribus platform, which is pro- cific users to view and edit documents grammed with JAVA and SWT (Mühlberger * Layout analysis – how to segment your et al.) A transcription of a handwritten documents to create training data for the document can be undertaken in Transkribus HTR engines for two main purposes. The first is a simple * Transcription – how to create a rich transcription – this allows users to train the transcript with tags and mark-up HTR engine to automatically read historical * HTR – how to apply HTR models to papers. The second is an advanced automatically generate transcripts, how to transcription – this allows users to create a conduct a keyword search of your transcription of a document which may documents, how to assess the accuracy of serve as the basis of a digital edition. This automatically generated transcripts presentation will explain both uses of Tran- The workshop will close with a Question skribus. and Answer session where participants can HTR engines are based on algorithms of clarify anything they are unsure about. They machine learning. The technology needs to will also have the opportunity to provide be trained by being shown examples of at feedback on the Transkribus tool via our least 30 pages of transcribed material. This user survey. helps it to understand the patterns which make up words and characters. This training

177 Number of participants: 15 markos, N., ‘Handwritten and Participants will need to bring their own lap- Machine Printed Text Separation in tops on and install Transkribus Document Images Using the Bag of (https://transkribus.eu/Transkribus/) be- Visual Words Paradigm", in: Frontiers fore attending the workshop. in Handwriting Recognition (ICFHR), 2012 International Conference, Bari Topics: Nordic Textual Resources and (2012), 103-108. DOI: Practices 10.1109/ICFHR.2012.207. Keywords: digitisation, handwritten text re- cognition, digital scholarly editing, crow- Contact information: The workshop will be de- dsourcing, OCR livered by Louise Seaward (University Col- lege London) and Maria Kallio (National References Archives Finland). The contact is Louise Leifert, G., Strauß, T., Grüning, T., and La- Seaward. bahn, R., ‘Cells in Multidimensional Dr Louise Seaward, Bentham Project, Recurrent Neural Networks’ (2016), Faculty of Laws, University College London, https://arXiv.org/abs/1412.2620v02 Bidborough House, 38-50 Bidborough Mühlberger, G., Colutto, S., Kahle, P., Street, London, WC1H 9BT ‘Handwritten Text Recognition (HTR) [email protected] of Historical Documents as a Shared +44 020 3108 8397 Task for Archivists, Computer Scien- tists and Humanities Scholars. The Bibliography Model of a Transcription & Recognit- Seaward, L. 'The Small Republic and the ion Platform (TRP)’ (pre-print) Great Power: Censorship between Gatos, B., Louloudis, G., Causer, T., Grint, Geneva and France in the later Eight- K., Romero, V., Sánchez, J.A., Toselli, eenth Century', The Library: Transac- A.H., and Vidal, E., ‘Ground-Truth tions of The Bibliographical Society, Production in the tranScriptorium Forthcoming Project’, Document Analysis Systems Seaward, L. (2014) 'The Société ty- (DAS), 2014 11th IAPR International pographique de Neuchâtel (STN) and Workshop on Document Analysis the Politics of the Book Trade in late Systems (2014), 237-244 Eighteenth-Century Europe, 1769- Stamatopoulos, N., and Gatos, B., ‘Goal- 1789', European History Quarterly, oriented performance evaluation vol. 44 (3), pp. 439-479 methodology for page segmentation Seaward, L. (2014), 'Censorship through techniques’, 13th International Confe- Cooperation: The Société ty- rence on Document Analysis and Re- pographique de Neuchâtel (STN) and cognition (ICDAR) (2015), 281-285. the French Government, 1769-89', Konstantinos, Z., Pratikakis, I., Anto- French History, vol. 28 (1), pp. 23-42 nacopoulos, A., Gatos, B., and Papa-

178