Digital Humanities at the University of Tartu: State of The
Total Page:16
File Type:pdf, Size:1020Kb
Digital humanities at the University of Tartu: state of the art Andres Kimber, Liina Lindström, Peeter Tinits Tallinn, 27.09.2019 Centre for Digital Humanities and Information Society of the University of Tartu Founded in 2018; some actions already before it Goal: develop interdisciplinary teaching and research in the field of DH and IS Partners: all institutes in the Faculty of Arts and Humanities, Institute of Computer Science, Institute of Social Studies, Tartu University Library Council digihum.ut.ee Muide avastasin, et kult-evo-sem alustas 2015 ja DH oli esimeses kutses märksõna. Pmst temaatiline asi, mis toimus TÜs, aga ei tea kas seostub. Staff Head of the Centre: Liina Lindström (Institute of Estonian and General Linguistics) Head of the Council: Andra Siibak (Institute of Social Studies) Project manager: Andres Kimber (starting 1.10. Ann Siiman) Specialist in DH: Peeter Tinits Lecturer in Computational Linguistics: Siim Orasmaa Junior Researcher in Applied Dialectology: Maarja-Liisa Pilvik Visiting lecturer in DH: Joshua Wilbur Visiting lecturer in DH: Artjoms Šela Learning and teaching Digital Humanities Programme & courses Elective module of DH for all MA programmes in the Faculty of Arts and Humanities since 2017 Minor in DH since 2019/20 Funding from HITSA Elective courses by visiting lecturers Summer Schools Digital Methods in Humanitites and Social Sciences 2018, 2019 Our main target group has been MA and PhD students Guest lecturers Visiting lecturers in DH since 2016 Funded by ASTRA Lecturers with different disciplinary backgrounds in order to offer courses to students from different fields List of lecturers: David Lorenz (quantitative linguistics), Leandro Ezequiel Koile (phyologenetic & quantitative linguistics), Néhémie Strupler (computational archaeology), Artjoms Šeļa (literary studies), Kimmo Elo (social sciences), Joshua Wilbur (documentary linguistics/language technology), Timothy Tangherlini (computational folkloristics) Activities Linguistics (dialectology, historical linguistics, corpus linguistics, phonetics, documentary linguistics) Archaeology (GIS, photogrammetry) Literary Studies (stylometry) Cultural Evolution (computational methods) Social Science NLP tools Case studies: Corpus-based dialectology Liina Lindström, Maarja-Liisa Pilvik et al. Corpus of Estonian Dialects, compiled 1998-2018 Recordings (from 1960-1970s) → transcriptions → morphological tagging → data analysis & visualization on maps Quantitative variation analysis; frequency & frequency maps; distribution of dialectal features etc. Maps & GIS applications for dialect research: http://rurake.keeleressursid.ee/index.php/apps/ Case studies: Corpus-based dialectology Simple frequency maps: 1sg pronoun omission rate in Estonian dialects Combined with methods used in variation studies: (mixed-effect) logistic regression models, conditional inference trees and random forests, etc. Case studies: Communal court minute books ● Some materials from 1866 to 1890 previously digitized in 2004 ● in html-format ● 22 municipalities, 2867 files ● Server crash → rescued via web archives Case studies: Communal court minute books Aigi Rahi-Tamm, Kadri Muischnek, Liina Lindström, Maarja-Liisa Pilvik, Siim Orasmaa, Gerth Jaanimäe jt in collaboration with Estonian National Archives Cleaning and morphological annotating the texts; Named Entity Recognition EstNLTK tools (Python libraries for analyzing Estonian) Highly varying language: North and South Estonian (dialectal features); old and new spelling system; handwritten texts Annotated texts available in the Corpus of Old Literary Estonian Crowdsourcing platform opened in 2019: http://www.ra.ee/vallakohtud/ Case studies: Spelling variation and prescription 1880-1920 processes of standardization Text corpora & dictionary advice Case studies: Historical biographies Metadata on culturally significant people (1800-1930) => database. Useful for cultural history, linguistic analyses, etc. Visualization link Case studies: cooperation with GLAMs Cooperation with Estonian National Library on text corpora Case studies: Cultural evolution of films 1910-2019 film crews Growth in: Size Structure Complexity Case studies: unmasking academic “forgery” - 1978: Collection of poems by Gavriil Batenkov (1793-1863). Published by literary scholar Aleksandr Iliushin. - ~40% of texts in the collection don’t have a confirmed manuscript source Possible forgery? - Extensive & close linguistic and formal examination led to inconclusive results (Shapir 2000) - A way to solve: “unmasking” (Koppel et al. 2012) method + using multi-level features together (lexical, morphological, versification: rhythm, rhyme) - Unmasking is basically asking: how author A is behaving in relation to (their)self? How vs. others? Is Pseudo-author similar to actual author A? Case studies: unmasking academic “forgery” How real same-author samples behave? Grey lines: author vs. others Red line: demonstrates how each particular author classifies vs. themselves: FAST DECREASE, LOW PRECISION RATES Case studies: unmasking academic “forgery” How Pseudo-Batenkov behaves vs. “real” self? Case studies: Archaeological data management ● Archaeological data in four different databases ● COST Action: SEADDA (Saving European Archaeological Data from the Digital Dark Age) Estonian Museum Information System muis.ee National registry of cultural monuments register.muinas.ee Case studies: Archaeological data management ● Digitalisation of reports and local lore ● Working towards aggregating data for FAIR principles University of Tallinn University of Tartu arheoloogia.ee tara.ut.ee Distribution of Estonian archaeological sites Case studies: 3D models of cultural heritage ● Photogrammetry, laser scanning & RTI ● Lacking official data management and metadata guidelines Cellar of a 14th century merchant house in Tartu Mummy of a boy (4.-2. century BC) at UT Art Museum Model by Andres Kimber / University of Tartu Model by Ragnar Saage / University of Tartu https://skfb.ly/6NG7O https://skfb.ly/6HrZO Case studies: GIS and spatial analysis in archaeology ● Mapping, fieldwork planning and reporting ● Analysing land use patterns and landscape perception Total viewshed analysis of Rebala Heritage Predictive model of settlement locations Reserve Andres Kimber / University of Tartu Model by Allar Haav / University of Tartu Case studies: Documentary Linguistics and Language Technology (Pite Saami) Automatic annotations: ● word ● lemma ● part of speech ● morphology ● English glosses Case studies: Text mining cultural transitions Deep Transitions in socio-technical systems (1900-2019) 4-year project in Social Science Newspaper texts -> social history Events Cultural Evolution Seminar (2015-2018) DH-lab Room to work in every Friday Learning together, works in progress Summer School Digital Methods in Humanities and Social Sciences 2018, 2019 ~100 students 4-5 days of workshop Target group: PhD&MA 4 graduate schools DH conferences In collaboration with Estonian Digital Humanities Society and other institutions. Thank you!.